spatial processes and statistical modelling

38
1 Spatial processes and statistical modelling Peter Green University of Bristol, UK BCCS GM&CSS 2008/09 Lecture 8

Upload: major

Post on 10-Feb-2016

51 views

Category:

Documents


0 download

DESCRIPTION

Spatial processes and statistical modelling. Peter Green University of Bristol, UK BCCS GM&CSS 2008/09 Lecture 8. . . . . . . . . Spatial indexing. Continuous space Discrete space lattice irregular - general graphs areally aggregated Point processes - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Spatial processes and statistical modelling

1

Spatial processes and statistical modelling

Peter GreenUniversity of Bristol, UK

BCCS GM&CSS 2008/09 Lecture 8

Page 2: Spatial processes and statistical modelling

2

• Continuous space

• Discrete space– lattice– irregular - general graphs– areally aggregated

• Point processes– other object processes

Spatial indexing

Page 3: Spatial processes and statistical modelling

3

Space vs. time• apparently slight difference• profound implications for

mathematical formulation and computational tractability

Page 4: Spatial processes and statistical modelling

4

Requirements of particular application domains• agriculture (design)• ecology (sparse point pattern, poor data?)• environmetrics (space/time)• climatology (huge physical models)• epidemiology (multiple indexing)• image analysis (huge size)

Page 5: Spatial processes and statistical modelling

5

Key themes• conditional independence

– graphical/hierarchical modelling• aggregation

– analysing dependence between differently indexed data

– opportunities and obstacles• literal credibility of models• Bayes/non-Bayes distinction blurred

Page 6: Spatial processes and statistical modelling

6

Why build spatial dependence into a model?• No more reason to suppose

independence in spatially-indexed data than in a time-series

• However, substantive basis for form of spatial dependent sometimes slight - very often space is a surrogate for missing covariates that are correlated with location

Page 7: Spatial processes and statistical modelling

7

Discretely indexed data

Page 8: Spatial processes and statistical modelling

8

Modelling spatial dependence in discretely-indexed fields• Direct• Indirect

– Hidden Markov models– Hierarchical models

Page 9: Spatial processes and statistical modelling

9

Hierarchical models, using DAGs

Variables at several levels - allows modelling of complex systems, borrowing strength, etc.

Page 10: Spatial processes and statistical modelling

10

Modelling with undirected graphsDirected acyclic graphs are a natural

representation of the way we usually specify a statistical model - directionally:

• disease symptom• past future• parameters data ……whether or not causality is understood.But sometimes (e.g. spatial models) there is

no natural direction

Page 11: Spatial processes and statistical modelling

11

Conditional independenceIn model specification, spatial

context often rules out directional dependence (that would have been acceptable in time series context)

X0 X1 X2 X3 X4

Page 12: Spatial processes and statistical modelling

12

Conditional independenceIn model specification, spatial context

often rules out directional dependenceX20 X21 X22 X23 X24

X00 X01 X02 X03 X04

X10 X11 X12 X13 X14

Page 13: Spatial processes and statistical modelling

13

Conditional independenceIn model specification, spatial context

often rules out directional dependenceX20 X21 X22 X23 X24

X00 X01 X02 X03 X04

X10 X11 X12 X13 X14

Page 14: Spatial processes and statistical modelling

14

Directed acyclic graph

)|()( )(pa vVv

v xxpxp

in general:

for example:

a b

c

dp(a,b,c,d)=p(a)p(b)p(c|a,b)p(d|c)In the RHS, any distributions are legal, and uniquely define joint distribution

Page 15: Spatial processes and statistical modelling

15

Undirected (CI) graphX20 X21 X22

X00 X01 X02

X10 X11 X12Absence of edge denotes conditional independence given all other variablesBut now there are non-trivial constraints on conditional distributions

Regular lattice, irregular graph, areal data...

Page 16: Spatial processes and statistical modelling

16

Undirected (CI) graphX20 X21 X22

X00 X01 X02

X10 X11 X12

C

CC XVXp ))(exp)(

iC

CCii XVXXp ))(exp)|(

)|()|( iiii XXpXXp

then

and so

The Hammersley-Clifford theorem says essentially that the converse is also true - the only sure way to get a valid joint distribution is to use ()

()

clique

Suppose we assume

Page 17: Spatial processes and statistical modelling

17

Hammersley-CliffordX20 X21 X22

X00 X01 X02

X10 X11 X12

C

CC XVXp )(exp)(

)|()|( iiii XXpXXp

A positive distribution p(X) is a Markov random field

if and only if it is a Gibbs distribution

- Sum over cliques C (complete subgraphs)

Page 18: Spatial processes and statistical modelling

18

Partition functionX20 X21 X22

X00 X01 X02

X10 X11 X12

C

CC XVXp )(exp)(

Almost always, the constant of proportionality in

is not available in tractable form: an obstacle to likelihood or Bayesian inference about parameters in the potential functionsPhysicists call the partition function

))( CC XV

X C

CC XVZ )(exp

Page 19: Spatial processes and statistical modelling

19

Markov properties for undirected graphs• The situation is a bit more complicated

than it is for DAGs. There are 4 kinds of Markovness:

• P – pairwise– Non-adjacent pairs of variables are

conditionally independent given the rest• L – local

– Conditional only on adjacent variables (neighbours), each variable is independent of all others

Page 20: Spatial processes and statistical modelling

20

• G – global– Any two subsets of variables separated by a

third are conditionally independent given the values of the third subset.

• F – factorisation– the joint distribution factorises as a product

of functions of cliques• In general these are different, but FGLP

always. For a positive distribution, they are all the same.

Page 21: Spatial processes and statistical modelling

21

Gaussian Markov random fields: spatial autoregression

C

CC XVXp )(exp)(

)|()|( iiii XXpXXp is a multivariate Gaussian distribution, and

If VC(XC) is -ij(xi-xj)2/2 for C={i,j} and 0 otherwise, then

is the univariate Gaussian distribution),(~| 22

iij

jijiii XNXX

ij

iji /12where

Page 22: Spatial processes and statistical modelling

22

Inverse of (co)variance matrix:dependent case

3210241011210012

A B C D

non-zero

non-zero),|,cov( CADB

A B C D

A

B

C

D

Gaussian random fields

Page 23: Spatial processes and statistical modelling

23

Non-Gaussian Markov random fields

ji

ji xx

~

)cosh(log)1(exp

C

CC XVXp )(exp)(

Pairwise interaction random fields with less smooth realisations obtained by replacing squared differences by a term with smaller tails, e.g.

Page 24: Spatial processes and statistical modelling

24

Discrete-valued Markov random fields

C

CC XVXp )(exp)(

Besag (1974) introduced various cases of

for discrete variables, e.g. auto-logistic (binary variables), auto-Poisson (local conditionals are Poisson), auto-binomial, etc.

Page 25: Spatial processes and statistical modelling

25

Auto-logistic model

C

CC XVXp )(exp)(

ji

jiiji

ii xxx~

exp

)|()|( iiii XXpXXp is Bernoulli(pi) with

ij

jijiii xpp )())1/(log(

- a very useful model for dependent binary variables (NB various parameterisations)

(Xi = 0 or 1)

Page 26: Spatial processes and statistical modelling

26

Statistical mechanics models

C

CC XVXp )(exp)(

The classic Ising model (for ferromagnetism) is the symmetric autologistic model on a square lattice in 2-D or 3-D. The Potts model is the generalisation to more than 2 ‘colours’

ji

jii

x xxIXpi

~

][exp)(

and of course you can usefully un-symmetrise this.

Page 27: Spatial processes and statistical modelling

27

Auto-Poisson model

C

CC XVXp )(exp)(

ji

jiiji

iii xxxx~

)!log(exp

)|()|( iiii XXpXXp is Poisson ))(exp(

ij

jiji x

For integrability, ij must be 0, so this onlymodels negative dependence: very limited use.

Page 28: Spatial processes and statistical modelling

28

Hierarchical models and hidden Markov processes

Page 29: Spatial processes and statistical modelling

29

Chain graphs• If both directed and

undirected edges, but no directed loops:

• can rearrange to form global DAG with undirected edges within blocks

Page 30: Spatial processes and statistical modelling

30

Chain graphs• If both directed and

undirected edges, but no directed loops:

• can rearrange to form global DAG with undirected edges within blocks

• Hammersley-Clifford within blocks

Page 31: Spatial processes and statistical modelling

31

Hidden Markov random fields• We have a lot of freedom modelling

spatially-dependent continuously-distributed random fields on regular or irregular graphs

• But very little freedom with discretely distributed variables

use hidden random fields, continuous or discrete

• compatible with introducing covariates, etc.

Page 32: Spatial processes and statistical modelling

32

Hidden Markov models

z0 z1 z2 z3 z4

y1 y2 y3 y4

e.g. Hidden Markov chain

observed

hidden

Page 33: Spatial processes and statistical modelling

33

Hidden Markov random fields

Unobserved dependent field

Observed conditionally-independent discrete field

(a chain graph)

Page 34: Spatial processes and statistical modelling

34

Spatial epidemiology applicationsindependently, for each region i. Options:• CAR, CAR+white noise (BYM, 1989)• Direct modelling of ,e.g. SAR• Mixture/allocation/partition models:

• Covariates, e.g.:

)(~ iii ePoissonY casesexpected

cases

relative risk

)log( i))cov(log( i

izi eYEi

)(

)exp()( Tiizi xeYE

i

Page 35: Spatial processes and statistical modelling

35

Spatial epidemiology applications

Spatial contiguity is usually somewhat idealised

Page 36: Spatial processes and statistical modelling

36

relativerisk

parameters

Richardson & Green (JASA, 2002) used a hidden Markov random field model for disease mapping

)(Poisson~ izi eyi

observedincidence

expectedincidencehidden

MRF

Spatial epidemiology applications

Page 37: Spatial processes and statistical modelling

37)(Poisson~ ii eY

Chain graph for disease mapping

e

Y

zk

)(Poisson~ izi eYi

based on Potts model

Page 38: Spatial processes and statistical modelling

38

Larynx cancer in females in France

SMRs

)|1( ypiz

ii Ey /