an introduction to spatial point processes and markov random fields

24
An Introduction to Spatial Point Processes and Markov Random Fields Author(s): Valerie Isham Source: International Statistical Review / Revue Internationale de Statistique, Vol. 49, No. 1 (Apr., 1981), pp. 21-43 Published by: International Statistical Institute (ISI) Stable URL: http://www.jstor.org/stable/1403035 . Accessed: 10/06/2014 16:41 Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at . http://www.jstor.org/page/info/about/policies/terms.jsp . JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about JSTOR, please contact [email protected]. . International Statistical Institute (ISI) is collaborating with JSTOR to digitize, preserve and extend access to International Statistical Review / Revue Internationale de Statistique. http://www.jstor.org This content downloaded from 194.29.185.141 on Tue, 10 Jun 2014 16:41:14 PM All use subject to JSTOR Terms and Conditions

Upload: valerie-isham

Post on 06-Jan-2017

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: An Introduction to Spatial Point Processes and Markov Random Fields

An Introduction to Spatial Point Processes and Markov Random FieldsAuthor(s): Valerie IshamSource: International Statistical Review / Revue Internationale de Statistique, Vol. 49, No. 1(Apr., 1981), pp. 21-43Published by: International Statistical Institute (ISI)Stable URL: http://www.jstor.org/stable/1403035 .

Accessed: 10/06/2014 16:41

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at .http://www.jstor.org/page/info/about/policies/terms.jsp

.JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range ofcontent in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new formsof scholarship. For more information about JSTOR, please contact [email protected].

.

International Statistical Institute (ISI) is collaborating with JSTOR to digitize, preserve and extend access toInternational Statistical Review / Revue Internationale de Statistique.

http://www.jstor.org

This content downloaded from 194.29.185.141 on Tue, 10 Jun 2014 16:41:14 PMAll use subject to JSTOR Terms and Conditions

Page 2: An Introduction to Spatial Point Processes and Markov Random Fields

International Statistical Review, 49 (1981) 21-43 Longman Group Limited/Printed in Great Britain

An Introduction to Spatial Point Processes

and Markov Random Fields

Valerie Isham Department of Statistical Science, University College London, Gower Street, London WCIE 6BT, England

Summary

Binary-valued Markov random fields may be used as models for point processes with interactions (e.g. repulsion or attraction) between their points. This paper aims to provide a simple nontechnical introduction to Markov random fields in this context. The underlying spaces on which points occur are taken to be countable (e.g. lattice vertices) or continuous (Euclidean space). The role of Markov random fields as equilibrium processes for the temporal evolution of spatial processes is also discussed and various applications and examples are given.

Key words. Gibbs state; Graphical model; Ising model; Lattice process; Markov random field; Nearest neighbour potential; Spatial point process; Spatial-temporal evolution.

1 Introduction

A system in which points are distributed at random in time and/or space is called a point process. In applications these points may represent, for example, the locations of plants, cities, vehicles or galaxies in appropriate regions of Euclidean space. Other examples are when a point represents the place and time of identification of a case of an infectious disease or the position, time and energy of an earthquake.

In this paper we consider spatial processes, in which the points lie in some region which is usually contained in a Euclidean space of d dimensions, for some number d which is greater than one, and where each dimension represents space rather than time. Clearly, if a dimension in this space does represent time then special considerations apply, partly because of the 'directionality' implicit in time which is usually inappropriate in a spatial dimension and also because of the particular interest in the existence of space x time interactions.

Mathematically, a point process in some space, which we shall denote by 92, is specified if we know the (random) number N(A) of points contained in any arbitrary subset A of 0. The random function N is, in probability theory, called an integer-valued random measure and it is usual to assume that N is finite, i.e. that there are only a finite number of points of the process in any bounded region. The theory of such processes is well-developed and a useful summary is given by Daley & Vere-Jones (1972). A more recent introductory account is given by Cox & Isham (1980), while more mathematical treatments may be found in books by Kallenberg (1976) and Matthes, Kerstan & Mecke (1978).

The most basic point process on the real line is the homogeneous Poisson process. In this, the numbers of points in nonoverlapping sets are independent and the number N(A) of points in any set A has a Poisson distribution with mean tAlA, where IAl is the length (Lebesgue measure) of A, and ) is called the rate of the process. The process has the important property that for any set A, conditionally upon N(A) = n, these n points are independently and uniformly

This content downloaded from 194.29.185.141 on Tue, 10 Jun 2014 16:41:14 PMAll use subject to JSTOR Terms and Conditions

Page 3: An Introduction to Spatial Point Processes and Markov Random Fields

22

distributed over A. Another well-known property of the Poisson process in one dimension is that the intervals between successive points are independently and exponentially distributed random variables with parameter 2, and this property is often used to define the process. There are many other point processes in one dimension which can be specified easily in terms of their interval sequences but, unlike the Poisson process, very often their 'counting' properties are far from straightforward. Since the intervals have no natural analogue in higher dimensions, these interval specifications do not readily extend when d > 1. For example, a renewal process in one dimension is specified as having independent, identically distributed intervals. The most obvious generalization of this does not lead to a genuine spatial process. For, if we let

{Xt: i = 1, 2,... } be a sequence of independent, identically distributed d-dimensional vectors with nonzero mean, and consider the process of points with coordinates {Sk = X1 +

... + Xk;

k = 1, 2, . . . }, then it is clear that, with high probability, the process will be contained in a funnel about the mean direction. On the other hand, for those processes which are simply specified in terms of the counting measure N, the dimension of the space on which the process is defined is usually unimportant. Thus, for example, the definition of the Poisson process in terms of N extends immediately to higher dimensional spaces.

In specifying models for spatial processes it is important to incorporate the clustering or inhibition of points which is often a feature of the physical situations which are modelled by point processes. Thus, in ? 2 of this paper, we shall describe models for spatial point processes in which there are interactions between the points, for example, of attraction or repulsion. We start by assuming that 0 consists of a finite set of sites, at each of which there is either one or no point, so that the process can be described by a finite set of interacting binary variables. If each pair of sites in 0 is classified as either 'neighbouring' or not, then we can look for joint distributions of the binary variables which reflect this neighbour structure. In particular, we can require that the variables satisfy a kind of Markov property. This is that the conditional distribution of a subset of the variables, given the values of the remaining variables, depends only on the values at sites which are neighbours of the sites corresponding to this subset. A

process with this property is a Markov random field. Now any joint distribution for the variables attached to the sites of 92 can always be expressed

in terms of a potential function (i.e. a real-valued function defined on the subsets of 0 which vanishes on the empty set A), and in this form is often known as a Gibbs state. In ? 2.1 it is shown that if a process is a Markov random field then the corresponding potential function is what is called a nearest neighbour potential; that is, the potential is a sum of interactions between variables, in which a set of variables contributes a nonzero interaction only if the corresponding sites are all neighbours of each other. The converse, that a Gibbs state with a nearest neighbour potential is a Markov random field, also holds. This characterization of Markov random fields in terms of nearest neighbour potentials is then extended to when 0 is countably infinite (i.e. the sites of 0 can be labelled by the integers). Finally in ? 2.1 we consider the more general question of how to specify a probability measure in terms of a set of functions which are to represent the conditional distributions of points on finite sets of sites given the points on the

remaining sites.

The most usual case of a countably infinite set Qt is when the sites form the vertices of a

rectangular lattice; this is of particular importance in statistical physics. For example, the

Ising model consists of a set of binary variables located at the vertices of a rectangular lattice, where each variable interacts only with the variables at the four nearest lattice vertices. The model was originally proposed as a model of a ferromagnet (Ising, 1925) with each variable

representing the spin (up or down) of an electron. Such models are of great interest and have been the subject of much research. Their development was boosted by the introduction of the

spatial Markov property (Dobrushin, 1968a), which led to a considerable expansion in the

This content downloaded from 194.29.185.141 on Tue, 10 Jun 2014 16:41:14 PMAll use subject to JSTOR Terms and Conditions

Page 4: An Introduction to Spatial Point Processes and Markov Random Fields

23

probability and statistical mechanics literature. It therefore seems appropriate to review some of this work here for a wider audience. The paper is purely expository and is written with the aim of providing a simple nontechnical introduction to the work which has been done on Markov random fields, in particular that in connection with spatial point processes, and also of indicating some particularly useful and relevant papers in the vast literature on the subject.

In ? 2.2, we go on to consider 'genuinely' spatial point processes in which the set of sites is no longer assumed to be finite or countably infinite and the process has its points located in a d-dimensional Euclidean space, where, in general, d > 1. Again a neighbour structure is defined on the space in which the points lie and we examine spatial point processes reflecting this structure, to find analogues of the results of ? 2.1. The neighbour structure means that with any spatial region we can associate a boundary region which consists of all the sites outside the region, which are neighbours of at least one site within the region. Then, a process is a Markov random field if the conditional distribution of the points in a particular region, given all the points outside that region, depends only on those points in the boundary region.

In this case we study the measure defining the process in terms of its density with respect to another measure. It is most natural to take this latter measure to be that of a Poisson process, which is the simplest spatial point process and, as mentioned earlier, has no interactions between its points. For a Markov random field, this density has a particular product form in which only sets of mutually neighbouring points contribute terms to the product, and this property characterizes such fields. As when 0 is countable, we also discuss the general problem of specifying a process in terms of a suitable set of functions which are to represent conditional distributions.

In ? 2, the processes are purely spatial and fixed in time, but it is both important and of considerable interest to look at the temporal evolution of such processes. Thus in ? 3 we discuss the evolution of spatial processes when there are births, deaths and movement of the points, which are influenced by interactions between the points. Such processes are termed spatial- temporal and we can look for such evolutions which will generate the spatial processes of ? 2 as equilibrium processes. For example, when 0 is finite, the class of Markov random fields is shown to be exactly the class of equilibrium distributions of time-reversible birth and death processes in which the birth and death rates are influenced only by neighbouring points. Again, ? 3 is divided into two parts with ? 3.1 concentrating on processes with a countable (i.e. finite or countably infinite) space 0, and ? 3.2 concerned with the noncountable case.

Finally, in ? 4 some specific applications and examples are discussed. Two particular Markov random fields on a rectangular lattice are described. In the first, the neighbours of a lattice vertex are the four nearest vertices and a homogeneous nearest neighbour potential is assumed. The second has a special construction in terms of Markov chains and the neighbours of any vertex turn out to be the eight nearest vertices. We also give an example of a Markov random field in continuous space in which the inhibition of points depends only on the number of neighbouring points.

An unusual application of the theory of Markov random fields is described, in connection with log linear models for contingency tables. This connection is exploited in the definition of a class of models which is contained in the class of hierarchical models and itself contains the class of decomposable models.

The section ends with a brief discussion of some of the problems of fitting models to spatial data.

2 Spatial processes

2.1 Processes with a countable set of sites

Point processes are usually thought of as consisting of randomly located points in continuous

This content downloaded from 194.29.185.141 on Tue, 10 Jun 2014 16:41:14 PMAll use subject to JSTOR Terms and Conditions

Page 5: An Introduction to Spatial Point Processes and Markov Random Fields

24

spaces, but it is convenient to start by considering the situation in which Q2 consists of a countable collection of sites at which process points may be located. We shall assume that no multiple occupancy is allowed, so that a binary (0 or 1) variable is attached to each site, representing the absence or presence of a process point. The aim is to find ways of specifying such processes with various sorts of interaction between the points and to examine some of their properties. We start by assuming that the set of sites 0 is finite.

A realization of the process can be regarded as a finite subset A of ?0, where A is the set of sites at which points are located. Thus, if 0 is finite, the process can be described by a discrete probability distribution, 4u, on the set S(92), of subsets of Q. Suppose that u(A) > 0 for all A E S(9). Then we can define a real-valued function

V, on S(Q) by

V(A) = -log {•(A)/l(0)1},

(2.1)

where 0 denotes the empty set. The function V,

is often called a potential and satisfies V( ) = 0. Rewriting (2.1) we have

p(A) = C exp{- y(A)}, (2.2)

where C is a normalizing constant obtained by summing (2.2) over all subsets A in S (~). Note that the potential ,(A), associated with a realization of the process having points at

exactly those sites in A, can always be decomposed into a sum of contributions from each subset of these points. Thus we can write

V(A)= C U,(F), (2.3) FcA

where

U,(F) = I (-1)r'r'FI (F') (2.4)

F'rc

and IFl denotes the number of sites in F. The function U, is called the interaction potential. Note also, that in a statistical context, the potential y/(A) is, essentially, the log likelihood of the realization A.

Probability distributions written in the form (2.2) are frequently referred to as Gibbs states. The use of Gibbs's name comes from the connection with statistical mechanics, where the Gibbs canonical ensemble specifies the probability density for the positions of n particles moving, under certain assumptions, in a bounded region of R3. This density is proportional to

exp {-V(C,. .

.., ,)/(kT)}, where q(4C, . . ., ,,) is the potential energy associated with particles

at , . . ., ,, k being Boltzmann's constant and Tbeing the absolute temperature. So far, the only restriction put on the probability distribution pu is that of positivity. Often,

however, the sites in 0 have some structure which we would like p to reflect. In particular, suppose that we have a relation defined on Qa which specifies whether or not any two sites in 0 are neighbours, each site being defined to be its own neighbour. Then we define a set IF c 0 to be a clique if any two sites in F are neighbours; these sites need not be distinct, so that any single site forms a clique. A nearest neighbour potential u is a potential for which the corresponding interaction potential U, vanishes on any subset F of (t unless F is a clique. Thus the only contributions to i,(A) in the decomposition (2.3) come from subsets of A which are themselves cliques. An interesting class of probability distributions, therefore, consists of those distributions, p, for which the corresponding potential

s, given by (2.1), is a nearest

neighbour potential. For example, suppose that Q is a finite portion of a 2-dimensional rectangular lattice and

that two distinct sites in C are neighbours if they are adjacent lattice vertices. Then the cliques are the single sites of 0 together with any pairs of adjacent sites in 0, and there are no cliques

This content downloaded from 194.29.185.141 on Tue, 10 Jun 2014 16:41:14 PMAll use subject to JSTOR Terms and Conditions

Page 6: An Introduction to Spatial Point Processes and Markov Random Fields

25

of more than two sites. If the sites are denoted by C,,, where (i, j) are the rectangular coordinates of the vertices, then a nearest-neighbour potential V, on f2 must have the form

(A) = Y UO,(,j) + I UO(Qj, 4,'), (2.5)

for any A E S(M), where U,(?,j, C,,j,) is nonzero only if either i' = i + 1 and j' =j or i' = i andj' =j + 1.

If we assume that the process is homogeneous, with

UO(C y) a, UO((C, (i, 1j)= 1, Uo((1 j, J+ )=2,, then the potential yq(A) is said to be homogeneous and is given by ,(A) = na + n, fl + n2fl2,

where n= I AI is the number of sites in A, ni is the number of sites, (4,, in A for which •+,1j is also in A and n2 is the number of sites, ,,j, in A for which j , + ,is also in A. Further simplifica- tion is obtained if the lattice possesses a directional symmetry so that fl = f2.

Another way of incorporating the neighbour structure of 0 is through the use of Markov random fields (Dobrushin, 1968a). A probability distribution '1 on S(tQ) is a Markov random field if the corresponding conditional probability of a point at a particular site C, given the locations of the points on the remaining sites, ?0 - , is the same as the conditional probability of a point at 4 given only the locations of points on those sites in ? - 4 which are neighbours of 4. Again consideration is restricted to distributions p satisfying, for all A E S(O?),

#(A) > 0. (2.6) Intuitively one would expect a close connection between the class of Markov random fields

on S(O) and the class of Gibbs states with nearest neighbour potentials, and, in fact, the two classes are the same. We can use the decomposition (2.4) to give a simple proof as follows. Let 4 be a single site in 0 and let A be a subset of 0 not containing 4. Then

4u(A U 0) pr{points at 4 and the sites of A4

#(A) pr {points at the sites of A only }

pr{point at 4!

realization A on the sites in Q - 4} pr{no point at 41 realization A on the sites in 0 - } (2.7)

If # is a Markov random field then (2.7) must depend only on the positions of points on those sites in 0 - 4 which are neighbours of the site 4; we shall denote these sites by (9.

Now by (2.2) and (2.4) respectively

p (AU 0

#(A) exp {-(A U 0 +

(A)}, y(A U 0- y/(A)= U,(F U 0. rcA

If /(F U 0 - yV(A) is to depend only on the point at 4 and those in A 0n a(, it follows that U,(T U 0 must vanish unless the sites in F U 4 form a clique. For example, if A = {4' } where 4' is not a neighbour of C, then

v'(4' u 4)- ( (')= U(' u 4) + u(, which will not depend on 4' only if U,(4' U ) = 0.

Since 4 is arbitrary, the interaction potential U, vanishes on all subsets F of tI unless F is a clique and thus p is a Gibbs state with a nearest neighbour potential. The converse result, that if p is a Gibbs state with a nearest neighbour potential then p is a Markov random field, is equally straightforward to demonstrate.

The equivalence of the classes of Markov random fields and Gibbs states with nearest neighbour potentials was proved by Hammersley & Clifford (1971) and subsequently has been

This content downloaded from 194.29.185.141 on Tue, 10 Jun 2014 16:41:14 PMAll use subject to JSTOR Terms and Conditions

Page 7: An Introduction to Spatial Point Processes and Markov Random Fields

26

proved more simply by many authors; see, for example, Grimmett (1973), Preston (1973), Sherman (1973), Besag (1974) and Moussouris (1974). Earlier proofs of this result, when 0 is a finite subset of a rectangular lattice with the natural neighbour structure, were given by Averintsev (1970) and Spitzer (1971a), who assumed translation invariance.

If, now, Q is a countable collection of sites with a neighbour structure as before, then the definitions of Gibbs states and Markov random fields become more complicated and questions arise as to the existence of these distributions. We assume that each site has a finite number of neighbours. To define Gibbs states and Markov random fields we proceed as follows. Since S(Q) is no longer countable the distribution p of the point process on 2 is specified via a consistent set of probabilities for the locations of points within finite sets of sites. Let uA(A) denote the probability of obtaining the configuration A on the set A, by which we mean that there are points at each site in A and no points at sites in A - A. Here A E S(A) and A belongs to the set Sf(0) of finite subsets of Q. To define a Markov random field when 0 is finite, we

required the conditional probability of the configuration A on A, given the configuration on 0 - A, to be the same as the conditional probability of the configuration A on A given only the configuration on the boundary bA, where bA is the set of sites in 0 - A which are

neighbours of the sites in A. In the general countable case, the set A is embedded in some

larger finite set A', which is big enough to contain both A and all its neighbouring sites, i.e. which is such that A U aA c A', and takes the place of 0 in the definition.

Thus, we define a probability distribution, p, on Q to be a Markov random field if, for all

AE S(A), AE Sf(0), pA(A) > 0, (2.8)

and also, the conditional probability of the configuration A on A, given the configuration on A' - A, is the same as the conditional probability of the configuration A on A, given only the

configuration on the boundary bA, for all A E S(A) and A,A' E Sf(Q) with A U A c A'. When Q is countable, any potential V defined on C2, i.e. a real-valued function on Sf,(0)

with V,(

) = 0, can be decomposed in terms of the interaction potential U, just as when f0

is finite. Equations (2.3) and (2.4) still hold, the only restriction being that A, and therefore F, is a finite subset of 0. As before V is a nearest neighbour potential if the interaction potential

U, vanishes except on cliques; note that we assumed that all cliques are finite. Then, for countable 0, the distribution u is said to be a Gibbs state with nearest neighbour potential V, if the conditional probability of the configuration A on A given the configuration F on A' - A is proportional to

exp [- {A AU (F n bA) }, (2.9)

for some nearest neighbour potential y and for all A c A and A,A' E SS,(f) with A U bA c A'. It can be shown that such a distribution p must satisfy (2.8). This extension of the definition of a Gibbs state (originally with an arbitrary pair potential) to a countable set QZ (originally a rectangular lattice) was described in a series of papers by Dobrushin (1968b, c; 1969). He also discussed equivalent extensions given by Minlos (1967a, b), based on limits of Gibbs states on finite sets (thermodynamic limits), and Ruelle (1967), who employed the principle of

maximum entropy. First of all, we note that, for a particular nearest neighbour potential y, a Gibbs state

always exists (we shall return to this later). However, in contrast with when Q is finite, this state need not be unique; if more than one Gibbs state corresponds to a particular potential i, then phase transition is said to occur. It is of particular interest to know what sort of conditions on W cause phase transition. Suppose n is a rectangular lattice in d dimensions and y is a homogeneous nearest neighbour potential (adjacent lattice vertices being defined as neighbours). Then, if d = 1, there is a unique Gibbs state with potential ,, while ifd > 1, this will be true if,

This content downloaded from 194.29.185.141 on Tue, 10 Jun 2014 16:41:14 PMAll use subject to JSTOR Terms and Conditions

Page 8: An Introduction to Spatial Point Processes and Markov Random Fields

27

for example, the interactions between points (occupied sites) are sufficiently weak or if the density of points is sufficiently low. A third instance of a unique solution is when all the parameters in the potential are uniformly small (the high temperature case). For example, if the model is applied to ferromagnetic material, it is found that below a certain critical temper- ature the material is magnetized in one of two equally likely directions, while above the critical temperature the material is demagnetized. An extremely clear account of these results is given by Spitzer (1971b). Some of the very many more general results can be found in the work of Dobrushin (1968c) and Lebowitz & Martin-L6f (1972).

The term phase transition is also used in a slightly different sense when a physical substance exists in different states at, say, different temperatures. Everyday examples of this are the solid/liquid/gas phases of water. For straightforward accounts see Kac (1978), Thompson (1972) and many other statistical mechanics textbooks. In general, phase transition occurs if there is some sort of discontinuity in the properties of a physical substance or model. It is, therefore, a topic of the utmost importance in statistical mechanics and the subject of con- siderable research effort; see, for example, Ruelle (1969) and the volumes of papers edited by Domb & Green (1972-6).

Now, if p is a Gibbs state with a nearest neighbour potential V then p is a Markov random field. Conversely, if p is a Markov random field, we can define a function 4 on the finite subsets of 0 by

V(A) = logP , UA(A) '

which does not depend on A as long as A U 9A c A. Then it follows that y is a nearest neighbour potential and that p is a Gibbs state with potential V. This last result generalizes that for finite 0 described earlier and proofs of this and related results are given by Preston (1974, Chapter 4).

Both the Gibbs states with nearest neighbour potentials and the Markov random fields have been defined in terms of the conditional probability of obtaining a configuration A of points on a finite set A of sites, given the configuration of points on A' - A, where A' is some finite set containing A and all its neighbouring sites. In the following, we no longer assume the presence of a neighbour relation on 0, but consider the more general problem of how to specify a point process on n via a set of functions uA(A; F) which are intended to represent the conditional probabilities (defined in terms of Radon-Nikodym derivatives) of obtaining the configuration A on the finite subset A of (0, given the configuration F on Q - A; thus A E Sf( ) and F E S(Q - A), where, as before, Sf(4) and S(4) denote the set of all finite subsets of 0 and the set of all subsets of 0 respectively. It is then natural to ask what conditions these functions must satisfy in order to ensure that there exists a point process on fl with these functions as its conditional probabilities. Also, under what circumstances is there a unique point process? Clearly the pa(A; F) must

(i) be nonnegative, (ii) sum to one over all subsets A of A, for fixed F, (iii) satisfy, for ail A c A, A' c A' - A, r

•A - A' and Ac A' E S,(),

pA(A U A'; F) = pa(A; A' U F) 1 pA,(A" U A'; F).

Notice that the condition (iii) is a simple decomposition of the conditional probability of obtaining the configuration A on A and A' on A' - A given the configuration F on 0 - A', since the summation on the right-hand side is simply the conditional probability of the configuration A' on A' - A given 1 on 0 - A'. It can be shown (Preston, 1974, Chapter 5) that for a set of continuous functions pA(A; F) satisfying conditions (i), (ii) and (iii) above, at least one distribution p on Q exists. Note that a cylinder set is the set of all realizations X on

This content downloaded from 194.29.185.141 on Tue, 10 Jun 2014 16:41:14 PMAll use subject to JSTOR Terms and Conditions

Page 9: An Introduction to Spatial Point Processes and Markov Random Fields

28

0 which have the same configuration A on a finite subset A of 0, that is { X: X n A = A}, for some A and A; these cylinder sets form a basis of open sets for the topology on S(92), and the continuity of pA(.; .) is with respect to the product topology on S(A) x S(92 - A). Furthermore, if (i) is replaced by

(i)' uA(A; F) > 0 for all A cA , A E Sf( ), F E S(O - A),

then there exists a unique potential y such that

PA(A; F) = CA,r exp {- V(A U F)}, (2.10)

where

C- = IY exp {- v(A U F)}, AcA

for A cA E S,(0), F E S,(9 - A).

Conversely, suppose that we have a potential y/ defined on S,(f). Then we can define a set of functions {PA(A; F) } by

exp {- (A U F) + V(F)1 4uA(A; r) (2.11) Y exp {-

,(A'

U F) + ()} A' cA

whch satisfy (i)', (ii) and (iii). If these functions are continuous then, from above, we know that

they are conditional probabilities for at least one process. Suppose, for any fixed A E Sf,(), we define a functionfA on S,(0 - A) by

fA(F)= y(A U F)- W(F), (2.12)

and extend the function to S(O - A) (since the finite subsets of 92 - A are dense in S(O - A)). Then a necessary and sufficient condition on the potential Y for the continuity of the PA(A; F) is that the functions fA should be continuous. Note that if Vy is a nearest neighbour potential, then this condition is satisfied and therefore a Gibbs state with this potential exists.

We have been concerned with the numbers of points at the sites of 0 and so, by excluding multiple occupancy of the sites, have been considering a set of binary variables. The problem of specifying systems of binary variables on a lattice, with interactions, arises in a variety of contexts and has attracted a great deal of attention in the mathematics/physics literature. In

particular, in the Ising model for a ferromagnet, each variable represents the spin of an electron. It is assumed that the electrons are sited at the vertices of a rectangular lattice, and that interactions only occur between pairs of adjacent electrons (McCoy & Wu, 1973). Similar lattice models are used to represent gases-in this case each variable indicates the presence or absence of a molecule. Interactions are again restricted to pairs of molecules, though these need not be adjacent. Many authors consider sets of more general interacting variables so that results like the Hammersley-Clifford theorem on the equivalence of Gibbs states and Markov random fields are not restricted to binary variables. Finally, although most of the work in this area is of a mathematical nature being concerned with questions of the existence and uniqueness of distributions on a with particular properties, the statistical analysis of processes of fairly

arbitrary interacting variables on a rectangular lattice has been discussed by, amongst others, Besag (1974, 1977) and Bartlett (1975). We shall return to this briefly in ? 4.

2.2 Processes in Rd

In this section we consider genuinely spatial processes in Rd, and investigate what happens to the results of ? 2.1 when S2 is no longer countable. In the same way that it was simpler to consider 0 to be finite, rather than to be countable, so now it is easier if we assume, initially, that

This content downloaded from 194.29.185.141 on Tue, 10 Jun 2014 16:41:14 PMAll use subject to JSTOR Terms and Conditions

Page 10: An Introduction to Spatial Point Processes and Markov Random Fields

29

0 is a bounded region of 0Rd. For then, the number, N(O), of points in 0 is finite with probability one. Again we shall consider processes without multiple occurrences.

For a homogeneous Poisson process with rate p, the probability density (l,

..., ,) of there being exactly n points in 0 at •r,..., (,

is given by

(u(,. .., ,)= p" e-P'o'/n! (n = 1, 2,...), u() = e-p'P' where 10 l, previously the number of sites in K, is now the Lebesgue measure of 92. Suppose that a point process has a distribution which is absolutely continuous with respect to the distribution of this Poisson process. Then we can write

"((,, ..., ,)= gn(X, "I

. *.Qn) Pn -plln/n! (n = 1, 2,...), #p()= goe-Pin', (2.13) where the functions gn are invariant under permutations of their arguments. Then gn specifies the likelihood of a particular configuration of points, 09,..., (n, relatively to a Poisson process of rate p. Thus,

prIN()n= n )= {d(... dngn((q,...,) (n= 1, 2,...),

pr{ N(9) = 0} = go e-p'P', (2.14)

and, therefore, the functions gn are such that the distribution of N(O), given by (2.14), is normalized to one. Given that N(Q) = n, the n points are distributed on 0 with a density which is proportional to gn.

Note that if u(9, . .., .,) > 0 (n = 1, 2, . .) and u( ) > 0, then we can always define a function y on the finite subsets of n by

V(O) = 0, (r,..?, ) = -log {u (4i,..., ,)/p(0)} (n = 1, 2,...), where y/ is a potential, and then the functions gn can be written in terms of Y. Alternatively, we can define a closely-related function On by

n((,1..., ) = -log gn((o** •n) (n = 1, 2,...), 00 = -log go, (2.15)

either by restricting the gn to be positive, or by allowing On to be infinite-valued. An interesting family of point processes can be obtained by assuming that the functions On have expansions of the form

On(, , * ) = na, + I a(2i, - i,)+ a(,2 4i,, i3 - i,) "+

(

il <2 iI <i2<i3

+ Z as(2 ,- 41,..., •-).

(2.16) i" <... <is

The idea here is that each distinct k-tuple (k = 1, .. ., s) from the n points , 9,..., (, makes a contribution of ak(r, 1., r* _ *

k-_) to the function 0n, where ir,..., r*_-1 are the positions of

k - 1 of the points relative to the kth point. The ak are even functions of their arguments and must be such that (2.14) defines a proper probability distribution. Thus, for example, if when s = 2 we were to have a2(r) < -6 < 0 for all arguments r, then there would be a factor of at least exp { n(n - 1)6 } in pr {N(Q) = n }, which would dominate other factors for sufficiently large n and make the required convergence impossible. Note that a1 appears in (2.14) in the form pe-",, so that it may be convenient, in special cases, to fix p = 1 and modify a1 appro- priately, or to fix a1 = 0 and modify p.

Some particular examples, with s = 2, assume that particles interact only if they are within some critical distance r of each other. For these processes a2(r) = 0 for I rlI > r. In particular we might take a2(r) to be constant for irl < r. An extreme case is the 'hard-core' model in which

This content downloaded from 194.29.185.141 on Tue, 10 Jun 2014 16:41:14 PMAll use subject to JSTOR Terms and Conditions

Page 11: An Introduction to Spatial Point Processes and Markov Random Fields

30

a2(r) is infinite-valued for 1 l < r, corresponding to a Poisson process in which only realizations with no two points within a distance r of each other are acceptable.

To return to the general case with ((1, . -.., (,) given in (2.13), suppose that there is a neighbour structure on 92. That is, suppose that there is a relation on 0 specifying whether or not any two elements of 0 are neighbours; as before, each element is defined to be its own neighbour. For example, we might take two elements of 0 to be neighbours if they are less than some fixed distance r apart. Then a point process on 0 is a Markov random field if the con- ditional probability density of obtaining any particular configuration 1, ..., ,4 of points on a subset A of C2 given the configuration on 5 - A, is the same as the conditional probability of

r(,..., a0 given the configuration on the boundary aA of A, consisting of elements in 5 - A which are neighbours of elements in A. It has been shown (Ripley & Kelly, 1977) that if a point process has a distribution p given by (2.13), such that

A (,o9- - -9 Q) > 0 =>A(iiq,..9,Q >0 for all subsets il, *..., ik of 1,..., n, k = 1,..., n and n = 1, 2,..., then the process is a Markov random field if and only if the functions gn

can be written in the form

gn,(C, ., Q,)= I CI ( a,, . . ., 9, (2.17)

where the product in (2.17) is over all subsets i, .. ., ik of 1, ..., n for k = 1, ..., n, and the

function f is nonnegative and takes the value 1 unless C,, . . ., , form a clique. Thus, if p is a Markov random field, the functions 0, defined in (2.15) satisfy

One(ll, ", 09Qn)

= -I log (ci,, . - -, i), (2.18)

where the sum in (2.18) is over only those subsets of 1 ,..., , which form cliques; (compare this with the definition of a Gibbs state with a nearest neighbour potential when 0 is a finite set of sites). If the process is stationary then (2.18) has the form (2.16), if

ak(4,2 - C l '. . .9 - Cii) is nonzero only when , * , k form a clique.

Note that Ripley & Kelly (1977) assume the point process to be absolutely continuous with

respect to a Poisson process with an arbitrary mean measure p(.) and multiple occurrences are possible.

It is worth emphasizing that the neighbour relationship is defined on the set Q, so that, for example, we cannot define the neighbours of a particular process point to be the three nearest process points. In one dimension, a renewal process is Markov, in the sense that the conditional distribution of the process for t > r given the history of the process up until r is the same as the conditional distribution given only the backward recurrence time from r to the last point of the process occurring at or before r. However, the renewal process is not, in general, Markov in the sense that we have been considering. If the intervals of the renewal process are bounded, almost surely, by some constant a say, then we can define two points to be neighbours if they are not more than a apart. Then if A =

[dl, d2], A = [dI - a, dl) U (d2, d2 + a] and

given the configuration of points on cA, the process on A is conditionally independent of that outside A U 9A. If the intervals of the renewal process are not bounded then no such

property holds. So far, in this section, Q has been assumed to be a bounded subset of Rd. In practice, this

is likely to be a reasonable assumption since we can probably embed the region of interest,

A, say, in a much larger, but bounded, region Q of Rd, and then look at the behaviour of the

process within A, integrating out the process on Q - A. However, if we now assume that 2 is unbounded, we can consider specifying the process in terms of a set of functions,

This content downloaded from 194.29.185.141 on Tue, 10 Jun 2014 16:41:14 PMAll use subject to JSTOR Terms and Conditions

Page 12: An Introduction to Spatial Point Processes and Markov Random Fields

31

where #A((1, . .., r,; F) is to represent the conditional probability density that, within a bounded region A, there are n process points located at r,.. ., ,, given the configuration F of points on 0 - A. Then, as in ? 2.1, we want to know what conditions the pA(C, . , , ; F) must satisfy to ensure the existence of a point process with these functions as its conditional probabilities, and under what circumstances the process in unique.

Clearly the pA(4(, ..., C,; F) must satisfy the analogues of the conditions on pA(.; F), described in ? 2.1, which are sufficient to ensure the existence of the process when 92 is countable. These conditions are, essentially that

(a) the uA(r,,..., C,; F) and uA(O; F) should be nonnegative; (b) the distribution of N(A), obtained by integrating out the Cj's, should be normalized

to one; (c) the functions should combine together as conditional probabilities, so that, in a fairly

obvious notation, if A c- A' then

uA, (A U A'; F) =~uA(A; A' U F) J dA" Ap, (A" U A'; F),

where A, A", A' and F are intended to represent configurations of points on A, A, A' - A and 0 - A' respectively.

As before, we shall assume that PA has a density with respect to the distribution of a Poisson process, i.e. that

p?, e--pAI /#a(CC,...,

r,; 0)= gn,a(Cr,...

-, C,,; rI) (n = 1, 2,..),

pa( ; F) = go,a e-PA'. (2.19)

Then the functions g,,a must satisfy an equivalent set of conditions. As usual, we can consider writing the functions pa or

g,,a in terms of a potential. For example,

there exists a function yt, mapping finite configurations of points in 0 onto (-oo, ao], which is such that

exp(2.20) , JA d?,

-. J A d?, exp V-- ( ),

, . ., C, )r+l, s *,

,,) ps e-P'A'/s!

with ?( )= 0. Conversely, starting with the function V, we can try to use (2.20) to define a set of-functions g,,a corresponding to a set of conditional probabilities. The conditions which must be imposed on y, for this to be possible, are discussed by Preston (1976a, Chapter 6). For example, we have fo ensure the existence of the integral in (2.20).

Note that Preston assumes that u•a(.;

F) is absolutely continuous with respect to the distribution of an arbitrary Poisson process with mean measure p(.). Also, in specifying a set of suitable functions, pA, which can be taken as conditional probabilities, he allows certain configurations to be classed as 'impossible'. Then, if F is such a configuration, "ua(.; F) is defined to be identically zero and consideration is restricted to distributions for processes which do assign zero probability to these 'impossible' configurations. For example, this allows us to consider 'hard-core' models.

3 Spatial-temporal processes 3.1 Processes with a countable set of sites The processes described in ? 2 were purely spatial and in this section we discuss the temporal evolution of such processes. Two basic forms of evolution will be considered. In the first, points are created and annihilated at rates governed by the positions of existing points, and

This content downloaded from 194.29.185.141 on Tue, 10 Jun 2014 16:41:14 PMAll use subject to JSTOR Terms and Conditions

Page 13: An Introduction to Spatial Point Processes and Markov Random Fields

32

each point remains in a fixed position throughout its lifetime. In the second, there is no birth or death, so that the number of points (though possibly infinite) is fixed, but each point is allowed to move and its movements depend on the instantaneous positions of the other points. Clearly these two forms of evolution can be combined to give a process with birth, death and movement of points.

As before it is simplest to start by describing the situation when there is only a finite set of sites, 0, with some specified neighbour structure, and again multiple occupancy of the sites is excluded. A birth and death process on 0 is a Markov process with birth and death rates

fl(; A) and 6((; A) respectively. Thus, given that, at time t, the points on 0 are at exactly the sites of A, the conditional probability that, at time t + r the configuration is A U C, is given by

pr {configuration A U 4at t + rlconfiguration A at t} = fl(; A) r + o(r),

for 4CE - A. Similarly,

pr {configuration A at t + rlconfiguration A U (at t} = 6((; A) r + o(r),

where C E 2 - A. If we assume that fl((; A) > 0 and 56(; A) > 0, for all E Q - A, A E S(Q), then it is well known from the theory of Markov chains (see, for example, Doob, 1953, Chapter 5) that a unique equilibrium distribution, y, for the evolution does exist; that is, p is an invariant distribution for the evolution, and the distribution of the process at time t converges to p as t - 0 oo, irrespective of the initial conditions. Also, P is positive, that is, u(A) > 0 for all AE S(92).

Now suppose that the birth and death rates at 4 only involve A through the points at sites in A which are neighbours of ?, that is

fl(; A)= fl(; A n ao, 65(; A)= 6((; An o). Then one would expect this property to be reflected in the equilibrium distribution. To get a nice result we need to impose the extra restriction on the evolution, that it be time-reversible. This means that transitions from A to A U 4 must occur at the same rate as those from A U to A, that is, # must satisfy

#(A) fl((; A) = #(A U 0 6((; A), (3.1)

for all 4 E ~ - A and all A E S(92). Then it follows immediately that P is a Markov random

field, and hence from ? 2.1, p is a Gibbs state with a nearest neighbour potential. Combining (2.2) and (3.1), we see that this potential, y/ say, satisfies

exp { fV(A) - V(A U 01 =fl(C; A) 6((; A)

Conversely, if p is any Markov random field on f, then we can define a birth and death process on 0 by taking

P/((; A) = p(A U 0()/p(A), (; A) = 1.

The positivity of these rates is ensured by the definition of a Markov random field (see (2.6)) and, since (3.1) is satisfied, this process is a time-reversible nearest neighbour birth and death

process with the Markov random field p as its equilibrium state. Thus the class of Markov random fields is the same as the class of equilibrium distributions for time-reversible nearest neighbour birth and death evolutions.

Alternatively, consider a Markov process consisting of a fixed number, n, with n < l I, of point particles moving between the sites of Q as follows. Suppose that the conditional probability that a point at 4 moves during the time interval (t, t + r], given that at time t the

This content downloaded from 194.29.185.141 on Tue, 10 Jun 2014 16:41:14 PMAll use subject to JSTOR Terms and Conditions

Page 14: An Introduction to Spatial Point Processes and Markov Random Fields

33

configuration is A U C, is s(C; A)r + o(r). The function s, called the speed function, is defined for all C E C and A E S(( - 0, and is not restricted to those A which consist of exactly n - 1 sites. Given that a point at 4 is to move, let P(C, r) be the probability of a transition to site 1r. We assume that the probability of more than one transition in (t, t + r] is o(r). Since multiple occupancy of sites is excluded, any transition which would result in two points occupying the same site is suppressed.

Suppose that we make the following assumptions about the function s and P:

(i) that P(C, t) is an irreducible, symmetric transition function with P(C, 0 = 0; (ii) that s(C; A) = s(i; An o0 > 0 for all C E L2 - A, A E S(A);

(iii) that, for all 4, q EE C - A with 4 7, AEB S(L2),

s(C; A U r)s(r; A) = s(?r; A U )s(C; A). (3.2) Then we can define a potential V on the subsets of 91 by

(O) =0, 0 t(A U 0 - y(A) = log s(4; A). (3.3)

The potential V/ is obtained from (3.3) by iteration and the assumption (iii) above ensures that y/ is well defined. Since the speed at C for given A depends, by assumption (ii), only on the sites which are neighbours of C, it follows that V is a nearest neighbour potential. If we now consider a process of n points whose movements are governed by these functions P and s, then it follows that the process is time-reversible and has an equilibrium distribution of the form

&n(A) = Cn exp t- V(A). (3.4) The constant

Cn is determined by summing (3.4) over all subsets, A, of Q, which have exactly

n elements, so that (3.4) represents the restriction to such subsets of the Gibbs state defined in (2.2). Details of these results can be found in the work of Spitzer (1971b), when L is a finite subset of a rectangular lattice, and of Preston (1973) in the more general case.

When C2 is countable, predictably things become much more complicated. A thorough discussion of this case is given by Liggett (1977), who considers both the birth and death evolution and a more general form of the evolution, described above, of a fixed number of moving points. In this the transition function, P(C, r7), is replaced by a function P((, r; A) so that the probability of a transition to r from C depends on the configuration, A, of the remaining points. As before multiple occupancy of the sites is excluded. More recent results for some particular evolutions on a countable set of sites, for which multiple occupancy is allowed, are discussed by Spitzer (1979) and Liggett & Spitzer (1979).

When 0. is not finite, the first problem is to investigate the conditions which must be imposed on the functions used to describe the evolution, in order that a unique process does exist which evolves in the specified manner. Then one can look at the further conditions which are needed so that a unique invariant distribution, p, exists and, finally, at whether the distribution of the process converges to p as t -- ,o, regardless of the initial conditions. As when 0 is finite, a Gibbs state with a nearest neighbour potential will be the invariant distribution for the evolution if the functions specifying the evolution depend appropriately upon the corresponding potential function.

In the book by Griffeath (1979), percolation theory forms the basis for constructions for systems of particles evolving on an infinite lattice, and graphical representations of the systems are used throughout. In particular, with these models, there is no problem of existence.

3.2 Processes in Rd

In this section, we assume that the process of interest is evolving in a noncountable space 0,

This content downloaded from 194.29.185.141 on Tue, 10 Jun 2014 16:41:14 PMAll use subject to JSTOR Terms and Conditions

Page 15: An Introduction to Spatial Point Processes and Markov Random Fields

34

and specifically that 92 = Rd., so that at any time we have a genuine spatial process. Consider first a birth and death evolution. It will simplify matters if we assume either that the process is restricted to a bounded region of 0 or that it is finite, that is, the number of points in existence at any particular time t, is finite for all t. We shall make this latter assumption in the following discussion. Then we aim to specify the process by birth and death rates as follows. Given that at time t there are n points located at (1, . . ., n,,

the conditional probability density of a birth at 4 in the time interval (t, t + r] is fl(C; (, .. ., 9,()r + o(z). Similarly, given that at time t there are n + 1 points located at C, 4(, . . ., (n,

the conditional probability of a death at ( in the time interval (t, t + r] is 6(4; (, ..., n,) r + o(r). Just as when 0 is countable, the first

problem is to investigate what conditions on fl and 6 will guarantee the existence of a unique process evolving in the way described. Then the question arises of whether the process approaches an equilibrium state as the evolution progresses. Preston (1976b) gives sufficient conditions on fl and 6 for the existence of the process and for the existence of the equilibrium distribution, by relating the spatial process to a simple (nonspatial) birth or death process.

It is also important to see how an invariant distribution for the process is connected to the birth and death rates fl and 6. As when 92 is finite, it is convenient to consider time-reversible processes. Then, if p is an invariant distribution it must satisfy

PG1I,,...,I(n)#((;(P-1, Qn=(,C1,,'".",4n)6((;(1 ,n), (3.5)

which corresponds to (3.1) in the finite f2 case. We can use (3.5) to generate the distribution p, starting from

p(0)f(4; •) = 9((((); 0), with an obvious extension of the notation, and adding points one at a time. This procedure will lead to a well-defined process if we obtain the same distribution whether we add ( first and then r or vice versa. In particular, therefore, we need # and 6 to satisfy

where y((; 1,-..,

( ,) = fl(; ( ), ;)/(; 4n)( ,...,

•n). Hence, the function y must satisfy

Y(9; 41,...,

n)y(; (919, 01,...,

4)= AM( 19; 1,' ",4n)y((;

(1, 41,...,

4n); (3.6)

compare this with the condition (3.2) imposed on the function s(4; A), for a similar purpose, in ? 3.1.

Thus, suppose that we have a process specified by birth and death functions # and 6, with

(5; (,?

.. ., 4) > 0 for all , (, ,..., 4n, and such that (3.6) is satisfied. If the process is time reversible and has an equilibrium distribution, then this distribution can be constructed using (3.5). Clearly, since the invariant distribution depends only on # and 6 through their ratio y, there will be a range of processes having the same equilibrium distribution.

For a particular function y such that y((; 1,..., 4n) > 0 for all C, 1,..., 0n,

we can construct a potential function using

11(0, .r., ) _- (1,...,

4t)= -log y(I ; 4,..., 4,), ( )= 0. (3.7)

Then, from (3.5), it follows that the equilibrium distribution is a Gibbs state given by

((1 .. , )= C exp -

(1, .... )}. (3.8)

Thus if a neighbour structure is defined on Q, and if the birth and death rates at 4 depend only on the positions of those points which are neighbours of 4, the corresponding potential v given in (3.7) will be a nearest neighbour potential and the equilibrium distribution (3.8) will be a Markov random field.

This content downloaded from 194.29.185.141 on Tue, 10 Jun 2014 16:41:14 PMAll use subject to JSTOR Terms and Conditions

Page 16: An Introduction to Spatial Point Processes and Markov Random Fields

35

The technicalities in the work described have been greatly simplified by our assumption that the process is finite. No such restriction is made, for example, by Durrett (1979) who discusses the behaviour of an infinite system of points evolving, in Rdd, as a spatial birth and death process in which the death rate, 6, is constant and the birth rate # is additive; that is

P(A; C,, (2..(. ; (,) where

fl((; )d=p < oo. This last condition means simply that each point, rl, contributes the same amount, f, to the total birth rate. In particular, Durrett considers the limiting distributions of counts in bounded sets. Holley & Stroock (1978a) also discuss limit theorems for the distributions of counts for some particular birth and death evolutions of infinite systems of points, though on a rect- angular lattice. In the processes they consider, the birth and death rates depend either on the process only at neighbouring lattice points or (in one dimension) on the distances to the nearest points in either direction.

Birth and death processes in R', for which the birth and death rates at 4 depend on the process only through the locations of the nearest points to C on either side, were introduced by Spitzer (1977) and discussed further by Holley & Stroock (1978b). It is important to note that such birth and death processes are essentially different from the nearest neighbour birth and death processes mentioned earlier, for there the neighbour structure was defined in terms of the space C, while now it is concerned with the process points. Suppose that we consider an evolution on R", for which the conditional probability density of a birth at 4 in the interval (t, t + r], given the entire process at time t, is #(1, r)T + o(r), where I and r are the distances from 4 to the nearest points of the process on either side at time t. Similarly suppose that the conditional probability of a death in (t, t + r], given the process at t, is 6(1, r)r + o(r). We assume that the functions p and 6 are positive and bounded.

If we assume the existence of a time-reversible invariant distribution ' for the evolution, then this imposes restrictions on # and 6, or, more precisely, on y(1, r) = f(1, r)/6(l, r). In particular, y must satisfy

y(x, y)y(x + y, z) = y(y, z)y(x, y + z); (3.9)

compare this with (3.6) for the finite birth and death process. Then there exists a positive, measurable functionfsuch that

y(x, y)= f(x)f(y)/f(x + y); (3.10) this follows from (3.9) on the assumption of mild regularity conditions on y. On the other hand, if we have a birth and death evolution for which the function y satisfies (3.10) in terms of a probability density function f, on (0, oo) with finite mean, then an equilibrium state does exist for the evolution and is a renewal process with interval densityf. It is a simple matter to check intuitively that the analogue of the equation (3.1), defining time reversibility, holds in this case. Details of these results are given by Holley & Stroock (1978b).

Spitzer (1977) also suggests a similar nearest-point evolution for a fixed set of n particles moving in R 1, although for mathematical simplicity he assumes that n is not R' but a circle of radius R. Suppose that the conditional probability that a point at 4 moves in the time interval (t, t + 1], given the configuration of points at time t, is s(l, r)r + o(r) where I and r are the distances to the nearest points on either side of 4 at time t. Let the distance moved by the point have some symmetric distribution which does not depend on C, on the configuration of points at t or on t itself. Ifs(l, r) satisfies

s(l, r) =f(l + r)/{f()f(r)}, (3.11)

This content downloaded from 194.29.185.141 on Tue, 10 Jun 2014 16:41:14 PMAll use subject to JSTOR Terms and Conditions

Page 17: An Introduction to Spatial Point Processes and Markov Random Fields

36

for some positive probability density, f with a finite mean, then the renewal process with interval densityf is a time-reversible invariant state with respect to the evolution. The definition used for a renewal process on a circle is such that if the radius, R, and the number of particles, n, tend to infinity appropriately, the usual renewal process is obtained. Spitzer conjectures that the result applies to renewal processes on I1.

Returning to processes in R d, suppose we consider points moving continuously in some sort of Brownian motion, rather than remaining in a fixed place during some interval and then jumping to another location. In particular, suppose that there are n points each performing a Brownian motion such that, if at time t the points are at 1, . . ., ,C, then the expected movement of the point at Ci in the interval (t, t + r] is

a(4i; 1, . . ., (J)r + o(r), for some suitably smoothly

varying vector drift function a. We assume that the components of the displacement are

independent with variance a2 + o(r), (after preliminary transformation of the spatial co- ordinates if necessary).

Consider a discrete approximation to the problem. If a time-reversible equilibrium distribution p exists for the evolution then transitions from (C ,...,

n) to ((, + 6,, .., •, + 6,) must balance

those transitions occurring in the opposite direction. Thus, p must satisfy

I( n

2(r2Zr

where, for a vector 6, 1812= 5r6. If 16 1 is small for each i= 1, . . ., n and the function a is

sufficiently smooth, then

p ((1

+ 6,,...,

,+ n) 2 n3.

log-_(I6T) :a((,; 01, 9. Cn.). (3.12)

If p is positive, we can write it in the form

P(C, ..., n)

= Cn exp I- W(•,,

* * ) n)},

(3.13)

for some constant C, and real-valued function y/. It follows from (3.12) that

grade (u, ( ..., Qn)

oc a((; r,. . ., C

n). (3.14)

Note here that n is fixed, so W is only defined on n-tuples of points in Rd. Thus (3.13) and

(3.14) say that, if it exists, the time-reversible equilibrium distribution, u, is the restriction to

n-tuples of a Gibbs state for which the gradient of the potential of an n-tuple is proportional to the drift function of the motion. This result was obtained by Kolmogorov (1937) for the motion of a finite number of particles on a torus and has been extended by various authors. For example, Lang (1977) considers infinite sets of interacting particles diffusing in Rd, while a clear expository discussion of time-reversible diffusions on manifolds is given by Kent (1978). Note that a single point particle diffusing in a region of Rnd is equivalent to n particles each

diffusing in a subset of Rd. Earlier papers on this subject are by Nelson (1958), Nagasawa (1961) and Nagasawa & Sato (1962).

4 Examples and applications

4.1 Processes with a countable set of sites

Suppose that the set of sites n is a finite portion of a rectangular lattice and that any two distinct sites in C are neighbours if they are adjacent lattice vertices. Thus, any site has at most

This content downloaded from 194.29.185.141 on Tue, 10 Jun 2014 16:41:14 PMAll use subject to JSTOR Terms and Conditions

Page 18: An Introduction to Spatial Point Processes and Markov Random Fields

37

four neighbours apart from itself (each site being defined to be its own neighbour). Then, from (2.5), a nearest-neighbour potential V satisfies

t,(A) = I U,(ij) + I UO(Cij, 4,r,), (4.1)

for any A E S((2), where U,,(C,j, C,j,) is nonzero only if (jy and (,j, are adjacent sites. It is convenient to change notation here and let the row vector x = (xl, . . .,

x1,a) denote the con-

figuration of points, where the sites are numbered arbitrarily and xi = 1 if the ith site is occupied and xi = 0 otherwise. Then the potential (4.1) can be rewritten in the form

I01

V(x) = xi ai + xi xj fli , (4.2) i=1 1<i<j<,Il

where flij is zero unless the ith and jth sites are neighbours. If u is a probability distribution on the set of all configurations, with a potential lV given

by (4.2), then by definition ,u(x) oc expI{-y(x)}. It follows from this that the conditional probability that there is a point at the ith site, given the configuration x(~' say, at the remaining sites, is

pr (x

=- 1Ix -)

= iXj)(4.3) 1 + exp (--ai - ,,i x, flij) Equivalently

pr (xi

= 1Ix()) = exp

(--a, - Y,j, xj flj), (4.4) pr (xi= OlxMt)

where both (4.4) and (4.5) involve x(l' only through the variables at sites adjacent to the ith site. This model is termed the autologistic model by Besag (1972).

Usually, it is convenient to regard the process as applying to a larger set of lattice points which contains 0 and all sites which are neighbours of at least one site in Q. Then we consider the process on 0 conditionally upon the configuration of points on these boundary sites. In this case, the second sum in (4.2) is over all pairs of adjacent sites which have at least one site in Q, while for a homogeneous process (4.4) reduces to

pr (xi = 1 1x(')) pr (x= exp (-a - 11 Z•) - 2(2i)) (4.5) pr (xi = 0 Ix(/)

where z(•' = x) + x) = + ) and xl') (j = 1, ..., 4) are the values of the variables at the four sites which neighbour site i as shown:

xl) x, x 0.

A particular spatial-temporal model which generates the Markov random field given by (4.5) as its equilibrium distribution, is due to Besag (1972). Let the conditional probability that the value of the variable at site i changes from x to 1 -x in the interval (t, t + r), given the configuration, x(f(t), on the remaining sites at time t, have the form

pr {xi(t + ) = 1 -xlxt(t)= x, x(o(t)} = rexp {ax +

flx zi(t) + flx2z~ o(t)} + o(r),

in an obvious extension of the notation. If we require a time-reversible evolution, then the

This content downloaded from 194.29.185.141 on Tue, 10 Jun 2014 16:41:14 PMAll use subject to JSTOR Terms and Conditions

Page 19: An Introduction to Spatial Point Processes and Markov Random Fields

38

equilibrium distribution for the process must satisfy the following balance equation (compare with (3.1)):

pr (xi = x I x(') exp (ax + flx zli) + flx2,Z)) = pr (xi = 1 - x

Ixx/') exp {a_x + ll-x1 zi') + f1-x2 Z2(1) .

This equation has the same form as (4.5) if

a= a= a- ao, #1=11-l, o =2 12-fl02. Another specific example of a three-parameter family of Markov random fields on a

rectangular lattice is described by Pickard (1977b, 1978). A process is constructed on the lattice

using Markov chains, and is a Markov random field if the neighbours of any site are defined to be the four adjacent lattice vertices plus the four diagonally adjacent vertices. Pickard has verified that the dependence on the diagonally adjacent vertices is unavoidable in this class of

models, so that the process cannot reduce to a Markov random field in which each site has

only the four nearest vertices as neighbours. The process has some interesting properties. For example, the correlation between site variables decreases geometrically as the distance between the sites increases; more specifically if X1j represents the binary variable at site ij then

corr (Xij, Xi+kj+l) - plk+i

Also, importantly, its Markov chain structure makes the process easy to simulate. An unusual application of the theory of Markov random fields on a finite set of sites

exploits a connection between Markov fields and log linear interaction models for contingency tables (Darroch, Lauritzen & Speed, 1980). The idea is to let the set of sites, (, correspond to the set of factors with respect to which the objects in the experiment are to be classified, while the values of the site variables represent the factor levels. Thus the case we have been consider-

ing, of binary site variables, corresponds to each factor having two levels (presence/absence, high/low, etc.). Each possible realization of the Markov random field represents a particular cell in the contingency table. As earlier in this section, let x denote a particular realization or cell. Let p(x) be the probability that an object belongs to cell x and suppose that all cells have

positive probability. Then log p can be expanded in the form

log1p(x)= j iA(x),

(4.6)

where the sum in (4.6) is over all subsets A of the set of factors, and ia(x) depends on x only through those factors which are in A. The functions iA are the interactions between the factors.

Compare (4.6) with the expansion (2.3) of the potential as a sum of interaction potentials. A general log linear model involves specifying some interactions to vanish and allowing the

rest to be arbitrary and unknown. A hierarchical model is one in which, if any interaction

iA vanishes, then any higher-order interaction which involves all the factors in A must also vanish. Such a model can be specified, therefore, by giving the set of all allowable interactions.

Now suppose there is a graph on the set of factors (or, equivalently, a relation on the sites

specifying whether or not any two sites are neighbours), and consider a hierarchical model in which an interaction between factors is allowable if these factors form a clique. Then the

set of all possible probabilities p is exactly the same as the set of all nearest-neighbour Gibbs states with this particular neighbour structure on the sites of 0. It follows from the results of

? 2.1 that p is a Markov random field. Thus the conditional independence properties of Markov random fields apply. For example, if two factors have a vanishing interaction, then, given the levels of the remaining factors for a particular object in the experiment, the levels of these two factors are conditionally independent.

Darroch et al. call such hierarchical models graphical and point out that not all hierarchical

This content downloaded from 194.29.185.141 on Tue, 10 Jun 2014 16:41:14 PMAll use subject to JSTOR Terms and Conditions

Page 20: An Introduction to Spatial Point Processes and Markov Random Fields

39

models are of this type. As an example, consider a case when there are just three factors A, B and C and suppose all interactions are allowable except the three-factor interaction ABC. This model is hierarchical but not graphical, for, since all three pairwise interactions are allowable, IA, B, C } is a clique.

An important property of the class of graphical models is that it contains the class of decomposable models (Haberman, 1974). In fact, a graphical model is decomposable if the graph is triangulated in that any cycle linking four or more vertices must have a chord.

4.2 Processes in ~d

As an example of a Markov random field on a bounded region 0 of d-dimensional Euclidean space, Kelly & Ripley (1976) and also Ripley (1977) discuss the following process, in which the inhibition of points depends only on the number of neighbouring points. Let two elements of t2 be defined to be neighbours if they are less than some fixed distance r apart, and let the density of the process with respect to the distribution of a Poisson process (see (2.13)) be given by

gn( , . , ) = abn en(,,

..... ), (4.7)

where nr(l,i ..., 4,) is the number of distinct (unordered) pairs (C, 4 ) for which C and Cj are neighbours (1 < i < j I n). In (4.7) the constant c is restricted to lie in [0, 11, while b is an arbitrary positive constant. This process has the property that the conditional density of a point at 4, given the configuration of points in Q - C, depends only on the number of neighbours of 4 in this configuration. Note that if c = 0, equation (4.7) defines a hardcore model in which no two points are less than r apart, while if c = 1 the process is a Poisson process.

A time-reversible spatial birth and death process which generates this process as its equilibrium distribution must satisfy (3.5) and therefore the ratio of the birth rate to the death rate must be given by

= pbc"r(4: p)b , (4.8)

where nr(C; 1,C

..., 4) is the number of the C1,..., C, which are neighbours of 4. Two possible spatial-temporal evolutions are when

[(; 4,..., = pbg, (4;i, 9...,

( 1; 4,,...,

4)= pbc,(,r ......(, ,..., 4)= 1.

Kelly and Ripley also show that if we want a process for which the conditional density of a point at 4, given the configuration of points in Q - C, depends on the total number of points in Q as well as the number of points which are neighbours of ?, then the

gn must be of the form

g,.(, .., ,) = f (n) c"( -

....(), for a suitable functionf.

4.3 Model fitting

In this paper we have been concerned with the mathematical structure of spatial processes of interacting points. When it comes to fitting such models to data there are severe problems. Even in the simple case of a finite portion of a rectangular lattice and a homogeneous potential function involving only pairwise interactions (compare with (4.2)), the form of the normalizing

This content downloaded from 194.29.185.141 on Tue, 10 Jun 2014 16:41:14 PMAll use subject to JSTOR Terms and Conditions

Page 21: An Introduction to Spatial Point Processes and Markov Random Fields

40

constant in the joint probability distribution of the process makes the use of maximum likelihood techniques intractable.

Besag (1974) suggests the use of coding methods which he illustrates on real data. For a process having a nearest-neighbour potential function y/ of the form (4.2), the values of the variables on alternate sites are regarded as given. Using a coding scheme of the form

x O x O x 0 x O x O x 0 x 0 x,

the variables at the sites marked x are conditionally independent, given the values of the variables at the sites marked O. Thus the conditional likelihood of the x-coded variables is a product of conditional probabilities, pr(xi = x I x'), j = 1, ..., 4), and conditional maximum likelihood estimates of the parameters can be obtained. A second (correlated) set of estimates, which can then be combined in some way with the first set, can be derived by reversing the roles of x and 0. Clearly, more complicated interaction schemes will necessitate conditioning on more site variables, but more sets of estimates will be obtainable by translating the coding scheme.

An alternative method of estimation, suggested by Besag (1975), is to choose parameter values which maximize the pseudolikelihood, i.e. the product over all sites of the conditional likelihood. This method gives estimates which are consistent as the number of sites increases.

Another possibility discussed by Besag (1974) is that of approximating the true (two-sided) model by a one-sided model. For example, suppose that we have a model in which the con- ditional distribution of the variable at site (r, given the configuration on the set of predecessors,

{(r, ,: eitherj' < j or both j' = j and i' < i}, is a simple function of the variables at sites

i-lj, •i 1_ and

-i_1l_-. Then, starting from a

suitable set of boundary values, we can write down the likelihood of the realization of the

process as a product of these conditional probabilities and obtain maximum likelihood estimates of the parameters. By a careful choice of the predecessors which appear in the conditional

probability and the form of this probability, the one-sided model can provide a good approxi- mation to the true model.

The first- and second-order properties of a process are very often the ones of most interest and in this case it may be appropriate to concentrate on fitting these aspects of the data. The correlational properties of the Ising model have been considered by many authors; see for

example, Kaufmann & Onsager (1949), Green & Hurst (1964). The correlation between site

variables decreases in absolute value (though not geometrically) as the distance between the

sites increases, but whether the limit is zero or not depends on the parameter values involved. If this limit is nonzero, then the process is said to have long-range order and the transition from

long- to short-range order coincides with the transition from one phase to another.

More recently, Bartlett (1971; 1972; 1975, ? 2.2) has used a spatial-temporal approach to obtain approximations to the correlations, and Pickard (1976, 1977a) derives a central

limit theorem for the sample correlations between nearest neighbours; note that both these authors work with binary variables taking values +1 rather than 0, 1.

The correlational properties of such lattice processes with interactions are notoriously difficult and exact results are only available in certain special cases. Moreover it should be noted that two processes on a lattice can be visually very different and yet have the same second-order behaviour. An example of this is given by Enting & Welberry (1978). Thus, concentrating on the second-order properties of a process may be unwise.

The above comments have concerned processes on a lattice but the problems of fitting

spatial point processes with continuous state spaces are no less severe. The features of a

This content downloaded from 194.29.185.141 on Tue, 10 Jun 2014 16:41:14 PMAll use subject to JSTOR Terms and Conditions

Page 22: An Introduction to Spatial Point Processes and Markov Random Fields

41

spatial process of most interest tend to be the second-order counting properties (or spectra) and the nearest-neighbour distances; that is, the distances between one point or site and the nearest (or, more generally, kth nearest) point to it. The theoretical properties of processes with interactions between the points, of the kind considered in this paper, usually have to be studied by simulation. Models can then be fitted by comparing the simulations with the observed property. If, for example, repulsion between points is suspected, then the nearest-neighbour distances are an intuitively appealing way of investigating this. Examples of model fitting using these distances are given by Bartlett (1974; 1975, ? 3.2.2.) and Besag & Diggle (1977).

However, the simulation of these processes is no trivial matter. One possibility is to use a rejection sampling method. The function gn in (2.13) specifies how likely a particular set of points is relatively to a Poisson process of rate p. If, for all n, g,(.) < gmax, then we can generate points from a Poisson process of rate p and accept the result with probability g~(•l, .

?., C)/gmax and reject it otherwise. Another possibility is to use a spatial-temporal evolution which has the desired process as its equilibrium distribution, and simulate this evolution over a sufficiently long period. A discussion of simulation of processes and model fitting is given by Ripley (1977).

Finally, we note that although we have been concerned with processes for which the conditional distribution of the variable attached to site C, given the values of the remaining variables, depends only on those at sites in b, another related approach to modelling spatial point processes has been considered by various authors; see for example Whittle (1954), and also Bartlett (1974), Besag (1974, 1975, 1977), Mead (1971) and Ord (1975). In this approach, the joint distribution of the variables is assumed to have a simple product form: for example consider the process on a rectangular lattice with joint probability distribution

Il,q(xi; x5), j = 1, .. ., 4),

for a suitable function q. In particular this would apply in an autoregressive scheme with 4

Xi= a, Xj ) + 6,

i=1

where the Ei are independent, identically distributed variables. Usually the et are assumed to be normally distributed in which case the lattice variables are continuous, unlike the discrete case with which we have been concerned. Note that in this autoregressive scheme the conditional distribution of X1, given the values of the variables at all the other lattice points, does not depend only on the Xj') (j = 1, ..., 4), but on the 12 neighbouring sites of site i as shown below.

Xi

Acknowledgments

I am most grateful to O. Barndorff-Nielsen and D.R. Cox for the highly constructive suggestions which they made on a preliminary draft of this paper, and to F. Spitzer for providing prepublication copies of his recent work with T.M. Liggett. Helpful comments from both referees are acknowledged with thanks.

References

Averintsev, M.B. (1970). On a method of describing discrete parameter random fields (in Russian). Problemy Peredachi Informatsii 6, 100-108.

This content downloaded from 194.29.185.141 on Tue, 10 Jun 2014 16:41:14 PMAll use subject to JSTOR Terms and Conditions

Page 23: An Introduction to Spatial Point Processes and Markov Random Fields

42

Bartlett, M.S. (1971). Physical nearest-neighbour models and non-linear time-series. J. Appl. Prob. 8, 222-232. Bartlett, M.S. (1972). Physical nearest-neighbour models and non-linear time series II. Further discussion of

approximate solutions and exact equations. J. Appl. Prob. 9, 76-86. Bartlett, M.S. (1974). The statistical analysis of spatial pattern. Adv. Appl. Prob. 6, 336-358. Bartlett, M.S. (1975). The Statistical Analysis of Spatial Pattern. London: Chapman and Hall. Besag, J.E. (1972). Nearest-neighbour systems and the autologistic model for binary data. J. R. Statist. Soc. B 34,

75-83. Besag, J.E. (1974). Spatial interaction and the statistical analysis of lattice systems (with discussion). J. R. Statist.

Soc. B 36, 192-236. Besag, J.E. (1975). Statistical analysis of non-lattice data. Statistician 24, 179-195. Besag, J.E. (1977). Errors-in-variables estimation for Gaussian lattice schemes. J. R. Statist. Soc. B 39, 73-78. Besag, J.E. & Diggle, P.J. (1977). Simple Monte Carlo tests for spatial pattern. Appl. Statist. 26, 327-333. Cox, D.R. & Isham, V. (1980). Point Processes. London: Chapman and Hall. Daley, D.J. & Vere-Jones, D. (1972). A summary of the theory of point processes. In Stochastic Point Processes,

Ed. P.A.W. Lewis, pp. 299-383. New York: Wiley. Darroch, J.N., Lauritzen, S.L. & Speed, T.P. (1980). Markov fields and log-linear interaction models for contingency

tables. Ann. Statist. 8, 522-539. Dobrushin, R.L. (1968a). The description of a random field by means of conditional probabilities and conditions

of its regularity. Theory Prob. Appl. 13, 197-224. Dobrushin, R.L. (1968b). Gibbsian random fields for lattice systems with pairwise interactions. Funct. Anal. Appl.

2, 292-301. Dobrushin, R.L. (1968c). The problem of uniqueness of a Gibbsian random field and the problem of phase

transitions. Funct. Anal. Appl. 2, 302-312. Dobrushin, R.L. (1969). Gibbsian random fields. The general case. Funct. Anal. Appl. 3, 22-28. Domb, C. & Green, M.S. (Eds). (1972-76). Phase Transitions and Critical Phenomena. London: Academic Press. Doob, J.L. (1953). Stochastic Processes. New York: Wiley. Durrett, R. (1979). An infinite particle system with additive interactions. Adv. Appl. Prob. 11, 355-383. Enting, I.G. & Welberry, T.R. (1978). Connection between Ising models and various probability distributions.

Suppl. Adv. Appl. Prob. 10, 65-72. Green, H.S. & Hurst, C.A. (1964). Order-disorder Phenomena. London: Wiley. Griffeath, D. (1979). Additive and cancellative interacting particle systems. Lecture Notes in Mathematics, No. 724.

Berlin: Springer-Verlag. Grimmett, G.R. (1973). A theorem about random-fields. Bull. Lond. Math. Soc. 5, 81-84. Haberman, S.J. (1974). The Analysis of Frequency Data. I.M.S. monograph. University of Chicago Press. Hammersley, J.M. & Clifford, P. (1971). Markov fields on finite graphs and lattices. (Unpublished manuscript.) Holley, R. & Stroock, D. (1978a). Invariance principles for some infinite particle systems. In Stochastic Analysis,

Eds. A. Friedman and M. Pinsky, pp. 153-173. New York: Academic Press. Holley, R. & Stroock, D. (1978b). Nearest-neighbour birth and death processes on the real line. Acta Mathematica

140, 103-154. Ising, E. (1925). Beitrag zur Theorie des Ferromagnetismus. Z. Physik. 31, 253-258. Kac, M. (1978). Some mathematical problems in statistical mechanics. In Studies in Probability Theory, Ed.

M. Rosenblatt, pp. 180-228. Mathematical Association of America. Kallenberg, 0. (1976). Random Measures. Berlin: Akademie Verlag. Kaufmann, B. & Onsager, L. (1949). Crystal statistics III. Short-range order in a binary Ising lattice. Phys. Rev.

76, 1244-1252. Kelly, F.P. & Ripley, B.D. (1976). A note on Strauss' model for clustering. Biometrika 63, 357-360. Kent, J. (1978). Time-reversible diffusions. Adv. Appl. Prob. 10, 819-835. (See also Adv. Appl. Prob. 11, 888.) Kolmogorov, A.N. (1937). Zur Umkehrbarkeit der Statistischen Naturgesetze. Math. Ann. 113, 766-772. Lang, R. (1977). Unendlich-dimensionale Wienerprozesse mit Wechselwirkung. Z. Wahr. verw. Geb. 38, 55-72;

39, 277-299. (See also T. Shiga (1979). A remark on infinite-dimensional Wiener processes with interactions. Z. Wahr. verw. Geb. 47, 299-304.)

Lebowitz, J.L. & Martin-L6f, A. (1972). On the uniqueness of the equilibrium state for Ising spin systems. Comm. Math. Phys. 25, 276-282.

Liggett, T.M. (1977). The stochastic evolution of infinite systems of interacting particles. Lecture Notes in Mathematics, No. 598. Berlin: Springer-Verlag.

Liggett, T.M. & Spitzer, F. (1979). Ergodic theorems for coupled random walks and other systems with locally interacting components. To appear.

McCoy, B.M. & Wu, T.T. (1973). The Two Dimensional Ising Model. Harvard University Press. Matthes, K., Kerstan, J. & Mecke, J. (1978). Infinitely Divisible Point Processes. London: Wiley. Mead, R. (1971). Models for interplant competition in irregularly distributed populations. In Statistical Ecology,

Vol. 2, Eds. G.P. Patil, E.C. Pielou and W.E. Waters, pp. 13-22. Pennsylvania State University Press. Minlos, R.A. (1967a). Limiting Gibbs' distribution. Funct. Anal. Appl. 1, 140-150. Minlos, R.A. (1967b). Regularity of the Gibbs limit distribution. Funct. Anal Appl. 1, 206-217. Moussouris, J. (1974). Gibbs and Markov random systems with constraints. J. Statist. Phys. 10, 11-33. Nagasawa, M. (1961). The adjoint process of a diffusion with reflecting barrier. Kodai Math. Sem. Rep. 13,

235-248.

This content downloaded from 194.29.185.141 on Tue, 10 Jun 2014 16:41:14 PMAll use subject to JSTOR Terms and Conditions

Page 24: An Introduction to Spatial Point Processes and Markov Random Fields

43

Nagasawa, M. & Sato, K. (1962). Remarks to 'The adjoint process of a diffusion with reflecting barrier. Kodai Math. Sem. Rep. 14, 119-122.

Nelson, E. (1958). The adjoint Markov processes. Duke Math. J. 25, 671-690. Ord, J.K. (1975). Estimation methods for models of spatial interaction. J. Am. Statist. Assoc. 70, 120-126. Pickard, D.K. (1976). Asymptotic inference for an Ising lattice. J. Appl. Prob. 13, 486-497. Pickard, D.K. (1977a). Asymptotic inference for an Ising lattice II. Adv. Appl. Prob. 9, 476-501. Pickard, D.K. (1977b). A curious binary lattice process. J. Appl. Prob. 14, 717-731. Pickard, D.K. (1978). Unilateral Ising models. Suppl. Adv. Appl. Prob. 10, 58-64. Preston, C.J. (1973). Generalised Gibbs states and Markov random fields. Adv.-Appl. Prob. 5, 242-261. Preston, C.J. (1974). Gibbs States on Countable Sets. Cambridge University Press. Preston, C.J. (1976a). Random fields. Lecture Notes in Mathematics, No. 534. Berlin: Springer-Verlag. Preston, C.J. (1976b). Spatial birth and death processes. Bull. Int. Statist. Inst. 46, 371-391. Ripley, B.D. (1977). Modelling spatial patterns (with discussion). J. R. Statist. Soc. B 39, 172-212. Ripley, B.D. & Kelly, F.P. (1977). Markov point processes. J. Lond. Math. Soc. 15, 188-192. Ruelle, D. (1967). A variational formulation of equilibrium statistical mechanics and the Gibbs phase rule. Comm.

Math. Phys. 5, 324-329. Ruelle, D. (1969). Statistical Mechanics. New York: Benjamin. Sherman, S. (1973). Markov random fields and Gibbs random fields. Israel J. Math. 14, 92-103. Spitzer, F. (1971a). Markov random fields and Gibbs ensembles. Am. Math. Mon. 78, 142-154. Spitzer, F. (1971b). Randcm fields and interacting particle systems. Lecture notes given at the 1971 M.A.A. summer

seminar. Mathematical Association of America. Spitzer, F. (1977). Stochastic time evolution of one-dimensional infinite particle systems. Bull. Am. Math. Soc. 83,

880-890. Spitzer, F. (1979). Infinite systems with locally interacting components. To appear. Thompson, C.J. (1972). Mathematical Statistical Mechanics. New York: Macmillan. Whittle, P. (1954). On stationary processes in the plane. Biometrika 41, 434-449.

Rtsume

On peut utiliser les champs binaires de Markov comme modules pour les processus ponctuels avec interactions entre points (tels que I'attraction ou la r6pulsion). En suivant cette d6marche, cet article se presente comme une introduction simple aux champs de Markov, sans d6velopements technicques detaill6s. Nous nous interessons au cas oui les espaces sousjacents contenant ces points sont denombrables (ex. les noeuds d'un treillis ou continus (espace euclidien). Nous consid6rons egalement les champs de Markov en tant que processus d'6quilibre dans les 6volutions temporelles des processus spatiaux. Pour finir, nous signalons quelques exemples et quelques applications.

[Paper received May 1980, revised October 1980]

This content downloaded from 194.29.185.141 on Tue, 10 Jun 2014 16:41:14 PMAll use subject to JSTOR Terms and Conditions