directional analysis of stationary point processes

Martina Sormani
zur Verleihung des akademischen Grades Doktor der Naturwissenschaften (Doctor rerum naturalium, Dr. rer. nat.)
genehmigte Dissertation
Datum der Disputation: 11.06.2019
Acknowledgements
First of all and most of all I would like to thank my supervisor Claudia Redenbach, who gave me the opportunity to do my Phd, and was of great support and guidance during this time, not only regarding issues in mathematics.... I am grateful to Tuomas Rajala, who shared with me lots of code and ideas and to Prof. Aila Särkkä, especially for her detailed and accurate text corrections. Thanks to Johannes Freitag to share with us the ice-data, for his suggestions, and for giving me the opportunity to enter in the DFG Project. Thanks also to his Phd student Tetsuro. I also would like to thank the image processing group of the ITWM for letting me use their software and to have shared their knowledge. In particular thanks to Sonja for all her help and Prakash. I am grateful to professors Lothar Heinrich, Jürgen Franke, Gabriele Steidl for sharing their knowledge and for taking time to discuss with us. Thanks to Disha that was always with me in good and bad times. Finally I want to thank my family in Italy which has been always near to me, and... to Luis and Diego .....
Preface
This work has mainly been supported by the DFG priority programm “Antarktisforschung mit vergleichenden Untersuchungen in arktischen Eisgebieten”: FR 2527/2-1, RE 3002/3-1. Partial funding by the DFG-Graduiertenkolleg 1932 and from the Center for Mathematical and Computational Modelling (CM)² in Kaiserslautern, is gratefully acknowledged.
List of Symbols
B0 bounded sets of B
Nlf locally finite subsets of Rd
Nlf σ-algebra on Nlf
x point configuration on Rd
Nx(B) number of points of x in a subset B ⊂ Rd
X spatial point process on Rd
XS X ∩ S
NX(B) number of points of X in a subset B ⊂ Rd
λ intensity of X
∂W border of W
(,F ,P) probability space
K(·) reduced second order moment measure
λ(2) second order product density
P0(·) Palm measure
W ∗ window of observation of the Fry points
λZ intensity of the Fry points
µ,ν measures
G nearest neighbor distance distribution function
F empty space function
g pair correlation function
d(·, ·) distance function
Sd−1 unit sphere in Rd
kd volume of the d-dimensional unit ball
I(·) indicator function
T = RC linear mapping, R rotation matrix, C compression matrix
R0 element of SOn
det(·) determinant of a matrix
tr(·) trace of a matrix
AT transpose of the matrix A
VI
Contents
Introduction 2
1 Spatial Point Processes 3 1.1 General notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.2 Definitions and preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.3 Properties of spatial point patterns . . . . . . . . . . . . . . . . . . . . . . . . 5 1.4 Poisson point process (CSR) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 1.5 Summary statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.5.1 Intensity measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 1.5.2 Palm distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 1.5.3 Second order summary statistics . . . . . . . . . . . . . . . . . . . . . . 11 1.5.4 First order summary statistics . . . . . . . . . . . . . . . . . . . . . . . . 12
1.6 Strauss process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 1.7 The Metropolis Hastings algorithm . . . . . . . . . . . . . . . . . . . . . . . . . 17
1.7.1 The algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 1.7.2 Convergence of the algorithm . . . . . . . . . . . . . . . . . . . . . . . . 18 1.7.3 Simulation of locally stable point processes . . . . . . . . . . . . . . . . 19
2 Directional Analysis 21 2.1 Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.1.1 Aims . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 2.1.2 Explicative examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.2 Fry points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 2.3 Integral method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.3.1 Estimation of R . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 2.4 Projection method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
2.4.1 Estimation of R . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 2.5 Ellipsoid method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
2.5.1 Estimation of R . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 2.6 Estimation of C . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
2.6.1 Integral method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 2.6.2 Projection method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
3 Simulation Study 45 3.1 Simulation study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
Contents
3.2 Estimation of R . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 3.2.1 2D . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
3.2.1.1 Integral Method . . . . . . . . . . . . . . . . . . . . . . . . . . 46 3.2.1.2 Projection Method . . . . . . . . . . . . . . . . . . . . . . . . 58 3.2.1.3 Ellipsoid Method . . . . . . . . . . . . . . . . . . . . . . . . . 59
3.2.2 3D . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 3.2.2.1 Projection Method . . . . . . . . . . . . . . . . . . . . . . . . . 61 3.2.2.2 Ellipsoid Method . . . . . . . . . . . . . . . . . . . . . . . . . . 63
3.2.3 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 3.3 Estimation of C . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
3.3.1 2D . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 3.3.2 3D . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 3.3.3 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
4 Directional Analysis-Additional Aspects 79 4.1 Influence of noise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
4.1.1 Simulation study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 4.2 Classification algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
4.2.1 Model specification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83 4.2.2 MCMC method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83 4.2.3 Variational Bayes algorithm . . . . . . . . . . . . . . . . . . . . . . . . 84 4.2.4 Comparison of the methods . . . . . . . . . . . . . . . . . . . . . . . . . 85
4.3 Testing against anisotropy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86 4.3.1 Power of the “Projection” test . . . . . . . . . . . . . . . . . . . . . . . 88
4.4 Visualization of the Fry points . . . . . . . . . . . . . . . . . . . . . . . . . . . 89 4.4.1 2D . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90 4.4.2 3D . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
4.5 Limit behaviour of the geometric anisotropy transform . . . . . . . . . . . . . . 95
5 Application to Ice Data 99 5.1 Description of the data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
5.1.1 Division in subsamples . . . . . . . . . . . . . . . . . . . . . . . . . . . 101 5.2 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105 5.3 Directional analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
5.3.1 Estimation of the interaction radius . . . . . . . . . . . . . . . . . . . . 106 5.3.2 Estimation of R . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
5.3.2.1 Talos Dome core . . . . . . . . . . . . . . . . . . . . . . . . . . 108 5.3.2.2 EDML core . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 5.3.2.3 Renland core . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
5.3.3 Estimation of C . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113 5.3.3.1 Talos Dome core . . . . . . . . . . . . . . . . . . . . . . . . . . 114 5.3.3.2 EDML core . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114 5.3.3.3 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
5.3.4 Representation of the Fry points . . . . . . . . . . . . . . . . . . . . . . 119
Conclusions 122
VIII
Contents
Appendices 123 A.1 Proof of unbiasedness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123 A.2 Expectation of wavelet coefficients . . . . . . . . . . . . . . . . . . . . . . . . . 123 B.1 Academic Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125 B.2 Akademischer Werdegang . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
Bibliography 125
IX
Introduction
In this thesis we consider as a main topic the directional analysis of a stationary point process. The interest in such an analysis rises in the modern point process literature, since, thanks to the advances in technology, large and complicated point pattern data, in particular in 3D, are more common. For those patterns, the assumptions of isotropy and stationarity can not be simply made but need further investigation. Testing stationarity has been considered by several authors and currently lots of non-stationary models are available [1, 2, 27, 39, 49]. Isotropy is, on the other hand, often still assumed without further checking, although several tools to study anisotropy have been suggested in the literature. To render those tools more easily accessible it was recently published a paper [51] where the existing non-parametric methods have been collected. In this thesis we focus on and compare three non-parametric methods which we have defined as Integral method, Ellipsoid method and Projection method. All of them are based on second- order analysis of a point process. The Ellipsoid method has been introduced in [52]. The Integral method has been applied in the literature in several versions, for example in [53] or in [35] and it is here described in a general context. The Projection method, from the best of our knowledge, is introduced in this thesis. A similar idea in 2D can be found in [26, page 254]. In a simulation study we apply the methods in order to find preferred directions and we compare their performances. Testing isotropy and visualization of anisotropy, both in 2D and 3D, are also considered. Directional methods are especially useful to detect directions in regular point patterns since it can be difficult to visually detect anisotropy in such patterns. In constrast, in clustered patterns, the shape and directions of the clusters can already reveal some information. An example of a regular pattern where it is difficult to visually detect anisotropy is the amacrine cells data (Figure 0.0.1, left), which consists of on cells and off cells. These data have been analyzed several times by assuming stationarity and isotropy, but it was recently detected by Wong and Chiu that both the marginal on and off patterns and the superposition show some sign of anisotropy. Anisotropy can be formed by several mechanisms. In this thesis we focus on the so called geometric anistropy mechanism which has been considered in the literature both for clustered point patterns, such as the Welsh chapel data (Figure 0.0.1, right) [38, 36] and for regular point processes, such as the amacrine cells [66] and air bubbles in polar ice [53, 52]. Motivated by our application to real data, we here pay special attention to the regular case. As in [53, 52] we consider the 3D locations of air-bubbles in glacial ice-cores. For this data the aim of a directional analysis is to get information about the deformation of the ice-sheet at different depths. This information is necessary for the glaciologists in order to build dating models for the ice. A first directional analysis of the ice-data can be found in [53] and [52].
Contents

on
off
Finally we consider the influence of isotropic and stationary noise on the results of the directional analysis of a stationary point process. This study is motivated by the ice application. In fact it has been recently discovered that ice core samples may contain noise bubbles which form due to the relaxation of the ice after the core is taken out of the drilling hole. In this context, the classification algorithms introduced in [54] and [50] are taken into consideration. The limit behavior of the geometric anisotropy mechanism is also described. Introduction to point process theory and to the main notation is given in Chapter 1. The three main methods, the Integral, the Ellipsoid and the Projection method, are described in Chapter 2, as well as their application in the setting of geometric anisotropy. In Chapter 3 the methods are compared via a simulation study both in 2D and in 3D. In Chapter 4 we consider the influence of noise, the anisotropy tests and the limiting behavior of the geometric anisotropy mechanism. Finally in Chapter 5 we apply the methods to the ice data. Parts of this work have been published in
• C. Redenbach, A. Särkkä, M. Sormani (2015). Classification of Points in Superpositions of Strauss and Poisson Processes. Spatial Statistics. 12, 81-95.
• T. Rajala, C. Redenbach, A. Särkkä, M. Sormani (2016). Variational Bayes Approach for Classification of Points in Superpositions of Point Processes. Spatial Statistics. 15, 85-99.
• T. A. Rajala, A. Särkkä, C. Redenbach, M. Sormani (2016). Estimating geometric anisotropy in spatial point patterns. Spatial Statistics. 15, 139-155.
• T. A. Rajala, C. Redenbach, A. Särkkä, , M. Sormani (2018). A review on anisotropy analysis of spatial point patterns. Spatial Statistics.
2
1 Spatial Point Processes
In this chapter we describe the fundamentals of the theory of spatial point processes which are necessary to introduce our work. We start by giving the formal definition of a spatial point process in Section 1.2 and by describing some important properties that a point process may have in Section 1.3. In Section 1.4 we introduce the Poisson point process, which is a fundamental model in spatial point process theory. In Section 1.5 we describe some of the possible summary statistics used to describe point patterns. Finally in Section 1.6 we introduce the Strauss process, that will be considered throughout the thesis. The main references we used in this chapter are [26], [40],[59] and [64].
1.1 General notation
In this section we define some general notations that will be considered in the thesis. More specific notations will be introduced later. We denote with I[·], the indicator function and with IB[·] the indicator function over a set B ⊂ Rd which, given x ∈ Rd, is defined as
IB[x] :=
.
Given a set B ⊂ Rd, we denote its Lebesgue measure by |B|. In particular, the Lebesgue measure of the d-dimensional unit ball Br(0) with r = 1 will be denoted by kd. We define the d− 1-dimensional unit sphere Sd−1and the positive half unit sphere as
(Sd−1)+ := {x ∈ Sd−1s.t xd > 0},
where xd denotes the last component of x. We denote the Minkowski sum of two sets A and B in Rd as
A⊕B = {a+ b : a ∈ A, b ∈ B}.
The set Bx = B⊕{x} therefore corresponds to the translation of the set B by a point x ∈ Rd. We define the Euclidean norm as || · || and denote by d(x, y) := ||x− y|| the distance between two points x, y∈ Rd. The distance between a point x ∈ Rd and a set B ⊂ Rd is given by
d(x,B) := inf y∈B
d(x, y).
Given the space Lp(Rd,R) of Lp-Lebesgue integrable functions from Rd to R, we define the corresponding Lp-norm as || · ||Lp . We denote by det(A) the determinant of a matrix A, by tr(A) its trace and by AT its transpose. Finally, we denote the Dirac Delta function with δ(·). We now introduce the notation regarding three particular types of sets in Rd. We denote S(u, ε, r) the double conical sector centered in the origin with main direction u ∈ Sd−1,
opening angle ε and with radius of the sector given by r. In 2D the set will be denoted by S(θ, ε, r) where θ is the angle that u forms with the x-axis (plot 1 of Figure 1.1.1). We denote by L(u, r, hc) the cylinder (3D) or rectangle (2D) with major-axial direction unit vector u ∈ Sd−1, with height 2r and cross-section half-length hc with 0 < hc < r. In 2D we use, as for the cone, the notation L(θ, r, hc) (plot 2 of Figure 1.1.1). Finally we denote by E(u, r, k), where k < 1, the 2D ellipse centered in the origin with major-axial direction u ∈ Sd−1 and semi-axes of length r/k and rk. In 2D we use the notation E(θ, r, k) (plot 3 of Figure 1.1.1).

r ε
1.2 Definitions and preliminaries
Spatial point processes are random countable subsets of a space S. The space S is required to be a topological locally compact space with a countable base, on which a Borel σ-algebra is defined. In this thesis we usually consider S = Rd endowed with the σ-algebra B induced by the Euclidean metric. In some cases we also consider S ⊂ Rd again endowed with the σ-algebra induced by the Euclidean metric which is also denoted by B. We now give a formal definition of a point process on S, restricting our attention to point processes whose realizations are locally finite subsets of S. Let B0 be the set of bounded elements of B, x be a countable subset of S, Nx(S) its cardinality and Nx(B) the cardinality of the point configuration x restricted to a subset B of S. We define the set Nlf of locally finite subsets of S as
Nlf = {x ⊂ S : Nx(B) <∞ ∀B ∈ B0}
On Nlf we define the following σ-algebra
Nlf := σ{{x ∈ Nlf : Nx(B) = k}, B ∈ B0, k ∈ N}.
Definition 1.2.1. Let (,F ,P) be a probability space. A spatial point process X on S is a measurable map
X : (,F)→ (Nlf ,Nlf ).
1.3 Properties of spatial point patterns
The distribution of X is given by the probability measure PX on the measure space (Nlf ,Nlf )
defined as PX(F ) = P(ω : X(ω) ∈ F ) ∀F ∈ Nlf .
In applications spatial point processes are used as statistical models for the analysis of observed patterns of points, called spatial point patterns or spatial point configurations, where the points represent the locations of some objects of interest. A great variety of objects can be considered and in many different contexts. Typical examples are locations of trees in a forest, locations of stars in the galaxies or locations of cells in a tissue. In all these situations the data, at a basic level, simply consist of point coordinates. Since spatial point patterns present a huge variety, one of the primary aims of point process theory is to provide structural methods describing how to find a statistical model which offers a satisfactory explanation of the considered pattern. To this aim, different types of models, which could depend on different parameters are considered and studied. In practice, the data of a realization of a spatial point process are collected in a bounded observation windowW , which affects the analysis of the data and should therefore be carefully taken into consideration.
1.3 Properties of spatial point patterns
In this section we describe some important properties that spatial point patterns (processes) may have. When having a point pattern it is in fact useful to check whether it satisfies certain properties, in order to find a correct model for the data, and, if possible, to simplify its analysis. We start by describing two properties, namely stationarity and isotropy, that will play a central role throughout the thesis. Let X be a point process on Rd.
1) Stationarity: We say that X is stationary, if its distribution is invariant under translations. This means that the point process Y := X + x, where x is an arbitrary fixed point of Rd, has the same probability distribution as X for all x ∈ Rd.
2) Isotropy: We say that X is isotropic, if its distribution is invariant under rotations about the origin. This means that the point process Y := R0X, where R0 ∈ SOn has the same probability distribution as X for all R0 ∈ SOn.
Both the assumption of stationarity and of isotropy considerably simplify the analysis of a point pattern. To check stationarity various methods have been proposed in the literature, some of them are pretty standard to be used (see for example the quadrat counting method, [4, page 165], where one should assume independence of the points). The hypothesis of isotropy is instead often confirmed only by a visual check. In applications we usually distinguish between
1) Regular point patterns: The points show repulsion between each other and are located such to preserve a certain distance. The repulsion may be caused by some physical limits, for example the points could represent the centers of spheres which have a certain radius r0. (Figure 1.3.1, second plot)
5
2) Clustered point patterns: The points show attraction between each other and form clusters where the points lie close to each other. An example of a clustered pattern can be the pattern of the seeds spread by a group of plants, where each plant spreads seeds in its proximity. (Figure 1.3.1, third plot)
3) Complete Spatial Randomness (CSR): The points do not show any type of interaction and are independently randomly scattered in space. (Figure 1.3.1, first plot) The CSR model plays a major role in spatial statistics and will be taken into consideration in the next Section 1.4. In the literature, several models both for regular and clustered patterns have been proposed. In this thesis we particularly focus on regular point patterns. An additional property of spatial point processes is
Simplicity: Realizations of X consist a.s. of strictly different events, so that it can a.s. not happen that two events coincide. In most applications, and also in ours, this does not represent a constraint since for physical reasons it is impossible that two points are located in exactly the same place.
Figure 1.3.1: Realizations of a CSR process (first plot), a regular point process (second plot) and a clustered point process (third plot).
CSR Regular Clustered
1.4 Poisson point process (CSR)
Definition 1.4.1. Let µ be a locally finite, diffuse measure on Rd. A point process X on Rd
such that
(i) NX(A) ∼ Poisson(µ(A)) ∀A ∈ B0,
(ii) if A1, . . . , Ak ∈ B0 are disjoint sets, NX(A1) . . . NX(Ak) are independent random vari- ables,
is called Poisson point process with intensity measure µ.
The Poisson process with intensity measure µ can be defined on a subset S ⊂ Rd in an analogous way. If the measure µ has a density λ with respect to Lebesgue measure, λ is called intensity function. If λ is constant, we say that X is a homogeneous Poisson process. It is easy
6
1.5 Summary statistics
to verify that the homogeneous Poisson process is stationary and isotropic. Note that, in the literature, when talking about CSR, it is usually referred to the homogeneous Poisson point process. The Poisson process is also used as a base for the construction of more complicated models and it is the most analytically tractable model. Note that in Definition 1.4.1, if X is simple, property (i) implies property (ii).
Definition 1.4.2. Let S ∈ B, µ a diffuse locally finite measure on Rd with µ(S) < ∞, and let n ∈ N. A point process X is a µ-binomial point process on S with n points if X := ∪ni=1ξi, where ξi are independent and they are distributed µ-uniformly in S, so that
P(ξi ∈ A) = µ(A)
µ(S) A ∈ B, A ⊂ S.
We now consider the restriction XS of a Poisson process X with intensity measure µ on a set S such that µ(S) <∞, or equivalently a Poisson process X with intensity measure µ defined on such a set S. This is not a strict restriction in applications since we usually observe a realization of our point process in a restricted window W .
Proposition 1.4.1. Let X be a Poisson point process on Rd with intensity measure µ. The process XS := X ∩S with S ∈ B0 and µ(S) > 0 conditional on NX(S) = n is a binomial point process with n points on S.
From Proposition 1.4.1 we can deduce a method to simulate XS . We can generate a random number N ∼ Poisson(µ(S)), and then generate N points, µ-uniformly scattered in S. Propo- sition 1.4.1 also allows us to characterize the distribution Π of X defined on the measure space (Nlf ,Nlf ) . In fact
Π(F ) = P(X ∈ F )
= ∞∑ n=0
e−µ(S)µ(S)n
dµ(s1)
µ(S) . . .
dµ(sn)
µ(S)
∫ S · · · ∫ S I[{s1, . . . , sn} ∈ F ]dµ(s1) . . . dµ(sn), F ∈ Nlf .
(1.4.1)
When n = 0 the integrals should be replaced by I[∅ ∈ F ].
1.5 Summary statistics
In this section we introduce different summary statistics used to describe spatial point patterns. Summary statistics can give different kinds of information about the considered spatial pattern and they can be used to help identify a suitable model for it. In Section 1.5.1 we introduce the so called intensity measures. In Section 1.5.2 we introduce Palm distributions, which
7
characterize conditional properties of spatial patterns and which are necessary to introduce second order statistics in Section 1.5.3: Ripley´s K-function and the pair correlation function g. Finally in Section 1.5.4 we introduce three possible first order summary statistics: the empty space function F , the nearest neighbor distance distribution G and the J-function, a combination of F and G.
1.5.1 Intensity measures
The first order moment measure Λ, also called the intensity measure of a point process X is defined on the space (S,B) as
Λ(A) = E(NX(A)) ∀A ∈ B,
so Λ(A) represents the expected number of points of X in A. The first order moment measure can have a density λ : S → R+ with respect to the Lebesgue measure. In this case we call λ the intensity function and we can write
Λ(A) =
∫ A λ(ξ + ν)dξ ∀ν ∈ Rd.
This implies that the intensity function is constant so that λ(x) = λ and that Λ(A) = λ|A|. In this case λ can be interpreted as the average number of points in a unit volume and, given a realization x of the process in the observation window W , can be estimated by
λ = Nx(W )
|W | . (1.5.1)
In the Poisson process the intensity measure Λ coincides with µ.
Theorem 1.5.1. (Campbell thm.) Let X be a point process on Rd and f : Rd → R+ a non-negative measurable function. Then
E (∑ x∈X
In the stationary case the equation can be written as
E (∑ x∈X
∫ Rd f(x)dx.
The proof of this theorem can be found e.g. in [57, page 54].
The definition of first order moment measure can be extended to an arbitrary order n as a measure on the product space (Sn,⊗nB)
8
E
A1, . . . , An ∈ B0.
We can also define the nth order factorial moment measure Λ(n) as
Λ(n)(A1 × · · · ×An) = E
6=∑ ξ1,...,ξn∈X
A1, . . . , An ∈ B0, (1.5.2)
where 6= indicates that the sum is done only for ξ1, . . . , ξn mutually distinct. The name of this measure is due to the fact that
Λ(n)(A× · · · ×A) = E(NX(A)(NX(A)− 1) . . . (NX(A)− n+ 1)).
The measure M (n)(A1 × · · · × An) represents the expected number of n-tuples we can form using the points of the process taking the i-th point in Ai and permitting repetitions if the intersections between some Aj are not empty, while Λ(n)(A1 × · · · × An) represents the same quantity without permitting those repetitions. The measure Λ(n) can have a density with respect to Lebesgue measure on (Sn,⊗nB) that we denote with λ(n). In the case n = 2, the density λ(2) is called second order product density. It follows directly from the definition of Λ(2) that
λ(2)(x, y) = λ(2)(y, x) ∀x, y ∈ Rd. (1.5.3)
The Campbell Theorem 1.5.1 can be generalized to the second order factorial moment measure Λ(2) as
Theorem 1.5.2. Let X be a point process on Rd and f : Rd × Rd → R+ a non-negative measurable function. Then
E ( 6=∑ x,y∈X
∫ Rd f(x, y)λ(2)(x, y)d(x, y).
We now define the so called Campbell measures, which will be useful in the next section. The first order Campbell measure on the product space (S ×Nlf ,B ⊗N lf ) is defined as
C(A× F ) = E(NX(A)I(X ∈ F ))
Notice that we have C(A×Nlf ) = Λ(A).
The first order reduced Campbell measure is defined as
C !(A× F ) = E ∑ ξ∈X
I[ξ ∈ A,X\ξ ∈ F ] ∀A ∈ B0, F ∈ Nlf .
These measures can be extended to higher orders in the obvious way.
9
1.5.2 Palm distributions
The Palm distributions of a spatial pattern are probability measures Pξ on (Nlf ,Nlf ), where ξ ∈ S. We will see that Pξ(F ) can be heuristically interpreted as P(X ∈ F |NX(Bε(ξ) > 0)) where ε > 0 is arbitrarily small, so Pξ gives the conditional distribution of X given that there is a point of the process at ξ. Formally we define the Palm distributions in the following way. Consider F ∈ Nlf , the first moment measure Λ(·) and the measure given by C(·, F ) where C(·, ·) is the first order Campbell measure. We have directly from the definition that
C(·, F ) Λ(·)
where means "absolutely continuous with respect to", since Λ(A) = 0 =⇒ C(A×F ) = 0. From the Radon Nikodym theorem we have that there exists a density dC(·×F )
dΛ : S → R such that
C(A× F ) =
dΛ dΛ(ξ) ∀A ∈ B0.
It is possible to choose this density such that fixing F we obtain a Borel measurable function and fixing ξ we obtain a probability measure on (S,B). We call this probability measure Palm distribution so
Pξ(·) = dC(ξ × ·)
dΛ .
We now show heuristically that the Palm distribution can be interpreted as the conditional distribution of X given that there is an event in ξ. In fact, given ε small enough, if we define A := Bε(ξ), we can assume that in A there is at most one point of X. With this assumption we have that
C(A× F ) ≈ E(I(X ∈ F,NX(A) > 0)) = P(X ∈ F,NX(A) > 0)
and C(A× F ) ≈ Pξ(F )Λ(A) ≈ Pξ(F )P(NX(A) > 0)
so Pξ(F ) ≈ P(X ∈ F,NX(A) > 0)
P(NX(A) > 0) = P(X ∈ F |NX(A) > 0).
In an analogous way, using the reduced Campbell measure, we can define the reduced Palm distribution P!
ξ(·) . In this case heuristically we can interpret P! ξ(·) as the probability dis-
tribution of X\ξ given that for X there is an event at ξ. From the definition of the reduced Palm distribution, using standard techniques in measure theory, the following formula, known as Campbell-Mecke Theorem can be proved
E (∑ ξ∈X
∫ ∫ h(ξ, x)dP!
ξ(x)dΛ(ξ) (1.5.4)
for non-negative measurable functions h. Consider now the case that X is stationary. Since the characteristics of the process are the same throughout space, it should not be important which point ξ is fixed when looking at the Palm measure. In fact it can be proved that (see [40]) if we define
P! 0(F ) :=
F ∈ Nlf , A ∈ B0, (1.5.5)
10
P! ξ(F ) = P!
0(F(−ξ)) F ∈ Nlf .
In the stationary case we can therefore restrict our attention to P! 0, which can also be in-
terpreted as the distribution of the further points of X given a "typical point" of X. The Campbell-Mecke theorem, in the stationary case, can be rewritten as
E( ∑ ξ∈X
h(ξ,X\{ξ})) = λ
∫ ∫ h(ξ, x+ ξ)dP!
0(x)dξ (1.5.6)
Consider now a Poisson point process. We can expect that the distribution of the process does not change if we suppose to know the position of one point of the process, since the scattering of the points is completely random and does not depend on the other positions.
Theorem 1.5.3. (Slivnyak thm.) Let X be a Poisson process on Rd with intensity measure µ. Then PX = P!
ξ for almost all ξ ∈ Rd.
For a proof of this theorem see [57, Thm 3.3.5, Notes 3.3.3].
1.5.3 Second order summary statistics
Second order summary statistics, although they do not fully characterize a point process, are believed to represent important statistical properties and therefore constitute a widely used tool for the analysis of point patterns. Second order statistics are based on the second order factorial moment measure Λ(2) which was defined in Equation (1.5.2). In this section we assume that X is stationary and that the product density λ(2) exists. In this case it can be proved that
λ(2)(x, y) = λ(2)(0, y − x) =: λ(2)(z), z = y − x
and that therefore
We now define the reduced second-order moment measure K by
λ2K(B) :=
From the definition of K and Λ(2) it follows that
Λ(2)(A×B) = λ2
Λ(2)(A×B) = λ
0(NX(B)). (1.5.11)
The quantity λK(B) can be therefore interpreted as the expected number of points in B excluding the origin, conditioned on 0 belonging to X. When observing X on all Rd, an unbiased estimator for K(B) is given by
λ2K(B) :=
I[y − x ∈ B]/|A|, A ∈ B0. (1.5.12)
Unbiasedness follows from Theorem 1.5.2. When observing X in a finite window W we need to deal with edge effects, since smaller distances between points are more likely to be observed than larger ones. In this case an unbiased estimator is given by
λ2K(B) =
|Wx ∩Wy| , (1.5.13)
where the weights 1 |Wx∩Wy | , are called translation edge correction weights and were introduced
by Ohser and Stoyan in [44]. When choosing B as the sphere centered in the origin with radius r, the K-measure coincides, as a function of r, with Ripley´s K-function which is widely used in practice, soK(r) = K(Br). Note that Ripley´s K-function, due to the shape of B, assumes both stationarity and isotropy. In Chapter 2, Section 2.3 we will discuss directional versions of the K-function that take anisotropy into account. For a homogeneous Poisson process Ripley´s K-function assumes values
K(r) = kdr d.
For clustered processes we expect that K(r) ≥ kdrd for small r, and for regular point processes we expect that K(r) ≤ kdrd for small r. The cumulative nature of the K-measure can make it hard to interpret it and can sometimes obscure some details. This is why sometimes its derivative is considered. Rewriting the K-measure as
K(B) = λ−2
λ2
is called the pair-correlation function. The pair correlation function is more practical than the product density λ(2) since it is independent of the intensity. By the definition of density we can interpret λ(2)(z)dz as the probability to have two points in two infinitesimal volumes dx and dy with difference vector z, while λ can be interpreted as the probability to have one point in an infinitesimal volume dx. If the two events of having one point in dx and one point in dy are independent, as in the homogeneous Poisson process, we have g ≡ 1. Values of g > 1 for ||z|| small, are typical in case of clustering, while values of g < 1 show repulsion between the points and are typical for regular patterns.
1.5.4 First order summary statistics
In this section we briefly introduce some first order summary statistics for stationary point processes. The nearest neighbor distance distribution function G is defined as
G(r) = P0(d(0, X\0) ≤ r), r > 0.
12
1.6 Strauss process
G(r) is the probability that there is at least one point which has distance less than r from 0, which is a point belonging to the process. The empty space function F is analogous to the G function, with the only difference that the Palm distribution is substituted by PX
F (r) = PX(d(0, X) ≤ r) r > 0,
where in this case 0 does a.s not belong to the process. F (r) is then the probability to find, given a generic point in S, at least one event of the process that has distance less than r from this point. Therefore F is the distribution function of the distance between an arbitrary point of S and the nearest point of the process, while G is the distribution function of the distance between the typical point of the process and its nearest neighbor. The J-function is defined as
J(r) = 1−G(r)
1− F (r) =
PX(NX(B(0, r)) = 0) ∀r > 0, F (r) < 1,
where 0 is the typical point of the process. Intuitively we have that, if J(r) takes values smaller than 1, the probability to have an empty space larger than r between points of the process is less than the probability to have the same empty space between a generic point and a point of the process, which is typical in clustered patterns. Instead if J(r) is higher than 1 we can suppose to have a more regular pattern. These heuristic observations are confirmed by the fact that for a Poisson process J(r) = 1 as a consequence of Theorem 1.5.3.
1.6 Strauss process
In this section we first introduce the class of processes that have a density with respect to a Poisson process with intensity measure µ, defined on a set S ⊂ Rd with µ(S) <∞. We then take into consideration a particular process belonging to this class: the Strauss process. The Strauss process will be considered in the simulation studies of Chapter 3.
Definition 1.6.1. We say that a process X has density p : Nlf → R+ with respect to the Poisson process with intensity measure µ if
P(X ∈ F ) =
∫ F p(x)dΠ(x), F ∈ Nlf
where Π denotes the probability distribution induced on (Nlf ,Nlf ) by the Poisson process.
From the Radon Nikodym theorem we have that every process that induces a probability measure on (Nlf ,Nlf ) which is absolutely continuous with respect to Π has a density with respect to the Poisson process and vice versa.
To be a probability density, p(·), if integrated on Nlf with respect to Π has to give 1. Given a function p : Nlf → R+ we want to give sufficient conditions for p to be a probability density. Since p(·) is usually known only up to a normalizing constant, we will consider h(·) = p(·)Z, where Z is the unknown normalizing constant and we describe conditions that assure that∫
F h(x)dΠ(x) <∞.
Two possible conditions are called local stability and Ruelle stability.
13
Definition 1.6.2. A non negative measurable function h on Nlf is locally stable if
∃K > 0 such that ∀x ∈ Nlf , ∀ξ ∈ S\x : h(x ∪ ξ) ≤ Kh(x)
and Ruelle stable if
∃K > 0, c > 0 such that ∀x ∈ Nlf : h(x) ≤ cKNx(S).
Local stability implies Ruelle stability which implies integrability of h [59].
Definition 1.6.3. We call processes that have a locally stable density with respect to Π
locally stable point processes.
Definition 1.6.4. Given a point process X that has density p(·) with respect to Π we define the Papangelou conditional intensity of X as
λ(x, ξ) = p(x ∪ {ξ})
taking λ(x, ξ) = 0 if p(x)=0.
Notice that
• The local stability condition implies the existence of an upper limit of the Papangelou conditional intensity.
• The Papangelou conditional intensity does not depend on the normalizing constant of the density p(·), which is unknown in most of the cases.
• Heuristically, the Papangelou conditional intensity λ(x, ξ) of a process X can be interpreted as
λ(x, ξ)dξ = P(NX(dξ) = 1|X ∩ (dξ)C = x ∩ (dξ)C)
so as the probability of finding a point in an infinitesimal region dξ around ξ given that the point process agrees with the configuration x outside dξ.
Definition 1.6.5. Suppose we have a point process X with Papangelou conditional intensity given by λ(x, ξ). We say that X is attractive if
λ(x, ξ) ≤ λ(y, ξ) ∀x ⊆ y ∈ Nlf
and repulsive if λ(x, ξ) ≥ λ(y, ξ) ∀x ⊆ y ∈ Nlf .
Intuitively, attractivity means that the chance that ξ ∈ X, given that X\ξ = x, is an increasing function of x, while repulsivity means the opposite.
We now give the definition of the Strauss process.
14
1.6 Strauss process
Definition 1.6.6. We say that a point process X is a Strauss process with parameters θ =
(β, γ, r0), where β > 0, 0 ≤ γ ≤ 1, r0 > 0, if X has density
pθ(x) = 1
Zθ βNx(S)γsr0 (x) (1.6.1)
with respect to the measure Π induced by a homogeneous Poisson process on S with intensity 1, where Zθ is the unknown normalizing constant and
sr0(x) = ∑
{ξ1,ξ2}⊆x:ξ1 6=ξ2
I[d(ξ1, ξ2) ≤ r0]
is the number of pairs of distinct points belonging to the point configuration x that have distance less than r0 from each other.
Proposition 1.6.1. The Strauss process is a locally stable, repulsive point process.
Proof. We have that the Papangelou conditional intensity of a Strauss process is equal to
λ(x, ξ) = β[N(x∪ξ)(S)−Nx(S)]γ[sr0 (x∪ξ)−sr0 (x)] = βγtr0 (x,ξ), ξ /∈ x
where we have denoted tr0(x, ξ) = sr0(x ∪ ξ)− sr0(x)
which is the number of points in configuration x that have distance less than r0 from ξ. The local stability follows from the fact that
γtr0 (x,ξ) ≤ 1 since 0 ≤ γ ≤ 1, tr0(x, ξ) ≥ 0
and the repulsivity from the fact that
tr0(x, ξ) ≤ tr0(y, ξ) if x ⊆ y.
The normalization constant Zθ is not explicitly known and its estimation, if needed, is not straightforward. We mention here a possible estimation used by Cressie and Lawson in [10], based on a Poisson aproximation (see [55])
1
(β2|W |2
2|W | kdr
d 0(γ − 1)
) . (1.6.2)
In Definition 1.6.6, β is called the intensity parameter, γ the interaction parameter and r0 the interaction radius. Realizations of the Strauss process have different characteristics depending on the values of these parameters (Figure 1.6.1). Typically, if γ is close to 0, the realizations look more regular than in the case in which γ is close to 1. Therefore the parameter γ will also be called regularity parameter. Consider the extreme cases. If γ = 0, since the density assumes values different from 0 only if sr0(x) = 0, we obtain the so called hardcore process, where points with distance less than r0 are prohibited. Instead, in the case γ = 1, we obtain
15
a Poisson process which allows arbitrarily close points. Decreasing γ is not the only way to obtain a more regular pattern. Another way is to increase r0 while fixing the other parameters. Notice that this fact highlights that the parameters r0 and γ are strictly related to each other, and from a pattern is not easy to guess if it is the parameter r0 that assumes for example a high value or γ that assumes a small value. This type of correlation can cause problems if estimations of the parameters of a Strauss process are needed. The parameter β is related to the intensity λ of the process. Note that λ can not be computed explicitly, even if the values of the parameters β, γ and r0 are known. A possible approximation of λ, given the parameters of the process, was introduced by Baddeley and Nair in [3] and it is given by
λ = W0(βΓ)
Γ ,
where W0 is the principal branch of Lambert’s W0 function (see [9]), and Γ = −kdrd0 log γ.
In Figure 1.6.1 we show some realizations of the Strauss process in the observation window [0, 1] × [0, 1] with different values of the parameters r0 and γ, while the parameter β is fixed to 200. The two rows correspond to two different values of r0 and in every row we consider three different increasing values of γ.
The Strauss process is a pairwise interaction point process and in particular a Gibbs or Markov point process ([40]). Using the properties of Markov point processes the definition of Strauss process on a finite set S can be extended to Rd by e.g. using the local specification character- ization as in [40, page 95]. Such an extension is a stationary point process on Rd (the Poisson process with intensity 1 is stationary and tr0 is invariant under translations and rotations).
16
1.7 The Metropolis Hastings algorithm
Figure 1.6.1: Simulations of Strauss processes with different values of the parameters γ and r. In the first
row r = 0.02, in the second r = 0.06. In the first column γ = 0, in the second γ = 0.3 and in
the last column γ = 0.6. γ =0, r=0.02 γ =0.3, r=0.02 γ =0.6, r=0.02
γ =0, r=0.06
γ =0.6, r=0.06
The Strauss process on a finite set S can be simulated for example by using the Metropolis Hastings algorithm as described in Section 1.7.3.
In this section we shortly introduce the Metropolis Hastings algorithm and show how to apply it to simulate locally stable point processes on a set S ⊂ Rd with |S| <∞. A more detailed discussion of the topics of this section can be found in [59, Chapter 2].
We first of all give a short description of Markov chains which are discrete in time, but with general state space. Consider a measure space (Y,Y), with Y countably generated. A discrete in time homogeneous Markov chain on (Y,Y) is a process Yn characterized by an initial distribution ν on (Y,Y) and a transition kernel P : Y × Y → [0, 1] such that
P(Y0∈A) = ν(A) A ∈ Y P(Yn∈A|Yn−1 = x) = P (x, A) A ∈ Y, x ∈ Y
Definition 1.7.1. Let µ and ν be two measures on (Y,Y). The total variation norm between µ and ν is defined as
||µ(·)− ν(·)||v = sup A∈Y |µ(A)− ν(A)|.
17
Definition 1.7.2. We say that the chain Yn converges in equilibrium to a measure π on (Y,Y)
as n→∞ if lim n→∞
||Pn(x, ·)− π(·)||v = 0 for π − a.a x ∈ Y
where Pn : Y × Y → [0, 1] is the n-step transition probability which satisfies
P(Yn ∈ A|Y0 = x) = Pn(x, A) A ∈ Y, x ∈ Y.
The Metropolis Hastings algorithm is an MCMC (Markov Chain Monte Carlo) method, and has the aim to get a sample from a distribution with density π with respect to a measure µ defined on a measure space (Y,Y). Usually this algorithm is needed when π is known only up to a normalizing constant and therefore direct sampling is not available. The basic idea of the method is to simulate, for a sufficiently long time, a discrete in time Markov chain with state space Y , that has equilibrium density given by π.
1.7.1 The algorithm
The algorithm consists in building the following discrete in time Markov chain. Suppose that at the n-th iteration the chain is in the state x. The n+ 1-th step is built by
• proposing a new state y using a density q(y,x) (with respect to µ),
• accepting or refusing y as the new state of the (n + 1)-th iteration using the following acceptance probability
α(y, x) =
1 if π(x)q(y,x) = 0
where H(y,x) is called Hastings ratio and is given by
H(y, x) = π(y)q(x,y)
π(x)q(y,x) .
Note that H(y,x) depends on π only through ratios, so for applying this algorithm it is not necessary to know the normalizing constant of π.
1.7.2 Convergence of the algorithm
It can be proved that the Metropolis Hastings algorithm, if choosing an appropriate proposal density q(·, ·), which has to render the constructed Markov chain aperiodic and irreducible [59], converges in equilibrium to the density π. A good proposal density q
• has to be easy to implement in practice,
• has high acceptance rate,
• provides a good mixing of the chain so that all the range of the states is visited “often” and not only a part of it,
• guarantees no cyclic behavior of the chain.
18
Notice that not only the convergence of the algorithm, but also the rate of convergence depends on the choice of q. For example a high rejection rate can make the convergence slower. The Metropolis Hastings algorithm gives us a way for sampling from a density π by running a chain for a suitable number of iterations such that the chain has reached equilibrium. We have however to consider that
(i) If we want a multi-dimensional sample, the sample we obtain by running a single chain is not independent.
(ii) The density of the sample is only asymptotically equal to π.
(iii) We do not know the rate of convergence, so for how many iterations the chain should be run before reaching approximately the equilibrium.
Regarding the first point, one could run multiple independent chains, although this leads to a high computational cost. Another possibility is to thin the chain and take its values every k-th iteration, obtaining an approximately independent sample. Regarding the third point, in practice, since theoretical results are in general difficult to apply, methods such the ones introduced by Raftery and Lewis in [48] are applied. These methods let first run the algorithm in order to obtain one or more pilot samples. The number of iterations are then determined by applying convergence diagnostics to the pilot samples. To get rid of the problem of the third point one can also use an alternative method to the Metropolis Hastings algorithm which is called dominated coupling from the past (DCFTP) [40]. Once it has converged the DCFTP gives an exact simulation of π. It can however happen that the algorithm takes long time to converge.
1.7.3 Simulation of locally stable point processes
The Metropolis Hastings algorithm can be used to simulate locally stable processes which have a density p with respect to Π where p is usually known only up to a normalizing constant. In this case the state space is (Y,Y) = (Nlf ,Nlf ). It is also possible to use the Metropolis algorithm to simulate from the conditional (on having n points) versions of those densities. Let us first consider the unconditional case. The proposal distribution q can be chosen as
• propose a birth with probability q(x), where the new point u ∈ S is sampled from a density b(x, u) with respect to µ.
• propose a death of a preexisting point with probability 1− q(x), where the point ξ ∈ x to delete is sampled from a density d(x, ξ) on the point configuration x.
With this choice of q, the acceptance probability α is
α(x ∪ u,x) = q(x)p(x ∪ u)d(x ∪ u, u)
(1− q(x))p(x)b(x, u) x ∈ Nlf , u ∈ S
and
19
q(x) = 1
1
Nx(S) .
It can be proved that under some conditions on b, d and q, which are fulfilled by the previous choices, the algorithm converges to a distribution with the specified density p. For the conditional case, when we fix the total number of points to n, the algorithm starts with a point pattern having n points and at each iteration it will be proposed to replace an old point with a new proposed point. For details see [40] page 108. Two other possible ways to simulate locally stable processes are spatial birth and death processes [26] and/or dominated coupling from the past [40] (exact simulations). An exact simulation of the Strauss process in 2D can be obtained by using the function rStrauss of the R-package spatstat. Both 2D and 3D simulations of the Strauss process using the Metropolis Hastings algorithm can be obtained by using the function rstrauss which can be found in the R-package rstrauss in https://github.com/antiphon/rstrauss.
20
2 Directional Analysis
In this chapter we describe different methods for a directional analysis of a stationary point process, which, in order to be exemplified, are applied to two simulated data sets, one regular and one clustered. Although the directional methods are introduced in the general case of a stationary point process X defined on Rd, special attention is given to their application to regular patterns subjected to a particular type of anisotropy mechanism, called geometric anisotropy, which is described in Section 2.1. In Section 2.3, Section 2.4 and Section 2.5 we describe the directional methods. In Section 2.2 we introduce the so called Fry points, which will be important throughout the chapter.
2.1 Settings
Let X be a simple stationary point process on Rd, with intensity λ and second order product density λ(2). Since we assume that X has no duplicate points, λ(2)(x, x), x ∈ Rd, is not well defined, and set equal to 0. We moreover assume that X is observed in a compact window W ⊂ Rd.
We now describe in details and introduce notations for a particular type of anisotropy mechanism, which has been called geometric anisotropy in [36]. Let X0 be a stationary and isotropic point process and define the point process
X = TX0 = {Tx : x ∈ X0} (2.1.1)
where T : Rd → Rd is an invertible linear mapping, which corresponds to a d× d matrix also denoted by T . We assume here that det(T ) > 0. If det(T ) = 1, the transformation T is called volume preserving. T can be decomposed by using the singular value decomposition
T = R1CR2
where R1 and R2 correspond to rotations and C is a diagonal matrix with strictly positive entries. Since X0 is isotropic we have that
TX0 = R1CR2X0 ∼ R1CX0.
Therefore it is sufficient to consider a linear mapping T of the form
T = RC. (2.1.2)
The matrix C “rescales” X0 along the coordinate axes, whereas the matrix R rotates the deformed process CX0. The axes obtained by rotating the coordinate axes by R are called deformation axes of T .
The point process X that we get after the transformation, is a stationary point process with intensity λX = det(T−1)λX0 . If the matrix C is not a multiple of the identity matrix, X can be anisotropic. Note that, in the case X0 is a stationary Poisson process, X remains a stationary Poisson process, only with different intensity. Geometric anisotropy has already been considered in the literature with X0 clustered or regular, both in the 2D and in the 3D case, for real and simulated data. Regarding the simulated data, the cluster case has been considered in 2D in [36] with log-Gaussian Cox processes and shot noise Cox processes, in [22] and in [66] with anisotropic Thomas processes. The regular case has been considered in [53] with Matern hard core processes (in 3D), in [66] with Gibbs hardcore processes (in 2D) and in [52] with Strauss processes (both in 2D and in 3D). In this thesis we focus on the regular case in both 2D and 3D. As in [52], in our simulation study in Chapter 3, we consider realizations of Strauss processes. Let now X be a point process on R2 or on R3, generated by the geometric anisotropy mechanism. Motivated by our application (Chapter 5) we assume T volume preserving. In 2D the scaling matrix C assumes the form (since det(T ) = 1)
C =
c
) . (2.1.3)
We assume that the strength of compression 0 < c ≤ 1. In 3D the scaling matrix C assumes the form
C =
c1c2
(2.1.4)
where we assume that 0 < c1 ≤ c2√ c1 , so that c2 > c1
√ c1. We call c1 the strength of main
compression and c2 the strength of additional compression. If c2 = 1 we have only one axis of compression. The other two axes of deformations are elongated with equal strengths. If c1 = c2√
c1 we have one axis of elongation and two axes of compression which are deformed with
equal strengths. In both cases T is a spheroidal transform. Let us now consider in 2D 0 < c < 1 and in 3D 0 < c1 <
c2√ c1 . Given our (non restrictive)
assumptions on the order of the elements of the diagonal of C, in 2D the process is compressed along the image (by applying the rotation R) of the x-axis, and dilated along the image of the y-axis. In 3D the process is compressed along the image of the y and x axes and dilated along the image of z. Since the compression along the image of y is stronger than the compression along x, we say that the image of y is the axis of main compression and the image of x is the axis of additional compression. In 2D the deformation axes can be simply represented by the angle θ ∈ [0, π] that the axis of compression forms with the x-axis (counterclockwise). From now on we will call θ the direction of compression. The matrix R can be expressed by
R =
) . (2.1.5)
In 3D we denote the axes of deformation (in order the axis of elongation, the axis of additional compression and the axis of compression) u1, u2, u3. The same notation will be used to denote the directions of the deformation axes with nonnegative z values, that belong to (S2)+. We call these directions directions of deformation. In the d-dimensional case we extend the
22
notation in the obvious way.
Besides geometric anisotropy, other anisotropy mechanisms could have been taken into consideration. An example of anisotropic stationary point processes not generated by geometric anisotropy, are Poisson processes (or in general stationary processes) with increased intensity along directed lines (see for example the Poisson line cluster point process (PLCPP) model in [35] and the models in [56]). These processes can be considered stationary, if the distribution of the lines is stationary.
2.1.1 Aims
Given the assumption of geometric anisotropy, our specific aims are
• Estimate the rotation R, so the axes of deformation.
• Estimate the matrix C, so estimate the strength c in 2D and the strengths c1 and c2 in 3D.
In Sections 2.3.1, 2.4.1 and 2.5.1 we consider the estimation of R, while in Section 2.6 we consider the estimation of C.
2.1.2 Explicative examples
In this section we show two realizations of 2D point processes, one regular and the other clustered, that we will use to show the basic ideas and the typical results of the considered directional methods. Both examples are constructed by using the geometric anisotropy mechanism. For the regular case we chose X0 as a Strauss process with fixed number of points n = 300 and parameters γ = 0, r0 = 0.04. For the clustered case we chose X0 as a Matern Cluster Process with radius of the clusters equal to 0.03, intensity of the Poisson process that determines the cluster centers equal to 10 and with an average of 40 points per cluster. For the simulation we used the function rMatClust of the R-package spatstat. In both the clustered and the regular case, we fixed R as the identity matrix, applying no rotation to X0 and we fixed the strength of compression c = 0.5. For details on how the realizations of these processes can be obtained see Section 3.1. The realization of the regular case is shown in the plot on the left of Figure 2.1.1 and the realization of the clustered case in the plot on the right.
23

0. 0
0. 2
0. 4
0. 6
0. 8
1. 0
0. 0
0. 2
0. 4
0. 6
0. 8
1. 0
Clustered
x
y In the clustered pattern, the axes of dilation and compression are visually detectable by looking at the shape of the clusters which are elongated along the axis of elongation x and are compressed along the axis of compression y. In the case of the regular pattern, the compression and the dilation axes are not so clearly visible.
2.2 Fry points
In this section we introduce the so called Fry points, which will be considered in all the following sections. The Fry points have been first introduced by Fry in [20]. We define the Fry points of a stationary point process X as
ZA := {y − x, x 6= y, x ∈ A, y ∈ X} A ∈ B0. (2.2.1)
In Equation (2.2.1) we need to consider x ∈ A ∈ B0, since, if considering all points of X on Rd, ZA would not be locally finite. In practice, when observing X in a finite observation window W , we can only observe the pairwise difference vectors
ZW := Z := {y − x, x 6= y, x, y ∈ XW } (2.2.2)
which we also call Fry points. The set Z is symmetric with respect to the origin since y − x and −(y − x) both belong to Z and is affected by edge effects. We denote the observation window of Z, which depends on W , by W ∗. From now on we concentrate only on the set Z.
In the next sections we will see that the Fry points can be exploited in order to analyze anisotropy in stationary point processes. Moreover, due to their structure, the Fry points are useful to visualize anisotropy both in 2D and in 3D (Section 4.4). Let us first look at the properties of the Fry points Z under isotropy. If X is isotropic and if W = Br(c), r ∈ R+, c ∈ Rd, we will prove that the distribution of the Fry points Z is rotationally symmetric with respect to rotations about the origin. The condition on W is necessary since it implies that the window W ∗ is also a ball and therefore invariant under rotations. If W is not a ball we can always restrict ourselves on the biggest ball which is contained in W . If R0 ∈ SOn we can write
R0Z = {R0y −R0x, x 6= y, x, y ∈ XW } = {y − x, x 6= y, x, y ∈ R0(XW )}. (2.2.3)
24
(1)∼ (R0X)W (2)∼ XW (2.2.4)
where in (1) we exploited the fact that W is a ball and the stationarity of X and in (2) the isotropy of X. From Equation (2.2.3) and Equation (2.2.4) we can easily derive that
R0Z ∼ Z ∀R0 ∈ SOn. (2.2.5)
Since Definition (2.2.2) considers X restricted to the observation window W , estimations involving X and Z are both affected by edge effects. For instance in W smaller distances between points are more likely to be observed than larger distances. Edge effects can be treated in different ways. In estimations involving Z particularly useful are the translational edge correction weights already introduced in Equation 1.5.13, since they can provide unbiased estimators considering only XW . Another possible edge treatment is given by the so called minus-sampling. In this case only the differences
{y − x, ||y − x|| < dist(x, ∂W )} (2.2.6)

directional analysis of stationary point processes

Documents