numerical methods

SITE CHARACTERIZATION & INSTRUMENTATION MODULE 8

Dr. P. Anbazhagan Page 1

Module 8: Numerical methods

Topics:

Introduction

Kriging

Artificial neural networks (ANN)

Triangulation with linear interpolation

Natural neighbour

Inverse distance

Minimum curvature

Regression by plane with weights

Radial basis functions

Keywords: Kriging, variogram, ANN, Interpolation methods

8.1 Introduction:

Surface interpolation and construction of maps have been traditionally used in many

fields such as physics, geophysics, geology, geodesy, hydrology, meteorology and so

on. The goal of this module is to present commonly used techniques for solving

interpolation/ approximation problems and to evaluate their applicability for solving

practical tasks in site charaterization. The below presented interpolation /

approximation methods are:

Kriging

Artificial neural networks (ANN)

Triangulation with linear interpolation

Natural neighbour

Inverse distance

Minimum curvature

Regression by plane with weights

Radial basis functions

8.2 Kriging method

8.2.1 Introduction:

Geo-statistics is a scientific approach to estimate problems in geology and mining. It

is a branch of statistics dealing with spatial phenomena modelled by random

functions.

Today, geo-statistics is no longer restricted to this kind of application. It is, applied in

disciplines such as hydrology, meteorology, oceanography, geography, forestry,

environmental monitoring, landscape ecology, agriculture or for ecosystem

geographical and dynamic study.



Underlying each geo-statistical method is the notion of random function. A random

function describes a given spatial phenomenon over a domain. It consists of a set of

random variables, each of which describes the phenomenon at some location of the

domain.

In most geo-statistical methods, the dependencies between the random variables are

preferably described by a variogram. The variogram depicts the variance of the

increments of the quantity of interest as a function of the distance between sites.

By far, kriging is the most popular geo-statistical method. The aim of kriging is to

predict the phenomenon at unobserved sites. This is the problem of spatial estimation,

sometimes called spatial prediction. Examples of spatial phenomena estimations are

soil nutrient or pollutant concentrations over a field observed on a survey grid,

hydrologic variables over an aquifer observed at well locations, and air quality

measurements over an air basin observed at monitoring sites.

8.2.2 Kriging:

In real world, it is impossible to get exhaustive values of data at every desired point

because of practical constraints. Thus, interpolation is important and fundamental to

graphing, analysing and understanding of 2D data.

Interpolation is the estimation of a variable at an unmeasured location from observed

values at surrounding locations

The word "kriging" is synonymous with "optimal prediction". It is a method of

interpolation, which predicts unknown values from data observed at known locations.

This method uses variogram to express the spatial variation, and it minimizes the error

of predicted values which are estimated by spatial distribution of the predicted values.

Kriging is optimal interpolation based on regression against observed z values of

surrounding data points, weighted according to spatial covariance values.

The term kriging was coined by, Matheron in honor of D.G. Krige who published an

early account of this technique. In its simplest form, a kriging estimate of the field at

an unobserved location is an optimized linear combination of the data at observed

locations.

A full application of a kriging method involves different steps:

1. An important structural analysis is performed usual statistical tools like

histograms, empirical cumulative distributions, can be used in conjunction with an

analysis of the sample variogram.

2. In place of the sample variogram, that does not respect suitable mathematical

properties, a theoretical variogram is chosen. The fitting of the theoretical

variogram model to the sample variogram, informed by the structural analysis, is

performed.

3. Finally, from this variogram specification, the kriging estimate is computed at the

location of interest by solving a system of linear equations of the least squares

type.



8.2.3 Advantages of kriging:

1. Helps to compensate for the effects of data clustering, assigning individual points

within a cluster less weight than isolated data points (or, treating clusters more like

single points)

2. Gives estimate of estimation error (kriging variance), along with estimate of the

variable, Z, itself (but error map is basically a scaled version of a map of distance to

nearest data point, so not that unique)

3. Availability of estimation error provides basis for stochastic simulation of possible

realizations of Z(u).

8.2.4 Kriging Approach and Terminology:

All kriging estimators are but variants of the basic linear regression estimator Z*(u)

defined as

Z*(u) m (u) =

With u, u : location vectors for estimation point and one of the neighbouring data

points, indexed by

n(u): number of data points in local neighbourhood used for estimation of Z*(u)

m(u), m(u): expected values (means) of Z(u) and Z(u)

(u) : kriging weight assigned to datum Z(u) for estimation location u; same datum

will receive different weight for different estimation location.

Z(u) is treated as a random field with a trend component, m(u), and a residual

component, R(u) = Z(u)- m(u). Kriging estimates residual at u as weighted sum of

residuals at surrounding data points. Kriging weights, , are derived from covariance

function or semi-variogram, which should characterize residual component.

Distinction between trend and residual somewhat arbitrary; varies with scale.

Basics of kriging:

The basic form of kriging estimator is:

Z*(u) m (u) =

The goal is to determine the weights , that minimize the variance of the estimator

2E (u) = var

under the unbiasedness constraint E = 0

The random field (RF) Z (u) is decomposed into residual and trend components, Z(u)

= R(u) + m(u), with the residual component treated as an RF with a stationary mean

of 0 and a stationary covariance (a function of lag, h, but not of position, u):

E {R (u)} = 0

Cov{R (u), R (u+ h)} = E{R (u). R (u +h)} = CR (h)

The residual covariance function is generally derived from the input semi-variogram

model, CR(h) = CR(0) - (h) = Sill - (h).

Thus, the semi-variogram we feed to a kriging program should represent the residual

component of the variable. The three main kriging variants,



1. Simple,

2. Ordinary, and

3. Kriging with a trend,

Differ in their treatments of the trend component, m (u).

Simple kriging:

For simple kriging, we assume that the trend component is a constant and known

mean, m (u) = m, so that

Z*

SK (u) = m +

This estimate is automatically unbiased, since E [Z (u) -m] = 0, so that E[ZSK*(u)] =

m = E[Z(u)]. The estimation error ZSK*(u) Z(u) is a linear combination of random

variables representing residuals at the data points, u , and the estimation point, u:

ZSK*(u) Z (u) = [ZSK

*(u) m] [Z (u) m]

=

Using rules for the variance of a linear combination of random variables, the error

variance is then given by

2E (u) = var {R*SK(u)} + var {RSK(u)} 2cov {RSK

*(u) , RSK(u)}

= ) + CR (0) -2

To minimize the error variance, we take the derivative of the above expression with

respect to each of the kriging weights and set each derivative to zero. This leads to the

following system of equations:

= 1,2,n(u)

Because of the constant mean, the covariance function for Z(u) is the same as that for

the residual component, C(h) = CR(h), so that we can write the simple kriging system

directly in terms of C(h):

This can be written in matrix form as

KSK (u) = K

where KSK is the matrix of covariances between data points, with elements Ki,j = C(ui-

uj), k is the vector of covariances between the data points and the estimation point,



with elements given by ki =C(ui - u), and SK(u) is the vector of simple kriging

weights for the surrounding data points. If the covariance model is licit (meaning the

underlying semi-variogram model is licit) and no two data points are co-located, then

the data covariance matrix is positive definite and we can solve for the kriging

weights using

SK = K-1

k

Once we have the kriging weights, we can compute both the kriging estimate and the

kriging variance, which is given by

2SK (u) = C(0) SKT(u)k = C(0) -

after substituting the kriging weights into the error variance expression above. All this

math finds a set of weights for estimating the variable value at the location u from

values at a set of neighbouring data points. The weight on each data point generally

decreases with increasing distance to that point, in accordance with the decreasing

data-to-estimation covariances specified in the right-hand vector, k. However, the set

of weights is also designed to account for redundancy among the data points,

represented in the data point-to-data point covariances in the matrix K. Multiplying k

by K-1

(on the left) will downweight points falling in clusters relative to isolated

points at the same distance.

Ordinary kriging:

For ordinary kriging, rather than assuming that the mean is constant over the entire

domain, we assume that it is constant in the local neighbourhood of each estimation

point, that is that m(u) = m(u) for each nearby data value , Z(u), that we are using to

estimate Z(u). In this case, the kriging estimator can be written

Z*(u) = m (u) +

=

and we filter the unknown local mean by requiring that the kriging weights sum to 1,

leading to an ordinary kriging estimator of

ZOK*(u) = with

In order to minimize the error variance subject to the unit-sum constraint on the

weights, we actually set up the system minimize the error variance plus an additional

term involving a Lagrange parameter, OK (u):

L = E2(u) + 2OK(u)[1-

so that minimization with respect to the Lagrange parameter forces the constraint to

be obeyed:



In this case, the system of equations for the kriging weights turns out to be

= 1,.n(u)

Where, CR(h) is once again the covariance function for the residual component of the

variable. In simple kriging, we could equate CR(h) and C(h), the covariance function

for the variable itself, due to the assumption of a constant mean. That equality does

not hold here, but in practice the substitution is often made anyway, on the

assumption that the semi-variogram, from which C(h) is derived, effectively filters the

influence of large-scale trends in the mean.

In fact, the unit-sum constraint on the weights allows the ordinary kriging system to

be stated directly in terms of the semi-variogram (in place of the CR(h) values above).

In a sense, ordinary kriging is the interpolation approach that follows naturally from a

Semi-variogram analysis, since both tools tend to filter trends in the mean.

Once the kriging weights (and Lagrange parameter) are, obtained the ordinary kriging

error variance is given by

ok2 (u) = C(0) -

Kriging with a Trend:

Kriging with a trend (the method formerly known as universal kriging) is much like

ordinary kriging, except that instead of fitting just a local mean in the neighbourhood

of the estimation point, we fit a linear or higher-order trend in the (x,y) coordinates of

the data points. A local linear (a.k.a., first-order) trend model would be given by

m(u) = m(x, y) = a0 + a1x +a2y

Including such a model in the kriging system involves the same kind of extension as

we used for ordinary kriging, with the addition of two more Lagrange parameters and

two extra columns and rows in the K matrix whose (non-zero) elements are the x and

y coordinates of the data points. Higher-order trends (quadratic, cubic) could be

handled in the same way, but in practice it is rare to use anything higher than a first-

order trend. Ordinary kriging is kriging with a zeroth-order trend model.

If the variable of interest does exhibit a significant trend, a typical approach would be

to attempt to estimate a de-trended semi-variogram using one of the methods

described in the Semi-variogram lecture and then feed this into kriging with a first

order trend. However, Goovaerts (1997) warns against this approach and instead

recommends performing simple kriging of the residuals from a global trend (with a

constant mean of 0) and then adding the kriged residuals back into the global trend.



Co-kriging:

Kriging uses information from one or more correlated secondary variables or

multivariate kriging in general. It requires development of models for cross-

covariance-covariance between two different variables as a function of lag.

Indicator Kriging:

It is the kriging of indicator variables, which represent membership in a set of

categories. Used with naturally categorical variables like fancies or continuous

variables that has been threshold into categories (e.g., quartiles, deciles). Especially

useful for preserving correctness of high- and low permeability regions.

8.3 Artificial Neural Networks

8.3.1 Introduction:

Artificial neural networks (ANNs) are a form of artificial intelligence, which attempts

to mimic the function of the human brain and nervous system. ANNs learn from data

examples presented to them in order to capture the subtle functional relationships

among the data even if the underlying relationships are unknown or the physical

meaning is difficult to explain. ANNs are thus well suited to modelling the complex

behaviour of most geotechnical engineering materials, which by their very nature,

exhibit extreme variability.

Geotechnical properties of soils are controlled by factors such as mineralogy, fabric

and pore water, and the interactions of these factors are difficult to establish solely by

traditional statistical methods due to their interdependence. Based on the application

of ANNs, methodologies have been developed for estimating several soil properties

including the pre-consolidation pressure, shear strength and stress history, compaction

and permeability, soil classification and soil density.

Liquefaction during earthquakes, problem of settlements in shallow foundations can

also be controlled using ANN method.

8.3.2 Overview of Artificial Neural Network:

ANNs consist of a number of artificial neurons variously known as processing

elements (PEs), nodes or units. For multilayer perceptrons (MLPs), which is the

most commonly used ANNs in geotechnical engineering, processing elements are

usually arranged in layers:

1) An input layer,

2) An output layer and

3) One or more intermediate layers called hidden layers.

Figure 8.1 shows a typical multi-layer ANN arrangements



Each processing element in a specific layer is, fully or partially connected to many

other processing elements via weighted connections. The scalar weights determine the

strength of the connections between interconnected neurons.

A zero weight refers to no connection between two neurons and a negative weight

refers to a prohibitive relationship. From many other processing elements, an

individual processing element receives its weighted inputs, which are summed and a

bias unit or threshold is added or subtracted.

Figure 8.1: A typical multi-layer ANN showing the input layer for ten different inputs,

the middle or hidden layer(s), and the output layer having three outputs

The bias unit is used to scale the input to a useful range to improve the convergence

properties of the neural network. The result of this combined summation is passed

through a transfer function (e.g. logistic sigmoid or hyperbolic tangent) to produce the

output of the processing element.

Figure 8.2 shows ANN with hidden layers

For node j, this process is summarized as:

Ij = j +

yj = f(Ij)

where,

Ij = the activation level of node j;

Wji = the connection weight between nodes j and i;

xi= the input from node i, i = 0, 1, , n;



j = the bias or threshold for node j;

yj = the output of node j; and

f(.) = the transfer function.

Figure 8.2: ANN with a hidden layer

The propagation of information in MLPs starts at the input layer where the input data

are presented. The inputs are, weighted and received by each node in the next layer.

The weighted inputs are, then summed and passed through a transfer function to

produce the nodal output, which is weighted and passed to processing elements in the

next layer. The network adjusts its weights on presentation of a set of training data

and uses a learning rule until it can find a set of weights that will produce the input-

output mapping that has the smallest possible error. The above process is known as

learning or training.

Learning in ANNs is usually divided into supervised and unsupervised. In supervised

learning, the network is presented with a historical set of model inputs and the

corresponding (desired) outputs. The actual output of the network is compared with

the desired output and an error is calculated. This error is used to adjust the

connection weights between the model inputs and outputs to reduce the error between

the historical outputs and those predicted by the ANN.

In unsupervised learning, the network is only presented with the input stimuli and

there are no desired outputs. The network itself adjusts the connection weights

according to the input values. The idea of training in unsupervised networks is to

cluster the input records into classes of similar features.

ANNs can be categorized on the basis of two major criteria:

1) The learning rule used, and

2) The connections between processing elements.

Based on learning rules, ANNs, as mentioned above, can be divided into supervised

and unsupervised networks. Based on connections between processing elements,



ANNs can be divided into feed-forward and feedback networks. In feed forward

networks, the connections between the processing elements are in the forward

direction only, whereas, in feedback networks, connections between processing

elements are in both the forward and backward directions.

8.3.3 Modelling issues in Artificial Neural Networks:

In order to improve performance, ANN models need to be developed in a systematic

manner. Such an approach needs to address major factors such as the determination of

adequate model inputs, data division and pre-processing, the choice of suitable

network architecture, careful selection of some internal parameters that control the

optimization method, the stopping criteria and model validation. These factors are

explained and discussed below.

Determination of model inputs:

An important step in developing ANN models is to select the model input variables

that have the most significant impact on model performance. A good subset of input

variables can substantially improve model performance. Presenting as large a number

of input variables as possible to ANN models usually increases network size, resulting

in a decrease in processing speed and a reduction in the efficiency of the network.

A number of techniques have been suggested to assist with the selection of input

variables. An approach that is usually utilized in the field of geotechnical engineering

is that appropriate input variables can be selected in advance based on a priori

knowledge. Another approach used is to train many neural networks with different

combinations of input variables and to select the network that has the best

performance.

A step-wise technique described by Maier and Dandy can also be used in which

separate networks are trained, each using only one of the available variables as model

inputs. The network that performs the best is then retained, combining the variable

that results in the best performance with each of the remaining variables. This process

is repeated for an increasing number of input variables, until the addition of additional

variables results in no improvement in model performance.

Another useful approach is to employ a genetic algorithm to search for the best sets of

input variables. For each possible set of input variables chosen by the genetic

algorithm, a neural network is, trained and used to rank different subsets of possible

inputs. A set of input variables derives its fitness from the model error obtained based

on those variables.

A potential shortcoming of the above approaches is that they are model-based. In

other words, the determination as to whether a parameter input is significant or not is

dependent on the error of a trained model, which is not only a function of the inputs,

but also model structure and calibration. This can potentially obscure the impact of

different model inputs.



In order to overcome this limitation, model-free approaches can be utilized, which use

linear dependence measures, such as correlation, or non-linear measures of

dependence, such as mutual information, to obtain the significant model inputs prior

to developing the ANN models.

Division of data:

ANNs perform best when they do not extrapolate beyond the range of the data used

for calibration. Therefore, the purpose of ANNs is to non-linearly interpolate

(generalize) in high-dimensional space between the data used for calibration. ANN

models generally have a large number of model parameters (connection weights) and

can therefore over-fit the training data.

In other words, if the number of degrees of freedom of the model is large compared

with the number of data points used for calibration, the model might no longer fit the

general trend, as desired. Consequently, a separate validation set is needed to ensure

that the model can generalize within the range of the data used for calibration. It is

common practice to divide the available data into two subsets; a training set, to

construct the neural network model, and an independent validation set to estimate the

model performance in a deployed environment.

Usually, two-thirds of the data are suggested for model training and one-third for

validation. A modification of the above data division method is cross-validation in

which the data are divided into three sets: training, testing and validation. The training

set is used to adjust the connection weights, whereas the testing set is used to check

the performance of the model at various stages of training and to determine when to

stop training to avoid over-fitting. The validation set is used to estimate the

performance of the trained network in the deployed environment.

In many situations, the available data are small enough to be solely devoted to model

training and collecting any more data for validation is difficult. In this situation, the

leave-k-out method can be used which involves holding back a small fraction of the

data for validation and using the rest of the data for training. After training, the

performance of the trained network has to be, estimated with the aid of the validation

set. A different small subset of data is, held back and the network is trained and tested

again. This process is, repeated many times with different subsets until an optimal

model can be obtained from the use of all of the available data.

In the majority of ANN applications in geotechnical engineering, the data are divided

into their subsets on an arbitrary basis. However, recent studies have found that the

way the data are divided can have a significant impact on the results obtained. As

ANNs have difficulty extrapolating beyond the range of the data used for calibration,

in order to develop the best ANN model, given the available data, all of the patterns

that are contained in the data need to be included in the calibration set.



Data Pre-processing:

Once the available data have been divided into their subsets (i.e. training, testing and

validation), it is important to pre-process the data in a suitable form before they are

applied to the ANN. Data pre-processing is necessary to ensure all variables receive

equal attention during the training process.

Pre-processing can be in the form of data scaling, normalization and transformation.

Scaling the output data is essential, as they have to commensurate with the limits of

the transfer functions used in the output layer. Scaling the input data is not necessary

but it is almost always recommended. In some cases, the input data need to be

normally distributed in order to obtain optimal results.

Transforming the input data into some known forms may be helpful to improve ANN

performance. However, empirical trials showed that the model fits were the same,

regardless of whether raw or transformed data were used.

Determination of Model Architecture:

Determining the network architecture is one of the most important and difficult tasks

in ANN model development. It requires the selection of the optimum number of

layers and the number of nodes in each of these. For MLPs, there are always two

layers representing the input and output variables in any neural network. It has been

shown that one hidden layer is sufficient to approximate any continuous function

provided that sufficient connection weights are given.

After several contradictions, Lapedes and Farber (1988) provided more practical

proof that two hidden layers are sufficient, the first hidden layer is used to extract the

local features of the input patterns while the second hidden layer is useful to extract

the global features of the training patterns. However, Masters (1993) stated that using

more than one hidden layer often slows the training process dramatically and

increases the chance of getting trapped in local minima.

The number of nodes in the input and output layers is restricted by the number of

model inputs and outputs, respectively. It has been shown in the literature that neural

networks with a large number of free parameters (connection weights) are more

subject to over-fitting and poor generalization. Consequently, keeping the number of

hidden nodes to a minimum, provided that satisfactory performance is achieved, is

always better, as it:

o Reduces the computational time needed for training;

o Helps the network achieve better generalization performance;

o Helps avoid the problem of over-fitting and

o Allows the trained network to be analysed more easily.

For single hidden layer networks, there are a number of rules-of-thumb to obtain the

best number of hidden layer nodes. Hecht-Nielsen and Caudill suggested that the

upper limit of the number of hidden nodes in a single layer network may be taken as

(2I+1), where I is the number of inputs. The best approach found by Nawari et al.



(1999) was to start with a small number of nodes and to slightly increase the number

until no significant improvement in model performance is achieved.

For networks with two hidden layers, the geometric pyramid rule described by Nawari

et al. (1999) can be used. The notion behind this method is that the number of nodes

in each layer follows a geometric progression of a pyramid shape, in which the

number of nodes decreases from the input layer towards the output layer. Kudrycki

found empirically that the optimum ratio of the first to second hidden layer nodes is

3:1, even for high dimensional inputs.

Another way of determining the optimal number of hidden nodes that can result in

good model generalization and avoid over-fitting is to relate the number of hidden

nodes to the number of available training samples (Maier and Dandy, 2000).

A number of systematic approaches have also been proposed to obtain automatically

the optimal network architecture. The adaptive method of architecture determination

is an example of the automatic methods for obtaining the optimal network architecture

that suggests starting with an arbitrary, but small, number of nodes in the hidden

layers.

During training, and as the network approaches its capacity, new nodes are added to

the hidden layers, and new connection weights are generated. Training is continued

immediately after the new hidden nodes are added to allow the new connection

weights to acquire the portion of the knowledge base, which was not stored in the old

connection weights. The above steps are repeated and new hidden nodes are added as

needed to the end of the training process, in which the appropriate network

architecture is automatically determined.

Model Optimization (Training):

As mentioned previously, the process of optimizing the connection weights is known

as training or learning. The aim is to find a global solution to what is typically a

highly non-linear optimization problem. The method most commonly used for finding

the optimum weight combination of feed-forward MLP neural networks is the back-

propagation algorithm which is based on first-order gradient descent.

The use of global optimization methods, such as simulated annealing and genetic

algorithms, have also been proposed. The advantage of these methods is that they

have the ability to escape local minima in the error surface and, thus, produce optimal

or near optimal solutions. However, they also have a slow convergence rate.

Ultimately, the model performance criteria, which are problem specific, will dictate

which training algorithm is most appropriate.

Stopping Criteria:

Stopping criteria are used to decide when to stop the training process. They determine

whether the model has been optimally or sub-optimally trained. Training can be

stopped: after the presentation of a fixed number of training records; when the

training error reaches a sufficiently small value; or when no or slight changes in the



training error occur. However, the above examples of stopping criteria may lead to the

model stopping prematurely or over-training.

The cross-validation technique is an approach that can be used to overcome such

problems. It is considered to be the most valuable tool to ensure over-fitting does not

occur (Smith 1993). A number of stopping criteria can also be used. Unlike cross-

validation, these stopping criteria require the data be divided into only two sets; a

training set, to construct the model; and an independent validation set, to test the

validity of the model in the deployed environment.

The basic notion of these stopping criteria is that model performance should balance

model complexity with the amount of training data and model error.

Model Validation:

Once the training phase of the model has been successfully accomplished, the

performance of the trained model should be validated. The purpose of the model

validation phase is to ensure that the model has the ability to generalize within the

limits set by the training data in a robust fashion, rather than simply having

memorized the input-output relationships that are contained in the training data.

The approach that is generally adopted is to test the performance of trained ANNs on

an independent validation set, which has not been used as part of the model building

process. If such performance is adequate, the model is deemed to be able to generalize

and is considered to be robust.

The coefficient of correlation, r, the root mean squared error, RMSE, and the mean

absolute error, MAE, are the main criteria that are often used to evaluate the

prediction performance of ANN models. The coefficient of correlation is a measure

that is used to determine the relative correlation and the goodness-of-fit between the

predicted and observed data. Smith (1986) suggested the following guide for values of

r between 0.0 and 1.0:

a. r 0.8 strong correlation exists between two sets of variables;

b. 0.2 < r < 0.8 correlation exists between the two sets of variables; and

c. r 0.2 weak correlation exists between the two sets of variables.

The RMSE is the most popular measure of error and has the advantage that large

errors receive much greater attention than small errors. In contrast with RMSE, MAE

eliminates the emphasis given to large errors. Both RMSE and MAE are desirable

when the evaluated output data are smooth or continuous.

Despite the success of ANNs in geotechnical engineering and other disciplines, they

suffer from some shortcomings that need further attention in the future including

model transparency and knowledge extraction, extrapolation and uncertainty.

Together, improvements in these issues will greatly enhance the usefulness of ANN

models with respect to geotechnical engineering applications.



8.4 Triangulation with linear interpolation

The method of triangulation with linear interpolation is historically one of the first

methods used before the intensive development of computers. It is based on the

division of the domain D into triangles. Each triangle then defines, by its three

vertices, a plane that is why the resulting surface is per parts linear.

8.4.1 Advantages:

Very fast algorithm

Resulting surface is interpolative

8.4.2 Disadvantages:

The domain of the function f is limited to the convex envelope of the points XYZ.

Resulting surface is not smooth and iso-lines consists of line segments

The division into triangles may be ambiguous, as the following simple example of

alternative division of rectangle shows in the first case a valley was created, in the

second case a ridge was created.

8.4.3 Application:

This method is, still used in geodesy and digital models of terrain. As a rule,

characteristic points of terrain are measured it means that the person performing

terrain measurements surveys only points where the slope of terrain changes (tops,

edges, valleys and so on) and thus avoids the above-mentioned ambiguity. For

interpretation of such data, the Triangulation with linear interpolation method is quite

suitable.

8.5 Natural neighbour

The Natural neighbour is an interpolation method based on Voronoi tessellation.

Voronoi tessellation can be defined as the partitioning of a plane with n points into n

convex polygons such that each polygon contains exactly one point and every point in

a given polygon is closer to its central point than to any other. In other words, if

i=1n is a given set of points in 2 than the Voronoi polygon corresponding to the

point Xi is the set Vi = {X 2 ;X , X i



were included in the tessellation. The weights of points A, B, C, D and E which are

used to compute the interpolated value of X are respectively the areas of the grey

region intersecting each original cell of A, B, C, D and E and are also known as the

natural neighbour coordinates of X.

Figure 8.3: New Voronoi cell and areas for computation of neighbour point weights.

The surface formed by natural neighbour interpolation has the useful properties of

being continuous (C0) everywhere and passing exactly through z values of all data

points. Moreover, the interpolated surface is continuously differentiable (C1)

everywhere except at the data points, providing smooth interpolation in contrast to the

Triangulation with linear interpolation method.

8.5.1 Advantages:

Fast algorithm

Resulting surface is interpolative and smooth except at the data points.


The domain of the function f is limited to the convex envelope of the points XYZ

The shape of the resulting surface is not acceptable in some fields such as in geology

or hydrogeology.

8.5.3 Application:

The Natural neighbour method is, mainly used in GIS systems as a digital model of

terrain and fast interpolation of terrain data providing a smooth surface.



8.6 Inverse distance

This method computes a value of function f at an arbitrary point (x, y) D as a

weighted average of values Zi:

f(x, y) = , where wi = hi = and

2 is a smoothing parameter.

If the number of points n is too great, the value of f (x, y) is calculated only from

points belonging to the specified circle surrounding the point (x, y). The method was

frequently implemented in the first stages of computers development.

8.6.1 Advantages:

Simple computer implementation; for its simplicity, the method is implemented in

almost all gridding software packages

If 2=0, the method provides interpolation.


High computer time consumption if the number of points n is large (due to

computation of distances)

Typical generation of "bull's-eyes" surrounding the position of point locations within

the domain D that is why the resulting function is not acceptable for most

applications.

8.7 Minimum curvature method

This method and namely its computer implementation was developed by Smith and

Wessel (1990). The interpolated surface generated by the Minimum curvature method

is analogous to a thin, linearly elastic plate passing through each of the data values

with a minimum amount of bending. The algorithm of the Minimum curvature

method is based on the numerical solution of the modified bi-harmonic differential

equation

(1-T) 4f (x, y) - T2 f(x, y)=0 with three boundary conditions:

(1 T)2 f / n2 + (T) f / n = 0

(2 f )/ n=0 on the edges

2 f / x y=0 at the corners

where

T0,1 is a tensioning parameter

2 is the Laplacian operator 2 f =2 f / x2 + 2 f / y2



4= (2)2 is the bi-harmonic operator

4 f =4 f / x4 + 4 f / y4 + 24 f / x 2 y 2 and

n is the boundary normal.

If T=0, the bi-harmonic differential equation is solved; if T=1, the Laplace differential

equation is solved in this case, the resulting surface may have local extremes only at

points XYZ.

8.7.1 Advantages:

Speed of computation is high and an increasing number of points XYZ has small

influence on decreasing the computational speed.

Suitable method for a large number of points XYZ.


Complicated algorithm and computer implementation

If the parameter T is near zero, the resulting surface may have local extremes out of

the points location

Bad ability to conserve extrapolation trends.

8.7.3 Application:

Universal method suitable for smooth approximation and interpolation (for example

distribution of temperature, water heads, potential fields and so on).

8.8 Regression by plane with weights

This method is based on regression by plane f(x, y) = ax +by +c using a weighted

least square fit. The weight wi assigned to the point ( Xi , Yi , Zi ) is computed as an

inverse distance from the point (x, y) to the point ( Xi ,Yi ) . Then the minimum of the

following function of the three independent variables has to be found:

F (a , b , c) = , which leads

to the solution of the three linear equations:

)

After rearrangement the following equations are obtained:

a



a

a

In addition to the regression by plane, some mapping packages, offer possibility to

use polynomials of higher order.

8.8.1 Advantages:

Simple algorithm

Good extrapolation properties


Resulting function is only approximate

Slow speed of computation if n is great (due to computation of distances)

8.8.3 Application:

Surface reconstruction from digitized contour lines. The method was frequently used

namely in the past, when contour maps were transferred from paper sheets to digital

maps.

8.9 Radial basis functions

The method of Radial basis functions uses the interpolation function in the form:

f(x, y) = p(x, y) +

where

p(x, y) is a polynomial

wi are real weights

|(x, y)-(Xi,Yi)| is the Euclidean distance between the points (x, y) and (Xi , Yi)

(r) is a radial basis function

Commonly used radial basis functions are (c2 is the smoothing parameter):

Multi quadric: (r) =

Multi log: (r) = log (r2 + c2)

Natural cubic spline: (r) = (r2 +c2)3/2

Natural plate spline: (r) = (r2 +c2) log (r2 +c2)



The interpolation process starts with polynomial regression using the polynomial p(x,

y). Then the following system of n linear equations is solved for unknown weights wi,

i = 1,..., n :

Zj p(Xj , Yj) = j=1,.,n

As soon as the weights wi are determined, the z-value of the surface can be directly

computed from equation above at any point (x, y) D.

8.9.1 Advantages:

Simple computer implementation; the system of linear equations has to be solved only

once (in contrast to the Kriging method, where a system of linear equations must be

solved for each grid node see the next section)

The resulting function is interpolative

Easy implementation of smoothing


If the number of points n is large, the number of linear equations is also large;

moreover the matrix of the system is not sparse, which leads to a long computational

time and possibly to the propagation of rounding errors. That is why this method, as

presented, is used for solving small problems with up to a few thousand points.

Solving large problems is also possible, but requires an additional algorithm for

searching points in the specified surrounding of each grid node.

8.9.3 Application:

Universal method suitable for use in any field.

numerical methods

Documents

method of interpolation

interpolation methods

kriging method

surface interpolation

word kriging

aim of kriging

observed values

geostatistical methods