fortran subroutines for computing smoothing and interpolating natural splines

Fortran subroutines for computing smoothing and interpolating natural splines

TOM LYCHE

Institute o f lnformatics, University o f Oslo, Box 1080, Blindern, Oslo 3, Norway

LARRY L. SCHUMAKER

Center for Approximation Theory, Department o f Mathematics, Texas A & M University, Station, Texas 77843, USA

College

KAMY SEPEHRNOORI

Department o f Petroleum Engineering, University o f Texas, Austin, Texas 78712, USA

FORTRAN subroutines for the stable and fas t computation of smoothing and interpolating natural splines solving certain data fitting problems in one dimension are presented. The subroutines are b a s e d on algorithms discussed in Ref . 1.

1. THE DATA FITTING PROBLEM

In this paper we are interested in the following:

Problem Given x ~ < x 2 < . . . <xn and real numbers Yl . . . . ,Yn,

construct a smooth function s to fit this data. We shall present three methods of solving this data

fitting problem. The first is a method which produces a function s which fits the data exactly.

Method i (best interpolation). Let 2 <~ 2m <<.n. Define

U = { u E L ~ ( - - o o , oo): u(x i )=Yi , i= 1 , 2 , . . . , n }

(1.1)

where

L' f ( - - o% oo) = {u:u (m- 1) is absolutely continuous on

(-- 0% oo) and u (m) E L2(--0% oo)} (1.2)

Construct s such that

v(s) = min v(u) (1.3) u E U

where

v ( u ) = f (u(m)(x))2 dx (1.4)

Discussion. This method involves finding the 'best interpolant' in the sense that the quantity v(u) is minimized over all appropriately smooth functions which interpolate

the given data. It is well-known 1 that there is a unique s satisfying equation (1.3), and that it is a natural polynomial spline o f order 2m with knots at the points A ={xl . . . . . xn} ; i.s. s is a member of the class

./V'OY2m (A) = { f E c2rn- 2( - oo, oo): f E , ~ m on

(-- ~, Xl) and (x n, ~o) while f E ~ 2 m on

(xi, xi+l ), i = 1, 2 . . . . , n - l } (1.5)

where "~m is the space of polynomials of order m (degree m - 1).

To find s numerically, it is convenient to introduce the so-called B-splines, Bx . . . . . Bn, which form a basis for • A / ' o J ~ 2 m ( A ) . I In terms of these B-splines, each s can be written in the form

t/

s(x) : ~ ciBi(x) (1.6) i = 1

The coefficients of the interpolating spline can be found by solving the linear system

Bc = Y (1.7)

where

n

B=(Bi (x / ) ) ~ , c = ( c l . . . . . Cn) T and Y = ( y l . . . . . Yn) T i , j = l

The matrix B is banded with 2m -- 1 bands. • In some applications it may not be desirable to fit the

data exactly. This might be the case, for example, when the data is subject to some noise (say measurement errors). In this case it may be more appropriate to construct a function s which approximately fits the data. The following two methods are designed to do just that.

Method 2 (smoothing). Let Wl . . . . . w n be positive numbers, and define

n

E(f ) = ff~ wi(y i -- f(xi)) 2 (1.8) i = 1

041-1195/83/010002-04 $2.00 2 Adv. Eng. Software, 1983, Vol. 5, No. 1 © 1983 CML Publications

Given p > 0, define

~Pp(.f) = v(f) + pE(jO

Find a function s such that

(1.9)

e p (s) = rain % (f) (1.1 O) f~zm(-**, ~0)

Discussion. In this problem we are minimizing a com- bination of the smoothness v f f ) and the weighted interpolation error E(f) . The parameter p controls the balance between these two quantities. It is known ~ that the function s can be constructed as a natural polynomial spline of the same form as in Method 1. Here the coefficients in equation (1.6) are determined by solving the linear system

(B + p-XWE) e = Y (1.11)

where W is a diagonal matrix with w~ ~ . . . . . w~ 1 down the diagonal, and where E is a certain 2m + 1 banded singular matr ix) •

Our third method is a slight variant of Method 2.

Method 3 (smoothing). Given ERR >t 0, find a function s such that

v(s) = rain v(f) /~ Lm( -*~, •) (1.12) E(/)~ERR

Discussion. Here the function s is uniquely defined as the one which minimizes v(j0 subject to a control on the size of the error E(f) . I f ERR = 0, this smoothing spline is in fact an interpolating spline as found by Method 1. To construct s satisfying equation (1.12), one starts with a guess for the value of p, computes the corresponding sp and E(sp), and then iterates until a value p(ERR), is found such that the corresponding spline s satisfies E(s) <<, ERR.I •

In the following section we describe FORTRAN packages for the three methods for solving the data fitting problem. We also present some subroutines which are useful in evaluating the resulting B-spline expansion (1.6) for s, along with its various derivatives. In some applications (for example when the spline has to be evaluated a large number of times (as in graphing it), it may be desirable to rewrite the spline s in terms of its piecewise polynomial representation. 3 Given the vector c, it is not difficult to compute an array CW so that

m

y~ cw(1 , / ) (x -x l ) j - l , x<x~, /=1

2m s (x)= Z CW(i,j)(x--xi)/-1, xi<,x<xi+l, (1.13)

/=1 i = 1 . . . . . n - - 1

r n

y. CW(n,/)(x-x,,)/-1, x,,<x ]=1

FORTRAN subroutines for accomplishing the conversion from B-spline coefficients to the coefficient array CW of the piecewise polynomial representation are also included.

2. THE FORTRAN PACKAGE

Our FORTRAN package for solving the data fitting problem by Methods 1-3 consists of ten FORTRAN

subroutines. These subroutines are called SPLCO, SMOOTH, BASIS, PREP, BANDET, BANSOL, BSPPWP, PPEV, SPLDER and SEARCH. Since the code is heavily com- mented, we do not feel that it is necessary to describe each of them here. The main subroutine for the average user is SMOOTH. It is a package which can be used for computing a spline by any of the three methods described in Section 1. We describe its use in the following section.

The subroutines are written in standard FORTRAN. Except for the machine constant AMACH used in SMOOTH, the subroutines are designed to be machine- independent. The constant AMACH is to be set to the smallest computer representable number such that 1.0 + AMACH v~ 1.0. The choice of AMACH is not extremely critical, and on most machines any value between 10 -14 and 10 -6 will be acceptable.

We now discuss some features of the code. The algorithms have been completely vectorized. The case of m = 2 (cubic splines) with n/> 6 equally spaced knots is of very frequent occurrence, and hence there is a special code in most of the subroutines for this case. The computation of B-splines and their derivatives is accomplished using well.known stable recurrence relations. 1

All three methods for constructing the spline s require the solution of a linear system of equations. The matrices of these systems are either B or BWE, respectively. Since these are banded matrices, the code is designed to take advantage of this structure. Since B is formed from B- splines (and BWE is a perturbation of B), we have elected to perform the decomposition without pivoting. (The matrix B differs only slightly from the usual B-spline matrix, for which it is known that decomposition without pivoting is permissible) Both BANDET and BANSOL include special code for the cases where B or BWE is diagonal or tridiagonal.

As noted in Section 1, the construction ors in Method 3 proceeds iterativety using Method 2. To start the process, an initial guess for p is required. It is desirable to choose this initial p quite small in order to guarantee that the iteration does not end prematurely. 1 In the code presented here, we take the initial guess for p to be a small multiple of the machine constant AMACH. Theoretically, the iteration in Method 3 should continue until E(sp)<~ERR. In practice, however, it is usually acceptable to stop when E(sp) is sufficiently close to ERR; we terminate the iteration as soon as E(sv)<~ERR + 0.1 *x/~ERR.

In addition to subroutines for producing the coefficients of the spline fit s, we have also included subroutines for the evaluation of s and/or its derivatives at any real number x. Two subroutine packages are presented for this purpose. The first, called SPLDER uses the B-spline expansion (1.6), and will produce derivatives of any order. The second evaluation subroutine is called PPEV, and is based on the piecewise polynomial representation (1.13) of s. For convenience of programming, it has been written only for computation of derivatives up to the second.

In order to keep storage requirements at a minimum, this collection of subroutines has been designed to make maximal reuse of workspace. To this end we have used a one-dimensional work array (called WORK) for the storage of all intermediate results. This array should be defined in the driver and should have dimension at least N(M.M+ 5M+ 3).

We have also attempted to make the code as robust as possible by attempting to anticipate all possible anomalies. Thus (see the next section for details) the code has been

Adv. Eng. Software, 1983, Vol. 5, No. 1 3

written with various escape routes with printed messages to explain what has happened.

While the set of subroutines presented here have been designed so that the user can integrate them into a package o f his own design, we have also included a driver which takes advantage of the great flexibility built into the subroutines. Its use is described in the following section.

3. USING THE PACKAGE

In this section we describe how to use the package presented here. Some numerical examples are presented in the following section.

The basic operat ion of SMOOTH is shown in the flow chart in Fig. 1. The data fitting problem is described by the input of the quantities:

m = positive integer such that 2m is the order of the spline n = the number of data points (with n >/2m) X = an array of the n increasing abscissas of the data points Y = an array of the n data values.

The further operat ion of SMOOTH is controlled by the parameter IPROB. The values of 1, 2, 3 for this parameter correspond to Methods 1, 2 and 3, respectively. In particular, I P R O B = 1 produces an interpolating spline, while IPROB = 2 or 3 procudes a smoothing spline. I f SMOOTH has been used to solve a data fitting problem and is going to be recalled with the same values of m, n and X, some work can be saved by using IPROB = - - 1, - - 2 or - 3.

I f IPROB =-+2 , it is also necessary to input the quantities:

W = an array of positive numbers giving the weights at tached to each data point (cf. equation (1.8))

p = the smoothing parameter (cf. equation (1.9)).

Similarly, if IPROB = -+ 3. it is necessary to input W and

0 o

Figure 2.

I I I I I I t 2 3 4 5 6

Numerical Example 1 (ERR = O, 0.1, 0.5, 2]

I O 0 -

5 0 -

0 i

0

F~ure 3.

=::S z/z - 7 ='°°

/ 4 ~ ~ERR = 250

I I * I I I I I * iii 112 q I 115 2 3 5 6 8 9 14

Numerical Example 2 (ERR = O, 20, 100, 250)

IINPUT ?, n, X, Y[

IINPUT IPROB I

[~ [,NPUT p. W I I INPUT ERR, w I I

Figure 1.

[OUTPUT C, CW]

A f low chart for SMOOTH

ERR = the tightness of fit in Method 3 (cf. equation (1.12))

The output of the subroutine SMOOTH is the following:

IPROB = an integer describing whether or not the computa- t ion proceeded as desired (see Table 1 for details)

C = the array of coefficients, equation (1.6) CW = the array of coefficients, equation (1.13)

The use of the subroutines SPLDER and PPEV for evaluating s(ider)(arg) is straightforward. One simply inputs ider and arg and calls the desired subroutine.

Table ]. Output values of lPROB

1 = solved interpolation problem 1 2 = solved smoothing problem 2 3 = solved smoothing problem 3 4 = requested 3, but ERR was too small, so did interpolation 5 = requested 2, but p was not positive 6 = requested 3, but ERR was not positive 7 = requested 3, but exceeded prescribed number of iterations 8 = requested 2, but p was too small, should do least squares

polynomial fit 9 = the value of m was less than 1

10 = 2m was larger than n 11 = the x's were out of order 12 = not all positive weights W 13 = IIPROBI > 3 on input

4 . N U M E R I C A L E X A M P L E S

The package presented here has been thoroughly tested on various machines at the University of Texas at Austin, the University of Oslo, the Free University of Berlin and elsewhere. In this section we give two simple numerical examples to illustrate its use.

Our first example consists of the data m = 2, n = 6, X = { 1 , 2 , 3 , 4 , 5 , 6 } and Y = { 1 , 1.5, 2 , 1 , 1 , 3}. In Fig. 2 we show the results of fitting this data with an interpolating spline, and by smoothing splines with various choices of ERR. It is clear from the figure that as ERR increases, the corresponding smoothing spline misses the data points by a larger amount, but is a smoother function (with less curvature).

4 Adv. Eng. Software, 1983, Vol. 5, No. 1

Our second example deals with the data n = 11, X = {0, 2, 3, 5 , 6 , 8 , 9 , 11, 12, 14, 15} and Y={10, 10, 10, 10, 10, 10, 10.5, 15, 50, 60, 85}. Figure 3 shows the results of fitting this data with cubic natural splines (m = 2). As in the first example, it is clear that as ERR increases, the smoothing splines miss the data points by a larger amount, but are smoother functions. Note that in this example, the data is monotone increasing, but that the interpolating and smoothing splines are not monotone until a large amount of smoothing is introduced. For some special methods for producing monotone fits, see ref. 4 and references therein.

5. REMARKS

(1) It is known 1 that for all n>~m, there are unique natural splines satisfying equations (1.3), (1.10) and (1.12), respectively. While it is certainly possible to write codes to compute natural splines for all such n, we have elected to restrict ourselves to the case where n f> 2m in order to simplify the programming. This is no essential restriction in practice since usually the number of data points will be large in comparison with m.

(2) Although the package presented here is written for general m, in practice it is recommended that the user stick with low values of rn such as 1, 2, 3 (corresponding to linear, cubic and quintic splines). Higher values of m will give splines with a greater tendency to oscillate.

(3) The size of the weights, wl . . . . , Wn, which appear in Methods 2 and 3 should be chosen to reflect the user's confidence in the accuracy of each data point. If they are all of the same accuracy, it is common to choose all weights

equal to 1. If a particular data point is judged to be less accurate, the size of the associated w should be reduced accordingly.

(4) The entire package presented here has also been coded in ALGOL. s Earlier ALGOL codes for interpolation and smoothing by natural splines appeared in ref. 6.

COMPUTER PROGRAM

The program listing described in this article was too long to publish in the journal. However, a copy of the program listing can be obtained by writing to the Editor, Advances in Engineering Software. A nominal charge of £7.00 ($14.00) will be made to cover reproduction, postage, etc.

REFERENCES

1 Lyche, T. and Schumaker, L. L. Computation of smoothing and interpolating natural splines via local bases, SlAM J. Numer. Anal., 1973, 10, 1027

2 deBoor, C. and Pinkuss A. Backward error analysis for totally positive linear systems, Nu~ Math., 1977, 27, 485

3 Schumaker, L. L. Spline Functions: Basic Theory, Wiley-Inter- science, New York, 1981

4 Schumaker, L. L. On shape preserving quadratic spline interpolation, to appear in SIAMJ. Numer. Anal.

5 Lyche, T. and Schumaker, L. L. An ALGOL package for computing smoothing and interplating spline functions, Report 28, Institute of Informatics, University of Oslo, 1978

6 Lyche, T. and Schumaker, L. L. Procedures for computing smoothing and interpolating natural splines, Commun. ACM, 1974, 17, 453

Adv. Eng. Software, 1983, Vol. 5, No. 1 5

fortran subroutines for computing smoothing and interpolating natural splines

Documents