cost-effective sampling network design for contaminant ... of observation net… · cost-effective...

25
Cost-effective sampling network design for contaminant plume monitoring under general hydrogeological conditions Jianfeng Wu a,b , Chunmiao Zheng a, * , Calvin C. Chien c a Department of Geological Sciences, University of Alabama, Tuscaloosa, AL, United States b Department of Earth Sciences, Nanjing University, Nanjing, China c Corporate Remediation, DuPont Company, Wilmington, DE, United States Received 1 October 2003; received in revised form 10 November 2004; accepted 16 November 2004 Abstract A new simulation–optimization methodology is developed for cost-effective sampling network design associated with long-term monitoring of large-scale contaminant plumes. The new methodology is similar in concept to the one presented by Reed et al. (Reed, P.M., Minsker, B.S., Valocchi, A.J., 2000a. Cost-effective long-term groundwater monitoring design using a genetic algorithm and global mass interpolation. Water Resour. Res. 36 (12), 3731–3741) in that an optimization model based on a genetic algorithm is coupled with a flow and transport simulator and a global mass estimator to search for optimal sampling strategies. However, this study introduces the first and second moments of a three-dimensional contaminant plume as new constraints in the optimization formulation, and demonstrates the proposed methodology through a real-world application. The new moment constraints significantly increase the accuracy of the plume interpolated from the sampled data relative to the plume simulated by the transport model. The plume interpolation approaches employed in this study are ordinary kriging (OK) and inverse distance weighting (IDW). The proposed methodology is applied to the monitoring of plume evolution during a pump-and-treat operation at a large field site. It is shown that potential cost savings up to 65.6% may be achieved without any significant loss of accuracy in mass and moment estimations. The IDW-based interpolation method is computationally more efficient than the OK- based method and results in more potential cost savings. However, the OK-based method leads to 0169-7722/$ - see front matter D 2004 Elsevier B.V. All rights reserved. doi:10.1016/j.jconhyd.2004.11.006 * Corresponding author. Tel.: +1 205 348 0579; fax: +1 205 348 0818. E-mail address: [email protected] (C. Zheng). Journal of Contaminant Hydrology 77 (2005) 41 – 65 www.elsevier.com/locate/jconhyd

Upload: others

Post on 07-Sep-2020

7 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Cost-effective sampling network design for contaminant ... of observation net… · Cost-effective sampling network design for contaminant plume monitoring under general hydrogeological

Journal of Contaminant Hydrology 77 (2005) 41–65

www.elsevier.com/locate/jconhyd

Cost-effective sampling network design for

contaminant plume monitoring under

general hydrogeological conditions

Jianfeng Wua,b, Chunmiao Zhenga,*, Calvin C. Chienc

aDepartment of Geological Sciences, University of Alabama, Tuscaloosa, AL, United StatesbDepartment of Earth Sciences, Nanjing University, Nanjing, China

cCorporate Remediation, DuPont Company, Wilmington, DE, United States

Received 1 October 2003; received in revised form 10 November 2004; accepted 16 November 2004

Abstract

A new simulation–optimization methodology is developed for cost-effective sampling network

design associated with long-term monitoring of large-scale contaminant plumes. The new

methodology is similar in concept to the one presented by Reed et al. (Reed, P.M., Minsker, B.S.,

Valocchi, A.J., 2000a. Cost-effective long-term groundwater monitoring design using a genetic

algorithm and global mass interpolation. Water Resour. Res. 36 (12), 3731–3741) in that an

optimization model based on a genetic algorithm is coupled with a flow and transport simulator and a

global mass estimator to search for optimal sampling strategies. However, this study introduces the

first and second moments of a three-dimensional contaminant plume as new constraints in the

optimization formulation, and demonstrates the proposed methodology through a real-world

application. The new moment constraints significantly increase the accuracy of the plume

interpolated from the sampled data relative to the plume simulated by the transport model. The

plume interpolation approaches employed in this study are ordinary kriging (OK) and inverse

distance weighting (IDW). The proposed methodology is applied to the monitoring of plume

evolution during a pump-and-treat operation at a large field site. It is shown that potential cost

savings up to 65.6% may be achieved without any significant loss of accuracy in mass and moment

estimations. The IDW-based interpolation method is computationally more efficient than the OK-

based method and results in more potential cost savings. However, the OK-based method leads to

0169-7722/$ -

doi:10.1016/j.

* Corresp

E-mail add

see front matter D 2004 Elsevier B.V. All rights reserved.

jconhyd.2004.11.006

onding author. Tel.: +1 205 348 0579; fax: +1 205 348 0818.

ress: [email protected] (C. Zheng).

Page 2: Cost-effective sampling network design for contaminant ... of observation net… · Cost-effective sampling network design for contaminant plume monitoring under general hydrogeological

J. Wu et al. / Journal of Contaminant Hydrology 77 (2005) 41–6542

more accurate mass and moment estimations. A comparison of the sampling designs obtained with

and without the moment constraints points to their importance in ensuring a robust long-term

monitoring design that is both cost-effective and accurate in mass and moment estimations.

Additional analysis demonstrates the sensitivity of the optimal sampling design to the various

coefficients included in the objective function of the optimization model.

D 2004 Elsevier B.V. All rights reserved.

Keywords: Contaminant transport; Monitoring network design; Interpolation method; Moment analysis; Genetic

algorithm; Massachusetts Military Reservation (MMR)

1. Introduction

During the past two decades, models that solve the governing groundwater flow

and/or solute transport equations in conjunction with an optimization technique have

been increasingly used as aquifer management tools (e.g., Gorelick, 1983; Ahlfeld and

Mulvey, 1988; Wagner and Gorelick, 1989; Culver and Shoemaker, 1992; Karatzas

and Pinder, 1993; Tiedeman and Gorelick, 1993; Rizzo and Dougherty, 1996; Minsker

and Shoemaker, 1998; Zheng and Wang, 1999a; Mayer et al., 2001). The combined

simulation and optimization model is appealing because it can account for the complex

behavior of the groundwater system and identify the best management strategy to

achieve a given set of objectives under prescribed constraints (Wagner, 1995a).

Simulation–optimization models have been developed for a variety of applications. A

representative simulation–optimization model is one that seeks to identify the least-cost

strategy to meet specified constraints. Recently the least-cost strategies are often

related to groundwater remediation (McKinney and Lin, 1994; Huang and Mayer,

1997; Bear and Sun, 1998; Aly and Peralta, 1999; Smalley et al., 2000; Zheng and

Wang, 2002).

Groundwater remediation is associated with numerous costs and often has time

horizons of up to 30 years or more. Installation of a cleanup system does not mean the

end of a groundwater remediation project. Instead, long-term monitoring of the system’s

performance is needed to ensure that the remediation objectives are being achieved and

the risks to human health and environment are being properly managed (Chien et al.,

2002). Over-sampling is a common problem encountered in groundwater quality

monitoring, while data collection and analysis of long-term monitoring are very

expensive. For a typical contaminant site, several hundred samples may be collected and

analyzed each year that cost hundreds of thousands of dollars. Applying simulation–

optimization models to long-term monitoring network design can achieve substantial

cost savings by eliminating unnecessary samples (Chien et al., 2002). This paper

presents a new methodology that builds on the work of Reed et al. (2000a) for cost-

effective groundwater monitoring design using a genetic algorithm and global mass and

moment estimation.

The optimization of long-term monitoring network design can be accomplished using a

variety of approaches. Selecting an appropriate method involves numerous criteria, the

most important of which include the site-specific long-term performance objectives and

Page 3: Cost-effective sampling network design for contaminant ... of observation net… · Cost-effective sampling network design for contaminant plume monitoring under general hydrogeological

J. Wu et al. / Journal of Contaminant Hydrology 77 (2005) 41–65 43

the amount and type of available data. Although monitoring network design has been

studied intensively in the past, most studies have only focused on the theoretical

development and simple hypothetical examples or small-scale field problems (e.g.,

Loaiciga, 1989; McKinney and Loucks, 1992; Hudak and Loaiciga, 1993; Christakos and

Killam, 1993; Andricevic, 1993, 1996; James and Gorelick, 1994; Meyer et al., 1994,

1995; Cieniawski et al., 1995; Wagner, 1995b; Bogaert and Russo, 1999; Montas et al.,

2000; Reed et al., 2000a). Often viable approaches for sampling network design in large-

scale applications are variance-based methods (Rouhani, 1985; Rouhani and Hall, 1988;

Hudak and Loaiciga, 1992; Ridley et al., 1995). For example, Rouhani (1985) did a

seminal work on groundwater sampling network design based on the variance reduction

analysis in which the variances of the estimation errors obtained from kriging interpolation

were sequentially decreased by adding more sampling points to a regional-scale water

table monitoring network until the variance reduction became insignificant. Rouhani and

Hall (1988) improved this variance reduction analysis and applied it to groundwater

network design for contaminant plume monitoring.

Two recent studies are of particular interest to this work. Reed et al. (2000a) presented a

methodology for cost-effective monitoring network design that is used to examine the

feasibility of reducing long-term monitoring costs under conditions similar to those at the

Hill Air Force Base, Utah with a comparatively small study area (approximately only 0.02

km2). As discussed in more detail in Section 4, it would be difficult to achieve a

satisfactory result applying the methodology developed by Reed et al. (2000a) to a large

field site such as the Massachusetts Military Reservation in Cape Cod, Massachusetts.

Montas et al. (2000) developed a direct partial enumeration method coupled with a

stochastic flow and transport model to characterize a hypothetical contaminant plume by

minimizing the spatial moments subject to sampling cost constraints. Although the

methodology proposed by Montas et al. (2000) was successful in generating near-optimal

sampling networks that satisfied all imposed constraints, it is not suitable for minimization

of monitoring costs. In addition, whether the proposed methodology is applicable to real-

world sites still needs to be verified further.

This paper is related to the work of Zheng and Wang (2002) who demonstrated the

application of optimization modeling to the design of a pump-and-treat system for the

containment and cleanup of a large trichloroethylene (TCE) plume in a shallow,

unconfined sandy aquifer at the Massachusetts Military Reservation in Cape Cod,

Massachusetts. The current work involves the design of a long-term sampling network to

monitor the plume evolution during the remedial operation in a most cost-effective

manner using a genetic algorithm and global plume estimation. The aim of this study is

to minimize the monitoring costs by eliminating data redundancy without significantly

affecting the accuracy and adequacy of sampled data. The methodology presented in this

study is similar in concept to one presented by Reed et al. (2000a) in that an

optimization model coupling a genetic algorithm and global mass estimation is used to

search for sampling plans. The interpolation methods used in this study for global plume

estimates are still ordinary kriging and inverse distance weighted interpolation. The most

notable difference for the methodology presented in this paper is that it adds and

emphasizes the plume moment constraints on the optimization procedure, leading to

more accurate sampling designs.

Page 4: Cost-effective sampling network design for contaminant ... of observation net… · Cost-effective sampling network design for contaminant plume monitoring under general hydrogeological

J. Wu et al. / Journal of Contaminant Hydrology 77 (2005) 41–6544

2. Methodology

The simulation–optimization (S/O) model developed in this study for cost-effective

monitoring network design includes four primary components: (1) a groundwater flow and

transport simulator, (2) a global mass estimator, (3) a plume moment estimator, and (4) an

optimal search technique based on a genetic algorithm (GA). Components (1)–(3) are used

to evaluate various constraints for all potential sampling designs, while component (4) is

used to identify the optimal or near-optimal sampling design. The framework of the S/O

model is similar to that proposed by Reed et al. (2000a). The most important difference is

the plume moment constraints introduced and emphasized in the current study.

A prerequisite for applying the monitoring network design model to any field site is the

existence of a calibrated flow and transport model for the study site. The flow and

transport code is used to simulate the contaminant plume from the initial time t0 to the

expected time tm under any given remedial scenario. The simulated plume is assumed to

represent the future conditions to be monitored. If a model node is selected as a potential

monitoring well location, the simulated concentration value at that location is considered

known. The known concentrations at all potential monitoring well locations are used to

construct a new plume through interpolation. The interpolated plume is then compared, in

terms of both mass and first and second moments, with the simulated plume output from

the transport model. If enough nodes were selected as potential monitoring well locations,

the difference between the simulated and interpolated plumes would be minimal. On the

other hand, the more model nodes are selected as potential monitoring well locations, the

higher the capital and sampling costs would be. Thus there is a tradeoff between the

accuracy of the interpolated plume based on monitoring data and the cost-effectiveness of

the monitoring network.

The objective function for each potential sampling design is evaluated in terms of total

monitoring costs while considering the accuracy of estimated global mass and spatial

plume moments based on sampling data. Two interpolation methods, ordinary kriging

(OK) and inverse distance weighting (IDW), are applied to estimate contaminant

concentrations at all unsampled nodes within the model domain. For each potential

design the global contaminant mass and spatial moments of the plume along different

directions are computed using the respective global mass estimator and plume moment

estimator. Moreover, the fitness value of each potential sampling design is expressed as the

combination of objective function (capital and sampling costs) and penalty costs due to the

violations of both global mass and plume moment estimation errors. Then fitness values

are evaluated by a GA to determine which individual sampling designs are allowed to

reproduce and evolve to the next generation. As the generation evolves, the optimal or

near-optimal solution (sampling design) will be found. Fig. 1 shows the framework used in

this study for implementing the cost-effective groundwater sampling network design.

2.1. Flow and transport simulation

In this study, it is assumed that the flow and transport models are based on the three-

dimensional finite-difference flow code, MODFLOW (McDonald and Harbaugh, 1988),

and its solute transport companion, MT3DMS (Zheng and Wang, 1999b). MODFLOW

Page 5: Cost-effective sampling network design for contaminant ... of observation net… · Cost-effective sampling network design for contaminant plume monitoring under general hydrogeological

Fig. 1. Flowchart for monitoring network design using GA and global plume estimation.

J. Wu et al. / Journal of Contaminant Hydrology 77 (2005) 41–65 45

has a modular structure that allows it to be easily adapted for a particular application.

MT3DMS also has a modular structure and has a comprehensive set of options and

capabilities for simulating contaminant transport, including advection, dispersion/

diffusion, and chemical reactions under general hydrogeological conditions. Note that

the contaminant plume simulated by the flow and transport models is assumed to represent

the future conditions to be monitored. Thus, the flow and transport models should have

been calibrated before it can be used in sampling network design for monitoring future

migration and remediation.

2.2. Global plume estimation

The global plume estimation includes two parts: global mass estimation and spatial

moment estimations. For each potential sampling design the coordinates and concen-

trations of sampled points are used to estimate the contaminant concentrations at all

unsampled nodes within the model domain using OK or IDW interpolation. The

interpolated concentrations are then used to estimate the global mass (zeroth moment) and

Page 6: Cost-effective sampling network design for contaminant ... of observation net… · Cost-effective sampling network design for contaminant plume monitoring under general hydrogeological

J. Wu et al. / Journal of Contaminant Hydrology 77 (2005) 41–6546

the first and second moments. The degree to which the estimates of global mass and higher

moments agree with those directly output from the transport model serves as the criterion

for evaluating the accuracy of individual sampling designs.

2.2.1. Ordinary kriging (OK) method

Kriging is a best, linear unbiased estimator (BLUE). It possesses some unique

properties including conditional unbiasedness, smoothing effect, additivity, and exact

interpolation. Kriging, as an interpolation technique, has been applied widely to estimate

hydrogeologic variables at unsampled locations from scatter data points, such as

transmissivity and hydraulic conductivity (e.g., Lavenue and Pickens, 1992; McKinney

and Loucks, 1992; Eggleston et al., 1996; Fabbri, 1997) and contaminant concentrations

(e.g., Zhu et al., 1997; Reed et al., 2000a).

The ordinary kriging (OK) method filters the mean from the simple kriging estimator by

requiring that the kriging weights sum to one (for more details see Isaaks and Srivastava,

1989; Deutsch and Journel, 1998; Olea, 1999). Note that the contaminant concentration

data can be converted to be normally distributed, i.e., normalized (e.g., by changing the data

into logarithm transform). Even in the non-normal case, the kriging method still gives the

best approximation to the conditional expectation (Vogely et al., 1978).

In this study the kriging packages modified from the Geostatistical Software Library

(GSLIB) (Deutsch and Journel, 1998) are used to estimate the plume concentrations at all

unsampled nodes within the study area. Reed et al. (2000a) mentioned two potential

shortcomings of using kriging for interpolation: (1) computational complexity; and (2)

difficulty of getting the semivariogram. While the first can be overcome with the rapid

advent of computing powers, the second makes it necessary to have considerable expertise

to apply this interpolation approach to field-scale problems.

2.2.2. Inverse distance weighted (IDW) interpolation

The IDW interpolation is one of the most commonly used techniques for interpolation

of scatter points. It is based on the assumption that the interpolation should be influenced

more by the nearby points and less by the more distant points. The interpolated value is a

weighted average of the scatter points and the weight assigned to each scatter point

diminishes as the distance from the interpolation point to the scatter point increases

(Vucetic et al., 2000).

The interpolated concentration, cinterp, at an unsampled location within the model

domain can be expressed as

cinterp ¼Xnri¼1

wicSi ð1Þ

where nr is the number of sampling points within the search radius from the interpolation

point, cSi is the concentration at sampling points, and wi is the weight function assigned to

each sampling point. The classical form of the weight function is:

wi ¼r�piXn

j¼1

r�pj

ð2Þ

Page 7: Cost-effective sampling network design for contaminant ... of observation net… · Cost-effective sampling network design for contaminant plume monitoring under general hydrogeological

J. Wu et al. / Journal of Contaminant Hydrology 77 (2005) 41–65 47

where p is an arbitrary positive real number called power parameter (typically, p=2), and riis the distance from the ith sampling point to the interpolation point.

The key feature of IDW interpolation is its numerical simplicity. However, the

assumption that the interpolation should be influenced more by the points nearby and less

by the points far away may result in considerable estimation errors, especially for the case

where the number of interpolated nodes is much larger than that of total sampled nodes.

2.2.3. Spatial moment estimation

Spatial moment estimation has often been used to define the characteristics of a

contaminant plume versus time (Freyberg, 1986; Garabedian et al., 1991; Adams and

Gelhar, 1992; Srivastava and Brusseau, 1996; Ezzedine and Rubin, 1997; Montas et al.,

2000; Vereecken et al., 2000). The zeroth moment measures the global mass of the plume.

The first moment provides the centroid of the plume, while the second moments provide a

measure of the spread of the plume around its centroid (Ezzedine and Rubin, 1997). This

study utilizes all three moments for the global plume estimation.

The global mass of the plume, mass, at some time t is defined by

mass tð Þ ¼ZX

gc u; tð ÞdX ð3Þ

where c(u, t) is the plume concentration at point u=(u1,. . .,und) and at time t within the

model domain, nd denotes the space dimension and is equal to 2 or 3, depending on the

type of the flow and transport model, g is the effective porosity, X denotes the aquifer

domain. The coordinate location of the center of the plume mass at time t is define by the

first moment about the origin (Freyberg, 1986; Ezzedine and Rubin, 1997)

luii tð Þ ¼ 1

mass

ZX

gc u; tð ÞuidX; ia 1;: : :; ndf g ð4Þ

where l1u1 denotes the first moment (coordinate) about the origin along the ui axis. For a

three-dimensional problem, l1u1, l1

u2, and l1u3 denote the first moments along the x, y, and

z directions, respectively, in the Cartesian coordinate system. For a two-dimensional

problem, there are only l1u1 and l1

u2, representing the first moments along the x, and y

directions, respectively.

The second moment about the center of mass, describing a measure of the spread of the

plume around its centroid, defines a spatial covariance tensor

X ¼lu1u12

: : : lu1und2

v : : : vlundu12

: : : lundund2

35

24 ð5Þ

The component terms of the spatial covariance tensor, X, at time t can be expressed as

luiuj2 tð Þ ¼ 1

mass

ZX

gc u; tð ÞuiujdX � lui1 luj

1 ; i; ja 1;: : :; ndf g ð6Þ

Page 8: Cost-effective sampling network design for contaminant ... of observation net… · Cost-effective sampling network design for contaminant plume monitoring under general hydrogeological

J. Wu et al. / Journal of Contaminant Hydrology 77 (2005) 41–6548

where l2uiuj denotes the second moment (spatial covariance) along the uiuj cross direction.

In a three-dimensional groundwater system, there are nine component terms for the spatial

covariance tensor, including u2xx, u2

xy, u2xz, u2

yx, u2yy, u2

yz, u2zx, u2

zy, and u2zz. Due to the

symmetry the second spatial moment tensor actually includes only six unknown

component terms (i.e., u2xy=u2

yx, u2xz=u2

zx, and u2yz=u2

zy). In a two-dimensional groundwater

system, the second spatial moment tensor has only three unknown component terms: u2xx,

u2xy, and u2

yy.

The components of the spatial covariance tensor are physically related to the spread of

the plume concentration distribution about its center of mass. In applications, the

magnitudes of the moments must be estimated from a discrete set of specified point

concentrations. In this study, the interpolated concentrations described above can be used

to estimate the global mass and the first and second moments for comparison with those

determined from the concentration distributions directly output from the transport model

without any interpolation.

2.3. Monitoring network design model

The goal of this study is to minimize long-term monitoring costs while preserving the

accuracy of estimated contaminant mass and plume moments based on the sampled data.

The objective function is the summation of total installation/drilling and sampling costs at

all sampled locations. The monitoring problem can be expressed as follows:

Minimize J ¼ C1

Xni¼1

xili þ C2

Xni¼1

yidi ð7Þ

subject to errormassVemass ð8Þ

errormomentVemoment ð9Þ

where, in the objective function as expressed in Eq. (7), J is the management objective in

terms of the total costs for sampling and well installation/drilling, n is the total number of

potential monitoring wells; C1 is the cost for each sampling, xi is a binary variable

indicating whether sampling takes place at well i (yes if xi=1; no if xi=0), li is the number

of sampling at different elevations for well i. If sampling takes place at well i, well i is

considered to be sampled at multiple (i.e., li) elevations. C2 is the fixed capital cost for

installation/drilling per unit depth of well i, di denotes the depth of borehole associated

with well i, and yi is a binary variable indicating whether well i is drilled (yes if yi=1; no if

yi=0).

Eqs. (8) and (9) are the constraints on the global mass estimation error and plume

moment estimation errors, respectively. Eq. (8) specifies that the error term errormass in the

estimation of the total dissolved contaminant mass must be less than or equal to the

prescribed acceptable tolerance emass. The global mass estimation error errormass can be

obtained by

errormass ¼masscal �massj

masscal

�������� ð10Þ

Page 9: Cost-effective sampling network design for contaminant ... of observation net… · Cost-effective sampling network design for contaminant plume monitoring under general hydrogeological

J. Wu et al. / Journal of Contaminant Hydrology 77 (2005) 41–65 49

where masscal is the contaminant mass in the area of interest as determined by the transport

model, and massj is the mass estimate as determined by the global mass estimator based on

the sampling design j. As mentioned above, we assume that the simulated plume

represents the future conditions to be monitored. Then, masscal obtained from the transport

model is considered the btrueQ mass of contaminants dissolved in the area of interest; and

Eq. (10) defines the absolute relative error between masscal and the estimated mass, massj,

based on the interpolated plume for the sampling design j.

Eq. (9) is the first and second moment constraints for the plume along all directions,

which requires that the center and shape of the interpolated plume agrees as closely as

possible with that output from the transport model. Similar to Eq. (8), it specifies that the

weighted sum of moment estimation errors, errormoment, must be less than or equal to the

prescribed acceptable tolerance emoment. The moment error term errormoment in Eq. (9) can

be stated as

errormoment ¼X2i¼1

Xk

lki;cal � lk

i;j

lki;cal

����������weightki ð11Þ

where lcal is the first or second moment of the simulated plume as output from the

transport model, lj is the estimated moment based on the sampling design j, i denotes the

order of moment, k is the direction along which the moment is computed, and weightik is

the weight assigned to the ith moment along the k direction. The larger a weight is

assigned to a moment, the more consistent is the moment determined by the transport

model with that based on the sampling design along a specific direction. In this study, all

weights assigned to both first and second moments along different directions are set

identically to 1. As a result, the effect of different coefficients on the sampling network

design can be equivalently compared as described below in Section 4.

The goal of the monitoring network design model is to find the optimal sampling plan

from among many alternatives. Eqs. (7)–(9) define an integer-programming problem

whose solution could be obtained using the branch-and-bound or exhaustive enumeration

scheme. Unfortunately, either of these schemes would require prohibitive computational

efforts. For example, if 100 potential monitoring locations exist at a field site, there will be

2100 possible sampling designs to be evaluated. Moreover, constraints (8) and (9) involve

the interpolation procedure for global mass estimation and moment evaluations, which

makes the computational burden even more excessive. Thus, to be practically feasible, an

optimization algorithm should only require that the objective function be evaluated for a

fraction of all possible monitoring network designs before the optimal or near-optimal

solution is obtained. Lee and Ellis (1996) compared eight algorithms for nonlinear integer

optimization of two applications and considered the heuristic algorithms promising for

solving realistic groundwater monitoring network design problems. In this study, a

heuristic search technique, GA, is used for this purpose.

2.4. Implementation of the genetic algorithm

The GA is a random search technique that is designed to mimic some of the key

features of natural selection and natural genetics in order to identify the optimal or near-

optimal solutions in a specified search space. Detailed descriptions of GA can be found in

Page 10: Cost-effective sampling network design for contaminant ... of observation net… · Cost-effective sampling network design for contaminant plume monitoring under general hydrogeological

J. Wu et al. / Journal of Contaminant Hydrology 77 (2005) 41–6550

many references (e.g., Goldberg, 1989; Sen and Stoffa, 1995; Pham and Karaboga, 2000).

In recent years, GA has been shown to be a valuable tool for solving complex optimization

problems in broad fields, including the field of water resources in papers by McKinney

and Lin (1994), Ritzel et al. (1994), Cieniawski et al. (1995), Wagner (1995b), Wang and

Zheng (1997), Aly and Peralta (1999), Cai et al. (2001), Karpouzos et al. (2001), Wang

and Jamieson (2002), and Zheng and Wang (2002).

The GA used here consists of three operators: selection, crossover, and mutation. For

any monitoring network design problem, there are n potential locations. The GA considers

each sampling alternative to be a string (chromosome) consisting of n zero–one variables,

where a value of 1 in the ith digit (bit) represents sampling from the ith location, and 0 no

sampling. The total number of nonzero digits in the string denotes the number of sampling

locations used in the current design.

The first stage of GA is selection. Individual strings (sampling alternatives) are selected

and reproduced on the basis of their fitness values. Strings with high fitness will have a

higher probability of reproducing and contributing offspring to the next generation. For the

monitoring network design problem proposed in Eqs. (7)–(9), the goal is to minimize the

sampling costs while considering the accuracy of global mass estimation and moment

estimation. Thus the fitness function must account for the constraints. To accomplish that,

the fitness measure can be modified by adding the amount of any constraint violation to

the objective function as a penalty:

F ¼ J þ V1 þ V2 ð12Þ

and

V1 ¼ a1errormass � emass

emass

þ a2Nunestimate; emassN0 ð13Þ

V2 ¼ a3 errormoment � emomentð Þ ð14Þ

where F is the fitness value, V1 and V2 are the amounts of constraint violation with respect

to the global mass and moment estimation errors, respectively; Nunestimate is the number of

points at which concentration is not estimated as a result of no sampling data point within

the specified search radius, and ai (i=1, 2, 3) are penalty coefficients.

The selection procedure chooses an interim population representing possible solutions

via a random number generator. This procedure begins by picking out four individual

strings with respective fitness values of Fi (i=1, 2, 3, 4) at random from the initial

population. It then compares the bfitnessQ between F1 and F2, and between F3 and F4,

respectively. Next a copy of the two winners in this process of btournament selectionQ isplaced in a temporary bmating poolQ and subject to the crossover operator described below.The tournament selection is repeated 1

2M times to reproduce a new interim population

where M is the total number of individual strings in the population (i.e., the population

size). As a result, the size of the new interim population is the same as that of the initial

population. The selection process is a key step in the GA that embodies the idea of bfittestsurvivalQ in nature. Accordingly, the selection operator ensures that the best individual

bhas probability 1.0 of prevailing in the competition while the worst and middle-ranking

individuals have probabilities 0 and 0.5, respectivelyQ (McKinney and Lin, 1994).

Page 11: Cost-effective sampling network design for contaminant ... of observation net… · Cost-effective sampling network design for contaminant plume monitoring under general hydrogeological

J. Wu et al. / Journal of Contaminant Hydrology 77 (2005) 41–65 51

Note that a string can be selected more than once and placed in the mating pool. The

next step in a GA is crossover. The crossover operator is performed on each selected pair

described above from the mating pool with a certain probability, referred to as crossover

probability, Pc. There are several ways of doing this. Some commonly used are one-point

crossover, two-point crossover, cycle crossover, and uniform crossover (Pham and

Karaboga, 2000). Since the task of sampling network design is to decide whether a

potential sampling location should be selected or not (i.e., 1 or 0), uniform crossover is

most appropriate for this study. With uniform crossover, a selected bit at a specific position

of one string is exchanged with its corresponding bit at the same position of the other

string. After crossover, a new generation is formed with the same number of strings as in

the previous generation.

The last and final genetic operator is mutation. In this procedure, all strings in the new

population are checked bit-by-bit and the bit values are randomly reversed according to a

specified probability Pm. Mutation can be carried out during the crossover operation. The

new generation formed after selection, crossover and mutation, is expected to contain

individual sampling designs with improved fitness values.

These three operators are conceptually simple. The selection seeks to retain those

feasible sampling designs with high information content. The crossover operator

contributes to improve sampling efficiency by combining the highly informative sampling

plans, and the mutation operator aims to prevent from the irrecoverable loss of information

(Wagner, 1995b). As the generation evolves, the optimal or near-optimal solution

(sampling design) is obtained. Here the number of generations a GA needs to evolve

depends on the number of potential sampling locations, the interpolation method and the

required accuracy of global mass and moment estimates.

Note that the methodology presented in this study identifies the optimal or near-optimal

sampling design for one monitoring period between times tn and tn+1. It can also be used

to solve monitoring network design problems with multiple monitoring periods. For each

monitoring period, the procedures outlined previously can be repeated for different output

times of the flow and transport model. If the ith potential monitoring well is selected for a

particular monitoring period, the capital (installation and drilling) cost is incurred, i.e., the

value of yi in Eq. (7) is 1, for that period. For all other periods, the value of yi will remain 0

since the capital cost should be counted only once. Because the fixed capital cost is

generally higher than the sampling cost at any monitoring location, the sampling design

for one monitoring period affects those of other periods. In other words, the optimal

monitoring network designs for different monitoring periods are not independent.

3. Application to a large-scale field site

The Massachusetts Military Reservation (MMR), located near the town of Falmouth in

Cape Cod, Massachusetts, was established in the early 20th century (Fig. 2). Many

activities included operation of aircraft runaways, aircraft and vehicle maintenance,

landfilling of waste materials and firefighter training have contributed to soil and

groundwater contamination on the MMR. Since MMR was added to the U.S. Environ-

mental Protection Agency’s National Priorities List as a Superfund Site in 1989, more than

Page 12: Cost-effective sampling network design for contaminant ... of observation net… · Cost-effective sampling network design for contaminant plume monitoring under general hydrogeological

Fig. 2. The elements of a pump-and-treat (PAT) system for the CS-10 TCE plume and the interpolation subdomain at the MMR in Cape Cod, Massachusetts. The gray-

scale-filled concentrations represent the peak value of all vertical layers at each horizontal location at the start of the PAT system. The solid dots represent the optimal

pumping well locations. The extracted water, after treatment, is reinjected into the infiltration trenches. The triangles along Sandwich Road denote an existing remedial

well fence (after Zheng and Wang, 2002).

J.Wuet

al./JournalofContaminantHydrology77(2005)41–65

52

Page 13: Cost-effective sampling network design for contaminant ... of observation net… · Cost-effective sampling network design for contaminant plume monitoring under general hydrogeological

J. Wu et al. / Journal of Contaminant Hydrology 77 (2005) 41–65 53

US$200 million had been spent up to 1999 for soil and groundwater cleanup. The cost

estimate for entire cleanup exceeds US$800 million. Projected costs in 1999 dollars for

groundwater remediation alone are evaluated to be greater than US$300 million (Zheng

and Wang, 2002).

The Chemical-Spill 10 (CS-10) site is located in the southeast corner of MMR and was

used by the U.S. Air National Guard for vehicle maintenance and storage. Trichloro-

ethylene (TCE) is the primary contaminant at the CS-10 site. The pump-and-treat (PAT)

remedy was selected as the most feasible approach for groundwater remediation at this site

(Zheng and Wang, 2002). An optimal remedial design for the PAT system was presented

by Zheng and Wang (2002). Fig. 2 illustrates the location of the study site and the elements

of the optimal PAT system for the CS-10 plume. The goal of this study is to develop the

most cost-effective sampling network to monitor the performance of the PAT system as

designed by Zheng and Wang (2002). A total of 160 potential monitoring locations are

assumed for this study (Fig. 3). The total mass and the first and second moments for the

interpolated plume based on the 160 potential monitoring locations are in reasonably close

agreement with those calculated directly from the output of the transport model using all

nodal points within the study area.

Fig. 3. The plan view of all potential sampling locations within the interpolation domain. The concentrations

represent the peak values from model layers 12 to 16 at each horizontal location at the end of the first monitoring

period. Only those concentrations greater than the cleanup level of 5 ppb are shown. Sampling locations outside

the main plume are needed for accurate estimation of the global mass and the first and second spatial moments.

Page 14: Cost-effective sampling network design for contaminant ... of observation net… · Cost-effective sampling network design for contaminant plume monitoring under general hydrogeological

J. Wu et al. / Journal of Contaminant Hydrology 77 (2005) 41–6554

The flow and transport model for the CS-10 site is based on the MODFLOW code

(McDonald and Harbaugh, 1988) and MT3DMS (Zheng and Wang, 1999b). The model

consists of 159 columns, 161 rows and 21 layers, covering an area of approximately 57

km2 (22 square miles). The horizontal grid spacing is 34 m (110 ft) in the vicinity of the

TCE plume, increasing gradually toward the boundaries. The vertical layer thickness

ranges from less than 1.5 m (5 ft) to more than 15 m (50 ft). Key input data used in the

flow and transport model are listed in Table 1. More details of the flow and transport

model for the CS-10 site can be found in Zheng and Wang (2002).

Note that the computational complexity of kriging and IDW interpolation are O(mns3)

and O(mns2), respectively (Reed et al., 2000a), where m is the number of nodes requiring

interpolation and ns is the number of sampled TCE concentrations within the model

domain. Further note that m is much larger than ns. To reduce the computational effort, a

subdomain of the original numerical grid was selected for performing global mass

estimation and moment evaluation. The horizontal subdomain consists of an 81 by 122

grid as shown in Fig. 3. Each grid cell is a square of a uniform spacing, 34 m (110 ft),

along columns and rows. Thus, for each layer, the size of the computational grid was

reduced from 25,599 to 9882 nodes within the interpolation subdomain. In this study the

interpolation subdomain was vertically restricted in five model layers (12–16 from top

down) that contained most of the TCE plume. This field application example only

considers the first monitoring period, assumed to cover the first 5 years since the

beginning of the PAT remediation.

To ensure a normal distribution, the original TCE concentrations were transformed into

logarithmic form. The type of the semivariogram structure for TCE concentrations was

approximately fitted to an exponential model (Deutsch and Journel, 1998):

r hð Þ ¼ c 1� exp � 3h

a

� �ð15Þ

where c is the positive variance contribution value or sill value, a is the effective range (a/

3 is the integral range), and h is the variable denoting the spatial distance between data

along some direction. For this study, the contaminant plume was determined to have a

longer correlated range in the longitudinal direction (along the y axis) than in the

transverse direction (along the x axis). Thus the semivariogram model used for the TCE

plume was anisotropic. The parameter c was determined to be 0.8, and the parameter a

Table 1

Primary input data used in the CS-10 TCE transport modela

Parameter Value

Porosity 0.3

Longitudinal dispersivity 11.0 m (35 ft)

Ratio of horizontal transverse to longitudinal dispersivity 0.1

Ratio of vertical transverse to longitudinal dispersivity 0.01

Effective molecular diffusion coefficient 0

Retardation factor for linear sorption 1.23

First-order decay coefficient for both dissolved and sorbed phases (half-life) 3.16�10�5 day�1 (60 years)

a Adapted from Zheng and Wang (2002).

Page 15: Cost-effective sampling network design for contaminant ... of observation net… · Cost-effective sampling network design for contaminant plume monitoring under general hydrogeological

J. Wu et al. / Journal of Contaminant Hydrology 77 (2005) 41–65 55

was determined to be 1829 m (6000 ft) along the y axis and 1219 m (4000 ft) along the x

axis, respectively.

Note that the use of normal or lognormal transformation implies that the concentrations

are multi-Gaussian distributed. This assumption will result in a bdestructurizationQ effect,which means that the extreme concentrations (near-zero concentrations and plume-center

concentrations) are less spatially correlated (Reed et al., 2004). If the Gaussian assumption

is inappropriate, the assumption of less correlation between the extreme concentrations can

significantly affect the uncertainty evaluation and misrepresent the real site conditions

(Reed et al., 2004).

According to the effective range selected above and the rule recommended by Reed et

al. (2000a), the maximum anisotropic search radii for kriging were defined to be 1829 m

(6000 ft) along the y axis and 1219 m (4000 ft) along the x axis, respectively. Meanwhile,

the search radii for IDW interpolation were set equal to those for kriging to facilitate the

comparison between the two interpolation techniques. In this study, empirically the

slightly high Pm works well even though Pm was recommended as the inverse of the

population size (1/M) (Reed et al., 2000b). All of the GA operators and parameters were

set identically in this study: M=800, Pc=0.85, and Pm=0.08. The number of generations

was set to 80 for all optimization runs except for those mentioned below. In the following

section, the results will be presented and the effects of using different coefficients in

constraints (8) and (9) will be discussed.

4. Results and discussion

In this study, it is assumed that the cost of each sampling (C1li) is US$500, and the

fixed cost (installation/drilling) per well (C2di) is US$20,000. The total mass of

dissolved TCE, masscal, within the model subdomain from layer 12 to layer 16, was

583.7 kg at the end of the first 5-year monitoring period according to the transport

model. Note that the current analysis does not address the uncertainty in the flow and

transport model. The model prediction is assumed to represent the future condition in the

field. Fig. 3 shows the peak concentration value of TCE of model layers 12–16 at each

horizontal location at the end of the first monitoring period.

4.1. Comparison of OK and IDW methods

Fig. 4a shows the plan view of the near-optimal sampling design based on the OK

interpolation scheme.Theparameters for the fitness evaluation functiongiven inEqs. (12)–(14)

are set to a1=5.0�104, a2=5.0, a3=3.0�105, emass = 0.05, and emoment = 0 (parameter set 1,

or PM1). The boptimalQ design includes 55 monitoring wells, and the estimated TCE mass

within the subdomain of interest is 566.3 kg, yielding an estimation error of 2.96%. The

average estimation error for the TCE plume within the subdomain of interest is 0.43% for

the first moments and 34.35% for the second moments. Compared with the initial design

using all 160 potential sampling locations the optimization model reduces the total costs

by 65.6% within the prescribed mass estimation error. The contours of the interpolation

plume for model layer 14 based on the 55 sampling locations are shown in Fig. 4a.

Page 16: Cost-effective sampling network design for contaminant ... of observation net… · Cost-effective sampling network design for contaminant plume monitoring under general hydrogeological

Fig. 4. The comparison of the optimal sampling designs based on OK and IDW methods for parameter set PM1:

(a) OK-based design; and (b) IDW-based design. The solid triangles represent potential monitoring locations. The

circle dots denote the optimal sampling design. The solid line contours show the interpolated concentration

distribution in model layer 14 at the end of the first monitoring period, while the gray-scale contour map

represents the corresponding concentration distribution output by the transport model.

J. Wu et al. / Journal of Contaminant Hydrology 77 (2005) 41–6556

Fig. 4b shows the boptimalQ sampling design of 47 monitoring wells based on the IDW

interpolation scheme; all other parameters are the same as those for kriging. Under the

IDW scheme, the boptimalQ sampling design by the GA reduces the total costs by 70.6%.

However, the total TCE mass interpolated from the 47 wells is equal to 792.1 kg, yielding

an estimation error of 35.7% relative to the calculated mass, masscal, by the transport

model. The average estimation error of the TCE plume within the subdomain of interest is

4.03% for the first moments and up to 128.31% for the second moments. The contours of

the IDW interpolated plume for model layer 14 are also shown in Fig. 4b. From a

comparison of the results for parameter set PM1 in Table 2, it can be seen that the IDW

interpolation method results in greater cost savings, however, the global plume estimation

based on the OK method is more accurate.

The amount of cost savings and accuracy of mass and moment estimation are affected

by the choice of various cost coefficients in the fitness function. When the penalty

coefficients become large enough, the monitoring costs will increase dramatically. Fig. 5

shows that the near-optimal sampling designs obtained under increased penalty

coefficients, a1=5.0�107, a2=5.0, a3=3.0�108 (PM2). Also shown in Fig. 5a and b are

the interpolated TCE plumes for model layer 14 based on the kriging and IDW schemes,

respectively. Under the OK scheme, the global mass estimated for the subdomain of

Page 17: Cost-effective sampling network design for contaminant ... of observation net… · Cost-effective sampling network design for contaminant plume monitoring under general hydrogeological

Table 2

Comparisons of the sampling designs based on OK and IDW interpolation methods and the effects of various

coefficients on optimal sampling designs

Parameter seta Interpolation No. of optimal Potential cost Global plume estimation errors (%) b

method sampling wells savings (%)errormass errormoment

1st errormoment2nd

PM1 OK 55 65.6 2.96 0.43 34.35

IDW 47 70.6 35.70 4.03 128.31

PM2 (a1=5.0�107

and a3=3.0�108)

OK 90 43.8 3.75 0.42 2.10

IDW 71 55.6 3.38 1.59 83.60

PM3 (a3=0) OK 52 67.5 2.89 1.31 61.02

PM4 (emass=0.10) OK 55 65.6 2.96 0.43 34.35

PM5 (emass=0.10

and a3=1.5�105)

OK 47 70.6 12.13 2.15 34.58

PM6 (a2=0) OK 55 65.6 2.96 0.43 34.35

PM7 (a1=0) OK 55 65.6 2.96 0.43 34.35

a PM1 is set to a1=5.0�104, a2=5.0, a3=3.0�105, emass = 0.05, and emoment = 0; and the others (PM2–PM7)

are the same as PM1 except for the specified value in the parentheses. Note that the tolerance emass is specified as

fraction not percentage.b errormoment

1st and errormoment2nd represent the average estimation errors for the first and second moments,

respectively.

J. Wu et al. / Journal of Contaminant Hydrology 77 (2005) 41–65 57

interest is 561.8 kg, resulting in an estimation error of 3.75%. The boptimalQ design

requires 90 monitoring wells, reducing the total costs by 43.8% relative to the total 160

potential wells. The average estimation error of the TCE plume is 0.42% for the first

moments and 2.10% for the second moments. In contrast, the global mass estimated under

the IDW scheme is 564.0 kg, yielding an estimation error of 3.38%. Only 71 monitoring

wells are required, reducing the total costs by 55.6%. However, the average estimation

errors of the TCE plume within the subdomain of interest are 1.59% and 83.61%,

respectively, for the first and second moments.

From the viewpoint of economics, the sampling network design obtained using the

IDW method is more effective than that using kriging. However, the comparison of Fig. 5a

and b indicates that the TCE plume interpolated by kriging (solid lines) agrees with that

output by the transport model (gray-scale-filled), significantly better than that interpolated

by IDW. The reason that the IDW interpolation scheme results in a greater mass estimation

error (Fig. 4b) and less accurate plume configuration (Fig. 5b) is that IDW is a lower-

accuracy interpolation scheme compared to kriging. The cost-effectiveness of a sampling

design should not be attained at the expense of less accuracy in the global mass and

moment estimates. Thus, for large-scale field applications, IDW is not suitable for use as

an interpolation scheme in optimal sampling design. The next section will discuss the

effects of fitness function parameters on the sampling design based on the OK method.

4.2. Effect of fitness function parameters on optimal sampling designs

To compare the results with and without the moment constraints, the penalty

coefficient of moment violations in PM1 is changed to a3=0 (PM3). Without the moment

constraints, the boptimalQ design by the GA includes 52 monitoring wells, and the

estimated TCE mass dissolved within the subdomain of interest is 600.6 kg, yielding an

Page 18: Cost-effective sampling network design for contaminant ... of observation net… · Cost-effective sampling network design for contaminant plume monitoring under general hydrogeological

Fig. 5. The comparison of the optimal sampling designs based on OK and IDW methods for parameter set PM2:

(a) OK-based design; and (b) IDW-based design. All symbols are the same as defined in Fig. 4.

J. Wu et al. / Journal of Contaminant Hydrology 77 (2005) 41–6558

estimation error of 2.89%. Compared with the initial design containing all 160 potential

sampling locations, the near-optimal design reduces the total monitoring costs by 67.5%

within the prescribed mass estimation error. In view of this, there seems no obvious

difference between the designs with and without moment constraints. However, without

moment constraints, the average estimation error of the TCE plume within the study

subdomain is 1.31% for the first moments and 61.02% for the second moments. The

latter is nearly twice the average error of 34.35% with moment constraints. The contours

of the interpolated TCE plume for model layer 14 based on the 52 sampling locations

obtained without moment constraints are shown in Fig. 6. Compared with Fig. 4a, it is

clear that the addition of moment constraints leads to significantly closer agreement

between the interpolated plume (solid lines) and the calculated plume directly output

from the transport model (gray-scale-filled), particularly for the outer contour of 5 ppb.

Thus it is important to add the moment constraints to the optimal sampling design model.

This can be further proved as follows.

If the value of mass estimation error tolerance emass is changed from 0.05 in PM1 to 0.10

in PM4, we get an interesting result showing no difference between the two cases, which can

be seen by comparing Figs. 7a and 4a. But if the penalty coefficient for moment violations

(a3) is decreased from 3.0�105 in PM4 to 1.5�105 in PM5, there are significant differences

between the results for PM 4 and PM5. Fig. 7b shows the near-optimal sampling design

using PM5 and the interpolated TCE plume for model layer 14 as comparedwith that directly

Page 19: Cost-effective sampling network design for contaminant ... of observation net… · Cost-effective sampling network design for contaminant plume monitoring under general hydrogeological

Fig. 6. The optimal OK-based sampling design for parameter set PM3 (without moment constraints). All symbols

are the same as defined in Fig. 4.

J. Wu et al. / Journal of Contaminant Hydrology 77 (2005) 41–65 59

output from the transport model. The boptimalQ sampling design by the GA consists of 47

monitoring wells and the global mass estimate using the 47 sampling locations is 512.8 kg.

This yields an estimation error of 12.13%, which is larger than 2.96% for PM4 (and also

PM1). For PM5 the average estimation error of TCE plume is 2.15% for first moments and

34.58% for second moments. Recall that for PM4 (and also PM1) the average errors for the

first and second moments are 0.43% and 34.35%, respectively. While the average moment

errors for PM4 and PM5 are quite similar, the difference between the global mass estimation

errors is rather substantial due to different penalty coefficients for moment violations. The

effect of such a difference can be clearly seen by comparing Fig. 7a and b.

Furthermore, if the penalty coefficient a2 for PM1 is set to zero in PM6 (i.e., no penalty

for unestimated nodes) and the penalty coefficient a1 for PM1 is set to zero in PM7 (i.e., no

mass constraint), the boptimalQ designs by the GA for both PM6 and PM7 are identical to

that for PM1. The moment constraints, therefore, are necessary and more important than the

mass constraint for the purpose of reducing the monitoring costs while ensuring the

accuracy of contaminant mass and spatial moments based on the sampled data. Conversely,

if proper penalty coefficients for moment violations are selected, the optimal or near-

optimal design can also be obtained without imposing constraints on the mass estimate or

the number of unestimated nodes.

Page 20: Cost-effective sampling network design for contaminant ... of observation net… · Cost-effective sampling network design for contaminant plume monitoring under general hydrogeological

Fig. 7. The comparison of the optimal OK-based sampling designs with the penalty coefficient for moment

violations (a3): (a) sampling design using a3=3.0�105 (as in PM4); and (b) sampling design using a3=1.5�105

(as in PM5). All symbols are the same as defined in Fig. 4.

J. Wu et al. / Journal of Contaminant Hydrology 77 (2005) 41–6560

The result for PM2 demonstrates that the optimization model can find a sampling design

that is accurate in terms of global mass estimation and moment estimation if the penalty

coefficients are several orders of magnitude (approximately three) larger than the capital

and sampling costs of a monitoring well. However, if the penalty coefficients are reduced to

approximately the same order as the capital and sampling costs, the resulting sampling

design will be more cost-effective while preserving the accuracy of global mass and

moment estimates except for the second moments (Fig. 4a). Whether the sampling design

resulting from PM1 or PM2 is used, it is up to the groundwater manager who considers the

trade-off between additional monitoring costs and more accurate mass and plume

estimations. The sampling design of 55 monitoring wells as shown in Fig. 4a seems to

possess the right balance between the cost-effectiveness and the accuracy of the mass and

moment estimations. The design as shown in Fig. 5a is more accurate in terms of the mass

and moment estimation, but it is likely too expensive to be practically useful. The effects of

objective function parameters on OK-based sampling designs are summarized in Table 2.

4.3. Discussion of computational efforts

All computations were carried out on a desktop personal computer (PC) with a 2.20-

GHz Pentium 4 processor and the Windows XP operating system. Given a flow and

Page 21: Cost-effective sampling network design for contaminant ... of observation net… · Cost-effective sampling network design for contaminant plume monitoring under general hydrogeological

J. Wu et al. / Journal of Contaminant Hydrology 77 (2005) 41–65 61

transport model, the run time for a sampling design optimization problem depends on two

primary factors: one is the speed of GA convergence and the other is the choice of

interpolation method.

It is difficult to know when the optimal or near-optimal solution is attained using GA.

Few guidelines have been suggested in the literature for determining the stopping criterion

for GA. Reed et al. (2000a) pointed out that two conditions must be met before the GA

converges. The first condition is that a single subset of the potential sampling locations

must be selected by about 90% of the individuals within the last generation. The second

one requires that all of remaining sampling locations could not be sampled by more than

10% of the individuals within the last generation. Reed et al. (2000b) presented a simple

three-step method for the GA to determine the number of control parameters. They also

proposed the relationships for the population size, convergence rates and genetic drift, and

demonstrated their proposed methodology through a long-term groundwater monitoring

application (Reed et al., 2000a,b). However, these suggested rules are not directly

applicable to this study because a different GA is used. For our study, we have checked

that for PM1 the OK-based sampling design has no improvement in the objective function

from generation 58 to generation 100. For the IDW-based design, the objective function

stops improving even sooner. Thus, to ensure mature convergence, the number of

generations is set to 80 for all optimization runs in this study. Fig. 8 shows the evolutions

of the OK-based and IDW-based objective functions versus the number of generations,

respectively, for the PM1 run. Increasing the population size from 800 to 2000 does not

affect the objective function, but is much more time-consuming.

The second factor is the choice of interpolation method. Using kriging it takes an

average of 50 min for each generation to carry out 800 estimates of global mass, and first

and second moments. Completion of an optimization run requires a total run time of 67 h.

Conversely, the speed of IDW interpolation is so fast that each generation is evaluated in

an average of 7 min and an optimization run is completed in 9 h. In any case, the run time

of less than 70 h for the OK-based model is considered acceptable for real-world

Fig. 8. Evolution of the fitness function values over GA generations using OK and IDW methods under parameter

set PM1.

Page 22: Cost-effective sampling network design for contaminant ... of observation net… · Cost-effective sampling network design for contaminant plume monitoring under general hydrogeological

J. Wu et al. / Journal of Contaminant Hydrology 77 (2005) 41–6562

applications, and the run time can be expected to decrease quickly as desktop computers

become more and more powerful.

5. Conclusions

A new simulation–optimization methodology is developed in this study that

combines a numerical flow and transport simulator, a global mass estimator, a plume

moment estimator, and a genetic algorithm to design cost-effective long-term

monitoring networks under general field conditions. Application of the methodology

to monitoring of plume evolution during a pump-and-treat operation at the MMR site

indicates that potential cost savings up to 65.6% may be achieved without significant

loss of accuracy in global mass and plume moment estimations. The methodology can

also be used to consider the trade-off between the conflicting demands for reduced

monitoring costs and increased accuracy in mass and plume moment estimations.

For the monitoring network design model presented in this paper, the IDW-based

interpolation method is computationally more efficient than the OK-based method and

results in more cost-effective monitoring designs. However, the OK-based method

leads to more accurate mass and plume moment estimations. The plume interpolated

by IDW based on sampled data can deviate significantly from the plume output by the

transport model. Thus the IDW-based method is not recommended for solving the

groundwater monitoring network design problem under general field conditions.

Although the optimization model using the OK-based interpolation method is more

time-consuming, it is still very feasible for today’s desktop PCs.

This study shows that the first and second moment constraints are more important

than the mass (i.e., zeroth moment) constraint. Without the moment constraints, the

plume interpolated from the sampled data cannot sufficiently match the plume output

by the transport model, thus failing the very purpose of plume monitoring. If the

penalty coefficient for the moment constraints is set appropriately, an optimal or near-

optimal design can be reached without the mass constraint. The effects of various

coefficients for the objective function on sampling design indicate that the penalty costs

can be set approximately 5–20 times the expected real monitoring costs to obtain an

optimal or near-optimal sampling design that is both cost-effective and sufficiently

accurate in terms of mass and moment estimations. Also, it is necessary to set the error

tolerances for the plume moment constraints as small as possible. Because the number

of sampled data is always much smaller than the total number of model nodes in a

numerical simulation model, it is rather difficult to reduce the second moment

estimation errors. Finally, it should be pointed out that considerable expertise and time

may be required to determine the plume semivariogram needed for kriging-based

interpolation.

It is noteworthy that the solutions to the sampling design optimization model for this

study are based on a single-objective GA. One run of the single-objective GA model can

only produce a unique sampling network design that is the most cost-effective subject to

a prescribed set of constraints. Reed et al. (2001) proposed the use of a multi-objective

GA such as the Non-dominated Sorting Genetic Algorithm (NSGA) to determine a trade-

Page 23: Cost-effective sampling network design for contaminant ... of observation net… · Cost-effective sampling network design for contaminant plume monitoring under general hydrogeological

J. Wu et al. / Journal of Contaminant Hydrology 77 (2005) 41–65 63

off curve between the optimal cost-effectiveness and the reduced global plume estimation

error. The advantage of a multi-objective GA is appealing, but its applicability to a large

field-scale problem should be tested and evaluated in future studies.

Further research should consider the sampling frequency as well as other techniques of

plume interpolation. Future work will also investigate the uncertainties in the flow and

transport model so that robust monitoring strategies can be developed for the range of

possible conditions existing in the subsurface. In addition, contaminant source

identification using an existing network of monitoring wells will be addressed. Important

issues, such as whether the existing monitoring wells are adequate or where new wells

should be added, will be explored.

Acknowledgments

This study is supported in part by DuPont Company and the National Natural Science

Foundation of China (Nos. 40002022 and 40472130). We are grateful to Gaisheng Liu of

the University of Alabama who provided the initial source code for computing the spatial

moments of a three-dimensional contaminant plume. We also thank Albert Valocchi,

Patrick Reed, and two anonymous reviewers whose constructive comments have led to

significant improvement of this manuscript.

References

Adams, E.E., Gelhar, L.W., 1992. Field study of dispersion in a heterogeneous aquifer: 2. Spatial moments

analysis. Water Resour. Res. 28 (12), 3293–3307.

Ahlfeld, D.P., Mulvey, J.M, Pinder, G.F., 1988. Contaminated groundwater remediation design using simulation,

optimization, and sensitivity theory, 2. Analysis of a field site. Water Resour. Res. 24 (5), 443–452.

Aly, A.H., Peralta, R.C., 1999. Optimal design of aquifer cleanup systems under uncertainty using a neural

network and a genetic algorithm. Water Resour. Res. 35 (8), 2523–2532.

Andricevic, R., 1993. Coupled withdrawal and sampling designs for groundwater supply models. Water Resour.

Res. 29 (1), 5–16.

Andricevic, R., 1996. Evaluation of sampling in the subsurface. Water Resour. Res. 32 (4), 863–874.

Bear, J., Sun, Y., 1998. Optimization of pump-treat-inject (PTI) design for the remediation of a contaminated

aquifer: multi-stage design with chance constraints. J. Contam. Hydrol. 29, 225–244.

Bogaert, P., Russo, D., 1999. Optimal spatial sampling design for the estimation of the variogram based on a

squares approach. Water Resour. Res. 35 (4), 1275–1289.

Cai, X., Mckinney, D.C., Lasdon, L.S., 2001. Solving nonlinear water management models using a combined

genetic algorithm and linear programming approach. Adv. Water Resour. 24, 667–676.

Chien, C.C., Medina Jr., M.A., Pinder, G.F., Reible, D.R., Sleep, B.E., Zheng, C. (Eds.), 2002. Environmental

Modeling and Management: Theory, Practice and Future Directions. Today Media.

Christakos, G., Killam, B.R., 1993. Sampling design for classifying contamination level using annealing search

algorithm. Water Resour. Res. 29 (12), 4063–4076.

Cieniawski, S.E., Eheart, J.W., Ranjithan, S., 1995. Using genetic algorithm to solve a multiobjective

groundwater monitoring problem. Water Resour. Res. 31 (2), 399–409.

Culver, T.B., Shoemaker, C.A., 1992. Dynamic optimal control for groundwater remediation with flexible

management periods. Water Resour. Res. 28 (3), 629–641.

Deutsch, C.V., Journel, A.G., 1998. GSLIB: Geostatistical Software Library and User’s Guide, 2nd ed. Oxford

Univ. Press, New York.

Page 24: Cost-effective sampling network design for contaminant ... of observation net… · Cost-effective sampling network design for contaminant plume monitoring under general hydrogeological

J. Wu et al. / Journal of Contaminant Hydrology 77 (2005) 41–6564

Eggleston, J.R., Rojstaczer, S.A., Peirce, J.J., 1996. Identification of hydraulic conductivity structure in sand and

gravel aquifer: Cape Cod data set. Water Resour. Res. 32 (5), 1209–1222.

Ezzedine, S., Rubin, Y., 1997. Analysis of the Cape Cod tracer data. Water Resour. Res. 33 (1), 1–11.

Fabbri, P., 1997. Transmissivity in the geothermal Euganean basin: a geostatistical analysis. Ground Water 35 (5),

881–887.

Freyberg, D.L., 1986. A natural gradient experiment on solute transport in a sand aquifer: 2. Spatial moments and

the advection and dispersion of nonreactive tracers. Water Resour. Res. 22 (13), 2031–2046.

Garabedian, S.P., LeBlanc, D.R., Gelhar, L.W., Celia, M.A., 1991. Large-scale natural gradient tracer test in sand

and gravel, Cape Cod, Massachusetts: 2. Analysis of spatial moments for a nonreactive tracer. Water Resour.

Res. 27 (5), 911–924.

Goldberg, D.E., 1989. Genetic Algorithm in Search, Optimization and Machine Learning. Addison Wesley,

Reading, MA.

Gorelick, S.M., 1983. A review of distributed parameter groundwater management modeling method. Water

Resour. Res. 19 (2), 305–319.

Huang, C., Mayer, A.S., 1997. Pump-and-treat optimization using well locations and pumping rates as decision

variables. Water Resour. Res. 33 (5), 1001–1012.

Hudak, P.F., Loaiciga, H.A., 1992. A location modeling approach for groundwater monitoring network

augmentation. Water Resour. Res. 28 (3), 643–649.

Hudak, P.F., Loaiciga, H.A., 1993. An optimization method for monitoring network design in multilayered

groundwater flow system. Water Resour. Res. 29 (8), 2835–2845.

Isaaks, E.H., Srivastava, R.M., 1989. An Introduction to Applied Geostatistics. Oxford Univ. Press, New York.

James, B.R., Gorelick, S.M., 1994. When enough is enough: the worth of monitoring data in aquifer remediation

design. Water Resour. Res. 30 (12), 3499–3513.

Karatzas, G.P., Pinder, G.F., 1993. Groundwater management using numerical simulation and the outer-

approximation method for global optimization. Water Resour. Res. 25 (10), 2245–2258.

Karpouzos, D.K., Delay, F., Katsifarakis, K.L., de Marsily, G., 2001. A multipopulation genetic algorithm to

solve the inverse problem in hydrogeology. Water Resour. Res. 37 (9), 2291–2302.

Lavenue, A.M., Pickens, J.F., 1992. Application of a coupled adjoin sensitivity and kriging approach to calibrate a

groundwater flow model. Water Resour. Res. 28 (6), 1543–1569.

Lee, Y.-M., Ellis, J.H., 1996. Comparison of algorithms for nonlinear integer optimization: application to

monitoring network design. J. Environ. Eng. 122 (6), 524–531.

Loaiciga, H., 1989. An optimization approach for groundwater quality monitoring network design. Water Resour.

Res. 25 (8), 1771–1780.

Mayer, A.S., Kelley, C.T., Miller, C.T., 2001. Optimal design for problems involving flow and transport

phenomena in saturated subsurface systems. Adv. Water Resour. 25, 1233–1256.

McDonald, M.G., Harbaugh, A.W., 1988. A modular three-dimensional finite-difference ground water flow

model. USGS Techniques of Water Resources Investigations, Book vol. 6.

McKinney, D.C., Lin, M.D., 1994. Genetic algorithm solution of groundwater management models. Water

Resour. Res. 30 (6), 1897–1906.

McKinney, D.C., Loucks, D.P., 1992. Network design for predicting groundwater contamination. Water Resour.

Res. 28 (1), 133–147.

Meyer, P.D., Valocchi, A.J., Eheart, J.W., 1994. Monitoring network design to provide initial detection of

groundwater contamination. Water Resour. Res. 30 (9), 2647–2659.

Meyer, P.D., Eheart, J.W., Ranjithan, S., Valocchi, A.J., 1995. Design of groundwater monitoring networks for

landfills. In: Kundzewicz, Z.W. (Ed.), Proceedings of the International Workshop on New Uncertainty

Concepts in Hydrology and Water Resources. Cambridge Univ. Press, pp. 190–196.

Minsker, B.S., Shoemaker, C.A., 1998. Dynamic optimal control of in-situ bioremediation of ground water.

J. Water Resour. Plan. Manage. 124 (3), 149–161.

Montas, H.J., Mohtar, R.H., Hassan, A.E., AlKhal, F.A., 2000. Heuristic space–time design of

monitoring wells for contaminant plume characterization in stochastic flow fields. J. Contam. Hydrol.

43, 271–301.

Olea, R.A., 1999. Geostatistics for Engineers and Earth Sciences. Kluwer Academic Publishers, Boston.

Pham, D.T., Karaboga, D., 2000. Intelligent Optimization Techniques: Genetic Algorithms, Tabu Search,

Simulated Annealing and Neural Networks. Springer-Verlag, New York.

Page 25: Cost-effective sampling network design for contaminant ... of observation net… · Cost-effective sampling network design for contaminant plume monitoring under general hydrogeological

J. Wu et al. / Journal of Contaminant Hydrology 77 (2005) 41–65 65

Reed, P.M., Minsker, B.S., Valocchi, A.J., 2000a. Cost-effective long-term groundwater monitoring design using

a genetic algorithm and global mass interpolation. Water Resour. Res. 36 (12), 3731–3741.

Reed, P.M., Minsker, B.S., Goldberg, D.E., 2000b. Designing a competent simple genetic algorithm for search

and optimization. Water Resour. Res. 36 (12), 3757–3761.

Reed, P.M., Minsker, B.S., Goldberg, D.E., 2001. A multiobjective approach to cost effective long-term

groundwater monitoring using an elitist nondominated sorted genetic algorithm with historical data. J.

Hydroinform. 3 (2), 71–89.

Reed, P.M., Ellsworth, T.R., Minsker, B.S., 2004. Spatial interpolation methods for nonstationary plume data.

Ground Water 42 (2), 190–202.

Ridley, M.N., Johnson, V.M., Tuckfield, R.C., 1995. Cost-Effective Sampling of Groundwater Monitoring Wells.

Lawrence Livermore National Laboratory, Livermore, CA. UCRL-JC-118909.

Ritzel, B.J., Eheart, J.W., Ranjithan, S., 1994. Using genetic algorithms to solve a multiple-objective groundwater

pollution containment problem. Water Resour. Res. 30 (5), 1589–1603.

Rizzo, D.M., Dougherty, D.E., 1996. Design optimization for multiple management period groundwater

remediation. Water Resour. Res. 32 (8), 2549–2561.

Rouhani, S., 1985. Variance reduction analysis. Water Resour. Res. 21 (6), 837–846.

Rouhani, S., Hall, T.J., 1988. Geostatistical schemes for groundwater sampling. J. Hydrol. 103, 85–120.

Sen, M., Stoffa, P.L., 1995. Global Optimization Methods in Geophysical Inversion. Elesevier Science

Publishers B.V.

Smalley, J.B., Minsker, B., Goldberg, D.E., 2000. Risk-based in situ bioremediation design using a noisy genetic

algorithm. Water Resour. Res. 36 (10), 3043–3052.

Srivastava, R., Brusseau, M.L., 1996. Nonideal transport of reactive solutes in heterogeneous porous media: 1.

Numerical model development and moment analysis. J. Contam. Hydrol. 24, 117–173.

Tiedeman, C., Gorelick, S.M., 1993. Analysis of uncertainty in optimal groundwater contaminant capture design.

Water Resour. Res. 29 (7), 2139–2154.

Vereecken, H., Doring, U., Hardelauf, H., Jaekel, U., Hashagen, U., Neuendorf, O., Schwarze, H., Seidemann, R.,

2000. Analysis of solute transport in a heterogeneous aquifer: the Krauthausen field experiment. J. Contam.

Hydrol. 45, 329–358.

Vogely, W.A., Sani, E., Monzon, P.G., 1978. The application of kriging methods to oil and gas resource

estimation. Dept. Mineral Economics, Pennsylvania State Univ., submitted to the U.S. Dept. Energy.

Vucetic, S., Fiez, T., Obradovic, Z., 2000. Examination of the influence of data aggregation and sampling density

on spatial estimation. Water Resour. Res. 36 (12), 3721–3730.

Wagner, B.J., 1995a. Recent advances in simulation–optimization groundwater management modeling. U.S.

national report to international union of geodesy and geophysics 1991–1994, Review of Geophysics,

pp. 1021–1028. Supplement.

Wagner, B.J., 1995b. Sampling design methods for groundwater modeling under uncertainty. Water Resour. Res.

31 (10), 2581–2591.

Wagner, B.J., Gorelick, S.M., 1989. Reliable aquifer remediation in the presence of spatially variable hydraulic

conductivity: from data to design. Water Resour. Res. 25 (10), 2211–2225.

Wang, C.G., Jamieson, D.G., 2002. An objective approach to regional wastewater treatment planning. Water

Resour. Res. 38 (3).

Wang, M., Zheng, C., 1997. Optimal remediation policy selection under general conditions. Ground Water 35 (5),

757–764.

Zheng, C., Wang, P.P., 1999a. An integrated global and local optimization approach for remediation system

design. Water Resour. Res. 35 (1), 137–148.

Zheng, C., Wang, P.P., 1999b. MT3DMS: a modular three-dimensional multispecies transport model for

simulation if advection, dispersion and chemical reactions of contaminants in ground water systems:

documentation and user’s guide. Contract Report SERDP-99-1, U.S. Army Engineer Research and

Development Center, Vicksburg, Mississippi (available at http://www.hydro.geo.ua.edu/mt3d).

Zheng, C., Wang, P.P., 2002. A field demonstration of the simulation–optimization approach for remediation

system design. Ground Water 40 (3), 258–265.

Zhu, X.Y., Xu, S.H., Zhu, J.J., Zhou, N.Q., 1997. Study on the contamination of fracture-karst water in Boshan

District, China. Ground Water 35 (3), 538–545.