modeling aerosol puff concentration distributions · 2005. 2. 14. · figure 4.8 figure 4.9 figure...

MODELING AEROSOL PUFF CONCENTRATION DISTRIBUTIONS

FROM POINT SOURCES

USING ARTIFICIAL NEURAL NETWORKS

Timothy J. De Vito

A Thesis Submitted to the Faculty of the Royal Military College of Canada

In Partial Fdfillment of the Requirements for the Degree of

Master of Engineering in Chernical Engineering

July 2000

8 This thesis may be used within the Department of National Defence, but copyright for open publication remains the property of the author.

National Libraiy Bibliothbque nationale du Canada

Acquisitions and Acquisitions et Bibliog raphic Services services bibliographiques 395 Wellington Street 395. nie WeUington OîtawaON KlAON4 Ottawa ON K l A W Canada CaMda

The author has granted a non- exclusive licence allowing the National Library of Canada to reproduce, loan, disîriiute or sell copies of th is thesis in microform, paper or electronic formats.

The author retains ownership of the copyright in this thesis. Neither the thesis nor substantial extracts fkom it may be printed or othexwise reproduced without the author's permission.

L'auteur a accordé une licence non exclusive permettant à la Bibliothèque nationale du Canada de reproduire, prêter, distribuer ou vendre des copies de cette thèse sous la forme de microfiche/nlm, de reproduction sur papier ou sur format électronique.

L'auteur conserve la propxiété du droit d'auteur qui protège cette thèse. Ni la thèse ni des extraits substantiels de celle-ci ne doivent être imprimés ou autrement reproduits sans son autorisation.

Acknowledgements

1 would like to express my gratitude to those who supported me throughout this

research project, without whose help none of this would have been possible. 1 am

particularly indebted to Dr. W. S. Andrews at the Royal Military Coilege of Canada for

providing the oppomuiity to underîake this project, and for his unwavering confidence in

me throughout the course of this work.

The personnel at the Defence Research Establishment Valcartier must aiso be

gratefully acknowledged. They were instrumental in the collection of these data, and

went out of their way to offer me every bit of assistance possible. Special th& goes to

Gilles Roy, who took time out of his busy schedule to answer my questions, conduct field

trials, and make me feel pedectly at home in Valcartier. Additional th& go to Jean-

Marc Thériault, Luc Bissonnette and Sylvain Cantin for their insights and assistance.

1 would also Wte to thank the Defence Research and Development Branch

(DRDB) for awarding me with the DRDB-RMC Fellowship. 1 am also grateful to the

Royal Canadian Regiment Trust, who provideci support in the fom of the Milton Fowler

Gregg VC Memorial Tnist Fund Bursary.

Financial support which made this work possible is gratefully aclmowledged fiom

the Academic Research Program and the Director General Nuclear Safety, both agencies

of the Department of National Defence.

Abstract

A series of over 50 field trials has been conducted in order to detennine the

concentration distributions within aerosol p u f i resulting fiom near-instantaneous

releases under atmospheric conditions falling within Pasquill stability classes A and B

(very unstable and moderately unstable, respecîively). The aerosol examined was kaolin,

an inert, non-buoyant ceramic having particle size less than 3 p. Concentration

measurements were made using the Defence Research Establishment Valcartier laser

cloud mapper (LCM), a scanning lidar systern operating at 1 .O6 p.

Artificial neural network (ANN) models were developed using the LCM data to

predict concentration distributions given a number of easily measured meteorological

parameters. ANN model results were compared to those f?om traditional Gaussian puff

models, including the US Army Research Laboratory's Gaussian-based dispersion model

COMBIC. ANN models provided significantly better predictions than the Gaussian pufY

models. The predicted concentration distributions of one of the ANN models were

parameterized as a function of wind speed and diffusion tirne, and a tilted Gaussian puff

model was developed. Simple analytical expressions were derived for the dispersion

lengths and puff tilt angle. This parameterized model provided better predictions than

traditional Gaussian puff models.

Table of Contents

Page

............................................................................................................... List of Figures vi

List of Tables .................................................................................................................. x

.. List of Symbols ............................................................................................................. xu

1 Introduction ............................................................................................................... 1

........................................ 1 . 1 Aerosol Dispersion in the Planetary Boundary Layer 2

1.2 Application of Lidar to Aeroso1 Dispersion Experiments .................................. 3

................................................................................. 1.3 Artificial Neural Networks 6

........................................................................................ 1.4 State of the Discipline 8

1.4.1 Dispersion Modeling ................................................................................... 8

.......................................................................................... 1.4.2 Lidar Inversion 12

............................. 1.4.3 N e 4 Network Applications to Dispersion Modeling 14

............................................................................................. 1 -5 Thesis Objectives 16

2 Background and Theory ....1......................................C.o..o............t.......................... 17

...................................................................... 2.1 The Gaussian Dispersion Mode1 17

2.1.1 Stability Classification Schemes and Dispersion Coefficient

.................................................................................... Parameterizations 23

................................................................................. 2.1.2 The COMBIC Mode1 27

.................................................................................. 2.2 The Laser Cloud Mapper 30

................................................................................. 2.2.1 LCM S pecifications 30

................................................................ .................... 2.2.2 Lidar Inversion -.. 32

2.3 Artificial Neural Networks ............................................................................... 43

..... 2.3.1 Multi-Layer Feed Forward Networks and Backpropagation Leanhg 45

............................................ 2.3 -2 Genaalization and Separaîion of Data Sets 55

3 Collection and Analysis of Data ............................................................................. 57

3.1 Experimentai Setup and Collection of Data ..................................................... 57

3.2 Analysis of Inverted LCM Scans ................................................................... 60 3.3 Preparation of Data for ANN Modeling ....................................................... 64

3.4 Gaussia. Puff Modeling ................................................................................. 70

4 Resuïts and Discussion ............................................................................................ 74

4.1 ANN Mode1 Developrnent ............................................................................. 74

4.2 Comparing ANN Models with Gaussian Puff Models .................................... 81

4.3 Sensitivity Andysis ........................................................................................ 92

4.3.1 Disabledhputs .......................................................................................... 92

4.3.2 Input Dithering ......................................................................................... 9 7

4.3.3 Analysis of Mode1 Residuals ................................................................... 100

..................................... 4.4 ANN Mode1 Concentration Distribution Predictions 102

........................ 4.4.1 Horizontal Concentration Distributions .............. .,. .......... 103

....................................................... 4.4.2 Vertical Concentration Distributions 107

4.4.3 Cloud Spread ........................................................................................... 114

4.4.4 ANN Mode1 1 A Parameterization ......................................................... 117

5 ConcIusions ............................................................................................................ 123

5.1 Summary and Conclusions ............................................................................ 123

5.2 Recornmendations ................... ... ............................................................... 127

List of References ....................................................................................................... 130

.................................. Appendix A Meteorological Measurements and Estimates 135

Appendix B Neural Network Performance Statistics ............................................ 138

Appendix C Mode1 Concentration Prediction Scatter Plots ................................. 141

Vita .............................................................................................................................. 151

Figure 2.1

Figure 2.2

Figure 2.3

Figure 2.4

Figure 2.5

Figure 2.6

Figure 2.7

Figure 2.8

Figure 2.9

Figure 2.1 O

A cornparison between Gaussian puff (top) and plume (bottom) diffusion, showing concentration contour surfaces. ......................... .2 1

The effect of atmospheric stability on the dispersion of plumes. The adiabatic lapse rate is shown as a dashed Iine, while typical vertical temperature profiles are shown as solid lines for (a) unstable conditions, (b) neutral conditions, and (c) stable conditions (Tumer, 1994). ..................................................... -24

Slade parameterization of dispersion coefficients as fiinctions of downwind travel distance fiom the source; (a) Pasquill stability class A (very unstable); (b) Pasquill stability class B (moderately unstable) (CCPS, 1996). ......................................... .29

Pasquill parameterization of dispersion coefficients as fuxictions of downwind travel distance fiom the source; shown are the modifications used in COMBIC, equations (2) to (4); (a) Pasquill stability class A (very unstable); (b) Pasquill stability class B (moderately unstable) (Ayres and Desutter, 1 996). ............................................................................. -29

................................... Raster scanning pattern used b y the LCM. .3 1

................................... The basic structure of a biological neuron. .43

................................ The basic structure of a processing element. -44

A rnulti-layer feed-forward ANN with two hidden layers (Haykin, 1994). ................................................................. .46

.................................. MLFF network connections and variables.. .48

Common tramfer functions: (top) the hyperbolic tangent, ............................................... (bottom) the sigmoid hct ion. -53

vii

Figure 3.1

Figure 3.2

Figure 3.3

Figure 3.4

Figure 3.5

Figure 4.1

Figure 4.2

Figure 4.3

Figure 4.4

Figure 4.5

Figure 4.6

Figure 4.7

.................................... Layout of the experimental triai plateau.. .58

Tiie bottom sweep of a typical LCM scan shown in raw fonn (bottom) and inverted (top), displayed using LCVS (bird's eye view). Radial grid lines are spaced about 50 m apart; azimuth grid lines are 10" apart. Note the reduced noise in the inverted scan. The strong return across the top is the sand dune. ................. .6 1

Data set 1 trainhg set fiequency distributions after transfomation and duplication. ............................................... -67

Data set 1 test set fiequency distributions after transformation and duplication. ................................................................. .68

The 'image source' accounts for surface interactions of the puff by reflecting material off the ground. ......................................... .7 1

Leamhg c w e s of ANNs 2 . 2 ~ and 2.7~. Also indicated in the figure are the points during îraining where RMS improved, and

........... the ANN was automatically saved by the SaveBest command. .79

Scatter plot for ANN model 1A over the (a) test set and (b) validation set. .................................................................... -86

Cornparison of geometric variance and mean bias between the ........... various models for data set 1 (a) test set and (b) validation set. .88

Cornparison of geometric variance and mean bias between the various models for data set 2 (a) test set and (b) validation set. ........... .89

Change in ANN 1A performance over (a) the training set and (b) the test set. Percentage decrease in R and increase in RMS are shown. Emor bars indicate the standard deviation among the 20 A N N s making up model 1A. ( t h e indicates the of day) ................................................................................ -94

Change in ANN 2A performance over (a) the training set and (b) the test set. Percentage decrease in R and increase in RMS are shown. Error bars indicate the standard deviation among the 20 ANNs making up model 1A. (time indicates t h e of &y) ................................................................................ .95

Results of dithering inputs by 5% over the test set for (a) model 1A and (b) model SA. Error bars indicate standard deviation

................................ among the 20 ANNs making up each model. -98

Figure 4.8

Figure 4.9

Figure 4.10

Figure 4.1 1

Figure 4.12

Figure 4.1 3

Figure 4.14

Figure 4.15

Figure 4.1 6

Figure 4.17

Figure 4.1 8

ANN model 1A residuals vs. (a) diffbsion tirne, (b) wind speed, (c) temperature, (d) time of &y, (e) Pasquill stability class and (f) pressure.. .......................................................... .10 1

ANN model 2A residuals vs. (a) diffusion t h e , (b) wind speed, (c) temperature, (d) time of day, (e) Pasquill stability class and (f) pressure.. .......................................................... - 1 O I

ANN model 1A predictions for a horizontal slice through the puff centre at A, shown (a) 10 seconds and @) 30 seconds d e r release. ................................................................... .104

ANN model 1A predictions and fitted Gaussian curves for profiles dong r0, y=O (lefi) and A, x=O (right). Dif i ion times (a) 10 seconds and (b) 30 seconds are shown. Note:

........... these are 1 -D profiles of the surfaces shown in Figure 4.10. -105

Model 1 A nonnalized concentration contours for vertical cross-sections through the puff centre w), shown (a) 20 s and (b) 40 s after release. ..................................................... - 1 08

Model 2A nonnalized concentration contours for vertical cross-sections îhrough the puff centre @=O), shown (a) 20 s and @) 40 s afier release. ..................................................... .109

Model 1A predictions of puff tilt angle, shown at various times d e r release for (a) T=19 OC, @) T=22 OC. Note the good linear fit and the slow decay of tilt angle with d i f i i on time

..................................................................... (-1.0 mk). 1 1

Model 2A predictions of pufT tilt angle, shown at various times afkr release for (a) T=1 9 OC, (b) T=22 OC. Note the increasing puff tilt with diffusion time for low T. ...................................... .112

Evolution of dispersion coefficients with diffusion time for wind speed (a) 1 .O m/s, (b) 1.5 m/s and (c) 2.5 m/s. ....................... . I l5

Vertical profiles of downwind and crosswind dispersion lengths as predicted by model 2A. Shown for wind speed of 1.0 mls, 10 seconds after release. Remaining inputs are the same as those in Figure 4.10 above. ......................................... - 1 16

Variation of puff tilt angle with diffusion time and wind speed, shown for T=19 OC ........................................................... .119

Figure C. 1

Figure C.2

Figure C.3

Figure C.4

Figure CS

Figure C.6

Figure C.7

Figure C.8

Figure C.9

Scatter plots for ANN model 1A over (a) the test set and (b) the ................................................................. validation set. .142

Scatter plots for ANN model 2A over (a) the test se? and (b) the .................................... ......................... validation set. ... - 1 43

Scatter plots for ANN model 1B over (a) the test set and (b) the ................................................................. validation set. 144

Scatter plots for ANN model 2B over (a) the test set and (b) the ................................................................. validation set. -145

Scatter plots for data set L GPMs (Slade) over (a) the test set ..................................................... and (b) the validation set. -146

Scatter plots for data set 1 GPMp (Pasquill) over (a) the test set ..................................................... and (b) the validation set. .147

Scatter plots for data set 2 GPMs (Slade) over (a) the test set ..................................................... and (b) the validation set. .148

Scatter plots for data set 2 GPMp (Pasquill) over (a) the test set ..................................................... and (b) the validation set. -149

Scatter plots for COMBIC over (a) the test set and (b) the .................................................................. validation set. 150

List of Tables

Table 2.1

Table 3.1

Table 3.2

Table 4.1

Table 4.2

Table 4.3

Table 4.4a

Table 4.4b

Table 4.5a

Table 4.5b

Table 4.6

Detennining the Pasquill stability category (Pasqui11 and Smith, 1983). ................................................................... ..25

Total number of vectors in the îrainhg, test and validations sets of data sets 1 and 2.. ............................................................. -66

COMBIC Input card used to mode1 kaolin trial 18, scan 5. .............. .73

The best networks for each architecture listed in Table B.1 (data set 1, relative-z coordinates), as determined by performance against the test set. Validation set statistics are also shown (RMS = root-mean-square error, R = linear wrrelation coefficient). ........................................................ .76

The best networks for each architecture listed in Table B.2 (data set 2, absolute-z coordinates), as detennined by performance against the test set. Validation set statistics are also shown. (RMS = root-mean-square error, R = linear correlation coefficient). ........................................................ -76

Mean test set staîistics for the two architectures for each data set. The error tem indicates the standard deviation between the 20 ANNs in each group. .................................................. -80

Mode1 cornparison over data set 1 test set. ................................... .83

Mode1 cornparison over data set 1 validation set. .......................... .83

Mode1 cornparison over data set 2 test set. ..................... .. .......... .84

Mode1 cornparison over data set 2 validation set. .......................... .84

Fitting parameters for the dispersion coefficients as linear functions of diffusion time and wind speed.. ............................... .120

................................ Table 4.7a Mode1 cornparison over data set 1 test set. .12 1

.......................... Table 4.7b Model cornparison over data set 1 validation set. .12 1

Table A. 1 Summary of measurements taken during the kaolin trials .............. .136

Table A.2 Calculated wind speed and Pasquill Stability Class ....................... .137

Table B.l Reliminary data set 1 ANNs. Effect of varying the network architecture and epoch size on network performance against the test set. The listed statistics are root mean square (RMS) and linear correlation coefficient (R) between predicted and target

....................................................................... outputs. - 1 3 8

Table B.2 Reliminary data set 2 ANNs. Effect of varyllig the network architecture and epoch size on network performance against the test set. The listed statistics are root mean square (RMS) and linear correlation coefficient (R) between predicted and target

....................................................................... outputs. - 1 3 9

Table B.3 Results of training 20 ANNs with two different architectures on each data set. Each net was initialized to a different random

........................................................ point in weight space. -140

List of Symbols

A

A

a

a

a

a

ANN

B

b

b

P

P (r)

C

C

C

C

C

Pasquill stability class: very unstable

receiver area (m)

fitting parameter for COMBIC parameterization

fitting parameter for model 1 A parameterization

mass extinction coefficient (m2/g)

momenturn coefficient

Artificial Neural Network

Pasquill stability class: moderately unstable


fitting parameter for model I A parameterization

puff tilt angle

elastic backscattering coefficient (m-'sr-')

Pasquill stability class: slightly unstable

aerosol concentration (@m3)

clear-air shot returned signal strength (W)


fitting parameter for model 1 A panimeterization

C

C o

COMBIC

CP

D

d

6

6

DREV

Avi

Aw

E

E

F

F(r)

f(

FI0

F2

GPMp

GPMs

4 I

speed of light ( d s )

observed concentration (g/m3)

Combined Obscuration Model for Battlefield Induced Contaminants

predicted concentration (dm3)

Pasquill stability class: neutrai

backscaîter to extinction ratio

output layer weight update factor

hidden layer weighî update factor

Defence Research Establishment Valcartier

hidden layer weight adjustment

output layer weight adjustment

Pasquill stability class: slightly stable

neural network global error

Pasquill stability class: moderately stable

lidar system constant

non-hear tramfer function

fiaction of predictions within a factor of ten of measurement

hct ion of predictions within a factor of two of measurement

Gaussian puff model with Pasquill parameterization

Gaussian puff model with Slade parameterization

hidden layer PE summation

output layer PE summation

received intensity (w/m2)

intensiîy of source emission (w/m2)

backscatter to extinction exponent

wavelength (m)

Laser Cloud Mapper

Light Detection and Rauging

Genmetric Mean Bias

Pasquill stability class

Received signai power (W)

transmitted pulse power (W)

atmosphezic pressure (in Hg)

aerosol release mass (g)

radiai distance (m)

correlation coefficient

root mean square

volumetrîc extinction coefficient (m-')

clear-air extinction coefficient (m-' )

downwind dispersion coefficient (m)

crosswind dispersion coefficient (m)

vertical dispersion coefficient (m)

diffusion time (s)

target output value

ambient temperature ("C)

pulse length (s)

optical depth

mean wind speed ( d s )

geometric mean variance

hidden layer weight

output layer weight

relative downwind distance (m)

rh input variable

downwind distance fiom source (m)

downwind puff centroid position (m)

relative crosswind distance (m)

learning coefficient

hidden PE output value

relative vertical distance (m)

absolute vertical distance (m)

output PE value

effective release height (m)

Chapter 1

Introduction

Aerosol dispersion modeling is concemed with predicting the concentration

distribution of a contaminant introduced into the atmosphere and its subsequent

dispersion downwind. Most of the work in this field to date has concerned the dispersion

of a pollutant fkom a continuous release, such as tiom a smokestack or evaporating pool.

However, the dispersion fiom a nearly instantaneous release has received much less

attention, both in theoretical treatment and in experimental trials.

Predicting the dispersion of instantaneous releases has numerous applications of

great importance in environmental science, indusby and the military, including

detennining the effects of an accidental release of toxic or radioactive materiai, both on

the surroundhg environment, and on the health of the public. A good mode1 can help

industry mitigate such incidents, and provide insight into regdatory requirernents (CCPS,

1996). In miiitary science, dispersion models are essential in the study of smoke-grenade

obscurants and countenneasures, as well as in improving the defensive procedwes for

dispershg chernical and biological weapons.

1.1 Aerosol Dispersion in the Planetary Boundary Layer

The term 'aerosol' generally refas to a liquid or solid substance suspended in a

gaseous medium, and covers a wide range of matter with varying sKes and composition,

incIuding dust, smoke and mists (Williamson, 1 973). Aemsols typically have particle

radii ranging from 0.01 p to 10 p (Hidy, 1984). Once introduced into the

atmosphere, an aerosol is acted upon by numerous complex processes.

In the upper parts of the atmosphere, ground surface fiction has little effect on

the flow of air, and can often be ignored. However, in regions closer to the surfâce, the

effects of the f?k&onal drag become important, and ultimately force the flow to zero

velocity at the surface itself Most dispersion problems of interest occur in this lower

region of the atmosphere, commonly known as the planetary boudary layer (PBL),

which can loosely be dehed as the depth of the surface-related influence on the fiow

(Oke, 1 987), and typically extends about 1 lan above the surface.

However, the character and depth of the PBL are directly influenced by the

surface, and Vary in response to changes in the surface's physical nature (Pasquill and

Smith, 1983). Thus the PBL is characterized by mechanical turbulence, generated by the

drag of the rough underlying surface, and convective turbulence due to the exchange of

heat between the sudice and the fiow. Since the strength of convective turbulence

generally varies diumally, so does the depth and nature of the PBL. By day, the earth is

heated more rapidly than the atmosphere, resulting in an upward flux of heat fiom the

d a c e to the air. This causes enhanced thermal mixing and the PBL can reach up to 2

km in depth. Conversely, at night, the earth cools more rapidly than the atmosphere,

producing a downward flux of heat, which suppresses wnvective turbulence. This can

lead to a reduction of the PBL depth to about 100 m (Oke, 1987).

Most of the heat and moisture in the PBL is tramferred by mechanical and

convective turbulence, and these processes cause rapid and efficient mixbg. These

phenornena also play a key role in the dispersion of aerosols released into the PBL.

1.2 Application of Lidar to Aerosol Dispersion Experiments

Remote sensing has been an important tool used in probhg the atmosphere for

decades. Specifically, monitoring scattered electromagnetic (EM) energy fiom a target

has been a particularly effective way of infixrhg a vast number of target properties. This

has direct application to atmospheric science, where the targets of interest are airbome

aerosols or molecdes. The use of energy at the optical or i n h e d wavelengths pennits

measurable scattering, even for objects of such small dimension. Even in the visibly

'clear' aûnosphere, backscattered signals fiom gases and suspended particles at ranges of

several kilometers may readily be detected with laser radars, or Zihrs, of modest

performance. It is thus possible to measure the position of clouds or aerosols, their

motion, and perhaps most importantly, their structure.

The texm RADAR was coined during the Second World War as an acronym for

RAdio Detection And Ranging, and refers to the process of measuring the the of flight

of reflected radio-wavelength EM radiation fkom a distant target to detemine its range.

This principle has been applied in the subsequent years at shorter and shorter

wavelengths, and was used to study atmospheric properties as early as the 1940s and

1950s. By analogy, the application of the radar principle to enagy of optical or near-

optical wavelengths produced by lasers was called LIDAR. LIght Deteetion And

Ranlzing*

At the heart of the Lidar systern is the laser source; although a number of

techniques exist to modulate the transmitted signal, the most common is pulse

modulation, where a single pulse of light of finite duration is transmitted. Typical puises

have a length of 10-20 nanoseconds and energy in the range of 0.1 to 1 .O J per pulse.

Lasers today have pulse repetition rates of 10 to 100 puises per second (Silfiast, 1996).

Once a pulse of light is transmitted, it is scattered in al1 directions by the various

gases or aerosol particles in the air. A certain hct ion of this energy is reflected back in

the direction of the source, where it can be mllected by appropriate optical components,

Iocated at the position of the lidar. It is then directed to a photo-detector, where a voltage

or current is produced that is proportionai to the power of the received pulse. The

magnitude of the received backscattered signal depends on the specific properties of the

target aerosol, including particle size distribution and its rehctive properties. In general,

however, the received power is proportional to the density of the scattering aerosol. In

addition to this, the portion of the target responsible for a given lidar return can be

localized quite accurately, using the time of flight of the laser pulse f?om the source to the

scattering medium and back and the pointing direction of the iidar. Thus the lidar is

capable of measuring the density at a aven point in a target aerosol, and the spatial

position of this point simultaneously.

The traditional and by far most common method for measuring concentration

distributions of dispershg aerosols is to use an array of detectors, situated dong a line or

arc domwind of the source. As the aerosol passes the array, concentration time-series or

dosage wunts are calculated at each detector on the array. The lidar presents numemus

advantages over such in situ measurements.

Laser beams are coherent and can be highly collimated. Thus, pulses can be

directed with high precision, and optical scanning systems c m be used to guide the beam

to scan a given volume with a specific pattern. In this way, large volumes of space can

be scanned fkom one remote position in a matter of seconds. There is no need to

physicdy set up a limited array of devices, which can be expensive, time consuming,

and in many conditions, impossible. A lidar has no problem rneaswhg hazardous or

inaccessible locations. Further, lidar probing does not disturb the process being

measured. This may not be the case for in situ measurements, where the presence of

detector an-ays can disrupt the local flow patterns.

There are drawbacks to using a lidar system for aerosol dispersion measurements.

For one, the spatial resolution is limited by the laser pulse length and beamwidth. Each

lidar retum is in fact a spatial average over the volume occupied b y the laser pulse. Fast-

response detectors offer much higher resolution and are more effective for measuring

fluctuation statistics. Also, a Mar system requires a h i t e amount of time to scan a given

volume, and thus a three-dimensional(3-D) concentration map is not a true instantaneous

'snapshot' of the diffushg cloud. Finally, it is necessary to convert the measued

backscattered power signal to a measure of concentration. This process is known as lidar

inversion, and is no trivial task; it requires numerous assumptions that are not always

valid.

1.3 Artificial Neural Networks

Artificial neural network (ANN) modeling is an emerging technique that has

shown rapid growth in the past decade. Today, ANNs are becoming more and more

common in a wide variety of disciplines, from financial market analysis and medical

diagnosis to the many fields of engineering. They have dernonstrated a rernarkable

ability to solve a wide variety of diverse problems, such as pattern recognition,

classification, process control, tirne-series prediction, function approximation and data

compression.

The success of ANN models in so many applications can be attributed to a

number of factors. Neural networks are inherently nodinea. structures, and are capable

of recognizing nonlinear relationships between seerningly random variables; hence they

model nonlinear processes well. Also, each component of an ANN is potentially affected

by the global activity of d l the 0 t h components of the network; thus, contextual

information is dealt with naturally by ANNs (Haykin, 1994). Furthemore, they are able

to handle noisy or incomplete data, and c m easily accommodate for variables that are

difficult to quanti@ numerically (e.g., categorical or Boolean inputs). These

characteristics give ANNs a signi ficant advantage over traditional multivariate statistical

regression models. One of the chief advantages of ANNs over traditional statistical

modeling approaches is their ability to handle CO-linear input variables. Co-linearity cm

significantly impair the performance of a statistical model, but presents no problem for

neural networks. Therefore, when modeling non-linear processes with large amounts of

noisy data, categorical and potentially CO-linear variables, neural networks are better

suited for the task than are statistical regression techniques. ANNs have been shown to

significantly outperfom statistical models in numerous applications (Gardner and

Dorling, 1 996; Yi and Pxybutok, 1 996).

Neural networks were originally modeled after the bctioning of the human

brain, which is a cornplex, nonlinea., parallel processor, capable of performing certain

computations many times faster than the fastest computrs today (Haykin, 1994). Work

on ANNs dates back to the 1940s. Early pioneers of the field showed promising success

with abstract models of the biological neuron and a formulation of the process of

learning. The most striking advances have taken place in the last 20 years, with the

advent of cheaper, faster cornputers. Over the past two decades, numerous unique and

sophisticated ANN paradigms have been developed, many of which differ markedly in

structure, operation and application. However, there are certain features that are shared

by al1 neural network paradigms.

Essentidly, al1 ANNs perform the same task: they accept a set of inputs (an input

vector), and produce a correspondhg set of outputs (an output vector); that is, they

perfonn a vector mapping (Wasserman, 1993). The relationship between the input vector

and the output vector is enwded in the fiee parameters of the ANN model, usually

referred to as the network weights.

The process by which an ANN encodes this mapping relationship in its weights is

effected through a learning algorithm. Typically, a given input vator is presented to the

ANN, dong with the associateci target output vector. The f k e weights are adjusted so as

to minimixe the difference between the ANN's predicted output vector and the desired

target output value. The precise way in which the weights are adjusted depends on the

specific leaming algorithm used by a given ANN pafadip. Numemus such input-output

vector pairs are presented to the network, which adjusts its weights in response to each

pair, until the ciifference between the desired and target output vectors reaches a tolerable

minimum. At this point, the network is considered to be 'trained'.

Regardless of the ANN paradigm, the goal of any trained network is to genaalize

well. This is the ability to produce the correct output when presented with an input it has

not seen before. A network that c m predict the correct output only for those inputs used

for training is of little practical use. It is the performance of a trained ANN against a set

of inputs not used for training that is the true rneasure of its usefiilness as a model.

ANN model development for this research was done using the Windows-based

proprietary software package NeuraZWorks Prof&onal D/PLUS by NeuralWare

(NeuralWare, 1 993 a).

1.4 State of the Discipline

This section gives a bnef discussion conceming the current state of the field of

study, citing appropriate references where necessary. Three topics will be discussed:

dispersion modeling, lidar inversion and ANN applications to dispersion modeling.

1.4.1 Dispersion Modeling

Atmospheric imbulent diffusion modeling has a long and rich history. Today,

countless different models exist and are in common use, many of which Vary

considerably in modeling methodology. Cornmon approaches include Lagrangian

sîochastic modeling, solutions to the advection-diffusion equations, and Gaussian

modeling.

Lagrangian stochastic modeling is a numerical modehg procedure, where the

trajectory of a particle (or group of particles) is calculated fiom a known sîatisticd

description of the turbulent velocity field. Although it is a very p o w d method, it is

computationally intensive, and is presently a research tool. Sawford and Wilson (1996)

provide discussions of current models.

Another common numerical technique is to iteratively solve the advection-

diffusion equation where closed-form analytid solutions can not be found. This allows

for the use of more reaiistic estimates for wind speed, temperature and eddy diffbivity

profiles, since analytical solutions can only be found for the simplest f o m . Some

examples of such numerical 'K-Theory' models can be found in current literaîure (CCPS,

1996).

By far the simplest and most commody used modeling technique is the Gaussian

dispersion model. The American Institute of Chemical Engineers (CCPS, 1996)

conducted a survey of 22 of the most commonly-used public and proprie* dispersion

modeling software packages; the majority of these incorporated a Gaussian plume or puff

model into the full dispersion code. The US Environmental Rotection Agency's

Gaussian-based dispersion model (Industrial Source Complex Short Term Model v.3) is

the most widely used in North America, and has been accepted by many jurisdictions as a

regulatory model (Lehder, 2000; CCPS, 1 996). Similady, the Gaussian model is in wide

use for regulatory control in Europe (Pankrath, 1995; Olesen, 1995).

Over the past decades, there has been a dramatic increase in the application of

dispersion models based on the Gaussian plume formulation as the core procedure

(Griffiths, 1994). Application of a Gaussian model ultimately requires a specification of

the nature of the standard deviations of the Gaussian distribution. These dispersion

coefficients describe the width of a cloud of con taminant and how the cloud grows as it

diffuses downwind. The accuracy of a Gaussian model depends critically upon these

coefficients.

A number of schemes have been developed over the years to estimate the

dispersion coefficients. The usual approach has been to semi-ernpincall y parameterize

the coefficients as functions of downwind distance and atmospheric stability, based on a

series of carefully performed diffusion experiments. The most widely used of these

parameterizations is that of Pasquill, later modifiecl by Gifford (Hanna et al., 1982).

They are based on ground level concentration measurements, due to contùiuous releases

over level gmund. Others have presented similar schanes, most notably Bnggs (1973).

The only parameterization based on measurements fiom instantaneous releases is

provided by Slade (1968). The parameterizations of Slade and Pasquill will be discussed

M e r in Chapter 2.

Some theoretical and experimental studies of near-instantaneous releases have

been conducted in the past few years. Hanna (1996) sunmarizes some of the recent work

on characterizing the downwind spread of a puff release, and the role of wind shea. on

puff growth.

Van Ulden (1992) conducted a theoretical treatment of the diffusion of a passive

puff near the g o n d , based on solutions to the advection-diffiion equation and Monin-

Obukhov similarity theory. Using a coordinate system that follows the p f l s horizontal

centre of mas, he developed relations for the standard deviations and skewness of the

concentration distribution. He concluded that wind shear and the interaction between

skewness and vertical diffusion dominate the downwind spread of the puff, leading to

tilted concentration distributions.

Yee et al. (1998) performed a detailed analysis on a series of carefùlly conducted

pdfdifksion field trials. This is one of the very few atmospheric experiments where the

field trials were controlled to such a degree that enough repeat realizations of

instantaneously released clouds could be pdormed to allow the construction of

staîistically-sound ensemble averages. Yee analyzed these ensemble average

distributions in the h e w o r k of relative diffusion, using Monin-Obukhov sirnilarity

theory. He found that downwind distributions were negatively skewed, while crosswind

distributions were Gaussian in nature. It was also detennined that under the conditions of

the trial, the horizontal spread of the puff grew approximately linearly with downwind

distance.

Sato (1995) investigated the longitudinal (downwind) distribution of a diffushg

puff based on a senes of release trials. He also found these distributions to be negatively

skewed, and attributed this skewness to the presence of vertical wind shear. At short

times, the growth of the puffs in the downwind direction was found to be proportional to

diffusion tirne, giving way to a % power relationship at longer times.

1.4.2 Lidar Inversion

With the advent of the use of lidar systems to probe the atmospke in the past

two decades, numerous techniques have been dweloped to extract useful optical

properties nom backscattered lidar retums. A review of cornmon inversion a l g o r i t . in

use can be found in Evans (1988), Elouragini (1995), and Bissornette (1996).

Bissonnette outlines the major problems remaining in most such inversion attempts. He

concludes that the chief problems are: 1) the need to determine a relationship between

aerosol backscatter and extinction coefficients, 2) the cornmon requirement to speciQ a

boundary value at some specified range, 3) instabilities or slow convergence of the

solutions, and 4) the need to properly account for multiple-scattering events.

Klett (1981) developed a stable inversion algorithm that assumes a power-law

relation between the backscatter and extinction coefficients. This method is based on the

well-known but unstable ' forward method', but requùes the measwement or estimate of

the extinction coefficient at some range beyond the extent of the cloud, rather than in

fiont of it. This results in an inversion procedure that is more stable with respect to

perturbations in the signal, the postulated relationship between the backscatter and

extinction coefficients, and to the estimate of the boundary condition. Some suggestions

conceming how to make the boundary condition estimate are given, but in practice this

can be quite difficult, and these estimates become less valid as the optical depth of the

cloud becornes small. In such cases, convergence is slow and generally only the fiont

portion of the retumed signal is of use. Klett modified this algorithm to account for

deviations in the relationship between the backscatter and extinction coefficients (Klett,

1985), and showed that stable solutions could be obtained, but the basic problems of the

original method remain. Multiple scattering effects are not accounted for in Kiett's

formulations.

Evans (1984, 1988) developed a stable inversion algorithm that does not require

the estimation of the extinction coefficient boundary condition. Each lidar retum is

instead normalized by a clear air calibration shot, which greatly reduces the effects of

system noise, enables the detection of very weak signals, and provides stable solutions

for a wide range of optical depth (Evans, 1984). Others (Uthe and Livingston, 1986;

Uthe, 1981) have adopted a similar approach. This method also assumes a power law

relation between the backscatter and extinction coefficients, and multiple scattering

effects can be accounted for, to some extent, directly in the algorithm. However, such

compensation is specific to the lidar system. This is the inversion dgorithrn adopted for

this research.

Roy et al. (1993) developed a lidar inversion technique based on the total

integrated backscatter. This technique does not rely on the Mar equation or any of its

assumptions. Rather, the total integrated backscatter is measured for various aerosols

under controlled conditions, and a calibration curve is formed that c m then be used for

field trial measurements. The technique shows promise, but the calibration curve is

specific to the aerosol, the dissemination technique, and the lidar system used to obtain

the curves. In addition, for sufficiently dense clouds, multiple scattering affects the

calibration curve, Iimiting the application of this technique.

1.43 Neural Network Applications to Dispersion Modeling

n i e application of neural networks to problems in atmospheric science has grown

rapidly in the past few years, and Gardner and Dorhg (1 998) provicie a bnef overview of

recent developments. However, in the specific field of atmospheric dispersion, relatively

few attempts have been made to mode1 the process using n e d networks.

A few studies have been perfomed where ANN models were used to predict

the-averaged pollutant concentration at specific receptor cites. Gardner and Dorling

(1 996) constructed an ANN model to predict the hourly average ozone levels at a specific

site in the UK. Inputs for the model consisted of hourly average of meteorological

measurements, including hadiance, temperature, humidiîy, wind speed and direction,

collected over the course of a year. The ANN model showed considerably better results

than a conventional multiple linear regression model, and demonstrated that about 53%

of the variability in the hourly d a c e ozone concentrations can be attributed to local

meteorology. Yi and Prybutok (1996) used an ANN modeling technique to predict ozone

levels for the Dallas-Fort Worth area. In this study, the model output was the daily

maximum ozone level; inputs consisted of hourly averaged meteorological parameters

and vehicle mission measurements. Their model also outperfomed standard linear

regression models.

Boznar et al. (1993) used ANNs to make short-term predictions of thermal power

plant pollutant levels in an industrialized area of Slovenia. In this study, the terrain was

quite cornplex, and traditional dispersion models repeatedly fàiled to give accurate

predictions. Meteorological and pollutant release rneasurements were continuously made

using a network of measuring stations located throughout the region. These

measurments formed the inputs and target outputs for a series of ANN models, where

one mode1 was consîructed for each receptor site. The ANN models predicted the short-

tenn concentration values at each receptor site very well, far outperfonning the

predictions of the numerical dispersion models.

Using measurements of downwind tracer concentration together with t h e -

averaged meteorological data, Rege and Tock (1996) constructed an ANN model to

predict the source emission rate. Continuous releases of ammonia and hydrogen sulfide

were placed near ground level, and detectors were located less than 30 m downwind fiom

either source. For data in the test set, most of the ANN predictions were within about

1 0% of the measured emission rates. Traditional Gaussian models that were empirically

modified to trial data failed to predict better than within about 20% of the m e w e d

emission rates.

Each of these shidies addressed either ambient atmospheric aerosols, or

continuously released contamimnts. None deals with instantaneous releases, and it

appears that only one attempt has been made to model the evolution of concentration

distributions of instanîaneous releases using ANNs. Using some of the same lidar data

analyzed for this research, Costa constnicted an ANN to predict aerosol concentration in

the fiamework of absolute diffusion (Costa, 1998; Andrews et al., 1998). However, the

wind speed and direction measurements used to comtruct the model were sampled far too

infiequently to construct statistically sound averages. This, coupled with the absolute

diffusion coordinate system, limited the performance of the ANN model. No thorough

analysis of the ANN model's predicted concentration distributions was made.

1.5 Thesis Objectives

It is a well-established fact that a thorough description of the process of

atmospheric turbulent d i f i i o n requires the measurement of a number of statistical

properties of the turbulent flow field, including the means and variances of wind speed,

wind direction and temperature. Nevertheles, in numerous practical situations, such

measurements are not available, and it is necessary to predict concentration distributions

fkom more readily measurable properties of the atmosphere. It is the purpose of this

research to develop a model that uses routinely rnea~u~ed aîmospheric parameters to

predict the average concentration distribution fiom an instantaneous puff release more

accurately than traditional Gaussian puff models.

Typical analyses of puff diffusion data first require the construction of ensemble

average concentration distributions. This requires a large number of repeat releases taken

under identical atmospheric conditions. During the field trials nom which the present

data set is taken, atmospheric conditions were highly variable and no such repeat

realizations could be perfonned. Thus, statistically sound ensemble averages could not

be constructed. To overcome this difficulty, the data were modeled using artificial neural

networlcs.

In addition to the developrnent of a successful ANN model, a fiirther goal of the

research was to parameterize this model's predicted concentration distributions. Simple

analytical relations are derived between the moments of the distribution and the most

influentid meteorological variables.

Chapter 2

Background and Theory

This section provides a theoretical discussion of the major components of. the

work. The Gaussian puff model is h t presented, dong with a description of the

Pasquill-Gifford stability scheme and dispersion length parameterization. A bnef

description of the COMBIC model is also given. This is followed by a detailed

derivation of the lidar inversion technique of Evans (1984, 1988). Feed-forward neural

networks are then discussed, together with an explanation of the learning d e used for

network training.

2.1 The Gaussian Dispersion Mode1

The dispersion process in turbulent flow is made up of two contributions: one of

molecular scale, due to the random thermal agitation of the molecules, and another of

much larger scale, due to random turbulent bulk motion within the fluid. These two

mechanisms differ in two key ways. First, the resultant motions are on entirely different

scales; the typicd span of turbulent movement is many orders of magnitude larger than

the mean f?ee path length of the diffûsing particles. Second, due to the macroscopic size

of the parcels of air involved in any single turbulent movement, there is a continuity

constraint. As such a parcel moves, it displaces another, and at the same time leaves a

vacant region that must be filled. This l ads to the characterization of turbulent flow by a

group of random, closed-loop motions commody r e f d to as turbulent eddies. In

atmospheric fiows, the contribution of molecular diffusion is s e v d orders of magnitude

less than that due to turbulent dif'fusion, and can usually be neglected.

A typical turbulent flow contains eddies that cover a wide range of spatial scales,

and are responsible for the dissipation of energy in the flow. This energy dissipation c m

be described as a cascade of energy conversion çoni the larger eddies to the smaller ones,

where the large eddies extract energy fiom the main advective flow. These eddies are

unstable, however, and smaller eddies extract energy fkom them. This process continues

as smaller and smder eddies feed off the larger ones, until finally the eddies are so small

that the viscosity of the fluid forces the conversion of the eddies' energy into heat.

Turbulent eddies displace parcels of air within a puff of contaminant, mixing

polluted air with relatively clean air, and vice versa. This mixing by bulk displacement

eventually causes polluted air to occupy larger volumes at lower concentrations.

However, not a11 eddies influence the dispersion of a contaminant in the same way.

Small eddies will displace material short distances, and will contribute less to the

dispersion of the pollutant, except perhaps at the edges of the p&, where mixing with

clean air may cause some notable redistribution of material. Conversely, eâdies that are

much larger than the p e o r plume will tend to displace the entire mass of pollutant as a

whole, a process known as 'meandering' and thus will contribute little to the interna1

mixing of the puE Eddies that have spatial scales comparable to the size of the puff or

plume are the most efficient in causing rapid mixing.

Given the highiy random nature of the motion of the turbulent eddies, it is

apparent that the concentration of a dispersing puff or plume is also in g e n d a random

variable, about which one can only make probabilistic predictions (Csanady, 1973). For

this reason, attempts to describe concentration distributions of a contaminant have been

restricted to considering ensemble average concentrations, which show much more

reguIar behaviour and can be more easily described mathematically (Williamson, 1 973).

Indeed, it has been shown experimentally that in a field of homogeneous

turbulence, ensemble average concentrations can be approximated by a Gaussian

distribution (Csanady, 1373). No rigorous theoretical justification for the observed

Gaussian distribution seerns to exist; Csanady sumarizes some of the more convincing

arguments, but concedes that ". ..the question why a Gaussian distribution is observe. in

experiments is not yet satisfactonly answered.. .". Few investigations into this question

have been undertaken in the recent literature.

Nonetheless, common practice is to assume a Gaussian distribution for averaged

concentration distributions, and this is the basis of the Gaussian puff model and its

variants. The Gaussian mode1 originated in the works of Roberts (Sutton, 1953), Sutton,

Pasquill and Gifford (Hanna et al., 1982). The prevalence of the Gaussian model can be

attnbuted to the following:

1. Its predictions agree with experiment as well as other models.

2. The simple form of the equation facilitates mathematical manipulation.

3. It is conceptually appealing.

4. It is consistent with the random nature of turbulence.

5. It is a solution of the Fickian d i h i o n equafion, assuming that both eddy

diffirsivity and wind speed are constants.

6. Other so-called theoretical formulas contain large amounts of empiricism in

their final stages.

7. As a result of the above, it is used in most govemment guidebooks, thus

acquiring an elevated status (Hanna et al., 1982).

The usual practice is to adopt a coordinate system where the x-axis is dong the

direction of the mean wind, U, the y-axis is in the cross-wind direction, i.e. perpendicular

to the x-axis and horizontal, and the z-axis is vertical. When dealing with the diffusion of

a puff, i.e., an Uistantaneous release, it is also cornmon to use a coordinate system whose

origin is at the centre of mass of the puff. Thus the entue coordinate system moves

downwind with the puff as it is advected by the mean wind. This effectively removes the

effects of puff meander fkom the d i f i ion process.

In its simplest form, the Gaussian puff equation takes the following fonn:

where C(x. y, z. t ) is the ensemble-average concentration &lm3),

Q is the mass of aerosol instantaneously released at thne (g),

x, y, z are the coordinates relative to the centre of mass of the puff (m),

q0, o,,(i), O#), are the standard deviations of the distribution in each of the

coordinate directions; also known as the dispersion coefficients (m).

Figure 2.1 below illutrates the relative diffusion coordiaate system, and the

qualitative difference between puff and plume diffiision. Note the moving coordinate

system for puff diffusion, which is advected with the mean wind, as indicated by the

arrow. in general, the three dispersion coefficients will be distinct, and the pufY contours

will be ellipsoidal in shape, rather than spherical.

Figure 2.1 A cornparison between Gaussian puff (top) and plume (bottom) diffusion, showing concentration contour surfaces.

The foIiowing assumptions are implicit in the Gaussian puff equation:

1. The average concentration distribution is weil represented by a Gaussian, or

normal, distribution in each of the îhree coordinate directions.

2. Homogeneous turbulence: the statistical parameters that characterize the

turbulent flow field are invariant in space.

3. Meteorological conditions are invariant in time, Le., wind speed, wind

direction, temperature, stability and al1 other meteorological parameters are

constants with respect to tirne.

4. Conservation of mass, i.e., no ground deposition or reaction.

The time dependence of C(x, y, z, t ) in equation (1) is contained in the dispersion

coefficients, which are, in general, fùnctions of time and the meteorological

characteristics of the flow. Using a statistical approach to the diffusion problem, Taylor

(1921) developed relations describing the evolution of the dispersion coefficients with

t h e for the problem of a contuiuous plume. Batchelor (1952) used a similar approach to

tackle the problem of relative diffusion of puffs. In both cases, however, growth of the

dispersion coefficients was found to depend on cornplex statistical properties of the

turbulent fiow. These parameters are d l y d i f f id t to assess, and require research-

grade turbulence measurements (Hanna et al., 1 982).

In the absence of such measurernents, it is general practice to use semi-empirical

parameterizatiom. These are formed by observing the behaviour of dispersing plumes or

puffs under a broad range of conditions, and generally express the dispersion coefficients

as functions of domwind distance fiom the source and atmospheric stability. To use

such parameterizations, one must £kt characterke the atmospheric stability, preferably

by a simple scheme based on inexpensive and easily obtained measurernents (Pasqui11

and Smith, 1983).

2.1.1 Stability Classification Schemes and Dispersion Coefficient Parameterizations

Atmosphaie stabiiity generally refers to the vertical temperature stratification of

the atmosphere, and its resultant effect on the degree of dispersion. Three stability

categories are generally recognized: unstable, stable, and neutral. Unstable conditions are

usually formed shortly after dawn on sunny days, when incoming radiation from the sun

heats the d a c e of the earth causing the air in the lower levels to be wanner, and

therefore less dense, than the air above it. Such conditions are favourable to the

formation of convective turbulence, since any mass of air displaced slightly up or down

will continue to rise or fdl due to the density difference between it and its surroundings

(Sutton, 1949). Thus, in unstable conditions, turbulent mixing is enhanced.

Conversely, stable conditions are usually foxmed at night, when the surface cools

by emission of long-wave radiation. This often results in an 'inversion', in which

temperature increases with height, causing the suppression of vertical displacements.

The intermediate state is refmed to as neutral stability, and is characterized by a slight

decrease in temperature with height, usually very close to the adiabatic lapse rate (about 1

OC per 100 m). Neutral conditions can result fiom cloudy conditions, which inhibit

incoming and outgoing radiation, and fkom windy conditions where the wind rapidly

mixes the heated or cooled air vertically, evening out the vertical temperature distribution

(Tumer, 1994). Neutral conditions typically show a . intermediate level of dispersion.

Figure 2.2 below illustrates the temperature profiles of these stability reglmes, together

with typical effects on plume dispersion.

TEMPERATURE - STRONG CAPSE CûNOlT ION

WEAR LAPSE COMHTtûN

Figure 2.2 The effêct of atmospheric stability on the dispersion of plumes. The adiabatic lapse rate is shown as a dashed line, while typical vertical temperature profiles are shown as solid lines for (a) unstable conditions, (b) neutral conditions, and (c) stable conditions (Turner, 1994).

One of the most widely used stability classification schemes was first developed

by Pasquill (Pasquill and Smith, 1983), and is applicable for routine meteorological data.

Stability categones are characterized semi-quantitatively by wind speed, incorning

radiation, and the nighttime state of the atmosphere. Specifically, atmospheric stability is

divided into six classes, called 'Pasquill Stability Classes' A to F, where A is the most

unstable category, D is neutral and F is the most stable case. These are based on five

classes of surface wind speeds, three classes of daytime insolation, and two classes of

nighttime cloudiness. Table 2.1 below summarizes the scheme.

Table 2.1 Detexminhg the Pasquill stability category (Pasquill and Smith, 1983).

A: Extremely unstable conditions D: Neutral conditions B: Moderately wistable conditions E: Slightiy stable conditions C: Slightly unstable conditions F: Moderately stable conditions

Surface Daytime insolation Nighttime conditions wind speed,

d s Strong Moderate Slight > 112 cloud <3/8 cloud < 2 A A-B B - - 2-3 A-B B C E F 3 4 B B-C C D E 4-6 C C-D D D D > 6 C D D D D

Pasquill and Smith (1983, pg. 336) offer the following notes for using Table 2.1 :

1. Strong insolation corresponds to sunny midday in midsummer England; slight insolation to similar ccnditions in midwintet.

2. Night refers to the period fiom 1 hour before sunset to 1 hour after sunrise. 3. Category D should be used, regardless of wind speed, for overcast conditions

during &y or night, and for any sky wnditions during the hour preceding or following night, as dehed in (2) above.

Using these stability criteria, Pasquill analyzed a group of experimental trials, and

measured the crosswind and vertical spread o f dispersing plumes at various downwind

distances between 100 m and 1 km. The trials were conducted over flac uniform terrain,

releases were fiom near ground level, and concentration measurements were t h e

averaged over about 10 minutes. Other experimental trials have been used to extend

Pasquill's original parameterization to include the effects of sdace roughness and

elevated releases (Pasquill and Smith, 1983). These parameterizations are usually put

into the form of power law equations, where the msswind and vertical dispersion

coefficients are expressed as fiinctians of downwind distance h m the source.

The only parameterization of dispersion coefficients for near instantaneous

releases is due to Slade (1968). Slade pooled the results of a number of puff diffiision

experiments to form simple power law relations for crosswind and vertical dispersion

coefficients as functions of downwind distance. Predictions of downwind dispersion

coefficients, o,, were in general lacking, and are usually taken to be equal to a,. These

results are based on far fewer experimental trials than are those of Pasquill (CCPS, 1996),

and the experiments varied in source configuration, release height, meteorological and

terrain conditions (Slade, 1 968).

Both the Pasquill and Slade c w e s for dispersion coefficients are in wide use in

Gaussian based dispersion codes (CCPS, 1996), although some models make minor

modifications or M e r interpolations. Slade's parameterization is more appropnate for

near instantaneous releases than that of Pasquill, which is based on continuous source

trials. Nonetheles, since many more data exist for continuous plumes than for

instantaneous puffs, several models use the Pasquilt parameterization for plumes and

puffs alike (CCPS, 1996; Hama et al., 1982).

2.1.2 The COMBIC Mode1

The Combined Obscuration Mode1 for Battlefield Induced Contrrminants

(COMBIC) is a model developed by the US Army Research Laboratory to estimate

variations of transmissivity through a battlefield obscured by smoke and dust clouds.

These transmissivity values can be calculated for up to seven different EM bands, dong

nurnerous different lines of sight, and for scenarios involving several sources of varying

type-

COMBIC uses a Gaussian-based dispersion code and ernploys Gaussian puff and

plume models, depending on the source being modeled. In addition to this, COMBIC

incorporates numerous enhancements to account for effects not considered in the basic

Gaussian equation. These include puff-surface interaction effects, cloud buoyancy

effects, and source effects such as initial cloud momentum, temperature, and radius.

COMBIC also uses a sophisticated boundary layer model based on Monin-Obukhov

similarïty theory to calculate vertical temperature, wind speed, and density profiles used

in transport, diffusion and buo yancy calculations (Ayres and Desutter, 1 995).

For dispersion times l a s than 30 seconds, the COMBIC Gaussian models use

dispersion coefficients based on the Pasquill parameterization. In the case of an

instantanmus release, the dispersion cwfncients take on the following form:

0, ( X ) = 0.667a - xO.'

where X is the downwind distance fkom the source (m), and

a, b, and c are numericd parameters derïved fkom the Pasquill

parameterization, and are functions of stability class.

Note that COMBIC uses the same parameters for plume dispersion, with the

difference that %=O (i.e., downwind dispersion is not considered for continuous releases).

Also note the factors of 0.740 and 0.667 for the domwind and crosswind spread

formulas, respectively. These factors are not used in the plume formulas, and are

included to reduce horizontal dispersion in an attempt to compensate for the reduced

effect of meander in puff diffusion (Ayres and Desutter, 1995). Also note that the

downwind factor is larger than the crosswind one, in accordance with the well-

established principle that downwind diffusion proceeds at a greater rate than crosswind

d i f i i o n (Hanna, 1996).

For cornparison, the parameterizations due to Slade and Pasquill are shown below

in Figures 2.3 and 2.4, respectively.

0 2 0 4 0 6 0 8 0 1 w 1 2 0

T W distance from saaoe (m)

O 20 40 60 80 100 120

Travel dwtancefran sance (m)

Figure 2.3 Slade panuneterization of dispersion coefficients as functions of downwind travel distance nom the source; (a) Pasquill stability class A (very unstable); (b) Pasquill stability class B (moderately unstable) (CCPS, 1996).

0 2 0 4 0 6 0 8 0 1 0 0 1 2 0

Travei d i i from source (m)

Figure 2.4 Pasquill panuneterization of dispersion coefficients as functions of downwind travel distance fkom the source; shown are the modifications used in COMBIC, equations (2) to (4); (a) Pasquill stability class A (very unstable); (b) Pasquill stability class B (moderately unstable) (Ayres and Desutter, 1996).

COMBIC does not use the rnodined Pasquill parameterization shown in Figure

2.4 for dispersion times greater than 30 seconds. More cornplex semi-empirical relations

for the dispersion coefficients are employed. These relations are based on Monin-

Obukhov similarity theory, and account for the effects of wind shear and surface

roughness.

2.2 The Laser Cloud Mapper

The laser cloud mapper (LCM) is a fast scanning lidar system designed and

developed at the Defense Research Establishment Valcartier (DREV), and has been used

in the study of military obscurant characterization (Roy et al., 1994; Evans et al., 1994),

cloud ice formation (Bissornette et a l , 1997), industrial pollutant emissions monitoring

(Pal et al., 1 W8), and lidar inversion techniques (Evans, 1988; Roy et al., 1993;

Bissornette and Hutt, 1 995). This section will describe the LCM system, and discuss the

inversion algorithm used to extract the concentration maps fiom the lidar retwns.

2.2.1 LCM Specifications

The LCM consists of a 1.06 p Nd-YAG laser source, a scanning platfonn,

secondary optics and a receiver. It is controlled by a PC and the entire system is rnounted

in a step van. The laser and collecting optics are w-linear.

The laser is a pulse-modulated source, emitting 10 ns pulses at a repetition

fiequency of 100 Hz (one emission every 0.01 s). The beam divergence and receiver

31

field of view were set at 3 and 4 mrads, respectively, which defines a sampling footprint

of 0.3 m at a radial distance of 100 m. The laser pulse energy is 80 mJ.

The scanning optics guide the laser pulses dong a raster pattern, as shown in

Figure 2.5, covering a large area in about 3.5 seconds or less. Each Mar emission is

termed a shot, and 150 backscattered retunis are collecte. for each shot All the shots

dong one constant elevation form a sweep. The speed of the scanning optics was set so

that 44 shots were taken dong each horizontal sweep, and 6 or 8 sweeps were performed

per scan. The LCM scanned a volume spanned by a radial distance of about 225 m, a

range of 60" in azimuth, and 10" in elevation. T'us the resolution in azimuth is about

1 .40°; depending on whether 6 or 8 sweeps were pedormed per scan, the resolution of the

LCM in elevation is 2" or 1.43O, respectively. The digitization rate of Iidar r e m was

100 MHz, giving the LCM a radial resolution of 1.5 m.

LCM

Figure 2.5 Raster scanning pattern used by the LCM.

2.2.2 Lidar Inversion

The lidar retums collectecl by the LCM system are in the f o m of a current, which

is then amplified logarithmically. The amplified curent can then be converted into a

measure of backscaîtered power, in watts, using the log-amp calibration curve and the

h o w n circuit parameters. One can then use an inversion algorithm to convert this power

measurement into a measurement of aerosol concentration. However, this is no trivial

task, given the non-linearity and complexity of the interaction between the LCM beam

and the scatterhg aerosol.

The lidar equation relates the power of a backscattered lidar retum fkom some

range r to the volumetric extinction coefficient of the scattering aerosol, usually denoted

by O@). This is defined as the hct ion by which the flux of energy in the direction of

propagation is reduced per unit length, and bas units of m-' (Hinckley, 1976). It is

generally a fict ion of both position and wavelength. The extinction coefficient of an

aerosol is a parameter of considerable interest, because it can be used to determine that

aerosol's concentration, via the relation:

o(r,h) =a(h)C(r)

where a@) is the mass extinction coefficient of the aerosol (m2Ig), and

C(r) is the concentration (dm3).

Thus, given that the mass extinction coefficient for a given aerosol is hown, extinction

values are easily converted to measures of aerosol concentration. However, determining

the extinction coefficient o(r) fiom the backscattered Mar returns requires inversion of

the lidar equation, based in radiation transfer physics.

The radiative tramfer equation is a complex relation describing the intensity of

EM radiation (of wavelength A) received at a given point, ernanating fkom a target some

distance r away. In differential form, the equation c m be written as (considering only the

radial direction for simplicity):

where &, L) is the received intensity (W/m2),

J(r. A) is the intensity of radiation due to the source ( ~ / m ~ ) , and

~ ( r , h) is the extinction coefficient (m-').

Equation (6) assumes that the wavelength of the radiation is much smailer than

the typical distance between scatterers (Costa, 1998). That is, the scatîered radiation is

incoherent and non-intdering. The source tem, J(r, A), takes into account d l radiation

scattered into the propagation path fkom al l directions at all points dong the path. It also

accounts for thermal emission of the medium into the propagation path. Thus this term

accounts for al1 diffuse radiation reaching the receptor, while the I(r, A) term represents

the direct transmittance. In general, J(r, A) is difficult to specie, it depends on the phase

function of the scattering aerosol, and its thermal emission properties. Equation (6) can

be considerably simplifiecl by making the assumption that the diffuse radiation is

negligible in cornparison with the direct radiation (i.e., J(r, A) « I(r, X)). That is to say,

once a photon is scattered by an aerosol particle it is pennanently removed fkom the

beam. This assumption of singie-scattering is very important, and is ofien not valid,

particularly in dense media. The effects of this and other assumptions will be discussed

below.

Given the assumption of negligi'ble source emission (mdti-scattered and thermal),

equation (6) is greatly simplified to:

d - ï(r, k) = -O (r, A) I(r , A) dr (7)

where I(ro, h) is the intensity of the beam before any extinction, that is, at the

target. This is the familiar Beer-Lambert law for direct trammittance in the most general

case.

For the specific case of the LCM, the received intensity is dependent on additional

factors. Since the pulse has a finite duration or length, r, it illuminates a finite portion of

the air at any time (m. where c is the speed of light). Thus, the 'effective pulse length' is

the range interval fiom which a signal is received at any instant; this distance is 4 2 due

to the two-way path that the pulse must traverse (Hinckley, 1976). The received signal

intensity is also directly proportional to the solid angle subtended by the receiver; this is

the factor ~ / r ~ , where A is the effective receiver area, and r is the range fiom lidar to

target. Other system-specific factors can be incorporated into a system function, denoted

by F(r). Generally F(r) is taken as a calibration constant which accounts for light

reflection losses in the optical sections of the lidar system, geometric crossover of the

receiver and laser beam, and the quantum efficiency of the receiver (Pollock, 1993;

Costa, 1998).

In addition to the above mentioned factors affecting the received signal intensity,

the signal will also depend on the probability of being backscattered at a range r. The

quantity P(r) is the elastic volume backscattering coefficient, with units of m-'sfl, and is

a measure of the hction of incident energy scatterd in the backward direction (Le.,

towards the Lidar system) p a unit solid angle, per unit path length. Although it is in

general a complex function of aerosol size distribution and shape, bsckscattering

generally increases with increased aerosol concentration. Finally, the exponential factor

seen in equation (8) will represent the transmissiviiy dong the two-way path h m source

to target and back; hence, it must be squared.

Incorporating equation (8) together with all the additional factors affecting the

LCM received signal strength, and noting that power is proportional to intensity, we

hal ly get the lidar equation.

CZ P(r) = P, F(r)P (r)

where P(r) is the received signal power 0,

Pa is the transmitted pulse power 0,

and the fiinctional dependence on 7L is understood.

This is the equation that govems the relationship between the received lidar signal power

and the extinction coefficient, and hence the aemsol concentration via equation (5).

However, it rests on a number of assumptions.

In addition to the three assumptions made in arriving at equation (8), a number of

further assumptions were made in derking the lidar equation. Below is a list of al1 the

assumptions made in the derivation of the Mar equation:

1. incoherent scatterers,

2. no significant emission or source terms (J(r, A) <« I(r, A)),

3. first-order multiple scattering only,

4. only radiation scattaed exactiy through 1 80° is receiveù,

5. o d y a plane wave is received (or use an effective receiver area),

6. no beam pulse stretching (constant r),

7. the receiver encompasses the laser beam,

8. the scattering medium does not change in index of rehction, size, shape, or

orientation distribution over the scattering volume,

9. o(r) is independent of Po.

10. particles are not shadowed by one another,

1 1. the light source is monochromatic, and

12. the laser pulse is rectangular (Evans, 1984, 1988).

Note that the scattering volume is the volume of aerosol instantaneously

illuminated by the pulse. Most of the above assumptions are tnie for typical lidar systems

and aerosols; some of them impose somewhat opposing restrictions, calling for a

compromise between a small pulse length and scattering volume, and large enough

aperture area and beam divergence (Evans, 1988). Nonetheless, a reasonable

compromise is not overly restrictive to typical systems, and most of the assumptions can

be met. However, the assumption that no multiple scattering events take place is the

most difficult assumption to work with and is the one most easily violateci. Dense

aerosols and reçeivers with large apertures will in general cause the violation of this

assumption. The inversion method attempts to correct for the multi-scattering effects, as

will be discussed below.

As can be seen fiom equation (51, in order to consî~ct a c o n c e n ~ o n map for an

aerosol, one must £int obtain an extinction map. This must somehow be obtained fkom

the returned power signal, as given by the lidar equation, as expressed in equation (9).

Even casual inspection of this equation shows that this is no trivial task. It is a non-linear

fùnction of two unknowns (assuming F(r) to be constant). Nmerous inversion

techniques have been devised over the past few decades, of varying success; most of

these rnethods are plagued by instability or inaccuracy. The Aubmatically Generated

Inversion of the Lidar Equation, or AGILE, developed in DREV over the mid-to-late

1980s (Evans, 1984, 1988), provides a fast inversion of the lidar equation without the

persistent instabilities of previous rnethods.

It is fint necessary to reduce the number of unknowns in the lidar equation.

Based on several observational and theoretical studies, it has been found that when

particdate backscattering dominates (generally irue for M a r e d wavelengths), a power

law can be used to relate P(r) and a(r):

P ( r ) = do (r) , (10)

where d is a constant, and k depends on lidar wavelength and various properties of

the aerosol. Reported values of the exponent are generally in the range 0.67 < k < 1.0

(Kletî, 1981). A survey of the litmature indicates that under most conditions, k = 1 is a

good estimate. Theoretical and experimental results over a wide variety of aerosols,

particle shapes, and extinction coefficients indicate that this assumption of Iinearity is

excellent. However, caution is r@ed: certain effects can cause non-linearity, including

multi-scattering, changes in aerosol size distribution, shape or index of refiaction (Klett,

198 1).

Incorporating the d e (1 0) into the lida. eqution gives:

Cr P (r) = 4 - F(r)do '

2

The strength behind the AGILE inversion is in its use of a clear air calibration

shot. Given that the extinction coefficient of clear air, crc, is constant over r, one obtallis

f?om (1 1) the retumed signai strength for the clear air shot, denoted by C(r):

Now, dividing the retunied signal by the clear air rehun (Le., dividing equation

(1 1) by equation (12)), and rearranging, we get

Note that system constants and the 1/r2 dependence have dropped out. Taking

the k? root of this equation, and integrating over r produces

where T(r) is the transrnissivity at distance r, defined as T(r) = e-' , and r is the optical

depth, dehed as r = exp{-l(r1)dr'j; note that tranmiissivity is boundsd by L e

interval [O, 11.

If it is assumed that is small enough to be negügiile (oc «k/2r , tnie for the

ranges considered in this analysis) then

where it is understood that P, C and T are all functions of r. Equation (14) is an

important result. Since transrnissivity is a bound quantity, this indicates that the left side

of the equation is also bound. Ttiat is,

Thus, as transmissivity decreases (i .e., as the aerosol becornes increasingl y dense)

the integral on the ieft of (14), which is the total normalized integratted backscatter,

approaches a maximum theoretical value. If the measured total integrated backscatter

exceeds this limit, then some experimental error has occurred; perhaps a digitization

error, or a physical asmption is no longer valid (such as that of single-scattering). This

provides a weak check on the system, inversion method and physical assumptions

(Evans, 1984). As will be discussed below, this fact will be used when the BTN

inversion is implernented in practice to attempt to correct for multiscattering effects in

dense clouds.

Returning to equation (13), taking the k& root, and using the definition of

transmissivity,

A g a assirming oc k/2r, and rearranging, (1 5) bemmes

Rearranging equation ( 1 4) gives:

Finally, substituting (1 7) into equation (1 6) gives:

where P, C and o are d l functions of r. This is the final equation used by the AGILE

algorithm to calculate the extinction coefficient fiom the retumed lidar signal and a clear

air calibration shot.

When implemented in practice, the AGILE algorithm must account for some

additional factors and potential sources of error. The primary challenge to implementing

most inversion algorithms is two-fold: accounting for muhiscattering effects, and the

response of the logarithrnic amplifier. If the system response is too slow in recovery or

the cloud is dense enough for multiple scattering, then the limit on the integral in

equation (14) could be surpassecl (Evans, 1984). For signals that approach this Iimit too

rapidly, the AGILE algorithm applies a numerical compensation. This keeps the signal

within the allowable limit (conservation of energy), and pennits to some extent correction

for multiple scattering and the down time of the system (Roy et ai., 1993).

This is done by multiplying the normalized return ( ~ ( r ) / ~ ( r ) ) by the value T' ,

where T is the transmissivity at r given by equation (1 7), and z is an empirical fmor less

than unity. Monte Carlo simulations and theones of multiple scatterhg suggest thar this

is a reasonable rule of thumb, and others ( K d e l and Weinman, 1976) have adopted a

similar approach. This correction procedure is only applied to retunis that have an

optical depth greater than unity. It is a fairly g e n d procedure in that it is not specific to

a given aerosol, but numerical compensation is specific to the lidar system (Roy et al.,

1993). Evans (1984) has found that a value of z-0.8 provides good agreement with

measured extinction values for the LCM.

If too much correction is needed to impose the limit of equation (14), the AGILE

algorithm outputs an error value of 1. In practice, values of o(i9M.5 m-' are assumed to

be the result of inversion errors, and are discarded. The minimum detectable extinction

of the LCM is estimated to be 104 m-l, and inverted values below this threshold are

considered clear air retums (Le. zero aerosol concentration).

In addition to its ability to compensate for multiple scattering events, the AGILE

algorithm offers a number of advantages over previous inversion methods. The value of

a, can be easily obtained by a trial inversion of one shot in which the transmissivity is

measured, or by other means (Evans, 1988). This does not pose a problern for the K M ,

whkh can perform many clear air shots in a very short tirne. The value used in the

inversion algorithm is a, = 2 - 1 O" m-l. This value must be detamined only once, and

can be used repeatedly for cases where many shots must be processed, provided that the

shots are taken under the same conditions. This is a great improvement over older

methods, where a refaence shot must be obtained for every lidar rem (Klett, 198 1).

Not only does the calibration with a clear air shot eliminate systern

constants such as F(r), 5, and the l/? attenuation, it also considerably cuts down on the

signal to noise ratio. For an AGILE-inverted signai, a histogram of extinction coefficient

frequency will show a standard deviation as much as one half of that obtained for an

inversion not using the clear-air calibration (Evans, 1984). Thus the AGILE inversion

allows for the detection of a very weak retum.

Further, the integration of equation (18) is camied out using a hybrid version of

the trapezoid nile and Simpson's d e , providing an accurate method for numerical

integration with a unifonn digitization rate, wwhile giving an integrated value at every

digitized point (Evans, 1988). Transmissivity e s t h t e s incur l e s error than previous

methods, since only one numerical integration must be carried out instead of the usual

two (one for the normalized integrated backscatter, and another for the integral of

extinction values).

The AGILE inversion technique has been validateci by cornparison with other

inversion techniques and field measurements. Specifically, inverted LCM rehims wae

compared to computations derived fiom Klett's inversion method (Klett, 1 98S), showing

excellent agreement (Evans, 1984). AGILE inversions aiso showed good agreement with

meaSUTernents made by coincident transmissometers, in situ concentration measwements,

and photographïc estimates of cloud extent (Evans, 1984).

2.3 Artificial Neural Networks

The principal cellular unit of the brain is the neuron. Within the brain, millions of

these neurons are interco~ected in a complex network where idonnation is exchanged

back and forth via electncal signals. A given neuron receives numerous signals dong

branch-like structures known as dendrites. If the combined signals are strong enough,

this neuron will 'fire', transmitting a signal dong its axon to other neurons. Between the

dendrites of one neuron and the axon of another is a srnall gap called the synapse, across

which the signal is tramfmed. The magnitude of the received signal depends both on the

strength of the original btansmitted stimulus, and the properties of the synapse. Figure

2.6 below shows the basic structure of the neuron.

Figure 2.6 The basic structure of a biological newon.

Artificial neural networks attempt to mode1 this mechanism. The basic unit of

ANNs is the processing element (PE), which receives extemal stimuli, combines them,

and transmits a signal. Each input to a PE is multiplied by a distinct factor, d l e d a

weight, and these products are then summed at the PE. The sum is then modifiai by a

nonlinear transfer fùnction, and the resultant value is passed on. This output value may

be the final result, or it may act as the input to other PEs, where another connection

weight is applied prior to being summed by the next processing element. The connection

weights are andogous to the synaptic signal strength of neural connections. The basic

structure of a single processing element is shown below in Figure 2.7.

1 vb ~rans fer: yj=f (4) xn - connection

weights

Figure 2.7 The basic structure of a processing element.

The synapses play an important role in the process of learning. In 1949, Webb

presented his postulate of learning, which states that the effectiveness of a synapse

between two neurons is increased by the repeated stimulation of one neuron by the other

across that synapse (Haykin, 1994). This mechanism for the leaming process was

adopted for artificial neural networks. As examples of the process to be learned are

presented to the ANN, the comecting weights are modified in a systematic way so that

the network eventually ' l e m ' that process.

2.3.1 Multi-Layer Feed Forward Networks and Backpropagation Leamhg

The simple single PE shown above in Figure 2.7 does not achieve rnuch in the

way of learning or predicting. The fimctionality desired in practical applications requires

multiple, interconnected PEs, which are organized into groups cailed layers. A typical

ANN consists of a sequence of layers with connections between the PEs of successive

layers. A characteristic multi-layer feed-forward (MLFF) ANN is shown below in Figure

2.8. In this figure, the leftmost layer is called the input layer, where data are initially

entered into the network, and the rightmost layer is termed the output layer, where the

ANN predictions are generated. The remaining layers are not part of the input or output,

and are called 'hidden layers'. The following convention will be used to describe a

network's architecture: the number of nodes in each layer is listed, starting with the input

layer and separated by a dash. For example, 9-30- 15-1 represents a network with 9 input

nodes, 30 PEs in the first hidden layer, 15 PEs in the second hidden layer, and 1 output

PE. Feed-forward ANNs are so named because the input signals and the intenial

intermediate si@s are always propagated forward. The flow of idonnation is only

directed towards the output, and no retuming paths exist.

Input Fi rs t Second Output layer hidden hidden layer

layer laycr

Figure 2.8 A multi-layer feed-forward ANN with two hidden layers (Haykin, 1994).

The backpropagation learning algorithm gets its name fiom the way corrections

are applied to the network weights. During training, input vectors are presented to the

network, and each vector is propagated forward, layer by layer, until an output vector is

calculated. This predicted output vector is compared to the target value associated with

that particular input vector, and an error is calculated. This error is then used to adjust

the weights between the last hidden layer and the output layer. An error value is then

computed ushg the outputs of the last hidden layer, and this enor value is used to adjust

the weights in the previous layer. This continues until the weight connections fiom the

input layer are adjusted. In this way, mors are propagated backward layer by layer. This

is repeated many times for the vectors in the M g set until the output error reaches a

minimum.

The derivation of the algorithm used for leaming is presented below (Pattmon,

1996). For simplicity, only MLFF networks with one hidden layer will be considered,

and the output layer will consist of only a single PE, as is typical of prediction problems.

Consider a network with n input PEs, h hidden layer PEs and 1 output PE. The following

notation will be adopted:

x n-dimensional input vector (with elements xi , i= 1,2,. . . n)

v, weight comection between input-layer PE i and hidden-layer PE j

wj weight connection between hidden-layer PE j and the output PE

ri output ofJ& hidden layer PE

z output fiom the output PE for input vector x

t target output for input vector x

$0 nonlinear activation function

When a training vector x is presented to the input layer, each PE, j, in the hidden

layer sums the elements of x to Hj7 first multiplying each xi by its associated weight

value:

Similarly, as the outputs fiom the hidden layer, yj-, are passed to the output PE,

each one is k t multiplied by its associated weight value, then summed to I:

Thus 4 is the combined input to the? hidden-layer PE, and I is the combined

input to the output PE. The output fiom thep hidden-layer PE is then given b y:

~3 =A&) j = 1, 2, ..., h (21)

Similarly, the output h m the single output PE is:

z =Al). (22)

Combining equations (19) to (22), the output PE produces a value z, due to an input

vector x, where:

Some of the weights and outputs for a mal1 network are iiiustrated below in Figure 2.9

for clarity .

Figure 2.9 MLFF network connections and variables

The goal of the learning process is to minimize the ciifference between the

network's output, z, and the target output, t, over al1 the input vectors in the training se t

One must therefore finit define an error fünction, or cost fùnction, to be minimized.

Typically, the mean square error is used to define the error fûnction; thus, the error

associated with a given input vector x is:

To minimize this error function by adjusting the weights, a gradient descent

method is adopted, so that the weight adjusûnent, Awj, is in the direction of decreasing

error:

where q is a constant learning coefficient, typicaliy 4. The chah d e is then invoked to

evaluate equation (25):

From equation (20), the second factor is:

The first factor of (26) can be broken down using the chah d e again:

From equations (24) and (22) we have

aE -=- ( t - z ) and dz az

-=f (0, ar

respectively, where f '(1) is the derivative off with respect to 1. Consequently,

Substituthg equations (27) and (28) into (26) we arrive at the expression:

For compactness, define 6 = (t - z) f'(1).

Then quation (25) can be written:

This weight update d e is valid for a l l the weights comecting the hidden layer PEs to the

output layer PE.

Next, the weight updates for the weights comecting the input layer to the hidden

layer must be calculated. As is clear fiom equation (23), these weights are deeply

embedded in the m r function. An expression is desired for:

The second factor in (3 1 ) is easily evaluated fkom equation (1 9):

The f h t factor can be broken down using the chin d e :

The second factor of (33) is easily evaluated using (2 1):

and the f'irst factor can be evaluated directly form quations (23) and (24):

and using equation (23):

= -(t - z ) f ' (1) wj .

Substituting equations (32) to (35) into (3 1) finally yields:

Avji = ? x i f ' (Hj )(t - z)f '(T)wj

= v i f ( q P w j ,

where 6 is defined above. Again, for compactness, we can d e k e 6,- as:

6 , = r ( H j ) 6 w, .

Then al1 the weights connecting the input layer to the hidden layer are adjusted accordhg

to the following d e :

After the presentation of an input vector, and the calculation of the weight updates

according to equations (30) and (37), each weight in the ANN is adjusted according to:

o u vjy = v , ~ + Avj, . (39)

This is the most basic f o m of the backqropagation learning rule. Some comments

concerning the details of implementing this algorithm and some useful enhancements are

in order.

The choice of the f o m of the non-linear transfer function f is somewhat flexible.

The only strict requirernents are that it is bounded, continuous, and continuously

differentiable. Typical choices for f are the sigmoid fûnction and the hyperbolic tangent

bct ion . These functions are shown in Figure 2.10 below. The shape of these two

functions is similar, the main difference is îhe range ont0 which the domain is mapped.

As can be seen f?om the figure, the sigmoid function maps values ont0 the range [O, 11,

while the hyperbolic tangent maps onto [-1, 11. Note also that the derivative of either

function cm be expressed in texms of the function itself (see Figure 2.10), which is

convenient due to the presence of the derivative factors in the weight update equations.

The hyperbolic tangent fûnction was used for this work.

f ( x ) =tan h ( x ) f y x ) = 1 - ( f ( x ) ) 2

Figure 2.10 Common tramfer functions: (top) the hyperbolic tangent, (bottom) the sigmoid bct ion.

Input vectors should not be presented to the ANN as raw values. They should be

scaled to a smaller range so that they are more compatible with the transfer functions

used for the learning algorithm. Figure 2.10 shows that both the sigmoid and hyperbolic

tangent functions behave fairly linearly over the range [-2, 21. If an input value that is

much p a t e r than this is presented to the network, even with mal1 weights in the

network, the sumrnations will be large. This will cause the tramfer function to becorne

satraated. When saturated, the derivative of the transfer fûnction becomes zero; since the

derivative is a factor in the weight update equations, learning stops for PEs with large

summation values (NeuralWare, 1993a). For this reason, it is desirable to map the raw

'real-worid' values of the input and target output vectors to a small range. Also, if the

input and output variables are not of the same order of magnitude, some variables may

appear to have more importance than others (Baughman and Liu, 1995). Since the

hyperbolic tangent was used for this work, al1 input values were linearly mapped to the

range [-1, I l . The target output values were iinearly mapped to the reduced range [-0.8,

0.81, where the transfer fiinction is more linear. This is common practice, and improves

network training.

Another common modification to the standard backpropagation learning nile is a

process calied batch updating. lastead of updating the network weights after the

presentation of each input vector in the training set, mors and weight adjustrnents are

stored and averaged over an epoch. An epoch can be a complete pass through the

training set, or a &action of it. These average adjustments may better represent g e n d

trends in the training set, and erratic changes due in response to individual training

vectors are avoided. Batch training of ten improves convergence rates, particularl y for

noisy data sets, and this approach was used here.

Another technique that can greatly improve convergence rates uses an adjustable

learning coefficient q, and introduces another variable parameter, a, known as the

momentun coefficient. This enhancement is known as the Extended Delta-Bar-Delta

(EDBD) learning d e (NeuralWare, 1993b) and it specifies distinct values of q and a for

each connection weight; furthemore, these values are themselves adjusted with each

iteration. The momentun coefficient is used to add a fraction of the previous weight

adjustment to the current weight adjustment. For iteration s,

Aw, (SI = 16 (SI Y , (s ) +a Awj (S - 1) ,

where normaliy O< a 4. If a given connection weight is adjusted in the same direction

(same sign) over several iterations, the learning and rnomentum coefficients for that

comection are increased. Conversely, if a weight adjustment changes direction over

several iterations, q and a are decreased. The effect of these enhcements is to

reinforce general trends while damping out oscillatory behaviour.

2.3.2 Generalization and Separation of Data Sets

Ultimately, the goal of the îraining process is to produce an ANN that can predict

output values well when presented with inputs it has not seen before. This is called

generalization. The degree to which a network can generalize weil is detexmineci by a

number of factors, including the size of the training sek how we1I the training set

represents the process being modeled, the choice of appropriate input variables, and

network architecture (i.e., the number of hidden lay ers and PEs).

In general, larger architedures can model more complex prowsseç. The

additional comection weights give the ANN more flexibility to represent intricate

relationships within the training set. Conversely, a small architecture may not be

sufficiently robust to accurately model the process. However, there is a trade-off

between network complexity and training set size. If the training set is small, an overly

complex network will tend to 'memorize' the data set, meaning it will learn to predict the

output of training examples very well, but will predict poorly when presented with

previously unseen input vectors. This is called over-training, and cm be avoided by

testing the ANN's generalization capability during the training process.

In order to do this, the fidl data set must f%st be divided into a training set and a

test set. It is also cornmon practice to separate a third set called the validation set, which

is representative of a typical model-deployment scenario. The training set is used in the

backpropagation leaming nile to build the ANN model, while the test set is used to assess

the trained net's ability to generalize. The test and training sets are disjoint, but both are

drawn fiom the same population to ensure that they cover the same domain of input

space.

NatualWorks Profersional IIPLUS provides a method to check the ANN's

generalization ability as training proceeds, using its 'SaveBest' command. This

command h t tests the untrained network against the test set, to form a base-line

performance estimate. The network is then trained for a specified -ber of iterations

(i.e., random presentations of training vectors), and is tested again against the test set. If

the ANN's performance improves nom the 1st evaluation, the network is saved, and

training continues. After the specified number of training iterations, the network is again

tested, and if generalization improves, it is resaved. This process of training and testing

continues until the ANN no longer shows improved generalkation nom continued

leaming. In this way, the SaveBest command retains the neîwork that shows the best

generalization, and thus avoids over-training.

Chapter 3

Collection and Analysis of Data

Considerable procasing of the LCM data was required to put it into a form

amenable to both ANN modeling and comparison with Gaussian puff models. A

description of the experimental setup of the aerosol release trials is £ïrst presented. This

is followed by a discussion of the analysis of the inverteci concentration maps, and the

selection of the training and test sets. Finally, the details of the Gaussian puff models

used for ANN mode1 comparison are given.

3.1 Experimental Setup and Collection of Data

The aerosol release trials were conducted at Canadian Forces Base Valcartier, on

a military obscurant trial range conirolled by DREV during the period 5- 12 August, 1997.

Forested hills covered much of the mounding area, but the trial range itself was a very

large, level plateau. Most of the d a c e was exposed soil, with small bumps and ripples

varying about 10 cm in height. About 30% of the s d c e was cuvered in long gras

(about 40 cm) and small shrubs (about 50 cm).

The mangement of the equipment is summarized below in Figure 3.1. The

weather measurement system was placed on a pole about 5 rn above the surface, to the

left of the LCM scanning volume. Here, measurements of temperature (OC), atmospheric

pressure (in Hg), wind speed ( d s ) and direction were taken. Wind speed was rneasured

using a cup mernometer, and direction was measufed with a bi-directional vane (Davis

Weather Wizard III; Davis, 2000). The LCM was about 1.5 m above the ground. A low

sand dune was located about 175 m north of the LCM.

LCM shot 44 \

disseminator (100 rn)

Figure 3.1 Layout of the experimental trial plateau.

Unfortunately, the meteorological data w a e not sampled fiequently enough to

detennine any statistical parameters of the turbulence. Just prior to each release, a single

instantaneous measurement was taken fiom the weather measurement system and the wet

bulb anemometer. Temperature and pressure varied slowly h m trial to trial, but wind

speed and direction fluctuatecl considerab1 y between trials, with consecutive trials

59

separated by about 3 minutes. Considerable fluctuations also occurred during each trial,

but these variations were not measufed.

As shown in Figure 3.1 above, the aerosol was released fiom a particle

disseminator, which was placed in one of two separate locations. These two diffaent

locations were used to remove any bias introduced due to the position of the release

relative to the LCM scanning volume. In either release location, the disseminator was

placed directly on the surface. However, the disseminator's nozzle was located about

0.25 m above the ground, at a slight inclination (about 20° relative to the surface). The

aerosol was placed in the disseminator reservoir, and was forced out the nozzle into the

environment.

The aerosol used for all the trials analyzed here was kaolin, a fine ground ceramic

powder (H2A12SizOs - H20), with particle size less than 3 p. It had a measured mass

extinction coefficient a=1.2 f 0.2 m2/g at wavelength of 1.06 pm, which compared

favourably with the value used in COMBIC of 1.0 m21g (Ayres and Desutter, 1995).

Kaolin is hert and non-buoyant. For each trial, 50 g of kaolin was released f?om the

disseminator.

Table A. 1 in Appendix A summarizes the meteorological measurernents taken

during the trials. Each kaolin release was scanned six times as it dispersed. Cloud

conditions were also recorded during the trials. The skies were very clear on August 5, 7

and 12, with very little or no cloud cover. On August 6, there was at most about 60%

grey cloud cover. A more full description of the actual data collection is provided by

Costa ( 1998).

3.2 Analysis of Inverted LCM Scans

A total of 5 1 separate kaolin releases were performed over the four days of trials.

Each release was scanned by the LCM six times (except one, which was scanned only

twice), resulting in a total of 302 LCM scans. Each scan was inverted using the AGILE

algorithm. For those releases where six sweeps were taken per scan, this produced a

three-dimensional extinction map containing 39 600 points per scan. When there were

eight sweeps per scan, the extinction maps contained 52 800 points. Most of these data

points were clear air retums, and only a fiaction of each scan containeci retunis fiom the

difiùsing kaolin cloud. Al1 data points fiom each scan that were below the LCM

extinction detection threshold of 104 m-' were considered clear air rehims, and were

removed fiom the data set. Al1 values that were a result of inversion mor, that is, those

that were 20.5 me', were also removed. This left 302 LCM scans that contained, on

average, about 1 100 extinction measurements of the diffûsing kaolin cloud.

Not al1 of these scans were usefûl, however. A custom-designed program for the

LCM, called LCVS, allows for quick viewing of raw and inverted extinction maps, in the

form of contour plots. This lets the user easily get an idea of the qualitative behaviour of

the puffs in each scan. Figure 3.2 below shows a typical LCM scan, both in raw fom,

and after being inverted by the AGILE algorithm. Using LCVS, it was found that in

many cases the kaolin release had moved mostly or entirely out of the LCM scanning

volume, leaving an extinction map of a very small portion of the puff. Typically this was

the case for the first and sixth scans of a given release.

Au'rnuth : 00 - 1 PO drg Timr : O - 9 - 5 0 ur

E i œ u i c i o n - 0 d Arirnuth <dsg> VS ~ikrts?u~e> WKA055021

i Atimuth:eO- 120 dm Timr : 0 - 4 - 6 0 u œ

i ~ œ v r t i o n - O drg Azimuth <dcg> VS f ime <us) [ K A 0 5 5 0 2 1

i

Figure 3.2 The bottom sweep of a typical LCM scan shown in raw form (bottom) and inverted (top), displayed using LCVS @id's eye view). Radial grid lines are spaced about 50 m apart; azimuth grid lines are 10" apart. Note the reduced noise in the inverted scan. The strong return across the top is the sand dune.

Each inverîed scan was manually irispected using LCVS, and those that appeared

to be missing considerable portions of the cloud were removed fiom the data set. This

left 187 scans, many of which were stili missing portions of the cloud above and below

the LCM scanning volume. This was unavoidable, given the smaii range in elevation of

the scanning volume (IO0), and the fact thaî the LCM itself was mounted about 1.5 m

above the ground. However, nearly al1 of the remaining scans contained the f d l

horizontal extent of the diffushg kaolin puff.

The puff extinction map in each remaining scan had to be isolated. This was

necessary to discriminate between retums fiom the puff of interest, and returns nom

extraneous influences, such as the sand dune (see Figure 3.2) or portions of previous

releases still in the LCM scanning volume. Again using LCVS, each of the 187 scans

was manually inspected, and boundaries were defined for each sweep of each scan. In

the subsequent analyses, only LCM retums within the predefined boundaries of a given

sweep and scan were considered.

It was then necessary to put the remaining isolated sans into the appropriate

coordinate system. As described in Chapter 2, the coordinate system was centred on the

p f l s centre of mas, and was oriented such that the x-axis points downwind. However,

there were serious difficulties with the wind speed and direction data taken during the

trials (listed in Table A.1). As noted, these were instantaneous readings, taken before

each trial; i.e., about every three minutes. These data provided almost no information

about either the means or variances of wind speed and direction. It was decided to

estimate the mean wind speed and direction during a given release from the trajectory of

the kaolin puff fkom scan to scan.

The cenue of m a s of each scan was calculateci, and fiom these values the

horizontal distance îraveled h m scan to scan in a given release was deriveci. Since the

time between scans was known, an estimate of the horizontal wind speed and direction

could be made. These wind speed estimates replaced those presented in Table A. 1 for the

remainder of the analysis, and are listed in Table A.2. The wind direction estimates and

puff centres of mass were used to translate and rotate each scan into the appropriate

coordinate s ystem.

It should be noted here that two separate sets of the LCM data were constnicted,

each employing a different coordinate system. That is, the same data were represented in

two different ways. The first data set was translateil and rotatecl as indicated above, so

that the position of each LCM retum in a scan was given in relation to that scan's centre

of mass. Thus, this data set was analyzed in the h e w o r k of relative diffusion, as

discussed in Chapter 2, where puff meander was entirely removed f?om the dispersion

process. The second data set was constructed so that the origin of the coordinate system

followed the horizontal centre of mass of the puff, but the vertical origin was fixed at

ground level. Thus, horizontal meander was removed fiom the diffusion process, but the

vertical meander was incorporated into the p S s dispersion, and would be partially

responsible for the vertical spread of the puff. This second coordinate system was

employed for two rasons. First, many cornmon Gaussian pufî models use absolute

coordinates, and include puff meander implicitly as part of the diffusion process. Second,

this system may offer some insight into the inhomogeneity of the turbulence due to the

surface, and its variability with height fkom the ground. These data sets are r e f d to as

data set 1 and data set 2, respectively.

3.3 Preparation of Data for ANN Modeling

Both data sets were prepared in the same manner, and the following discussion

applies to them both. The measured variables chosen for input are listed below.

Downwind position, x,

Crosswind position, y,

Vertical position, 2,

Diffusion tirne, t,

Mean wind speed, U,

Ambient temperature, T,

Time of day,

Pasquill stability class, P,

Atmospheric pressure, p.

Recall that position variables are relative to the p f l s centre of mass (z is relative to the

ground for data set 2). Wind speed is as calculated from the puff trajectones, as

described above, and Pasquill stability class is estimateci fiom Table 2.1 ; these values are

listed for each trial in Table A.2 of Appendix A. Al1 other values are as described above.

The ANN predicted a single output variable: kaolin conceniration, C, in @n3.

Given the relation o (r,li) = a(k)C(r) and the fact that a=1.2 m2/g for kaolin, the

inverted LCM extinction maps immediately yield concentration maps.

Neural networks have greater success modeling data that are evenly distributed

over the entire range of values (Wasserman, 1993). For this reason, it is desirable to

balance the distribution of each input and output variable so that certain portions of a

variable's range are not under-represented. Ofim a non-linear transfonn can

considerably irnprove a variable's fiequency distribution. Duplicating vectors that have

values in poorly represented regions c m ais0 help to even out the fiequency distribution.

Numerous non-linear transforms were applied to each of the input variables listed

above. No transform showed any marked improvernent in the uniformity of any of the

input variables, and it was decided that none would be applied. However, for ANN

developrnent inputs must be in n d c a l format, so certain inputs did require some form

of transformation. Specifically, time of day and Pasquill stability class had to be

transfonned.

The time of day inputs were linearly mapped ont0 the range [O, 11, where a value

of zero corresponds to midnight, 24:00:00, and a value of 1 corresponds to the last second

of the day, i.e., 235959. Pasquill stability class was transfomed so that class A was

represented by a value of 1 .O, and class B by 2.0. The intermediate class A-B was entered

as 2.5. Other transforms for Pasquill stabifity class were also applied, such as 1-of-n

coding and themorneter coding (NeuralWare, 1993~). However, these transformations

did not significantly alter network results, and were discardeci.

The output variable, C, did require a non-linear transform to even out its

fkquency distribution. The raw distribution was heavily skewed toward low

concentration values. Also, the range of kaolin concentration values spanned four orders

of magnitude. It was found that the tramfom lu&) significantly improved the

unifomnity of the concentration fiequency distribution. Clear air data points (C4)

outlining the kaolin clouds were included in the data set to help the ANN recognize the

boundary of the puff. The inclusion of these points required a slight modification of the

transform. A small offset allowed the clear air values to be logarithmically transfomed:

log'(^ + 1 0 ~ ) was chosen. Trained ANN outputs can be converted back into

concentration predictions by applying the inverse of this logarithmic transform.

Before the duplication of any vectors in the &îa set, a test set and validation set

were removed. The validation set was removed fïrst. Since this set represents a typical

mode1 deployment scenario, one complete trial was removed f+om the data set to fonn the

validation set. Tnd 44 was selected at random, and was used for both &ta sets.

Approximately 30% of the original set was randody selected and set aside to form the

test set.

Once the training and test set were separated, certain vectors were duplicated to

balance the input and output space more evenly. Given the high dimension of the input

space (i.e., nine), it is vexy difficult to Mprove the distribution of one input without

dimpting the distribution of another. Furtherrnore, it should be noted that even if al1

inputs had perfectly uniform fiequency distributions, this would not guarantee that the

entire input space is well represented. In order to detennine this, joint frequency

distributions would have to be c o n s i d d , this would be a formidable task in nine

dimensions. Figure 3.3 and Figure 3.4 show the histograms for the training set and test

set, respectively. These figures show data set 1 histograms; &ta set 2 distributions are

similar. Table 3.1 below summarizes the size of the training, test and validation sets for

data sets 1 and 2.

Table 3.1 Total number of vectors in the training, test and validations sets of data sets 1 and 2.

Data Set Training Set Test Set Validation Set

O 2 4 6 8 10 12

bin

Figure 3 -3 Data set 1 training set fiequency distributions af'ter transformation and duplication.

6

bin

Figure 3.4 Data set 1 test set hpency distributions after transformation and duplication.

The input variables covered the following ranges:

x E (-70,30) m,

y E (-30,30) m,

z E (-1 5,lS) m for data set 1, z E (O, 30) m for data set 2,

t E (7.556) s,

U E (0.5,3.7)m/s,

T E [17,24j OC,

time cf E [0.4083,0.6625], (i.e., about 10:OO to l6:OO)

P E (1 .O, 1.5,2.0) ,

p E [3O.OO, 30.451 in Hg.

Once the data sets were M y prepared as discussed above, network training

began. There are no well-defined niles for determining the optimal parameters for ANN

training. Thesz are largely determined by the process being modeled and the degree to

which the training set represents the salient features of the problem. Numerous networks

of varying architecture were trained. Epoch size was varied to determine an optimal

setting, and numerous adjustments were made to various network parameters in an

attempt to find an optimai solution. The EDBD Iearning rule was used for al1 ANN

modeling.

3.4 Gaussian Puff Modeling

Attempts were made to model the kaolin trials with Gaussian puff models. The

traditional Gaussian model, given by equation (l), was used together with both the Slade

and PasquilI parameterizations, as !mmma&ed in Figures 2.3 and 2.4, respectively. The

kaolin trials were also modeled using the fidl COMBIC dispersion code. The predictions

fiom these models were compared to LCM data, and the performance of these models

was cornpared with that of the ANN models.

Data set 1 is in a form that is directly amenable to cornparison with the Gaussian

puff model as given in equation (1). However, since data set 2 was constructed using

absolute z coordinates, a slightly modified version of the equation was used to model this

set. The form of this equation is given below (Sato, 1995).

where z is now the absolute vertical coordinate (relative to the ground), and zh is the

effective release height of the kaolin puff. The second texm in square brackets is included

to account for reflection of the aerosol fiom the ground This is a cornmon feature of

many Gaussian puff models, and is accomplished by incorporating an 'image source'

positioned beneath the ground at -zh. Figure 3.5 below illustrates the concept of the

image source.

Figure 3.5 The 'image source' accounts for surface interactions of the puff by reflecting material off the ground.

The use of an image source assumes that al1 the material interacting with the

d a c e is reflected back up into puff. That is, no deposition due to impaction or chernical

reaction is assumeci. It is also assulled that the puff is non-buoyant, i.e., it remains at the

effective release height as it disperses, and that gravitational settiing is negligible.

COMBIC also employs an image source to account for puff-surface interactions, and it

models non-buoyant puffs using equation (40).

A modification to COMBIC allows the user to output the model's predicted three-

dimensional concentration distribution. The user specifies the size of the grid and the

grid cells; cubic grid cells with 1-m sides were used, and the grid size was chosen to

entirely contain the modeled puS. Since COMBIC models diffusion relative to the

ground, and includes the ver t id meander of the pu€f in the dispersion process, COMBIC

predictions were compareci with data set 2. Also, COMBIC uses a fixed coordinate

system in its model, so the predicted three-dirnemiod concentration distributions were

centred on the puff s horizontal centre of mass before cornparison with &ta set 2 values.

A representative input card used for COMBIC modeling is shown below in Table 3.2.

Both the COMBIC model and the standard Gaussian puff models given by

equations (1) and (40) require the specification of initial puff size. The dispersion

coefficient profiles in figures 2.3 and 2.4 are for the idealized case of a point source, i.e. a

puffhaving no dimension at ~ û . Such a source is inappropriate for modeling the kaolin

releases, since the disseminatm immediately sprayed the kaolin over a finite volume.

Initiai puff radii were estimated using LCVS.

COMBIC models a single release scenario at a tirne. Since the validation set

comprises multiple scam of a single release, it is readily modeled using COMBIC.

However, the test set is composed of randody selected points from al1 of the 187

different LCM scans. The direct evaluation of COMBIC against the test set would

require a separate COMBIC model for each of the 187 scans. htead, a smaller selection

of 33 scans was modeled using COMBIC. These 33 scans were chosen so that a broad

range of meteorological conditions f?om the kaolin release trials was represented. It is

felt that the performance of COMBIC against this smaller subset of scans is

representative of COPuiD3IC's performance against the full test set.

Table 3.2 COMBIC Input card used to mode1 kaolin trial 18, scan 5.

WAVL VIS COMBIC PHAS FILE NAME ka01805 MET MET2 TERA MJNT CLOU SUBA SUBB DONE END CONTINUE WAVL VIS COMBIC PHAS FILE FILE NAME ka01805 ORIG LIST SLOC OLOC TLOC EXTC VIEW GREY TPOS DONE END STOP

Chapter 4

Results and Discussion

This chapter is divided into four parts. The first section presents and discusses the

results of various networks trained on data sets 1 and 2. The networks showing the best

performance for each data set were selected for fùrtber analysis and rehement. In the

second section, statistical measures of model performance are defined, and the

performance of the ANN models is compared to that of the Gaussian puff models and

COMBIC. A sensitivity analysis conducted on the ANN models is then described, in

order to determine the effect of each input on model performance. The last section

presents an analysis of the concentration distributions predicted by the ANN models, and

the significant features of these models are extracted into a simple parameterizaiion.

Curve fitting and plotting were effected using SigrnaPllot v. 5.00 by SPSS, ïnc. (SPSS,

1998).

4.1 ANN Mode1 Development

Numerous ANNs were trained on both data sets using the EDBD leamhg d e .

Training began on networks with simple architectures; more PEs and hidden layers were

then added to evaluate the performance of more complex networks. In addition to testing

different network architectures, numerous other network parmeters thai affect training

were altered in attempts to improve the performance of the ANN models.

Epoch s u e was systematically varied to determine the optimal setting. Since

network weight adjustments are averaged over the epoch, a small epoch size can often

lead to erratic jurnps in weight space. This is especially tnie of noisy data sets, where it is

desirable for the network to train to the more general trends of the data, smoothing out

the noise of the individual irainhg vectors. This was found to be the case for data sets 1

and 2. Networks of varying architectures were trained with epoch sizes of 100,200, 500

and 1000. The results of some of the preliminary ANNs trained on data sets 1 and 2 are

iisted in Appendix B in Tables B.1 and B.2, respectively. The best of these ANNs are

tabulated below in Tables 4.1 and 4.2, showing each network's performance against both

the test set and the validation set. Each ANN was trained using the SaveBest command,

and training continued for up to 5 million iterations (about 50 complete presentations of

the training set). Note that an epoch size of 500 was optimal for every architecture for

either data set. Networks train& with epoch sizes of 100 and 200 did indeed show very

mt i c behaviour, with oscillatory RMS that failed to stabilize quickly. The larger epoch

setting of 1 000 slowed training considerably.

Table 4.1 The best networks for each architecture listed in Table B. 1 (data set 1, relative- z coordinates), as detennined by performance against the test set. Validation set staîistics are also shown (RMS = root-mean-square error, R = linear correlation coefficient).

- - --- - -

Test Set Statistics Validation Set Statistics ANN Architecture Epoch

RMS R RMS R

Table 4.2 The best networks for each architecture listed in Table B.2 (data set 2, absolute- z coordinates), as determined by performance against the test set. Validation set statistics are also shown. (RMS = root-mean-square error, R = iinear correlation coefficient).

Test Set Statistics Validation Set Statistics ANN Architecture Epoch

RMS R RMS R 2 . 1 ~ 9-10-1 500 0.3565 0.6 140 0.38 16 0.557 1 2 . 2 ~ 9-20- 1 500 0.3549 0.6171 0.3674 0.5923 2 .3~ 9-30- 1 500 0.3508 0.6279 0.3820 0.5572 2 . 4 ~ 9- 10-5- 1 500 0.3550 0.6 179 0.3969 0.496 1 2 . 5 ~ 9-20-1 O- 1 500 0.3425 O. 6494 0.3940 0.5060 2 . 6 ~ 9-30- 15- 1 500 0.3407 0.6529 0.3792 0.5555 2 . 7 ~ 9-40-20- 1 500 0.33 15 0.6803 0.4098 0.4606

In addition to altering the epoch size, a number of other network adjustments wae

made in attempts to improve network performance. Some networks were trained with

additional weights directly comecting the input layer to the output PE. This bypasses the

non-linear transfa fimciion, and allows for linear parts of the problem to be solved

directly. These linear connections had negligible effect on network training, suggesting

that the process being modeled is highIy non-linear. A number of neîworks were trained

wiîh the addition of random noise. This is done by addiog a s m d random value to the

surnmation at each PE before applying the tramfer hction, and can improve network

generalization. In every case, the addition of random noise slowed network training with

no noticeable improvement in performance. Given the random nature of instantaneous

concentration distributions, it is likely that the data set is sufficiently noisy that the

additional noise had no effect. In addition to this, a small, constant offset value of 0.1

was added to the derivative of the transfer fûnction. This ensured a non-zero transfer

function derivative, and aIlowed saturated PEs to continue to leam.

Network architecture had the most influence by far on ANN training and

performance. As can be seen fiom Tables 4.1 and 4.2 above, larger architectures showed

better performance against the test set, but not against the validation set. It should be

noted that test set statistics are a better measure of network +omance than validation

set statistics. This is due in part to the fact that the test set is wmposed of about 10 times

more vectors than the validation set (see Table 3.1 above). Also, the test set covers the

full range of input variables, as shown in Figures 3.3 and 3.4, while the validation set is

specific to a single kaolin release. Based only on test set statistics, networks 1 . 7 ~ and

2 . 7 ~ clearly show the best performance.

However, it is also customary to use the smallest network that produces adequate

accuracy on the test set (Wassemian, 1993); this guards against over-training. In general,

when modeling with a data set as large and noisy as that used here, it is unlikely that an

ANN wili memorize the data. This is especiaily true when using a cross-validation

leaming technique such as the Suvekst command, which cont indy checks for

improvements in network generaiization. However, this data set is somewhat unique.

Although the training sets are large, input vectors nom the same scan wiii have identical

meteorologicai inputs, differing only in the position and time variables. This may reduce

the effective size of the training set, increasing the likelihood of over-training. For this

reason, smaller networks that show reasonable perfoxmance against the test set were

retained. Networks 1 . 2 ~ and 2 . 2 ~ (each with a single hidden layer of 20 PEs) both show

poorer perfomiance against the test set than networks 1 . 7 ~ and 2.7c, but still give

reasonable performance against the validation set. Thus, the following four networks

were selected for further analysis: 1.2c, 2 . 2 ~ ~ 1 . 7 ~ and 2.7~.

Typicai leaming curves are shown below in Figure 4.1. This shows the reduction

in RMS m o r against the test set as more training vectors are presented to the ANN. For

cornparison, the figure shows leaming curves for network 2 . 2 ~ (9-20-1) and 2 . 7 ~ (9-40-

20- 1). Leaming curves for the ANNs trained on data set 1 are sllnilar. In general it was

found that the 9-20-1 ANNs showed only marginal improvement der about 2 million

iterations, while the 9-40-20-1 ANNs showed steady improvement up to about 3 million

iterations. In both cases, the steepest part of the learning curve was within the nrst

500,000 iterations.

Also clearly show is the difference in RMS approached by either network. The

larger architecture networks generally were able to reach a lower RMS much more

rapidly than the smaiîer nets. Within the first 1 million iterations, the 9-40-20-1 ANNs

had trained sufficiently to produce a test set RMS that was about 5% Iower than that of

the 9-20-1 ANNs. When the RMS stabilized, the larger networks showed about 10%

better performance against the test set than did the smaller ANNs.

- ANN 22c (820-1)

0 saved - ANN 2 7 ~ (040-20-1) .O.

Millions of iterations

Figure 4.1 Leaming cuves of ANNs 2 . 2 ~ and 2.7~. Also indicated in the figure are the points during haining where RMS improved, and the ANN was automatically saved by the SaveBest command.

In any gradient-descent algorithm, the starting point can be critical. Before

training begins, ANNs are initialized by assigning every co~ec t ion weight a small

random value. Each of the four ANNs listed above were initialized to a random point in

weight space, and trained using the SuveBesr cornand. This was done 20 times for each

ANN, producing 20 different models h m each of the above four. It was found tfiat in

almost all cases, networks with different initializations converged to diff't solutions in

weight space. However, no one solution was significantly betîer than any other,

suggesting that the error d a c e had numerous minima of approximately the same

magnitude. The full results of these training sessions are listed in Table B.3 of Appendix

B. The mean test set statistics are summarized below in Table 4.3, where the emor terms

indicate the standard deviations between the 20 ANNs of a given architecture and data

set.

Table 4.3 Mean test set statistics for the two architectures for each data set. The error tenn indicates the standard deviation between the 20 A N N s in each group.

- - --

Data set Architecture RMS R

As c m be seen fiom Tables B.3 and 4.3, the variation in performance between the

20 networks of a given data set and architecture is small. Consequently, none of the 20

models within a group is preferable to any other. However, many have trained to a

different minimum in weight space, and may produce slightly differing predictions.

Thus, for each architecture and data set listed in Table 4.3, the average prediction of each

of the 20 ANNs was taken as the final model. This is also felt to help increase the

generalization ability of the final models. If any one of the 20 ANNs in a grop is trained

more or less in a specific region of input space, perhaps resulting in anomalous

predictions in that region, this model's deficiencies are partly suppressed by averaghg

the predictions of all 20 trained networks. Two average models were constructed for

each data set. The final average ANN models are labeled as follows: models 1A and 1B

are trained on data set 1, and have 9-20-1 and 9-40-20-1 architectures, respectively.

Similarly, the average models trained on data set 2 WU be labeled models 2A and 2B.

4.2 Comparing ANN Models with Gaussian Puff Models

Before the average ANN models could be cornpared to the Gaussian puff models

and COMBIC, it was first necessary to convert the ANN output into concentration

predictions. This was done by inverthg the logarithmic tninsform that was applied to the

ANN target outputs. The concentration predictions of each of the average ANN models

were then evaluated against the test and validation sets, as were the predictions of the

Gaussian puff models and COMBIC. The Gaussian puff model described in equation (1)

was evaluated against data set 1, using both the Slade and the Pasquill dispersion

coefficient parameterizations. The Gaussian model of equation (40) was evaluated

against data set 2, as was the COMBIC model.

The statistical evaluation of each of these models is based on the following model

performance rneamres.

Factor of two: CP F2 = fiaction of data for which 0.5 5 - I 2, C o

Factor of ten: CP FI0 = hction of data for which O. 1 5 - < 10, C o

Correlation:

where a ,,, is the standard deviation of In Co, and similar1y for a, cp .

1 Geometric Variance: VG = exp {(ln C, - ln C, )' J.

In the above equatiom, Co is au observed concentration, C, is the corresponding

predicted concentration, and overbars indicate average over the test or validation set.

The use of the logarithmic forms of correlation, geometrk rnean and geometric variance

is justified when there is a large range of magnitudes in the observeci and predicted

concentrations (CCPS, 1996). Since comparisons are being made between average

concentration predictions and instantaneous concentration measurements, there are many

data pairs with C,/Cp and Cp /Co qua1 to 10, 100 or more. Thus, the logarithmic

foms are more appropriate measures of model performance than are the more cornmon

linear forms (CCPS, 1996; Mohan and Siddiqui, 1997). The ideal value of each

statistical measure dehed above is 1 .O.

Table 4.4a below summarizes the performance against the test set of the two ANN

models (ANN 1A and IB), and the Gaussian puff model (equaîion (1)) using the Slade

(GPMs) and Pasquill (GPMp) parameterizations. Table 4.4b shows the performance of

the same models against the validation set.

Table 4.4a Model wmparison over data set 1 test set.

Mode1 R VG MG F2 FI O - -- -

ANN LA 0.63 14.43 1.40 0.3 1 0.84

ANN IB 0.7 1 9.14 1.36 0.35 0.89

GPMs 0.44 58.22 1.89 0.25 0.73

GPMp 0.46 38.09 1.43 0.27 0.76

Table 4.4b Model cornparison over data set 1 validation set.

- -

Mode1 R VG MG F2 FI0 - -

ANN 1A 0.58 13.52 1 .O1 0.30 0.83

ANN 1B 0.57 16.14 1.48 0.3 1 0.83

GPMs 0.52 17.17 0.93 0.30 0.82

GPMp 0.56 15.99 0.72 0.3 1 0.82

The statistical results for data set 2 are presented below. Table 4.5a shows the

performance against the test set of the two ANN models (ANN 2A and 2B), the Gaussian

puff mode1 (equation (40)) using the Slade (GPMs) and Pasquill (GPMp)

panuneterizations, and COMBIC. Table 4-93 shows the performance of the same models

against the validation set.

Table 4.5a Model comparison over data set 2 test set.

Mode1 R VG MG F2 FI0

ANN 2A 0.59 13.89 1.36 0.32 0.85

ANN 2B 0.66 9.22 1.3 1 0.34 0.89

GPMs 0.37 48.6 1 1.36 0.27 0.75

GPMp 0.35 50.87 1.34 0.28 0.75

COMBIC* 0.35 50.46 0.59 0.26 0.76

* Note: a different test set was used to evaluate COMBIC. See Section 3.4.

Table 4.S Model comparison over data set 2 validation set.

ANN 2A 0.60 13.15 1.30 0.3 1 0.85

ANN 2B 0.57 21.78 1 .91 0.30 0.8 1

GPMs 0.46 26.08 0.74 0.25 0.68

GPMp 0.29 78.52 1.15 0.22 0.66

COMBIC 0.29 87.8 1 1.1 1 0.19 0.56

Note that in al1 cases, less than about 30% of the predictions are within a factor of

two of the measurements. This clearly justifies the use of the logarithmic forms of the

statistical measures defieci above.

Over both test sets, the ANN models showed significantly better correlation with

concentration measurements than did the Gaussian puff models. The ANN models also

showed better correlation agauist the data set 2 validation set, but all models gave

comparable correIation over the data set 1 validation set. In no cases did any of the

Gaussian puff models have higher correlation than the ANN models.

Scaîter plots of each model over both the test set and validation set are presented

in Appendix C. A sample plot for the ANN model 1A is shown below in Figure 4.2. The

large scatter and low slope of the plot are characteristic of ail the models analyzed here.

Both of these traits are to be expected. Each model predicts average concentrations, but

is being compared to instantmeou concentration measurements, which vary greatiy from

the average distribution. Indeed, such instantaneous distributions are random, and a

perfect correlation is of course impossible.

1

(a) Test Set

1

(b) Validation Set

Figure4.2 ScatterplotforANNmodellAoverthe(a)testset and (b) validation set.

Each model under-predicts the high concentrations and over-predicts the low

concentrations to varying degrees, as indicated by the low slope (significantly less than

unity) of each plot. Again, this can be attributed to the fact that average concentration

predictions are being compared to instantaneous measmementS. The very process of

averaging or smoothing over the wildly fluctuating instantaneous distributions causes the

compression of extreme values into a smaller range. Predictions fkom these models

would presurnably show much l e s scatter and a siope closer to unity if they wae

compared to ensemble averages of measured concentration distributions. However, gîven

the present data set, statistically sound ensemble averages muld not be constructed for

comparison.

The geometric mean and variance of each mode1 can be displayed visually by

plotting VG versus MG. These plots are shown below in Figure 4.3 for data set 1 models,

and Figure 4.4 for data set 2 models. A perfect model would be placed at the point (MG,

VG)=(l, l), and a model that has no random scatter but suffers a mean bias would lie

along the curve ~(VG)=(~~(MG))~ , which is the minimum value of geometric variance

corresponding to a given geometrïc mean bias (CCPS, 1996). This parabola is indicated

in the figures along with lines of constant MG, corresponding to factor of two clifferences

in the mean.

(a) Data Set 1 Test Set Staüstics

0.125 0.250 0.500 1.000 2000 4.000 8.000

Geometnc Mean Bias, MG

(b) Data Set 1 Validation Set Statistics

Figure 4.3 Comparison of geometric variance and mean bias between the various models for data set I (a) test set and (b) validation set.

(a) Data Set 2 Test Set Staüstics

I 1 1 GPMp

O 1 Q CWBiC 1 GPMs

I I

Geometric Mean Bias. MG

(b) Data Set 2 Validation Set Statistics

0.500 1 .O00 2.000

Geomeûic Mean Bias, MG

Figure 4.4 Cornparison of geometric variance and mean bias between the various models for data set 2 (a) test set and (b) validation set.

For both data sets, the more complex networks (ANN 1B and 2B) show bmer

performance over the test set than do the simpler ones (ANN 1A and 2A). The mean bias

of al1 networks over both test sets is comparable. Over the validation set, however, the

smaller ANNs have less variance and more accurate means. It appears that the complex

ANNs do not generalize well in the region of input space covered by the validation set.

This may indicate that the larger networks are over-trained.

The vectors in the validation set have meteorological inpuîs that are not

represented by any vectors in the training set. This is not the case for the test set.

Aithough no vectors are in both the test set and the training set (Le., the two sets are

disjoint), many vectors in the test set have the same meteorological inputs as vectors in

the training set, since the test set was drawn fkom the same population of LCM scans. in

such a case, the cross-validation training technique employed by the SaveBest command

may not sficiently guard against over-training. During training, a network may

continue to show Mproved performance against the test set simply because it is training

and testing on many of the same meteorological conditions. This may result in an over-

trained network that still shows good generalization over the test set. The validation set

provides a check against this. Taking this into account, it appears that the smaller

networks are better ANN models than are the larger ones. Hence, only networks 1A and

2A will be retained for m e r analysis.

Al1 the Gaussian pufT models estimate the mean within a factor of two of the

observed value, indicating that concentration predictions are within the correct range.

However, these models al1 sufk fkom large variance. That is, the Gaussian puff rnodels

predict concentration leveIs nea. the correct magnitude, but it is the distribution of these

values that is in error. This may be attributed to the symmetricai shape of the Gaussian

puffmodel distributions. Neither equation (1) or equaîion (40) accounts for the effects of

wind shear, which tends to distort a diffusing puff. The increased wind speed at higher

levels above the ground causes a puff to tilt, so that the top portion is carried downwind

faster than the bottom. This results in a vertically skewed, asymmeûicd distribution (van

Ulden, 1992; Sato, 1995). Although the COMBIC mode1 does account for wind shear, its

effect on pufT d i f i i o n is not incorporated until after a diffusion tune of 30 seconds

(Ayres and Desutter, 1995). No shear-induced puff distortion was observed in any of the

trials modeled using COMBIC, even those with diffusion times greater than 30 seconds.

Another possible source of the large variance observed in the Gaussian puff

models is the estimate of initial puffradii. As previously noted, the initial dimensions of

a kaolin puff, i.e., immediately afier it is released nom the particle disseminator, were

estimated using the visualization software LCVS. The effective release height needed for

equation (40) was also estimated using LCVS. The h t scan from a number of diffaent

kaolin trials was exarnined (i.e., scans taken at nominal difbion t h e of zero), and it was

found that fairly consistent estimates could be fomed. The disseminator immediately

force. the kaolin into a cloud with a vertical extent of about 4 m, centrd about 2 rn

above the ground. The initial pufY length was estimated to be about 10 m in the direction

of the disserninator's nozzle, and about 2 m in the transvetse direction. Given these

estimates, and the orientation of the disseminator with respect to the wind direction,

initial puff radii could be estimatecl for each scan. Of course these methods provided

only rough estimates; to detemine the effect of the initial pufY radii on each Gaussian

puffmodel's performance, numeros different values were tried. The best results in each

case were f?om the estimates listed above, and these are the results reported earlier in

Tables 4.4 and 4.5, and Figures 4.3 and 4.4.

It is o h of interest to determine which input variables have the most profound

influence on an ANN model. This c m help clarie which inputs are the best descriptive

variables in modeling the process at hancl, and may suggest if certain vanables c m be

excluded fiom the model. Models IA and 2A were analyzed to detemine the infIuence

of each input variable on the predicîed concentration. Tbis was done in two ways. For

the k t analysis, each input PE for each network was disabled one at a time (i.e., set to

zero), and the effect on network performance over the training and test sets was

rneasured. A similar approach was taken for the second analysis, except that each input

value was adjusted by a £ked percentage, a procedure known as dithering. Both

sensitivity analyses were perfomed on each of the 20 individual ANNs comprising

models 1A and 2A, and the results were averaged. In addition to t h , an analysis of

model residuals was done to determine if there are any trends in model over-prediction or

under-prediction as a function of the input variables.

4.3.1 Disabled Inputs

One way to detennine the importance of a given input PE is to fix ifs value to

zero, and test the resulting performance of the network, leaving al1 other inputs

unchanged. Since all comection weights attached to a given input PE are multiplied by

the PE value before summation (see equation (19)), a disabled input does not wntrïbute

to any sums in the hidden layer. Consequentiy, the disabled PE does not contribute to the

predicted output (see equation (20)). In this sense, the disabled PE is effectively removed

fiom the network, and the change in performance of the ANN can be aîûibuted to the

missing variable.

However, one must exercise caution when interpreting the results of disabhg an

input PE. The change in network performance may indeed be due to the importance of

the disabled variable, but the distri'bution of that input variable also plays a part. Recall

that each input variable is linearly mapped to the range [-1, I l before presentation to the

ANN. Setting an input PE to zero effectively fixes that variable's value to the middle of

the range of possible values. If a given vector has an extreme input variable value (i.e.,

near f l), then fixing this variable at zero effects a large change. Conversely, a vector

whose input value is near the middle of the range (Le., near zero) will not be changed

greatly by disabling the PE.

Therefore, when measuring the change in network performance due to a disabled

PE over a given data set, the distribution of the disabled vanable over that data set can

have a significant effect, often making some variables appear more or l e s important than

they are. For example, if a certain input variable's distribution is centred near the mid-

range value, then fixing this input to zero will effect a smaller change on average and this

variable may seem to have l e s effect on network performance. Conversely, if a variable

is distributed mostly at the extremes, îhen fixing the value to zero effects a large change

on average over the data set, and this variable may appear to have more significance than

it t d y does.

Given the importance of variable distributions when disabkg inputs, this analysis

was done over both the test set and training set Change in network performance was

gauged by the percentage drop in R and increase in RMS as a resuit of disabling a given

input PE. These percentages were calculated for each of the 20 ANNs making up each

model, and the r d t s were averaged. Figure 4.5 below shows the average redts of

disabling each input PE in tum for model I A , over both the test set and the trsining set-

Figure 4.6 shows the same results for the ANN model 2A.

(a) ANN 1A (training set) R RMS

X Y Z t U T time P p

Disabled Input


45

Disabled Input

9 40 -

Figure 4.5 Change in ANN 1A performance over (a) the training set and (b) the test set. Percentage dmease in R and increase in RMS are shown. Error bars indicate the standard deviation among the 20 ANNs making up model 1A (time indicates time of day)

(b) ANN 1A (test set)

g 4 (a) ANN 2A ("ning set)

s! -r

m R T RMS


Disabled Input

- 40 4 (b) ANN 2A (test sût) I I

Disabled Input

Figure 4.6 Change in ANN 2A performance over (a) the training set and @) the test set. Pacentage decrease in R and increase in RMS are shown. Error bars indicate the standard deviation among the 20 ANNs making up mode1 1A. (time indicates time of day)

Note that for ANN IA, the largest differaices between the training set (Figure

4.5a) and the test set (Figure 4 3 ) evaluations are for the variables t, Ty time and p (i.e.,

diffusion time, temperature, time of day and pressure). These are precisely the variables

whose fiequency distributions differ notably beîween the training and test set (cf Figures

3.3 and 3.4). This iUustrates the effect that input variable distribution can have on the

results of a PE-disabling sensitivity analysis. The differences between Figure 4.6a and

4.6b cm be attributed to the same cause. Since the distribution of each of the position

variables (x, y, z) is concentrated at the mid-point of each variable's range, their influence

may be tmderestimated by this anaiysis. Conversely, the time of &y and pressure

distributions are concentrated at the extremes, and the importance of both of these

variables may be exaggerated.

It is clear fiom the above discussion that a direct cornparison of these results

should only be done between variables with similar distributions. First mmparing the

position variables, it appears that downwind distance has the most influence on either

model's perfoxmance. Mode1 2A attributes more importance to the vertical coordinate.

This is plausible, since mode1 2A was constructeci in the fkamework of absolute vertical

diffusion, Le., the z coordinate is meamred fkom the ground up. Conversely, the s

coordinate for mode1 1A is relative to the puff s centre of mas, and the ANN may have

exploited the symmetry of this coordinate system, attributing less importance to the

vertical coordinate.

Diffusion time, wind speed and Pasqui11 class al1 have relatively unifonn

distributions over both training sets. Both models place more importance on wind speed

than either of the other two variables, which is not unreasonable. Wind speed has a direct

and immediate affect on a concentration distribution, while a p f l s distribution changes

less rapidly with diffusion tirne. The Pasquill stability class was estunated using Table

2.1, which contains a large degree of subjectivity, and attempts to combine the effecîs of

various independent parameters into a single measure of stability. Stability class

estimates using this scheme may not accurately reflect the true thermal stratification of

the lower PBL.

The remaining three inputs (temperature, h e of day and pressure) have such

differing and non-unifom distributions that it is diffidt to determine their influence on

network performance with any degree of confidence.

4.3.2 Input Dithering

Dithering is sirnila. to PE disabhg in that each input is varied one by one, and

the redting effect on network performance is mea~u~ed. However, it differs in a

fllndamental way. Instead of fixing each input variable to zero, each value is 'dithered',

or adjusted, by a srnall constant value. The effect of each input on network performance

is expressed as the change in output divided by the change in input. In other words,

dithering estimates the partial derivative of the output with respect to each input variable,

or&/&, , in the notation of Chapter 2. Since each variable is dithered by the same

amount, a variable's fiequency distribution does not affect the analysis.

Each constituent ANN of models 1 A and 2A was dithered over the test set. The

dithering constant was chosen as 5% of the input mapping range, [- 1, 11 (Le., 0.05). The

hctional change in output due to each dithered PE was expressed as a percent. The

mean absolute value of this percentage change was calculated over the test set for each of

the 20 ANNs per model, and the values were averaged. Figure 4.7 below shows the

results of the dithering analysis for both models.

Dithered lnput (5%)

Figure 4.7 Results of dithering inputs by 5% over the test set for (a) model IA and (b) model 2A. Error bars indicate standard devïation among the 20 ANNs making up each model.

Note the similarity between the results fiom either model. The only significant

ciifference arises from z and pressure, p. Model 2A attributes more importance to the

vertical coordinate than does model 1 A, and a likely reason for this was describeci above.

Model 1A shows that dithering the pressure input has a large effect on the predicted

concentration value. It was not expected that atmospheric pressure wodd be one of the

most influentia.1 inputs. In fact, this result may once again be due in part to the fiequency

distribution of this variable. Figure 3.4 shows that the test set distribution of p is

characterized by a few regions of high population separated by regions of zero

population. It may be that the ANN models do not interpolate weli in the regions of zero

population, and the dithering of the pressure variable may force the model to predict in

this poorly-trained region. it is not clear why only model 1A places such large

importance on pressure.

The lower significance attributed to the downwind distance, x, in either mode1

may be due to the large range of input values for this variable. As noted in Chapter 3, the

x coordinate takes on values spanning the range (-70 m, 40 m), while the other two spatial

variables cover considerab1 y smailer regions. Concenîration levels are not expected to

vary considerable at very large distances nom the p f l s centre; large variations generally

occur closer to the puff centre of mass. Dithering a vector that has very large spatial

coordinates (i.e., far fkom the centre of the pu@ will likely have very little effect on the

concentration prediction. Since îhere are more vectors at large than at large Lyl or bl,

the importance of the x coordinate is diminished in the dithering analysis.

The greater importance of temperature than Pasquill stability class may indicate

that the temperature measurements provide some estimate of atmospheric stability. This

may not seem likely, since it is the vertical temperature gradient that provides a measure

of stabiliîy, and measurements at a minimum of two elevations are necessary to estimate

the gradient. However, if insolation is constant (as it was for at least 3 of 4 days during

the kaolin trials), the time of day may give an indirect estimate of the temperature of the

surface, since this depends largely on diurnai heating patterns. It is possible that the

ANN models have learned a relationship among the temperature, time of day, and

perhaps other meteorological inputs that affect atmospheric stability, and hence affect

concentration distributions. In any ment, it appears as though all inputs contri'bute

significantly to the performance of either model.

4.33 Analysis of Mode1 Residuais

It is of interest to detennine if there are regions of input space where each ANN

rnodel's performance varies. In order to determine this, the so-called model residuals, or

(in C,, - in Co ) = h(~, /c, ), were plotted against a number of input variables. The

independent variables andyzed were d i h i o n time, wind speed, temperature, time of

day, Pasquill stability class, and pressure. The test set was divided into a nurnber of

subsets, each covering a specific interval of one of these six input variables. Models 1A

and 2A were then evaluated against each subset, and the model residuals were plotted as

box-plots against each of the independent variables. These box-plots are shown below

for models 1A and 2A in Figures 4.8 and 4.9, respectively. Each box lies horizontally

between the two endpoints of the subset, and is divided vertically by seven divisions.

The middle line in each box is the 5 0 ~ percentile, and the bottom and top borders of the

box are the 25" and 75& percentiles, respectively. The error bars indicate the 10& and

90" percentiles, while the points represent the and 95" percentiles.

0.01 l I I 1 i A A-B B

Pasquill Stability CIess

, I I rnorning aftetmoon

rime of Day

1 0.01 ' , I 1 I

30.00 30.15 30.30 30.45

pressure (in Hg)

Figure 4.8 ANN model 1A residuals vs. (a) diffusion tirne, (b) wind speed, (c) temperature, (d) time of day, (e) Pasquill stability class and (f) pressure.

moming afiemooci Time of Dey

0.01 I I I 0.01 ' I 1 I 4

A A-B B 30.00 30.15 30.30 30.45 Pesquill Stability Clas pressure (in Hg)

Figure 4.9 ANN model 2A residuals vs. (a) diffusion time, (b) wind speed, (c) temperature, (d) t h e of day, (e) Pasquill stability chss and ( f ) pressure.

Note that for both models, there is a general trend of under-prediction. This is

consistent with Figures 4.3 and 4.4 above, which show that both models 1A and 2A have

mean bias greater than unity. Although both models show somewhat better performance

in certain regions of input space than in others, no clear trends are evident. Model 1A

residuals show more variation between intervals of a given variable thaa do those of

model 2A. This may suggest that model 1A is under-trained in certain regions of input

space, but it is more likely due to the distribution of test set residuals over each input

variable interval. Generally, regions with more data points have lower residuals on

average than regions with fewer points. Again, this is consistent with the tendency of

these models to under-predict It is unclear why mode1 2A shows so much less variation

in residual means than model I A , but both models appear to predict with the same ability

over the entire input space.

4.4 ANN Model Concentration Distribution Predictions

An analysis of the predicted concentration distributions of the ANN models is

presented in this section. For a number of meteorological conditions, three dimensional

concentration distributions of models 1A and 2A were constmcted within the spatial

domain of the input space considered here. The fùnctional f o m of these distributions

and their moments is examine. and trends with diffusion time and meteorological

variables are investigated. Finally, ANN model 1 A is analyzed in detail, and analytical

expressions are derived relating the properties of the predicted distributions to the more

influential input variables.

4.4.1 Horizontal Concentration Distributions

Both ANN models 1A and 2A predict very smoothly varying concentration

distributions, peaked at the pufT centroid and falling off rapidly with distance. In the

downwind and crosswind directions, these distributions are very well approximated by a

Gaussian distribution. Figure 4.10 below shows typicd horizontal cross-sections talcen

through the puff s centre (z=0 m) as predicted by ANN model 1A. These two surface

plots show the model's prediction after diffusion t h e of 10 seconds (Figure 4.1Oa) and

30 seconds (Figure 4.10b). The remaining model inputs used to constmct these sections

are U=l.O d s , T=I9 OC, Pasquill stability class A, p-30.3 in Hg, and time of day is

10:48 am.

Sections such as those shown in Figure 4.10 were constructed under a wide

variety of input conditions, and both ANN models 1A and 2A predict m e s very similar

in form to those shown below. Indeed, it was found that under almost all conditions, both

models predict horizontal distributions that can be represented quite well by a Gaussian

distribution. Figure 4.1 1 shows one-dimensional profiles of the siirfaces shown in Figure

4.10 with fitted Gaussian cuves.

(a) diffusion time, el0 seconds

(b) difiusion time, t230 seconds

Figure 4.10 ANN mode1 1 A predictions for a horizontal slice through the pufT centre at A, shown (a) 10 seconds and (b) 30 seconds afler release.

(a) t=lO s

Y (ml

(b) t=30 s

ANN mode1 1A ptedieaons Gawsian distribution

Figure 4.1 1 ANN mode1 1 A predictions and fitted Gaussian cuves for profiles dong FO, y=O (ieft) and A, FO (right). Diffusion times (a) 10 seconds and (b) 30 seconds are shown. Note: these are 1 -D profiles of the surfaces shown in Figure 4.10.

These pronles show very good agreement with the fitted Gaussian m e s (2

Ml98 in al1 cases), but it is clear that the fit is better closer to the cloud's centre. As

distance fiom the puff centroid increases, both ANN models deviate fkom the Gaussian

distribution, predicting slightly higher concentration levels. It was found that this

behaviour is typical of both models 1A and 2A over a broad range of input conditions. It

should be noted that the v e y high R~ vaIues typical of most Gaussian fits are valid only

for the central region of the curve, since the smaller values at the tail contribute l e s to the

regression. However, d e r transforming the data logarithmically to account for this, it

was detennined that most ANN predictions show very good agreement with the Gaussian

out to distances of about 3 standard deviations. Again, this is the case for vimially al1

input conditions.

Although there is no reason a priori to assume that the models' predicted

distributions should follow a Gaussian distribution, the fact that model predictions are

non-zero far fiom the puff centroid suggests that these slight over-predictions are not

physically realistic. The behaviour of the tails of the ANNs' predicted distributions can

be largely attributed to the number of clear air data points included in the data set. As

noted in Chapter 3, data points with zero concentration were incorporated into both data

sets, specificdy to help the ANNs l e m the boundary of the puffs. Earlier ANN models

were trained on data sets with no such clear-air data points. These models predicted

distributions with signihcantly larger non-zero tails than those of models 1A and 2A. It

is likely that if more clear-air data points near the puff bomdaries were included in the

data set, the ANN models would predict distributions that more rapidly approach zero

away fiom the puff centroid. However, maintaining the balance of the output fiequency

distribution places a strict coastraint on the number of such points that can be included in

the data set.

The chatacteer of the ANN distribution tails can not be entirely attributed to the

number of clear air points in the data set. While the ANN models predict very syrnmetric

distributions in the crosswind direction, downwind distributions typically exhibit

asymmetry about the ceutroid. Lndeed, this behaviour bas been reported by others; boîh

Sato (1995) and Yee, et al. (1998) found that downwind concentration distributions were

negatively skewed, such that the traiiing half of the puff had an elongated tail. The

downwind profiles shown in Figure 4.1 1 also show this trend to some degree, as did most

ANN predictions. Yee detennined that the trading half of the PLIFS distribution could be

approximated quite well by an exponential distribution. Such a distribution was fitted to

ANN model predictions under a number of input conditions, but did not provide a

significantly better approximation than the Gaussian distribution. In general, the

skewness of the ANN model predictions in the downwind direction is small, and the

Gaussian distribution is felt to provide a sufficiently accurate approximation.

4.4.2 Vertical Concentration Distributions

The vertical concentration distributions predicted by ANN models 1A and 2A

differ significantly due to the different vertical coordinate systems employed by each

model. Both models predict tilted distributions, such that the upper portions of the puff

diffuse downwind at a greater rate than the lower portions. This is consistent with a

sheared surface Iayer, where wind speed increases with elevation above the ground.

Typical vertical cross-sections are shown below in Figures 4.12 and 4.13 for models 1A

and 2A, respectively. These sections are taken through the çrosswind centre of the pufT -

(Le., at y=O), and show nomalized concentration contoias. Remaining inputs are the

same as those indicated in Figures 4. I O and 4.1 1 above.

(a) g20 s

Figure 4.12 Mode1 1 A normalized concentration contours for vertical cross-sections through the puff centre O), shown (a) 20 s and (b) 40 s after release.

(a) t=20 s 3 0 ,

(b) 1-40 s

Figure 4. I3 Mode1 2A noxmalized concentration contours for vertical cross-sections through the puff centre m), shown (a) 20 s and (b) 40 s afier release.

Xn order to determine the degree of pufT tiiting predicted by each model, the

downwind centroid was calcuiated dong planes of constant z. At a given vertical

position z, the domwind centroid, X,(z), is given by:

I j x - a x , Y, dd.y Jp(x ,y , r )&&

For each ANN model, X, was calculateci as a fiinction of vertical position, under a

number of different input conditions. It was found that for both models, the downwind

centroid position varies approximately linearly with vertical distance (2?%.98 in most

cases). Significant deviations fiom linearity were only observed under conditions of low

temperatures and long diffusion times, but even in these cases, the approximation of

linearity is satisfactory (R'>0.94).

For both models, the tilt angle displayed the same two general trends.

Specifically, puffs exhibited larger tilt angle under conditions of higher wind speed, and

the degree of puff tilting decayed slowly with difkion tirne. Both of these results are

consistent with the physical nature of a sheared surface layer. Given that wind speed

approaches zero at the d c e , higher wind speeds indicate larger shear close to the

ground. Since the effect of wind shear is to stretch the puff in the downwind direction,

larger shear results in puff distributions with greater vertical skewness, and hence, larger

tilt angle.

In his theoretical andysis of puff dispersion in a sheared surface layer, van Ulden

(1 992) concluded that under neutral conditions, a diffusing puff will maintain its shape as

it diffuses, Le., its tilt angle is invariant with tirne. He dso argued that under stable

conditions, a p f l s tilt angle should increase linearly with diffusion time. Although he

presented no discussion of the development of puff tilt angle for unstable conditions, he

detemiined that the interaction of wind shear and vertical d i h i o n largely determine the

degree of pufT tilting. Specifically, the effect of wind shear is to increase the vertical

skewness of the puE while the turbdent vertical mixing acts to destroy this skewness. In

unstable conditions, there is an increased level of convective turbulence genemted by the

large temperature gradient between the surface and the air above it. These convective

turbulent eddies conîribute to enhanced vertical rnixing. Thus, in unstable conditions, the

pufT skewness generated by the wind shear is destroyed at a greater rate by this increased

vertical mixing, and tilt angle wiil decrease as the puffdiffbses domwind.

Typical vertical profiles of X, for model 1A are shown below in Figure 4.14, for

two different temperatures, and show the slow decay of pufT tilt angle with diffusion

tirne. These X, profiles are typical of model 1A predictions under a wide range of input

conditions; a more detailed analysis of model 1A tilt angle is presented below.

t=l O s M O s

r 4 0 s linear M

Figure 4.14 Model 1 A predictions of puff tilt angle, shown at various tirnes afier release for (a) T=19 OC, (b) T=22 OC. Note the good linear fit and the slow decay of tilt angle with diffiion time (-1 .O m/s).

Model 2A shows similar behaviour under most conditions, but it should be noted

that for low ternperatures, a somewhat different trend is apparent. Specifically, model 2A

predicts increasing tilt angle with diffiion t h e for T S 20 OC. This effect is less severe

as temperature increases, and for T >20 OC, the model predicts decaying tilt angle with

diffusion tirne, simüar to model 1A predicîions. It is unclear why this trend is observed

only in conditions of low temperature; the model may have learned some relationship

berneen the temperature meaSuTernents and the thermal stability, or it may be an artifact

of the specific conditions of the experimental trials.

Typical x-centroid profiles for model 2A are shown below in Figure 4.15. The

low temperature predictions of in creasing tilt angle with diffiion time may be due to

gravitational settling of the kaolin as d i f i i on proceeds. As dernonstratecl above in

Figure 4.13, model 2A predicts distributions that fdl closer to the surfhce as the pufY

disperses downwind. This effect, together with the influence of wind shear, results in a

'squashed' distribution having increased vertical skewness.

(a) T=19 OC

e l o s t M O 8

W e3Os O M O 8 - lineer fit

Figure 4.15 Mode1 2A predictions of puff tilt angle, shown at various times afier release for (a) T=19 OC, (b) T=22 OC. Note the increasing puff tilt with d i h i o n time for low T.

The vertical contours of model 2A concentration distributions provide some

insight about the interaction of the aerosol with the Surface. Most contours, such as that

shown above in Figure 4.13, exhibit large concenîration gradients close to the gound,

with the distribution falling off to very low levels at 2 4 . This indicates that very little of

the aerosol is reflected h m the ground back up into the puE Aerosol pufXs that are

reflected fiom the surfàce will g e n d y show increased concentration levels closer to the

ground. This evenîually l a d s to verticai concentration profiles that are highest at the

surface, and taper off with height above the ground.

However, such behaviour was not predicted by model 2A. This suggests that

either the aerosol release height was sufficiently high that diffushg aerosol did not

interact significantly with the surface, or that there was non-negligible ground deposition.

Although kaolin is non-reactive, deposition due to impaction with the d a c e roughness

elernents (i.e., vegetation) can result in significant aerosol deposition. This effect is

difficult to assess; usually, surface deposition rates are detennined empirically, and

depend on the properties of both the aerosol and the surface (Hanna et al., 1982). A

simple yet effective way to deal with dry d a c e deposition is to incorporate a reflection

coefficient into the image source model. This approach simply multiplies the image

source term of equation (40) by an empirically detennined constant l e s than unity,

effectively reducing the amount of aerosol reflected back up into the puff. COMBIC

employs such a system, but determinhg the comect value of the reflection coefficient is

almost impossible without some prior knowledge of the deposition rate of the aerosol.

4.43 Cloud Spread

The spatial spread of the predicted concentration distributions of both models was

examined under a variety of input conditions. This was done by calculating the second

moments of the distribution, defineci as:

with similar expressions for a,' and a:. It was found that for both models, cloud spread

varied significantly with diffusion tirne and wind speed, but showed Iittle variation with

the remaining input variables. Figure 4.16 below shows the temporal development of the

second moments of predicted concentration distributions for model 1A. Three different

wind speeds are shown, and the remaining model inputs are the same as those indicated

above in Figure 4.10. Mode1 2A cloud moments show similar behaviour, but generally

showed larger vertical dispersion owing to the vertical meander of the PUE.

Al1 dispersion coefficients show linear growth with diffusion tirne, to good

approximation ($ > 0.95 in rnost cases). This is in agreement with the observations

reported by Yee, et al. ( 1 998)' who found that the downwind and crosswind widths of

instantaneously released puffs show neariy linear growth. Sato (1995) f o n d similar

results for puff releases at short d i f i i o n times (Le., t < 100 s).

Figure 4.16 Evolution of dispersion coefficients with diffusion t h e for wind speed (a) 1 .O m/s, (b) 1.5 d s and (c) 2.5 m/s.

The relative magnitudes of the dispersion coefficients iIlustrated in Figure 4.16

above were observed under al1 input conditions. Both models predict downwind puff

dimensions that are significantly larger than those in the crosswind or vertical directions.

This can be attributed to the effects of wind shear, which enhance diffision in the

downwind direction. At short diffusion times, the vertical puff width is also notably

larger than the crosswind width. This is likely due in part to the dissemination technique

used in the kaolin release trials. The disseminator nozzle was inclined slightly fiom the

plane of the surface, effectively giving kaolin clouds larger initial vertical spread.

Both models predict that downwhd dispersion coefficients grow more rapidly

with diffusion time than do the crosswind or vertical widths. They also predict that s,

grows more rapidiy at higher wind speeds. Again, these trends are consistent with the

aihanced downwind dispersion effected by wind shear.

Model 2A predicts that dispersion coefficients increase with height above the

ground. The downwind and crosswind second moments were caïcuiated about the tilted

axis of the puff at various levels above the d a c e . It was found that pufT spread was

greater at higher levels above the surfàce. A typical vertical profïie of dispersion lengîhs

is shown below in Figure 4. 17.

O 5 10 15

dispersion coefficients (m)

Figure 4.17 Verticai profiles of downwind and crosswind dispersion lengths as predicted by mode1 2A. Shown for wind speed of 1 .O m/s, 10 seconds after release. Remaining inputs are the same as those in Figure 4.10 above.

This increase in p d f spread with vertical height is due to the vertical

inhomogeneity of the surface layer caused by the presence of the ground. Wind shear is

largely responsible for the enhanceci downwind pufF spread with height above the

surface. However, the vertical increase in spread in both the downwind and crosswind

directions is a result of the varyhg eddy spectnun with height above the surface. Very

ciose to the ground, only the smallest turbulent eddies are present, since the motion of

large eddies is restricted by the rigid surface. However, as distance h m the surfkce

inmeases, larger eddies are present, resulting in greater dispersion at larger heights.

4.4.4 ANN Model 1A Parameterization

The ANN models developed here can be easily embedded into a simple cornputer

program for deployment, requKing relatively little cornputaiional time to produce results.

Nevertheless, for the sake of convenience, it is often desirable to have simple, explicit

analytical fonnulas to describe a model. To this end, ANN model IA predictions were

fiuther analyzed, and the significant features of the model were encapsulated into

analytical expressions of relatively simple fom.

The sensitivity analyses described above demonstrate that each of the nine input

variables used to construct the ANN models contributes in a significant way to the

models' performance. However, determining the behaviour of predicted concentration

distributions over the entire range of al1 nine variables would be an extremely difficult

task. Therefore, efforts were directed at a smder subset of the input variables.

Specifically, the three spatial variables, diffusion time and wind speed were used as the

primary descriptive variables. Temperature was also varkd during this analysis to

determine its effect on predicted distributions. Pasquiil stability class was adjusted with

the wind speed, in order to maintain consistency with its definition as shown in Table 2.1.

Conditions of strong daytime IlisoIation were assumed for this anaiysis, consistent with

the conditions of most of the release trials.

The rernaining inputs, time of day and atmospheric pressure, were held at fïxed

values. These constant values were chosen such that they lie in the mid-range for each

variable, at values that are fairly well represented by the training set. This ensures that

the ANN model is interpolating in a well-trained region of input space. The chosen fixeci

values for time of day and pressure were 10:48 am and 30.30 in Hg, respectively.

Based on the analysis of the concentration distributions presented above, model

1A predicîions were modeled by a three-dimensional tilted Gaussian distribution. This

distribution takes the following fom:

where f3 is the tilt angle (Le. the slope of the Xdz) ccurve fkom the z-axis), and al1

remaining variables and parameters are as defined above. The four fiee parameters, i.e.,

tilt angle and the three dispersion coefficients, were determined as functions of wind

speed and diffusion tirne. The e&t of varying temperature was also examined, but most

analyses were performed at T=19 OC, since this value lies near the mid-range of

temperatures, and is the most populated temperature value in the training set (cf Figure

3.3).

Vertical profiles of the x-centroid, such as those shown above in Figure 4.14, were

consûucted under a number of wind speeds and diffusion times. As already noted, model

1A predicts that tilt angle decreases with diff i ion time and increases with wind speed.

These trends are summarked below in Figure 4.18. Similar trends were observed for

different temperatures.

1.8 - Lk1.0 m/s

1.5 - r U=l.Srn/s i U=2.5m/s

1.4 - Iinear fit

Q - 1.3 - - m t Q 1.2- - - CI

1.1 -

1.0 -

Figure 4.18 Variation of puff tilt angle with diffusion t h e and wind speed, shown for T= 19 OC.

Tilt angle was f o n d to Vary approximately linearly with diffusion t h e and wind

speed, and a plane of the form (UJ) = a + b W + ct provided a sufficient approximation

(2 > 0.96). The fitting panuneters were deteRnined to be:

a = 0.99 f 0.03

b = 0.20 * 0.01 c = -0.0049 f 0.0006.

The dispersion lengths were determineci by calculating the second moments of the

predicted concentration distributions for various wind speeds and difihion times. The

downwind and vertical second moments were caidated about the tilted â u s of the pufE

The variation of the dispersion coefficients with wind speed and diffusion time is very

similar to that shown above in Figure 4.16, and a plane provideci a good fit to the

predicted pufT spreads. Table 4.6 below Summafizes the fitting parameters to the plane

q ( U , t ) = a + bU + c f , for i = x, y, z. The dispemion coefficients did show some

variation with temperature; lower temperatures generally showed greater puff spread.

However, the variations were minor, and for the sake of simplicity, they were not

incorporateci into the parameterization.

Table 4.6 Fitting parameters for the dispersion coefficients as linear hct ions of diaision time and wind speed.

dispersion coefficient a b c R2

The perfoxmance of equation (41) together with the simple parameterizations for

tilt angle and dispersion coefficients was evaluated against data set 1 test set and

validation set, For cornparison, the parameterized model's performance statistics are

listed below together with those of the other models considered above. Table 4.7a shows

statistics for the test set, and Table 4.7b for the validation set.

Table 4.7a Model comparison over data set 1 test set.

Mode1 R VG MG F2 FI O

ANN 1A 0.63 14.43 1-40 0.31 0.84

1A Parameterkition 0.49 27.38 1.34 0.27 0.77

GPMs 0.44 58.22 1.89 0.25 0.73

GPMp 0.46 38.09 1.43 0.27 0.76

Table 4.7b Model comparison over data set 1 validation set,

Mode1 R VG MG F2 FI0

ANN 1A 0.58 13.52 1.01 0.30 0.83

1A Parameterization 0.54 15.94 0.88 0.29 0.81

GPMs 0.52 17.17 0.93 0.30 0.82

GPMp 0.56 15.99 0.72 0.31 0.82

Although the parameterkation of model 1A shows significantly worse

performance than the full ANN model, it still oufperfonns both traditional Gaussian puff

models over the test set, and shows comparable performance over the validation set. It is

clear that much of the predictive power of the ANN rnodel is lost in the parameterkation,

and that the effects of the neglected input variables on model perfonnance are significant.

Nonetheless, the parameterization provides a simple analytical model for puff diffusion in

a sheared surface layer, with a clear physical interpretation. Under the conditions

covered by the present data set, this parmeterized mode1 provides a more accurate

alternative to existing Gaussian puff models.

Chapter 5

Conclusions

5.1 Summary and Conclusions

A number of ANN models were constructed to predict the concentration

distribution of an instantaneuusly released aerosol in the planetary boundary layer. These

models were based on data collectecl fkom 50 field trials, where 50 g lots of kaolin

powder were released fiom near ground level. Three-dimensional concentration maps

were rnea~u~ed using a scanning lidar system, which provided over 100,000

concentration measurements used to train the ANNs. Based on easily measured

meteorological parameters, these ANN models were able to significantl y outperform

traditional Gaussian puff models, including the US A m y Research Laboratory's

Gaussian-based dispersion code COMBIC.

The field data were analyzed in two separate coordinate systems. One system

followed the centre of mass of the puff as it d i f i e d downwind, and thus removed the

effect of puffmeander from the diffusion process. The other coordinate system followed

the horizontal centre of mass of the puff, but had a hxed vertical coordinate, relative to

the surface. Both systems are commonly used in theoretical and practical treatments of

aerosol diffusion, and the second system facrlitates the assessrnent of surface effects on

the dispersion process.

Numerous multi-layer feed-forward neural networks were trained on the field data

using the extendecl delta-bar-delta backpropagation leaming algorithm. It was found that

network architecture had the greatest effect on mode1 performance. More complex

networks (9-40-20- 1) repeatedly showed better performance against the test set, but failed

to generalize well against the validation set. It was found that the cross-validation

learning technique employed might not have d c i e n t l y guarded against over-training.

For this reason, networks with the smallest architectures (9-20-1) that showed good

performance against both the test set and the validation set were retained for further

anal y sis.

ANN models with identical architectures and parameter settings converged to

different points in weight space given different weight initializations. This suggests that

the error surface had numerous minima of comparable magnitude. For this reason, the

final mode1 for each data set was constructed by averaging the output predictions fiom 20

trained ANN models, each with the same 9-20-1 architecture and network parameter

settings, but initialized to different points in weight space.

The performance of the final averaged ANN mode1 for each coordinate system

was compared to traditional Gaussian puff models. Two standard Gaussian puff

equations were employed, one for each coordinate system. Two of the most commonly

used dispersion coefficient parameterizaiions were used with the Gaussian puff

equations; one developed by Pasquill (Pasquill and Smith, 1983), the other by Slade

(1 968). The US Amy Research Laboraîory's Gaussian-based dispersion d e COMBIC

was also tested against the field data.

The average ANN models showed better performance than the Gaussian puff

models against both the test set and the validation set. In general, ANN models gave

significantly better correlation, d e r mean bias, and far less variance tha. exhibited by

the Gaussian puff models.

Sensitivity analyses performed on both ANN models revealed that al1 nine input

variables used to describe the dispersion process contributed significantly to the

performance of each model. Specifically, it was found that the three spatial variables,

wind speed and ambient temperature had the most effect on model predictions. Diffunon

time was also found to be an influentid input.

An analysis of the predicted concentration distributions of both models was

perfomed to gain insight on the trends leamed by the neural networks, and to determine

the significance of these trends with regard to the process of puff dispersion. The models

predicted very smoothly peaked concentration distributions that were well approximated

by a Gaussian distribution, in accord with observations reported in the literature.

Predicted distributions also showed smail negative skew in the downwind direction,

consistent with observations reported in the literature for puff releases in a sheared

d a c e layer.

ANN models predicted tilted distributions, such that the upper portion of the puff

was displaced fhther downwind than the lower portion as a direct result of wind shear.

Tilt angle was found to increase with wind speed and decay slowly with d i e i o n t h e .

These predictions are consistent with theoretical treatments of puff diffusion in a sheared

surface layer.

ANN model 2A (absolute z-coordinates) predicted distributions that fail closer to

the ground as diffusion proceeds, indicating that gravitational settling of the kaolin cloud

may be more significant than expected. These distributions aIso indicated that surface

reflection of the aerosol was minimal, and that aerosol ground deposition rnay have been

responsible for the low concentration predictions near the surfaçe. This model also

predicted increasing dispersion at greater distances above the surfàce. This is consistent

with the common hypothesis of increasing eddy difi ivity with vertical distance.

Both models predicted dispersion coefficients that grow approximately linearly

with diffusion t h e , consistent with reported observations in the literature. Downwind

dispersion was greater than in either the crosswind or vertical direction, and proceeded at

a greater rate.

ANN model 1A (relative z-coordinates) was parameterized in terms of the most

influential input variables. A tilted Gaussian distribution was used to approximate the

ANN's predicted concentration distribution, and simple analytical f o d a s were

developed relating tilt angle and the dispersion coefficients to wind speed and diffusion

tirne. The parameterkation did not have the same predictive p w e r as the full ANN

model, but it provided a simple analytical model for puff dispersion in a sheared d a c e

layer, and significantly outperformed traditional Gaussian puffmodels over the test set.

5.2 Recommendations

A more robust and accurate ANN model could be developed if a more detailed

description of the character of the flow field was mea~u~ed. The most severe limitation

of the present data set is the lack of an accurate measurement of mean wind speed and

direction. Since the effects of wind shear were detennined to be very influentid in the

process of puff dispersion near the ground, vertical profiles of rnean velocity should also

be measured. In addition, a more accurate quantification of atmospheric stability could

be obtained fkom measured vertical temperature profiles. These measurements do not

pose significant experimental challenges, and c m be easily made using common

meteorological instrumentation.

Aithough a goal of this research was to coristruct a model using easily measured

meteorological parameters, it is well known that an accurate description of turbulent

diffusion rquires the determination of the statistical panuneters of the turbulent flow

tield. Rapidly sampled (-10 Hz) measurernents of wind speed, wind direction and

temperature would provide a good description of the turbulence, fTom which important

descriptive variables could be derived. Quantities such as turbulent intensity, vertical

heat flux, Monin-Obukhov length, and fiction velocity could be determinecl from such

measurements, and provide usefid inputs for an ANN model.

A significant amount of the present data set was discarded because a number of

diffising p u f i passed out of the LCM scanning volume. In future field trials, the aerosol

source should be positioned such that most of the dispershg cloud is contained within the

LCM scanning volume. Increasing the scanning volume would alleviate this problem to

some degree, but at the expense of either angular resolution or scanning time. Also,

positionhg the lidar system closer to the ground would provide important measurements

of near-surface concentration ievels. Such measurements would provide insights into the

interaction of a diffusing aerosol with the ground.

Given that an ANN model is empiricdly based, it can only be reliably used for

interpolation. Therefore, the predictive capability of any ANN model is restricted to the

conditions under which the data was collected. A more robust model can be developed if

M e r field trials are conducteci under an extended range of atmospheric conditions.

Specifically, to enhance the present data set, puff releases should be performed under

more stable conditions, perhaps at night, or just before dawn. The range of wind speeds

and temperatures should also be expanded. Varying the type and mass of aerosol

released would also help build a more robust model. Finally, a better dissemination

technique should be employed, or at the very least, source effects should be measured and

controlled. The kaolin disseminator used for the present field trials produced puffs wiîh

very large initial dimensions, which were difficult to quanti@.

Aside f?om considering better descriptive variables for inputs as noted above, two

major considerations should be made when constructing ANN models to predict 3-D

concentration distributions. First, some method to include clear-air zero-concentration

data points in the data set without greatly skewing the output fiequency distribution

should be developed. This would likely give the model better predictive capability away

from the central portion of the puK Second, an alternate method should be considered

for presenting the lidar data to the ANN for training. The typical method of separating

the data set into training and test sets by random selection of vectors is not appropriate for

this type of data set, and the usual cross-validation learning techniques do not properly

guard against over-training. Input vectors fkom individual Mar sans should be kept

together, and networks should be trained on complete scans, one at a t h e . Over-training

can be checked by cross-validation between scans.

List of References

Andrews, W. S., Costa, J., and Roy, G., "Measuring and Modeling the Influence of Atmospheric Effects on the Concentration Disiributions within Transient Aerosol Plumes", Proc. of the 1998 BattZespace Atmospheric and CIoud Irnpactr on MiZitary Operations Con ference, 243-250, 1 998.

Ayres, S. D., and Desutter, S., "Combined Obscuration Model for Battlefield Induced Contaminants (COMBIC92) Model Documentation", U.S. Army Atmospheric Sciences Laboratory, White Sands Missile Range, 1 995.

Batchelor, G. K.., "Diffusion in a Field of Homogeneous Turbulence II. The Relative Motion of Particles", Proc. Cambridge Phil. Soc., 48,345, 1952.

Baughman, D. R., and Liu, Y. A., Neural Networks in Bioprocessing and Chernical Engineering, Academic Press, San Diego, 1995.

Bissonnette, L. R., Bastille, C., and Vallee, G., "Estimation of Cloud Droplet Size Density Distribution fkom Multiple Field-of-View Lidar Retunis", Report DREV R-9705, ValCartier, 1997.

Bissonnette, L. R., "Lidar inversion methods: an introduction ", Proc. 8th Int. Workshop on Multiple Scattering Lidar Fxperiments, 102,1996.

Bissonnette, L. R., and Hutt, D. L., "Multiply Scattered Aerosol Lidar Retums: Inversion Method and Cornparison with in situ measurements", Applied Optics, 34, 6959- 6975, 1995.

Boznar, M., Lesjak, M., and Makar, P., "A Neural Network-based Method for the Short- t e m Preâictions of Ambient S02 Concentrations in Highly Polluted Indusîrial Areas of Complex Terrain", Atm. Env., 27B, 2,22 1-230, 1993.

Briggs, G. A., "Diffusion Estimation for Small Emissions", U.S. NOAA E.R.L. Report ATDL- 106, Oak Ridge, 1973.

Center for Chernical Rocess Safety (CCPS), Guidelines for Use of Vapor C M Dispersion Models, 2* ed., Arnerîcan Insîitute of Chernical Engineers, New York, 1996.

Costa, J., "Measiaing and Modeling the Atmospheric Concentration Distributions of Aerosols Released From Transient Point Sources7', M. Eng. Thesis, Royal Military College of Canada, 1998.

C s d y , G. T., Turbulent D t ~ i o n in the Environment, Reidel Publishing Company, Dordrecht, Holland, 1 973.

Davis, (Onüne). Davis Instruments, <hap://www.davisnetmm/> (July, 2000).

Elouragini, S., "Useful Algorithms to Derive the Optical Properties of Clouds fiom a Back-scatter Lidar Retum", J . Mod. Uptics, 42,7, 1439- 1446, 1995.

Evans, B. T. N, Yee, E., Roy, G., and Ho, J., "Remote Detection and Mapping of Bioaerosols", J. Aerosol Sci., 25,8, 1549- 1566, 1 994.

Evans, B. T. N., "LiDAR Signal Interpretation and Rocessing with Consideration for Military Obscurants", Report DREV R-4477/88, Valcartier, 1988.

Evans, B. T. N., "On the Inversion of the Lidar Equation", Report DREV R-4343/84, ValCartier, 1984.

Gardner, M. W., and Dorling, S. R., "Artificial Neural Networks (the Multilayer Perceptron)-A Review of Applications in the Atmospheric Sciences", Atm. Env., 32, 14/15? 2627-2636, 1998.

Gardner, M. W., and Dorling, S. R., 'Weurai Network Modelling of the Influence of Local Meteorology on Surface Layer Ozone Concentrations", Proc. 2" Int. Con$ on GeoComputation, 359-370, 1996.

Griffiths, R. F., "Errors in the Use of the Briggs Parameterkation for Atmospheric Dispersion Coefficients", Atm. Env., 28, 1 7,286 1-2865, 1994.

Hanna, S. R., "Along-Wind Dispersion of Short-Duration Accidental Releases of Hazardous Gases", Proc. 9U' Joint Con$ On Applccttions of A i r Pollution Meteorology with A& M , Atlanta, 28 January - 2 February, 1996.

Hanna, S., Briggs, G., and Hosker, R., Handbook on Atmospheric Dz%ion, National Technical Information Center U.S. Dept of Energy, Springfield, 1982.

Haykin, S., Neural Networkr A CompTehensive Foundation, Macmillan College Publishing Company, hc., New York, 1994.

Hidy, G.M., Aerosols An Indusstal and Environmental Science, Academic Press, hc., Orlando, 1984.

Hinckley, E-D-, Laser Monitoring of the Atmosphere, Springer-Verlag, Berlin, 1 976.

Klett, J. D., "Lidar inversion with variable backscatter/extinction Applied Optics, 24, 11, 1638-1643, 1985.

Klett, J. D., "Stable anaiytical inversion solution for processing lidar retums", Applied Qvtics, 20,2,2 1 1-220, 198 1.

Kunkel, K. E., and Weinman, J. A., "Monte Carlo Analysis of Multiply Scattered Lidar Retuns", J. Atmos. Sci., 33, 1772-1781, 1976.

Lehder, (Online). Lehder Enviro~mental Senrices Ltd., ~http://www.lehder.com/~ (July, 2000).

Mohan, M., and Siddiqui, T. A., "An Evaluation of Dispersion Coefficients for use in Air Quality Models", Bounhry-Layer Meteorology, 84, 1 77-206, 1997.

NeuralWare, Reference Guide: Softwnre Reference for Professional IVPLUS and Neural Works Explorer, NeuralWare, Pittsburgh, 1993a.

NeuralWare, Neural Cornpuring: A Technology Handbook for Professional I ' L U S and Neural Works E ~ l o r e r , NeuraiWare, Pittsburgh, 1 993b.

NeuralWare, Using Neural Works: A Tutorial for Neural Workr Professiona l II/PL US and Neural Works Ej.plorer, NeuralWare, Pittsburgh, 1 993c.

Oke, T.R., Boundary Layer Climates, Routledge, London, 1987.

Olesen, H. R., "Regulatory Dispersion Modelling in Denmark", Workshop on Operational Short-range Atrnospheric Dispersion Models for Environmental Impact Assessment in Europe, Mol, Nov. 1994, published in Int. J. Environment and Pollution, 5,4-6,4 12-4 1 7, 1 995.

Pal, S. R., Hlaing, D., and Carswell, A. I., "ScaMing Lidar Application for Pollutant Sources in an Industriai Cornplex", SPIE, 3504,76-86, 1998.

Pankrath, J., "Atmospheric Dispasion Models for Regulatory Purposes in the Federal Republic of Germany. Part 1: Regulatory Modelling", Workshop on Operational Short-range Atmospheric Dispersion Models for Environmental Impact Assessment in Europe, Mol, Nov. 1994, published in Int. J. Environment and Pollution, 5,4-6,427-430, 1 995.

Pasquill, F., and Smith, F.B., Atmospheric Dzwion, 3d ed., John Wiley & Sons, Rexdale, 1983.

Patterson, D. W., Awcia l Neural Networh: Theory and Applications, Prentice-Hall, Toronto, 1996.

Pollock, D. H., DUE0 Handbook Volume 7: Countenneasure System, Environmental Research Institute of Michigan, Ann Arbor, 1993.

Rege, M., and Tock, R., "A Simple Neural Network for Estimating Emission Rates of Hydrogen Sulfide and Ammonia nom Single Point Sources'', J. Air & Waste Manage. Assoc., 46,953-962,1996.

Roy, G-, Bonnier, D., DeVillers, Y., Couture, G., Hutt, D., and Vdlee, G., "Canadian National Report on the SOCMET Winter Test Held at DREV, Canada in March 1 993", Report DREV-TM-9408, Valcartier, 1994.

Roy, G., Valee, G., and Jean, M., "Lidar-inversion Technique Based on Total Integrated Backscatter Calibrated Curves", Applied Optics, 32,6754, 1993.

Sato, J., "An Analytical Study on Longitudinal Diffision in the Atmospheric Boundary Layef7, The Geophysical Magazine Series 2,1,2, 105- 15 1, 1995.

Sawford, B. L., and Wilson, J. D., 'Xeview of Lagrangian Stochastic Models for Trajectones in the Turbulent Atmosphere", Boundaly-Layer Meteorology, 78, 191-210, 1996.

Silfvast, W. T., Laser Fmdamentals, Cambridge University Press, New York, 1 996.

Slade, D. H., Editor, "Meteorology and Atomic Energy", TID-24 190, USAEC, 163- 175, 1968.

SPSS, SigmaPlot 5.0 User's Guide, SPSS, Inc., Chicago, 1998.

Sutton, O. G., Micrometeorology A Srudy of Physical Processes in the Lowest m e r s of the Earth S Ahnosphere, McGraw-Hill Book Company, hc., Toronto, 1953.

Sutton, O. G., Amiospheric Turbulence, 2nd ed., John Wiley & Sons, Inc., New York, 1949.

Taylor, G. L, "Diiffusion by Continuous Movements", Proc. London Math. Soc., 20, 196, 1921.

Turner, D. B., WorRbook of Atmospherir Dispersion Estimates An Introduction fo Dispersion Modeling, 2& ed., Lewis Publishers, Ann Arbor, 1994.

Uthe, E. E., and Livingston, J. M., 'Zidar Extinction Methods Applied to Obsemations of Obscurant Events", Applied Optics, 25,678, 1986.

Uthe, E. E., "Lidar Evaluation of Smoke and Dust Clouds", Applied Optics, 20, 1503, 1981.

van Ulden, A. P., "A Surface-Layer Similarity Model for the Dispersion of a Skewed Passive Puff Near the Ground", Atm. Env., 26A, 4, 68 1-692, 1992.

Wasserman, P., Advanced Methoak in Neural Conrputing, Van Nostrand Reinhold, New York, 1993.

Williamson, S., FundamentaZs of A i r Pollution, Addison-Wesley Publishing Company, Don Mills, 1 973.

Yee, E., Kosteniuk, P. R., and Bowers, J. F., "A Study of Concentration Fluctuations in ïnstantaneous Clouds D i s w i n g in the Atmospheric Surface Layer for Relative Turbulent Diffision: Basic Descriptive Statistics", Boun&ry-mer Meteorology, 87,409-457, 1998.

Yi, J., and Prybutok, V. R., "A Neural Network Model Forecasting for Rediction of Daily Maximum Ozone Concentrations in an Industrialised Urban Area", Environmental Pollution, 92,3,349-357, 1996.

Appendix A

Meteorological Measurements and Estirnates

Note: In accord with meteorological convention, wind direction is the direction fkom

which the wind is bIowing. The delay time entry in Table A.1 refers to the approximate

time d e r kaolin release that the fïrst LCM scan began, while scan to scan time is the

approximate t h e between subsequent scans of the same release.

Table A. 1 Summary of measurements taken during the kaolin trials.

N W O 10.8 W 16 10.8

NW 13 10.8 NW 21 10.8 NW 15 10.7 NW 19 10.7

WNW O 10.8 WSW 29 10.7 SW 15 10.7

WNW 26 10.8 N W 15 10.7

13 10:41 22 30.15 0.28 SSW 17 10.9 8/6/97 14 1 5:27 21 30.29 1.39 SW O 8.2

15 15:30 22 30.29 0.00 W O 8.2 16 15:33 22 30.29 3 .O6 S O 8.1 17 1536 23 30.28 2.78 S O 8 .O 18 1539 23 30.28 1.67 WNW O 8.4 19 1 5:42 24 30.28 2.22 SSW O 8.4 20 1545 24 30.28 3.61 SW O 8.4 21 1 5:48 24 30.28 3.06 SSW O 8.4 22 155 1 24 30.28 5.83 W O 8.3 23 1554 24 30.28 4.44 W O 8.4

8/7/97 24 10:18 18 30.28 5.83 W O 8.4 25 10:2 1 18 30.28 6.67 SW O 8.2 26 1 0:24 17 30.28 4.44 SW O 8.4 27 10:33 17 30.29 5 .O0 SW O 8.4 28 10:36 17 30.28 5.00 SW O 8.4 29 1 0:39 17 30.29 5.80 WSW O 8.4 30 1 0:42 18 30.28 4.17 SW O 8.4 3 1 10:45 18 30.28 4.44 W O 8.4 32 10:48 18 30.28 6.67 W O 8.5 33 10:s 1 18 30.28 3.6 1 W O 8.2 34 1 O:% 18 30.27 5.28 W O 8.4

37 11:02 19 30.45 3.61 S O 8.4 38 11:05 19 30.45 2.22 SSW O 8.4 39 1 1:08 18 30.44 3.06 SE O 8.4 40 1l:ll 19 30.45 1.39 E O 8.4 41 1 1:17 18 30.45 0.83 E O 8.4 42 11:20 19 30.44 0.83 ENE O 8.4 43 1 1:23 20 30.44 0.00 W O 8.4 44 1 1:26 20 30.44 1.39 S O 8.4 45 1 1:29 20 30.00 0.30 S O 8.4 46 1 1:32 20 30.43 0.00 SW O 8.4 49 1 1 :44 20 30.44 2.22 NW O 8.4 50 1 1 :47 20 30.43 2.22 NW O 8.4 51 1150 20 30.43 1.67 NE O 8.5 52 1 1:53 20 30.43 1.67 ESE O 8.5

Table A.2 Calculated wind speed and Pasquill Stability Class.

date trial no. speed (mm class

8/5/97 2 2.39 A-B A-B A-B A-B A A B A

A -B A A

13 1.10 A 8/6/97 14 0.96 A-B

0.4 1 A-B 1.54 A-B 1 .O0 A-B 0.57 A -B 1.77 A-B 2.25 B 1.96 A-B 1.83 A-B

A-B B

A -B A-B

B B B

A-B A

A-B 3 5 2.90 A-B

8/12/97 3 6 0.06 A

Appendix B

Neural Network Performance Statistics

Table B. 1 Preliminary data set 1 ANNs. Effect of varying the network architecture and epoch size on network performance against the test set. The listed staîistics are root mean square (RMS) and line-ar correlation coefficient (R) between predicted and target outputs.

Test Set Statistics ANN Architecture Epoch

RMS R

Table 33.2 Preliminary data set 2 ANNs. Effect of varying the network architecture and epoch size on network perfoxmance against the test set. The listed - statistics are root mean square (RMS) and linear correlation coefficient (R) between predicted and target outputs. .

Test Set Statistics ANN Architecture Epoch

RMS R 2.1a 9-10-1 100 0.3753 0.5687 2.1b 9-10-1 200 0.3607 0.61 10 2 . 1 ~ 9-10-1 500 0.3565 0.6 140 2. ld 9-10-1 1000 0.361 1 0.601 1 2.2a 9-20- 1 1 O0 0.3685 0.58 13 2.2b 9-20-1 200 0.3588 0.6044 2 . 2 ~ 9-20- 1 500 0.3549 0.6 17 1 2.2d 9-20- 1 1 O00 0.3562 0.6 128 2.3a 9-3 O- 1 100 0.3608 0.60 1 O 2.3b 9-30- 1 200 0.3559 0.6 183 2 . 3 ~ 9-30- 1 500 0.3508 0.6279 2.3d 9-30- 1 1000 0.3540 0.6191 2.4a 9- 1 0-5- 1 100 0.3636 0.596 1 2.4b 9- 10-5- 1 200 0.3 593 0.6035 2 . 4 ~ 9- 1 0-5- 1 500 0.3550 0.6 179 2.4d 9- 10-5- 1 1 O00 0.3572 0.6133 2.5a 9-20- 10- 1 100 0.3496 0.6340 2.5b 9-20- 1 0- 1 200 0.3439 0.6494 2 . 5 ~ 9-20- 1 0- 1 500 0.3425 0.6494 2Sd 9-20- 10- 1 1000 0.3463 0.6404 2.6a 9-30-15-1 1 O0 0.3492 0.6356 2.6b 9-30- 1 5- 1 200 0.3406 0.6533 2 . 6 ~ 9-30-1 5-1 500 0.3407 0.6529 2.6d 9-30-1 5-1 1 O00 0.346 1 0.6404 2.7a 940-20- 1 100 0.3466 0.6453 2.7b 9-40-20- 1 200 0.3367 0.6640 2 . 7 ~ 9-40-20- 1 500 0.33 15 0.6803 2.7d 940-20- 1 1000 0.3448 O. 6446

Table B.3 Results of training 20 ANNs with two different architectures on each data set. Each net was initialized to a different random point in weight space.

Data set 1 Data set 1 Data set 2 Data set 2 9-20- 1 9-40-20- 1 9-20- 1 9-40-20- 1

RMS R RMS R RMS R RMS R

Appendix C

Mode1 Concentration Prediction Scatter Plots

(a) Test Set

(b) Vaiidatim Set

Figure C. 1 Scatter plots for ANN modei 1A over (a) the test set and (b) the validation set.

(a) Test Set

1

(b) Vaiidahim Set

Figure C.2 Scatîer plots for ANN modei 2A over (a) the test set and (b) the validation set.

1

(a) T a Set

1 (b) Validaibn Set

Figure C.3 Scatter plots for ANN mode1 IB over (a) the test set and (b) the validation set.

1 - (a) Tea Set

1

(b) Validaiin Set

Figure C.4 Scaîter plots for ANN mode1 2B over (a) the test set and (b) the validation set.

1

(a) Test Set

1

(b) Validatim Set

Figure C.5 Scarfer plots for data set 1 GPMs (Slade) over (a) the test set and (b) the validation set.

1 - (a) Test Set

. .

1

(b) Validatim Se!

Figure C.6 Scatter plots for data set 1 GPMp (PasquiII) over (a) the test set and (b) the validation set.

1

(b) Validatim Set

Figure C.7 Scatter plots for data set 2 GPMs (Slade) over (a) the test set and (b) the validation set.

(a) Test Set

1

(b) Validatim Set

Figure C.8 Scatter plots for data set 2 GPMp (Pasquill) over (a) the test set and (b) the validation set.

1

(b) Validaüm Set

Figure C.9 Scatter plots for COMBIC over (a) the test set and (b) the validation set

Vita

Name: D. Timothy James DeVito

Education: University of Guelph, 1 992- 1 993 Guelph, ON

Queen's University, 1993-1 996 Kingston, ON B. Sc. (Elonours) ln Class Physics, 1996

Royal Military College of Canada, 1998-2000 Kingston, ON Current program

Experience: Research Engineer, 1998-2000 Royal Military College of Canada

Publications: Modeling Aerosol Puff Concentration Distributions fiom Point Sources Using Artificial Neural Networks. Proceedings of the 2000 BattIespace Atmospheric and Cloud Impacts on Military Operations Conference.

Awards: NSERC Postgraduate Scholarship-B, 2000-2002

Defence Research & Development Branch-Royd Military College Fellowship, 1999-2000

Milton Fowla Gregg VC Memonal Trust Fund Bursary, Royal Military College of Canada, 1998- 1999

University of Guelph Entrance Scholarship, 1 992- 1993

modeling aerosol puff concentration distributions · 2005. 2. 14. · figure 4.8 figure 4.9 figure...

Documents