a comparison of cascade impactor data reduction methods

14
Aerosol Science and Technology 37: 187–200 (2003) c 2003 American Association for Aerosol Research Published by Taylor and Francis 0278-6826/03/$12.00 + .00 DOI: 10.1080/02786820390112506 A Comparison of Cascade Impactor Data Reduction Methods Patrick T. O’Shaughnessy 1 and Otto G. Raabe 2 1 Department of Occupational and Environmental Health, The University of Iowa, Iowa City, Iowa 2 Center for Health and the Environment, University of California, Davis, California The mass median aerodynamic diameter, d g , and geometric standard deviation, σ g , of an aerosol is typically determined from a reduction of data obtained with a cascade impactor under the sim- plifying assumption that each stage of the impactor has an ideal collection efficiency. Two such reduction techniques are described and compared to an “inversion” method that incorporates the ac- tual collection efficiencies of an impactor. Theoretical comparisons were made to demonstrate the difference in the estimate of d g and σ g between the 2 ideal reduction methods and the inversion tech- nique. Results indicated that, in general, both d g and σ g are over- estimated when the collection efficiencies are assumed to be ideal. A spreadsheet application of the inversion method is described. INTRODUCTION A cascade impactor is a multistage aerosol sampling device that separates particles by size according to their inertial prop- erties in a moving air stream (Lodge and Chan 1986). The size distribution of many aerosols resulting from an analysis, or “re- duction,” of impactor data approximates the lognormal distri- bution characterized by a frequency distribution that is skewed toward the larger particles (Hinds 1982; Raabe 1971). Given that an aerosol’s size distribution is approximately log- normal and unimodal, the 2 statistical parameters required to completely describe such a distribution are the geometric mean of the diameters, d g , and geometric standard deviation, σ g (Hinds 1982). Furthermore, for a lognormally distributed variable, the geometric mean is equivalent to the median diameter, which is that diameter associated with 50% of the cumulative distribu- tion of the mass collected by the impactor. Cascade impactors Received 18 September 2001; accepted 15 July 2002. The authors would like to thank Dr. Stephen Hillis of the University of Iowa Statistical Consulting Center for his help in formulating the method used to evaluate the standard errors of the parameter estimates. Address correspondence to Patrick T. O’Shaughnessy, Department of Occupational and Environmental Health, The University of Iowa, 100 Oakdale Campus, 180 IREH, Iowa City, IA 52242-5000. E-mail: [email protected] are typically calibrated in terms of the “aerodynamic” diameters of the aerosol particles impacting on each stage. Therefore in this discussion, d g will be taken to be the mass median aerodynamic diameter (MMAD) of the aerosol. Several different impactor data reduction methods exist for determining d g and σ g . The purpose of this paper is to describe and compare 3 reduction methods and their application to the use of computer spreadsheet programs. The methods will be described in an order associated with an increase in their com- putational complexity dictated by attempts to minimize the sim- plifying assumptions inherent to the initial methods described. IMPACTOR DATA REDUCTION METHODS Probit Method A common technique for determining d g and σ g from im- pactor data is to first plot the cumulative mass percent versus the related cut diameter of each stage on log probability paper, where the stage cut diameter is defined as the aerodynamic di- ameter of a particle collected on the stage with 50% efficiency (Baron and Heitbrink 1993; Johnson and Swift 1997; Hinds 1982). If the mass fractions derived from the impactor data re- duction represent those derived from a dust with a lognormal distribution of particle diameters, the data sets will plot as a straight line on graph paper of that type. Historically, the best-fit line through impactor data plotted on log probability paper was drawn by hand. An estimate of d g can be obtained directly from the plot by visually determining the diameter related to the point where the best-fit line intersects the line associated with 50% probability. If the particles were normally distributed, the standard de- viation would simply be the difference between the diameter at 84.1% subtracted from the diameter at 50%, as this range (34.1%) represents the area under the normal distribution asso- ciated with one standard deviation from the mean (Hinds 1982). However, the particle distribution is lognormal. Hence the nat- ural logarithm of σ g is ln σ g = ln d 84.1% - ln d 50% . [1] 187

Upload: independent

Post on 14-May-2023

0 views

Category:

Documents


0 download

TRANSCRIPT

Aerosol Science and Technology 37: 187–200 (2003)c© 2003 American Association for Aerosol ResearchPublished by Taylor and Francis0278-6826/03/$12.00+ .00DOI: 10.1080/02786820390112506

A Comparison of Cascade Impactor Data ReductionMethods

Patrick T. O’Shaughnessy1 and Otto G. Raabe2

1Department of Occupational and Environmental Health, The University of Iowa, Iowa City, Iowa2Center for Health and the Environment, University of California, Davis, California

The mass median aerodynamic diameter,dg, and geometricstandard deviation,σg, of an aerosol is typically determined from areduction of data obtained with a cascade impactor under the sim-plifying assumption that each stage of the impactor has an idealcollection efficiency. Two such reduction techniques are describedand compared to an “inversion” method that incorporates the ac-tual collection efficiencies of an impactor. Theoretical comparisonswere made to demonstrate the difference in the estimate ofdg andσg between the 2 ideal reduction methods and the inversion tech-nique. Results indicated that, in general, bothdg and σg are over-estimated when the collection efficiencies are assumed to be ideal.A spreadsheet application of the inversion method is described.

INTRODUCTIONA cascade impactor is a multistage aerosol sampling device

that separates particles by size according to their inertial prop-erties in a moving air stream (Lodge and Chan 1986). The sizedistribution of many aerosols resulting from an analysis, or “re-duction,” of impactor data approximates the lognormal distri-bution characterized by a frequency distribution that is skewedtoward the larger particles (Hinds 1982; Raabe 1971).

Given that an aerosol’s size distribution is approximately log-normal and unimodal, the 2 statistical parameters required tocompletely describe such a distribution are the geometric meanof the diameters,dg, and geometric standard deviation,σg (Hinds1982). Furthermore, for a lognormally distributed variable, thegeometric mean is equivalent to the median diameter, which isthat diameter associated with 50% of the cumulative distribu-tion of the mass collected by the impactor. Cascade impactors

Received 18 September 2001; accepted 15 July 2002.The authors would like to thank Dr. Stephen Hillis of the University

of Iowa Statistical Consulting Center for his help in formulating themethod used to evaluate the standard errors of the parameter estimates.

Address correspondence to Patrick T. O’Shaughnessy, Departmentof Occupational and Environmental Health, The University of Iowa,100 Oakdale Campus, 180 IREH, Iowa City, IA 52242-5000. E-mail:[email protected]

are typically calibrated in terms of the “aerodynamic” diametersof the aerosol particles impacting on each stage. Therefore in thisdiscussion,dg will be taken to be the mass median aerodynamicdiameter (MMAD) of the aerosol.

Several different impactor data reduction methods exist fordeterminingdg andσg. The purpose of this paper is to describeand compare 3 reduction methods and their application to theuse of computer spreadsheet programs. The methods will bedescribed in an order associated with an increase in their com-putational complexity dictated by attempts to minimize the sim-plifying assumptions inherent to the initial methods described.

IMPACTOR DATA REDUCTION METHODS

Probit MethodA common technique for determiningdg andσg from im-

pactor data is to first plot the cumulative mass percent versusthe related cut diameter of each stage on log probability paper,where the stage cut diameter is defined as the aerodynamic di-ameter of a particle collected on the stage with 50% efficiency(Baron and Heitbrink 1993; Johnson and Swift 1997; Hinds1982). If the mass fractions derived from the impactor data re-duction represent those derived from a dust with a lognormaldistribution of particle diameters, the data sets will plot as astraight line on graph paper of that type. Historically, the best-fitline through impactor data plotted on log probability paper wasdrawn by hand. An estimate ofdg can be obtained directly fromthe plot by visually determining the diameter related to the pointwhere the best-fit line intersects the line associated with 50%probability.

If the particles were normally distributed, the standard de-viation would simply be the difference between the diameterat 84.1% subtracted from the diameter at 50%, as this range(34.1%) represents the area under the normal distribution asso-ciated with one standard deviation from the mean (Hinds 1982).However, the particle distribution is lognormal. Hence the nat-ural logarithm ofσg is

ln σg = ln d84.1%− ln d50%. [1]

187

188 P. T. O’SHAUGHNESSY AND O. G. RAABE

By the properties of logarithms this difference can be expressedas a ratio and, after taking the antilog of both sides of the equa-tion, σg can be expressed as

σg = d84.1%

d50%. [2]

More recently, computer code has been written to performa least-squares linear regression of the data to increase the ac-curacy of the estimates ofdg andσg (Knutson and Lioy 1995;Hinds 1986). A least-squares linear regression involves deter-mining the parameters,α andβ, of the linear equation

y(x) = α + βx [3]

that minimizes the sum of the squared difference between eachobservedy value, yi , and the corresponding expected value,y(xi ), for 2< i ≤ n total x and y values withn–2 degrees offreedom: ∑

[yi − y(xi )]2 = min. [4]

Because the relationship between stage cut diameter and masscumulative percent is nonlinear, a numerical transformation ofthese parameters must be performed prior to the linear regres-sion analysis. The linear transformation of the cut diametersis relatively simplistic in that the natural logarithm is calcu-lated for each cut diameter because the underlying distribu-tion is assumed to be lognormal. The more involved processof computing a linear transformation of the cumulative per-cent values is accomplished by first realizing that the scalingof the probability axis is derived from the integral of the stan-dardized normal probability density function (PDF) that definesthe cumulative fraction8(z) under the Gaussian distributioncurve:

8(z) =∫ z

−∞

1√2π

exp

(−z2

2

)dz, [5]

wherez= (x−µ)/σ in whichµandσ are the mean and standarddeviation of the data values, respectively, and the standardizednormal variable,z, hasµ= 0 andσ = 1.

Establishingzas the independent variable, the resulting linearequation is

ln d = α + βz. [6]

The corresponding graph is related to the use of the linear pro-bit scale rather than the nonlinear probability scale. Under thispremise,dg is equivalent to the intercept,α, as this represents thediameter atz= 0, which is equivalent to a probability of 50%.Likewise, the slope,β, can be expressed as the change in thenatural log of the diameters relative to a unit change inz:

β = ln dz=1− ln dz=0

1− 0. [7]

Given thatz values of 1 and 0 are equivalent to probabilities of84.1 and 50%, respectively, Equation (7) is identical to Equation(1) and thereforeβ is equivalent to the natural logarithm ofσg,henceσg= exp(β).

Spreadsheet Development.As described above, the cumu-lative percent calculated for each impactor stage can be trans-formed into its equivalent (linear)z value and plotted relativeto the logarithm of the cut diameters. Current versions of mostspreadsheets contain a predefined function that incorporates thenumerical solution relating8(z) to z. In spreadsheets that con-tain such a function, the acronym “NORMSINV” is used to namethe function. Therefore the stage cumulative percents (trans-formed into cumulative fractions), calculated as part of the im-pactor data reduction, can be transformed into their correspond-ing z values and plotted on a linear scale versus the stage cutdiameters plotted on a log scale (Figure 1). Furthermore, thebuilt in linear regression capabilities of these spreadsheets canbe utilized to determinedg andσg by taking the antilogarithmof the slope and intercept resulting from a linear regression oflog dc versusz.

Inverted Probit MethodAs the proof given in the previous section implies, the use

of linear regression analysis to determinedg andσg relies nu-merically on establishingz as the independent variable. Thiscondition is counter to the physical aspect of the impactor sam-pling process wherebyz is a product of the mass measured oneach stage, which is dependent on the cut diameter of each stage.Therefore the linear relationship should be inverted to correctlyaccount for the actual relationship between the cumulative frac-tions as the dependent variable and cut diameters as the inde-pendent variable. A determination ofdg and σg can then beaccomplished from theα andβ values resulting from the linearregression analysis of the inverted variables where

ln dg = −αβ, [8]

ln σg = 1

β. [9]

Sample Variance Inequalities.Proper application of a least-squares linear regression is performed under the assumption thatthe variance of the dependent variable is constant regardless ofthe magnitude of the independent variable. However, the prob-ability scale of log-probability paper (or log-probit paper) dis-proportionately magnifies the effect of an error in cumulativepercent near the extremes of the scale compared to values near50% (Hinds 1982, 1986). (The same is also true when applyinga logarithmic scale to the ordinate but is often ignored, as whendetermining log-log or semilog relationships.)

In general, if a regression is performed on dependent val-ues transformed by the function,f (yi ), rather than directly onthe data values,yi , then the estimated standard deviation (or“uncertainty”) associated with each data value,syi, must be

IMPACTOR DATA REDUCTION METHODS 189

Figure 1. Spreadsheet-generated log-probit paper with regression line on points determined from impactor data reduction.

modified by

sf (yi) = d f (yi )

dysyi [10]

to compensate for the distortion of eachsyi caused by the scalingtransformation (Bevington and Robinson 1992). For example, ifthe dependent variable is transformed by taking the natural log-arithm of each value, the standard deviations would be modifiedby

sln(yi) = d(ln yi )

dysyi = 1

yisyi. [11]

For the case described here, the cumulative mass fraction,8(z),is transformed into a correspondingz value by the inverse of

Figure 2. Distortion in probability scale relative to no distortion at a cumulative percent level of 50%.

Equation (3). With reference to Equation (11), the standard de-viation of the variablez, sz, is given in terms of the observedstandard deviation of each8(z) value,s8, by

szi = dz

d8(z)s8i =

√2π

exp(−z2/2)s8i , [12]

where for an impactor the subscript “i ” is used to index the totalnumber of stages of the impactor.

As shown in Figure 2, the value ofdz/d8(z) remains rel-atively constant between cumulative fractions of 0.2 and 0.8,but increases dramatically below and above those levels. Thecurve shown in Figure 2 also demonstrates the amount of dis-tortion to be expected from plotting on a probability axis given

190 P. T. O’SHAUGHNESSY AND O. G. RAABE

Figure 3. The effect of scaling distortion on sample error assuming a 5% error about each point.

equals8 values. This distortion results in an enlargement ofany error associated withz values plotted below or abovethe 20 (z<−0.85) and 80% (z> 0.85) lines, respectively(Figure 3).

To compensate for differences between eachsz resulting froman application of Equation (12), the method of weighted leastsquares (WLS) can be employed (Bevington and Robinson1992). WLS is a generalized form of least-squares regressionthat allows for the incorporation of thesz that necessarily vary.In general, WLS weights the squared errors between themeasurements,yi , and the regression values,y(xi ), given inEquation (4), by the reciprocal of each variance terms2

f (yi)(Bevington and Robinson 1992; Raabe 1978). The WLS methodthen determines the intercept,α, and slope,β, that minimizesthe summation given in Equation (13).

∑ [yi − y(xi )]2

s2f (yi)

=∑ [yi − (α + βxi )]2

s2f (yi)

= min. [13]

Since the variance term is used to weight the squared differencebetween measured and predicted levels ofy, it is not importantto accurately determine the absolute value of each variance, butonly to ensure that each has a magnitude that is correctly relativeto all other variance terms.

In summary, determining a minimum value of Equation (13)is accomplished after (1) transforming the cumulative fractions,8(z), associated with each stage into an equivalentz value;(2) making the logarithmic transformation of the diameters;and (3) transforming the standard deviation of the cumulativefractions for each stage,s8i , with the use of Equation (12) toobtain

∑ [zi − z(ln di )]2( √2π

e−z2/2s8i

)2 = min, [14]

where

z(ln di ) = α + β(ln di ). [15]

To then determine the values ofα andβ that minimize Equa-tion (13), the partial derivatives of Equation (13) with respect toboth parameters are evaluated when set equal to zero (Bevingtonand Robinson 1992). The computational equations that satisfythat criteria are given below.

α = 1

1

(∑ x2i

s2zi

∑ yi

s2zi

−∑ xi

s2zi

∑ xi yi

s2zi

), [16]

β = 1

1

(∑ 1

s2zi

∑ xi yi

s2zi

−∑ xi

s2zi

∑ yi

s2zi

), [17]

where:

1 =∑ 1

s2zi

∑ x2i

s2zi

−(∑ xi

s2zi

)2

. [18]

Commercially available spreadsheets do not have an optionto perform weighted least-squares regression. Therefore to de-termineα andβ, other software must be used or extra columnsmust be developed in a spreadsheet to compute the summationsgiven in Equations (16)–(18). Equations (8) and (9) can then beused to determinedg andσg, respectively, from the values ofαandβ determined from Equations (16)–(18).

Variance Assumptions.Application of the method descri-bed above relies on knowledge of the standard deviation of thecumulative fractions for each stage,s8i . However, an evaluationof the size distribution of an aerosol with the use of a cascadeimpactor is often obtained with a single sample. In this case,a value for eachs8i cannot be determined directly. As givenin the appendix, the variance of the stage cumulative fractions,s28i , can be determined if the variance of the mass,s2

mi, collected

IMPACTOR DATA REDUCTION METHODS 191

on each stage is known. This likewise leads to the problem ofdetermining eachs2

mi when only one sample is taken.Rather than rely on the na¨ıve assumption thats2

m is equivalentfor all stages, we propose that a more realistic assumption be thats2

m is directly proportional to the mass on each stage. Althoughnot meant to be conclusive, to test this assumption a series of 12trials were performed during which a pulverized grain dust wasaerosolized into a 1 m3 environmental chamber. During eachtrial the chamber flow rate, aerosol generation rate, impactorflow rate, and placement of the impactor within the chamberwere held constant. After sampling, impactor substrates wereweighed in a climate-controlled room with a 6 place balance andthe variance of the mass collected on each stage was determined.The mass fraction on each stage for each trial was computed.

Inversion MethodThe inverted probit reduction method described above re-

duces the error induced by making linear transformations of non-linear variables. However, that method still incorporates the sim-plifying assumption that each stage of the impactor has an ideal,perfectly sharp collection efficiency. Several data “inversion”techniques have been developed to incorporate the actual stagecollection efficiencies (Cooper 1993; Dzubay and Hasan 1990;Puttock 1981; Raabe 1978; Ramachandran and Vincent 1997).A least-squares fitting method for analyzing particle size distri-bution data was described by Kottler (1950). Raabe (1978) wasthe first to develop a stage-efficiencies weighted least-squaresmethod for impactor data to calculate the maximum likelihoodfitted parameters of the distribution based on the variation inthe variance of the stage mass fractions while also incorporat-ing the actual impactor stage efficiency curves to account fortheir deviation from the ideal. A spreadsheet application of thatmethod is described below and applied to the reduction of dataderived from the commonly-used Marple personal cascade im-pactor (Series 290, Anderson Inst. Inc., Smyrna, GA, referredto in subsequent text as “the impactor”).

Incorporation of Stage Efficiency Curves.The regressiontechniques described above for determiningdg andσg are per-missible if ideal stage collection efficiencies are assumedbecause a single diameter can be associated with the mass col-lected on a stage. However, if the actual, nonideal stage effi-ciency curves are considered in the analysis ofdg andσg, theneach stage mass fraction,fi , is associated with all diametersspanning the stage collection efficiency curve of that stage. Ifone assumes that a lognormally-distributed aerosol with a givendg andσg is measured by the impactor, then the fraction of dustcollected over a small size range can be calculated for a givenstage if the collection efficiency for that size range is known. Thetotal mass fraction for a particular stage can then be determinedby summing the individual fractions determined over the entirerange of the collection efficiency curve. Given the resulting setof computed mass fractions for each stage,fi (dg, σg), the WLSmethod can be used to determine the value ofdg andσg thatminimizes the sum of the weighted squares of the differences

between the measured fractions,fi , and the computed fractionsas given in Equation (19) (Raabe 1978).∑ [ fi − fi (dg, σg)]2

s2fi

=min. [19]

Unlike that given in Equation (13), the weighting factor, 1/s2fi ,

of Equation (19) does not compensate for differences in scalingbut is rather the reciprocal of an estimate of the variance for eachmeasured fraction (Bevington and Robinson 1992; Raabe 1978).Again, this variance is unknown if only one sample is taken.However, the distribution of a fraction,f , is best described by thebinomial distribution, which has a variance defined asf (1– f )given one sample. As part of the experimental analysis used toverify the assumption used to estimate the variance of the masson each stage,s2

m, the variance of the mass fractions,s2f , on each

stage was likewise computed and compared to values based onthe assumption thats2

f = f (1– f ). Furthermore, as mentionedabove, only the relative, not absolute, value of each varianceterm is needed for application in Equation (19).

Spreadsheet Development.Properties of various impactorshave been extensively evaluated by many investigators sinceits invention by May (1945). For example, Raabe et al. (1988)provide results of careful stage calibration of a circular-jet im-pactor. Rader et al. (1991) developed equations to fit measuredefficiency curves for the stages of the Marple personal cascadeimpactor previously calibrated by Rubow et al. (1987). The gen-eral form of the equation used by Rader et al. to describe eachstage efficiency,Ei , is

Ei (dar) = tanh

[(dar

ai

)bi], [20]

wheredar is the aerodynamic resistance diameter first describedby Raabe (1976) andai andbi are the fitted parameters that bestdescribe the curves for each stage. The aerodynamic diameter,dae, is related to thedar by the slip-correction factor,C,

dar = dae

√C(dae)

and is therefore referred to as the “slip-corrected equivalent ofthe aerodynamic diameter” by Rader et al. (1991).

The use of explicit equations to define the collection curvesof the impactor evaluated here simplified the application ofthis method for use in a spreadsheet. However, as emphasizedby Raabe (1978), this method will work equally well given aseries of interpolated values for each curve. Furthermore, im-pactor calibration methods other than those described by Raderet al. (1991) may be applied to obtain collection curves for im-pactors (e.g., Raabe et al. (1988)). As applied to the inversionspreadsheet, the efficiency values for 101 diameters between0.32µm and 100µm (−0.5–2.0 log10 diameter by incrementsof 0.025 log diameter) and for each stage were calculated withthe use of Equation (20). The resulting curves are shown inFigure 4.

192 P. T. O’SHAUGHNESSY AND O. G. RAABE

Figure 4. Impactor stage collection efficiency curves as determined by Rader et al. (1991).

Under the assumption that the aerosol sampled by the im-pactor is unimodal and lognormally distributed, the theoreticalcumulative fraction,8(z), less than each of the 101 diameterswas determined with the use of a spreadsheet (Excel, MicrosoftCorp., Seattle, WA) and the predefined function NORMDIST.That function computes the8(z) value associated with a par-ticular diameter given a mean diameter and standard deviation.In this case, the log10 of the diameters is normally distributed.Therefore the function was supplied with the log of each diam-eter as well as the log of chosen values fordg andσg. The massfraction, fi (1d), between each of the 101 diameters was thencomputed by subtracting adjacent8(z) values. Thesefi (1d)values were then modified to account for internal losses andsampling inlet efficiency with the use of equations supplied byRader et al. (1991).

The amount of aerosol within a certain size range that depositsonto a given stage is equivalent to the product of the amount inthat size range that penetrates through the previous stage andthe stage collection efficiency for that size range (Raabe 1978;Rader et al. 1991; Ramachandran and Vincent 1997). Thereforestarting with the modifiedfi (1d) values, the percent penetratingthrough and depositing on each stage was determined for each ofthe 100 size classes analyzed. The estimated fraction depositedon each stage,fi (dg, σg), was then calculated by determiningthe sum of the 100 fractions for each size range computed foreach stage. These estimated fractions were then used to developthe summation defined by Equation (19).

Nonlinear root-solving techniques are available to find thevalues ofdg andσg that minimize Equation (19) (Bevington andRobinson 1992; Press et al. 1989). Software programming canbe used to implement these numerical methods, however, mostspreadsheets contain a precoded function to perform nonlinearroot-solving automatically. The particular one utilized as part

of this analysis, “Solver” in the spreadsheet Excel, utilizes ageneralized reduced gradient (GRG2) nonlinear optimizationcode (Frontline Systems Inc., Incline Village, NV). In this case,Solver is used to minimize the contents of the cell containingthe summation defined by Equation (19) while changing thecontents of the cells containing the values ofdg andσg used todetermine the predicted fractions,fi (dg, σg) with the additionalconstraints ofdg≥ 0.01 andσg≥ 1.01.

A common difficulty that arises when attempting to find theminimum (or maximum) of a complex numerical system, suchas the one described here, is the likelihood of determining a“local” minimum rather than the desired minimum. To avoidthis problem here, the Probit method described above can beused to find initial values fordg and σg that will likely benear those found when using the Inversion method. Employ-ing the Probit method is also useful for performing an ini-tial evaluation as to whether the cumulative mass fractions fallapproximately on a straight line when plotted and thereforeindicate that the aerosol size distribution is lognormal andunimodal.

Confidence in the Parameter EstimatesA measure of the confidence in the estimates of the parameters

dg andσg can be obtained by computing their standard error (SE).This is a relatively straightforward process ifdg andσg werecomputed using linear least squares regression as when using theProbit method. In that case, the SE values are determined from anestimate of the experimental error variance,σ 2, by computingthe error sum of squares (SSE) equivalent to the summationgiven in Equation (19) for the weighted least squares case. Anunbiased estimate ofσ 2 can then be obtained by dividing the SSEby the appropriate number of degrees of freedom, which is thesample size,n, minus the number of parameters,p. The variance

IMPACTOR DATA REDUCTION METHODS 193

of a single parameter estimate, sayp(1), is then computed fromSSE/(n–p) multiplied by a function of the independent variable(or matrix of multiple independent variables),f (x), by

s2p(1) =

SSE

n–pf (x) [21]

and the standard error ofp(1), sp(1), is the square root ofs2p(1).

(The nature of the term,f (x), depends on the parameter evalu-ated. A statistical text may be consulted for a complete descrip-tion of the formulation of SE values for estimated parameters.)Given a SE value for a parameter estimate, the (1–α) 100%confidence limits for the true parameter value can be obtainedfrom

p(1)± tν,α/2Sp(1), [22]

whereν represents then–p degrees of freedom. The outputdisplayed when using a spreadsheet to perform a univariate linearregression analysis will typically contain both the SE values andcorresponding confidence intervals for the calculated slope andintercept.

The method described above results in an unbiased estimateof the SE of a parameter estimate obtained from a linear leastsquares regression analysis. This estimation is also “exact” inthat the function,f (x), can be solved explicitly. However, an ex-act solution off (x) cannot be obtained for a nonlinear function.Therefore the SEs of the parameter estimates cannot be solvedexplicitly. One method for estimating SE values for parametersassociated with a nonlinear function is to utilize a Taylor seriesexpansion to approximate the nonlinear model with linear terms(Neter et al. 1996). Application of this “Gauss–Newton” methodcan result in both a least-squares estimate of the parameters,dg

andσg, as well as their standard errors. However, in this case, themethod was used only to determine the standard errors once bestestimates were determined using the spreadsheet’s precoded it-erative solving routine. An explanation of this method is givenin the appendix.

Method ComparisonTheoretical comparisons between the 2 linear regression

methods described above and the Inversion method were madeunder the assumption that use of the Inversion method resultedin the most accurate estimate of the truedg andσg values. There-fore rather than use the Inversion method to determine unknowndg andσg values from a set of measured stage fractions, knowndg andσg values were applied to a spreadsheet containing theequations describing the stage collection efficiency curves todetermine the predicted stage fractions. These stage fractionstherefore represent those that would result from a perfectly log-normally distributed aerosol captured by the impactor with non-ideal stage collection efficiencies.

A primary difficulty associated with the use of the linearregression methods occurs when the data points do not fall on a

straight line. This may occur when the distribution of the aerosoldeviates from that of a perfect lognormal distribution. Deviationfrom a straight line may also occur if the aerosol distributionis not unimodal. However, for these comparisons only the massfractions of a unimodal, lognormal distribution was determined.Any deviations from the straight line therefore occurred as aconsequence of predicting an insignificant amount of mass forstages near the top or bottom of the impactor. For example, verylow mass fractions were predicted for the lower stages givenan aerosol with a largedg and lowσg. This caused the pointsassociated with the lower stages to drop relative to the slopecreated by the upper stage points. In practice, one may performa linear regression on only those points that form a straight line.For comparison purposes, however, only those combinations ofdg andσg that resulted in linear regression through all points withanr 2 value greater than 0.995 were compared to the Inversionmethod. This limitation excluded the comparison of distributionswith σg< 2.0 anddg> 8.

Given the limitations described above, the comparisons wereconducted for the 15 combinations ofdg values of 1, 2, 4, 6,and 8µm andσg values of 2.0, 2.5, and 3.0. The stage massfractions predicted from stage collection efficiency equationswere applied to the separate spreadsheets used to determinedg

andσg by the linear regression methods. The percent differencesin the resulting values ofdg andσg, compared to those of theoriginal set applied, were then computed for each combinationof dg andσg.

The application of corrections for internal losses and inletsampling efficiency were not applied when making these com-parisons as they were assumed to hold true regardless of method.Furthermore, the set of stage cut diameters applied to the linearregression methods were those reported in the Rader et al. (1991)paper rather than the commonly-used cut diameters reported inthe Rubow et al.(1987) paper.

RESULTS AND DISCUSSION

Variance AssumptionsFor the 12 dust trials performed, the variance of the mass

collected on each stage and back up filter was computed. Astatistical analysis (F test) determined a significant differencebetween the smallest (0.0002, stage 8) and largest mass vari-ance (0.137, stage 3) (p< 0.001). Therefore an assumption thatthe mass variance,s2

m, is constant when applied to the calcula-tion of the variance of the cumulative mass fraction,s2

8, is notvalid. Because the variances obtained by estimation were of adifferent magnitude than those calculated, each set of estimatedand calculated variances were normalized relative to the sumof all variances obtained for each set. As shown in Figure 5, acalculation ofs2

8 for each stage using the assumptions2mi=mi

gives a closer approximation to the actual variance of the massmeasured for each stage compared to values obtained underthe assumptions2

mi= constant. The SSE between the measured

194 P. T. O’SHAUGHNESSY AND O. G. RAABE

Figure 5. Normalized stage cumulative fraction variances compared to predicted variances.

variances and the estimated variances was 236 when applyingthe first assumption compared to 796 when applying the second.

Likewise, the variance of the 12 measured mass fractions foreach stage,s2

f i , was computed for each stage and back up filter.A statistical analysis (F test) determined a significant differencebetween the smallest (0.152, stage 8) and largest variance (2.769,stage 2) (p< 0.001). Therefore the need to utilize weighted leastsquares to compensate for differences ins2

f is justified whenusing the Inversion method. A comparison was made betweenthese calculated variances and estimations of the variance basedon the assumption thats2

f i = fi (1– fi ) (Figure 6). After normaliz-ing each set of values as a percentage of the sum of all values, the

Figure 6. Normalized stage mass fraction variances compared to predicted variances.

SSE (145.3) was less than that obtained while assuming a linearrelationship between the two (188.6). Although these analyseswere not performed in such a way as to demonstrate conclusiveresults to support use of the variance assumptions, they do lendreasonable guidance for an estimation of these variances whenonly one sample has been taken.

Method ComparisonAs shown in Figures 7 and 8, use of the 2 linear regres-

sion reduction methods resulted in values ofdg andσg greaterthan those supplied to the Inversion method. However, therewas closer agreement between the Inversion method and the

IMPACTOR DATA REDUCTION METHODS 195

Figure 7. Percent difference indg (a) andσg (b) predicted by the Probit method relative to those applied to a spreadsheet thatincorporated the actual stage efficiency curves.

Inverted Probit method compared to its agreement with the Pro-bit method.

The increased estimate ofdg by the 2 linear regression meth-ods is primarily related to the nature of the actual stage collectionefficiency curves relative to the ideal collection assumed whenusing these 2 methods. The sigmoidal nature of the collectioncurves indicates that each stage will fail to collect some parti-cles greater than the cut diameter and will likewise collect someparticles less than the cut diameter. Conversely, under ideal col-lection each stage will collect only those particles with sizesfalling between its cut diameter and the larger cut diameter ofthe previous stage. Actual collection on each stage will there-

fore result in collecting particles with sizes above and belowthis range. If for all stages the mass of particles smaller than thestage cut diameter is greater than the mass of particles largerthan the cut diameter of the previous stage, the linear-regressionmethods will overestimatedg. As shown in Figure 9, this is thecase for the Marple personal cascade impactor regardless of theunderlying size distribution.

The long tail of the collection efficiency curve for stage 1(Figure 4) has a pronounced effect on this difference in the esti-mate ofdg. The Inversion method correctly associates the masscollected on stage 1 with the contribution of particles less thanthe cut diameter for that stage, whereas the linear-regression

196 P. T. O’SHAUGHNESSY AND O. G. RAABE

Figure 8. Percent difference indg (a) andσg (b) predicted by the Inverted Probit method relative to those applied to a spreadsheetthat incorporated the actual stage efficiency curves.

methods assume the mass to be contributed by all particlesgreater than the cut diameter, thereby adding to the inflationof the estimate ofdg by that method.

A sensitivity analysis was also performed in which stage frac-tions predicted from the equations describing the stage collec-tion efficiency curves were applied as “measured” mass fractionsto the spreadsheet containing the Inversion method. If startingvalues for thedg andσg were purposely chosen to be much dif-ferent than those used to calculate the stage fractions, then theSolver routine would often converge on a minimum located ata combination ofdg andσg that were not near the original set.However, when the Probit method was used to generate starting

values, as suggested above, the largest percent error betweensupplied and determineddg andσg was 7.5× 10−5 and 5.6×10−5, respectively.

The 95% confidence intervals about the estimates ofdg andσg shown in Figure 1 were determined using the unbiased methodassociated with linear regression analysis. This method resultedin confidence intervals of 1.04µm about the estimateddg valueof 4.34µm and 1.03 about the estimatedσg value of 2.64. Theestimation technique described in the appendix for determiningconfidence intervals about parameters estimated by nonlinearleast squares was also applied to the same impactor data set.This method resulted in confidence intervals of 0.25µm about

IMPACTOR DATA REDUCTION METHODS 197

Figure 9. The ratio of all mass collected below, relative to all mass collected above, successive stage cut diameters given variouscombinations ofdg andsg.

an estimateddg value of 3.85µm and 0.14 about the estimatedσg value of 2.40. The fitted distributions using these 2 methodsare plotted in Figure 10 together with a histogram of the relativestage mass percent relative to aerodynamic particle diameter,dae. As shown in Figure 10, the distribution determined with theProbit method is wider and shifted toward the larger particle di-ameters, a relationship consistent with the general results foundwhen comparing these 2 methods.

Figure 10. Histogram of the relative mass percent on each stage compared to lognormal distributions resulting from applicationof the Probit and Inversion methods.

CONCLUSIONThe spreadsheet application of 2 linear regression methods

and an inversion method for the reduction of data collected by acascade impactor were described. Application of least-squareslinear regression to impactor data plotted on log-probability pa-per requires that the stage cut diameter be the dependent variablein order to derive thedg andσg values from the resulting inter-cept and slope, respectively. To correct for this condition, a linear

198 P. T. O’SHAUGHNESSY AND O. G. RAABE

regression method was described that inverts the relationship be-tween cut diameter and the associated stage cumulative percent.These linear regression methods were compared to an inversionmethod that incorporates the stage collection efficiency curves inthe analysis ofdg andσg. The method comparison demonstratedthat both linear regression methods overestimated both thedg

andσg within the ranges analyzed. The linear regression meth-ods will therefore underestimate the fraction of the smaller, res-pirable particles. Furthermore, the increased complexity of theInverted Probit method over that of the Probit method does notappear to be justified as this method also producesdg andσg val-ues consistently higher than the Inversion method. Future workwill involve the addition of features to the Inversion methodspreadsheet to allow for the analysis of bimodal and trimodaldistributions. In that regard, the extent to which the impactorcan distinguish the separate median diameters of a bimodallydistributed aerosol will be evaluated. An electronic copy of any,or all, spreadsheets described in this paper is available uponrequest to the corresponding author.

NOMENCLATUREd aerosol diameterlog d log base 10 of aerosol diameterdar slip-corrected aerosol diameterdg mass median aerodynamic diameter of the aerosol

lognormal size distributionfi measured mass fraction on thei th stage of an im-

pactorfi (dg, σg) expected mass fraction on thei th stage based on

impactor stage collection efficiency curves and uni-modal, lognormal size distribution defined bydg

andσg

fi (1d) expected mass fraction for the size interval1d onthe i th stage of an impactor

sf i estimate of the standard deviation of the mass frac-tion of thei th stage of an impactor

smi estimate of the standard deviation of mass collectedon thei th stage of an impactor

syi estimate of the standard deviation of a dependentvariable measured at a dependent level,xi

s8i estimate of the standard deviation of the cumulativefraction of mass collected on thei th stage of animpactor

szi s8i normalized to compensate for scaling distor-tions when transformed to its equivalent normalstandard variate,z

xi an independent variable indexed byi from 1 tonyi a dependent variable associated with the indepen-

dent variable,xi

z normally distributed variable having a mean of 0and standard deviation of 1

zi z value associated with thei th stage of a cascadeimpactor

α intercept of a linear regressionβ slope of a linear regression8(z) the cumulative distribution function of the standard

normal distribution,N(0, 1)σg geometric standard deviation of the aerosol lognor-

mal size distributionσ 2

i population variance of mass collected on thei thstage of an impactor

REFERENCESBaron, P. A., and Heitbrink, W. A. (1993). Factors Affecting Aerosol Mea-

surement Quality. InAerosol Measurement: Principles, Techniques, and Ap-plications, edited by K. Willeke and P. A. Baron. Van Nostrand Reinhold,New York, pp. 146–176.

Bevington, P. R., and Robinson, D. K. (1992).Data Reduction and Error Analysisfor the Physical Sciences, 2nd ed., McGraw-Hill, New York.

Cooper, D. W. (1993). Methods of Size Distribution Data Analysis and Presen-tation. In Aerosol Measurement: Principles, Techniques, and Applications,edited by K. Willeke and P. A. Baron. Van Nostrand Reinhold, New York,pp. 146–176.

Dzubay, T. G., and Hasan, H. (1990). Fitting Multimodal Lognormal SizeDistributions to Cascade Impactor Data,Aerosol Sci. Technol.13:144–150.

Hinds, W. C. (1982).Aerosol Technology: Properties, Behavior, and Measure-ment of Airborne Particles, John Wiley & Sons, New York.

Hinds, W. C. (1986). Data Analysis. InCascade Impactor: Sampling & DataAnalysis, edited by J. P. Lodge and T. L. Chan. ACGIH, Inc., Cincinnati, OH.

Johnson, D., and Swift, D. (1997). Sampling and Sizing Particles. InThe Occu-pational Environment—Its Evaluation and Control, edited by S. R. DiNardi.American Industrial Hygiene Association, Fairfax, VA.

Knutson, E. O., and Lioy, P. J. (1995). Measurement and Presentation of AerosolSize Distributions. InAir Sampling Instruments for Evaluation of AtmosphericContaminants, 8th ed., edited by B. S. Cohen and S. V. Hering. ACGIH,Cincinnati, OH, pp. 121–137.

Kottler, F. (1950). The Distribution of Particle Sizes,J Franklin Inst. 250:339–352, 419–441.

Lodge, J. P., and Chan, T. L. eds. (1986).Cascade Impactor: Sampling & DataAnalysis, ACGIH, Inc., Cincinnati, OH.

May, K. R. (1945). The Cascade Impactor: An Instrument for Sampling CoarseAerosols,J. Sci. Instrum.(London) 22:187–195.

Neter, J., Kutner, M. H., Nachtsheim, C. J., and Wasserman, W. (1996).AppliedLinear Statistical Models, 4th ed., Irwin, IL, pp. 531–552.

Press, W. H., Flannery, B. P., Teukolsky, S. A., and Vetterling, W. T. (1989).Numerical Recipes in Pascal: The Art of Scientific Computing, CambridgeUniversity Press, New York.

Puttock, J. S. (1981). Data Inversion for Cascade Impactors: Fit-ting Sums of Log-Normal Distributions,Atmos. Environ. 15(9):1709–1716.

Raabe, O. G. (1971). Particle Size Analysis Utilizing Grouped Data and theLog-Normal Distribution,Aerosol Sci. Technol.2:289–303.

Raabe, O. G. (1976). Aerosol Aerodynamic Size Conventions for Inertial Sam-pler Calibration,APCA Journal26(9):856–860.

Raabe, O. G. (1978). A General Method for Fitting Size Distributions toMulticomponent Aerosol Data Using Weighted Least-Squares,Environ. Sci.Technol. 12(10):1162–1167.

Raabe, O. G., Braaten, D. A., Axelbaum, R. L., Teague, S. V., and Cahill, T. A.(1988). Calibration Studies of the DRUM Impactor,J. Aerosol Sci.19:183–195.

Rader, D. J., Mondy, L. A., Brockmann, J. E., Lucero, D. A., and Rubow,K. L. (1991). Stage Response Calibration of the Mark III and Marple PersonalCascade Impactors,Aerosol Sci. Technol.14:365–379.

IMPACTOR DATA REDUCTION METHODS 199

Ramanchandran, G., and Vincent, J. H. (1997). Evaluation of Two InversionTechniques for Retrieving Health-Related Aerosol Fractions from PersonalCascade Impactor Measurements,Am. Ind. Hyg. Assoc. J.58:15–22.

Rubow, K. L., Marple, V. A., Olin, J., and McCawley, M. A. (1987). A Per-sonal Cascade Impactor: Design, Evaluation and Calibration,Am. Ind. Hyg.Assoc. J.48(6):532–538.

APPENDIX

Variance of Cumulative FractionsFor a cascade impactor withn stages, with the top (largest

cut diameter) stage numbered 1 andn+ 1 the filter, each stagecollects a mass of materialmi (with standard deviationsmi) as-sociated with particles larger then the cutoff diameter for thatstage. The cumulative mass fraction,F , less than the cutoff di-ameter for a stage,k, is therefore the ratio of all mass collectedby higher numbered stages to the total mass collected on allstages as defined below.

Fk =∑n+1

i=K+1 mi∑n+1i=1 mi

= v

u+ v =1

uv+ 1= 1

w,

whereu andv are independent of each other and given by

u =k∑

i=1

mi ,

v =n+1∑

i=k+1

mi .

Sinceu andv are unrelated, the variance of (u/v) is given by

s2(u/v) =

[s2u

u2+ s2

v

v2

][u

v

]2

,

where

s2u =

k∑i=1

s2mi s2

v =n+1∑

i=k+1

s2mi.

Also,

s2w = s2

u/v

so that the variance of the cumulative fraction for stagek is

s2Fk=[

s2w

w2

][Fk ]2 =

[s2w

w2

][1

w

]2

= s2w

w4.

Confidence in the Parameter EstimatesThe least squares criterion for the nonlinear response function

fi (dg, σg) was given in Equation (19) as

∑ [ fi − fi (dg, σg)]2

s2f i

=min.

If the 2 parameters,dg andσg, are expressed as a vector,γ ,the response function can be rewritten asfi (γ ), which is themean response for thei th stage plus filter. To minimize the leastsquares criterion, the Gauss–Newton, or linearization, methoduses a Taylor series expansion to approximate the nonlinear re-gression model with linear terms and then employs ordinary leastsquares to estimate the parameters (Neter et al. 1996). Consider-ing the case where the weights, 1/s2

f i , are not applied, then givenstarting values,g(0)

k , for the 2 parameters, the mean responsesfor then stages and back up filter are approximated by the linearterms in the Taylor series expansion, which, for thei th stage, is

fi (γ) ∼= fi(g(0))+ p−1∑

k=0

[∂ fi (γ)

∂γk

]γ=g(0)

(γk − g(0)

k

),

whereg(0) is the vector of starting values forγ .If the notation is simplified as

α(0)i = fi

(g(0)),

D(0)ik =

[∂ fi (γ)

∂γk

]γ=g(0)

,

β(0)k = γk − g(0)

k ,

then

fi (γ) ∼= α(0)i +

p−1∑k=0

D(0)ik β

(0)k

and therefore an approximation of the nonlinear regressionmodel is

fi ∼= α(0)i +

p−1∑k=0

D(0)ik β

(0)k + εi ,

whereεi is the random error term for thei th case.Defining

F (0)i = fi − α(0)

i ,

then the linear approximation can be expressed in matrix nota-tion as

F(0) ∼= D(0)β(0)+ ε,

200 P. T. O’SHAUGHNESSY AND O. G. RAABE

which has the same form as the general matrix expression for alinear regression model:

Y = Xβ + ε.

Given this linear approximation of the nonlinear model, an ap-proximate variance-covariance matrix of the regression coeffi-cients can be estimated by

s2{g} = MSE(D′D)−1,

where

MSE= SSE

n–p=∑

[ fi − fi (γ)]2

n–p.

With the application of weighted least squares, the least squarescriterion given in Equation (19) can be expressed as

∑[fi

sf i− fi (γ)

sf i

]2

= min.

An evaluation of the variance-covariance matrix can therefore beevaluated as given above, except that each of the partial deriva-tives,Dik , defined above must be multiplied by 1/si , the squareroot of the weights,wi = 1/s2

f i . Since each derivative is mul-tiplied by every other derivative when evaluatingD′D, this isequivalent to multiplying each pair of derivatives by the weights,wi . In matrix form this involves creating ann× n matrix,W =diag[w i ], consisting of thewi along the diagonal and all zeroselsewhere, and computing the variance-covariance matrix by

s2{g} = MSEw(D′WD)−1,

where

MSEw = 1

n–p=∑

[ fi − fi (γ)]2

s2f i

.

Application to Impactor Inversion Routine.Once the Solverfunction has been used to determine the value ofdg andσg thatminimize the criterion function, the optimal values fordg andσg can be used to determine the matrix of derivatives,D. Thederivatives are first computed while first holdingσg to its optimalvalue and varyingdg by a very small amount,1dg (say, 10−9).The derivative associated with each of the stage plus filter massfractions is then

Dik = f (dg +1dg, σg)− f (dg, σg)

1dg.

The same is performed while holdingdg constant and vary-ing σg by a small amount. Assuming all 8 stages of the personalcascade impactor are used, this will then develop a 9× 2 ma-trix of derivatives,D. Likewise, a 9× 9 matrix, W, can beshown in the spreadsheet as a 9 row by 9column table withthe weights along the diagonal and all 0s elsewhere. Likewise,the transpose ofD, D′, can easily be formulated in the spread-sheet. Matrix “manipulation” functions in the spreadsheet canthen be used to multiply and invert the matrices in order toevaluate the 2× 2 matrix (D′WD)−1. The diagonal values ofthis matrix are then multiplied by MSEw to determine the stan-dard errors in the estimates ofdg andσg. Finally, a 95% con-fidence interval about the estimates can be obtained by multi-plying the square root of the variance values obtained by thet value atα/2 andn–p degrees of freedom, which in this case ist0.975,7= 2.365.