structure– and property–activity relationship models for prediction of microbial toxicity of...

8
ECOTOXICOLOGY AND ENVIRONMENTAL SAFETY 39, 112 119 (1998) ENVIRONMENTAL RESEARCH, SECTION B ARTICLE NO. ES971615 Structure– and Property–Activity Relationship Models for Prediction of Microbial Toxicity of Organic Chemicals to Activated Sludge N. Nirmalakhandan, E. Egemen, C. Trevizo, and S. Xu Civil, Agricultural and Geological Engineering Department, New Mexico State University, Las Cruces, New Mexico 88003 Received August 25, 1997 Two quantitative structureactivity relationship (QSAR) and two quantitative propertyactivity relationship (QPAR) models reported in the literature for predicting toxicity of synthetic organic chemicals to activated sludge microorganisms are sum- marized and compared. The QSAR models were developed using solvatochromic parameters and molecular connectivity indices; the QPAR models, using octanolwater partition coefficient and aqueous solubility. Experimental data on 16 chemicals not used in developing the above models are used to compare and evaluate the predictive ability of these QSAR/QPAR models. Based on the quality of the original models, their predictions, ease of application, and availability of model parameters, molecular connectivity indices and log P appear to be the most suitable in toxicity predictions. ( 1998 Academic Press Key Words: microbial toxicity; organic chemicals; toxicity predictions; activated sludge; structureactivity relationship models; propertyactivity relationship models; respirometry. INTRODUCTION Toxicity of synthetic organic chemicals (SOCs) to micro- organisms is an important consideration in assessing the chemicals’ environmental impacts against their economic benefits. Microorganisms play important roles in several environmental processes, both natural and engineered. In natural systems, microorganisms drive the nutrient cycle and are also at the base of the food chain. In engineered systems, microbial processes offer cost-effective options to control and mitigate environmental pollution by SOCs. In addition, their response to SOCs may be extrapolated to higher life forms, the testing of which is more expensive and elaborate (Cronin et al., 1991). Traditionally, an assortment of microorganisms are used to biomineralize municipal and domestic wastewaters. These wastes are composed mainly of readily biodegradable organic chemicals; however, due to rapid industrialization, several SOCs at toxic levels have now been detected in municipal sewer lines, resulting in plant upsets and dis- charge permit violations (Volskay and Grady, 1988). Since most municipal wastewater treatment plants were not originally designed to receive such SOCs, plant operators and regulatory agencies are concerned about the toxic ef- fects of SOCs to microorganisms. Advance knowledge of toxicity of SOCs can benefit plant operators in optimizing plant operation; setting pretreatment standards; protecting receiving water quality; establishing sewer discharge permits to safeguard the plant; and allocating waste load. Microbial processes have also emerged to be feasible technologies for treating industrial and hazardous wastes and for remediating groundwaters and soils contaminated with SOCs. While several SOCs have been demonstrated to be amenable to biological transformations into less harmful end products, success is severely limited by the toxic and inhibitory threshold levels of the SOCs. Prior knowledge of toxicity of SOCs may be of benefit in these applications too. Even though numerous toxicological studies on higher aquatic life forms have been reported in the literature, sys- tematic toxicity assays of SOCs on microorganisms used in waste mineralization have been undertaken only recently (Blum and Speece, 1990; Tang et al., 1992; Nirmalakhandan et al., 1993; Sun et al., 1994). Apart from the IC 50 values, these data sets also encode valuable toxicological informa- tion: interspecies correlations; organism sensitivity; chem- ical structuretoxicity and chemical propertytoxicity relationships, etc. Since the resources available for compre- hensive and exhaustive toxicity testing of all the SOCs in current use are severely limited, it is prudent to maximize the knowledge that can be gleaned from available data on which considerable resources have been expended. Quantitative structureactivity relationship (QSAR) and propertyactivity relationship (QPAR) techniques have emerged as rational tools to explore available test data for extracting the maximum possible information from them and for developing predictive models. The premise of these techniques is that properties and activities of organic molecules are well correlated to their molecular structures and basic physical/chemical properties. Using these tech- niques, it is now possible to derive models from available test results that could predict reliable data for related, 112 0147-6513/98 $25.00 Copyright ( 1998 by Academic Press All rights of reproduction in any form reserved.

Upload: n-nirmalakhandan

Post on 07-Oct-2016

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Structure– and Property–Activity Relationship Models for Prediction of Microbial Toxicity of Organic Chemicals to Activated Sludge

ECOTOXICOLOGY AND ENVIRONMENTAL SAFETY 39, 112—119 (1998)ENVIRONMENTAL RESEARCH, SECTION B

ARTICLE NO. ES971615

Structure– and Property–Activity Relationship Models for Predictionof Microbial Toxicity of Organic Chemicals to Activated Sludge

N. Nirmalakhandan, E. Egemen, C. Trevizo, and S. XuCivil, Agricultural and Geological Engineering Department, New Mexico State University, Las Cruces, New Mexico 88003

Received August 25, 1997

Two quantitative structure–activity relationship (QSAR) andtwo quantitative property–activity relationship (QPAR) modelsreported in the literature for predicting toxicity of syntheticorganic chemicals to activated sludge microorganisms are sum-marized and compared. The QSAR models were developed usingsolvatochromic parameters and molecular connectivity indices;the QPAR models, using octanol–water partition coefficient andaqueous solubility. Experimental data on 16 chemicals not usedin developing the above models are used to compare and evaluatethe predictive ability of these QSAR/QPAR models. Based onthe quality of the original models, their predictions, ease ofapplication, and availability of model parameters, molecularconnectivity indices and log P appear to be the most suitable intoxicity predictions. ( 1998 Academic Press

Key Words: microbial toxicity; organic chemicals; toxicitypredictions; activated sludge; structure–activity relationshipmodels; property–activity relationship models; respirometry.

INTRODUCTION

Toxicity of synthetic organic chemicals (SOCs) to micro-organisms is an important consideration in assessing thechemicals’ environmental impacts against their economicbenefits. Microorganisms play important roles in severalenvironmental processes, both natural and engineered. Innatural systems, microorganisms drive the nutrient cycleand are also at the base of the food chain. In engineeredsystems, microbial processes offer cost-effective options tocontrol and mitigate environmental pollution by SOCs. Inaddition, their response to SOCs may be extrapolated tohigher life forms, the testing of which is more expensive andelaborate (Cronin et al., 1991).

Traditionally, an assortment of microorganisms are usedto biomineralize municipal and domestic wastewaters.These wastes are composed mainly of readily biodegradableorganic chemicals; however, due to rapid industrialization,several SOCs at toxic levels have now been detected inmunicipal sewer lines, resulting in plant upsets and dis-charge permit violations (Volskay and Grady, 1988). Since

11

0147-6513/98 $25.00Copyright ( 1998 by Academic PressAll rights of reproduction in any form reserved.

most municipal wastewater treatment plants were notoriginally designed to receive such SOCs, plant operatorsand regulatory agencies are concerned about the toxic ef-fects of SOCs to microorganisms. Advance knowledge oftoxicity of SOCs can benefit plant operators in optimizingplant operation; setting pretreatment standards; protectingreceiving water quality; establishing sewer discharge permitsto safeguard the plant; and allocating waste load.

Microbial processes have also emerged to be feasibletechnologies for treating industrial and hazardous wastesand for remediating groundwaters and soils contaminatedwith SOCs. While several SOCs have been demonstrated tobe amenable to biological transformations into less harmfulend products, success is severely limited by the toxic andinhibitory threshold levels of the SOCs. Prior knowledge oftoxicity of SOCs may be of benefit in these applications too.

Even though numerous toxicological studies on higheraquatic life forms have been reported in the literature, sys-tematic toxicity assays of SOCs on microorganisms used inwaste mineralization have been undertaken only recently(Blum and Speece, 1990; Tang et al., 1992; Nirmalakhandanet al., 1993; Sun et al., 1994). Apart from the IC

50values,

these data sets also encode valuable toxicological informa-tion: interspecies correlations; organism sensitivity; chem-ical structure—toxicity and chemical property—toxicityrelationships, etc. Since the resources available for compre-hensive and exhaustive toxicity testing of all the SOCs incurrent use are severely limited, it is prudent to maximizethe knowledge that can be gleaned from available data onwhich considerable resources have been expended.

Quantitative structure—activity relationship (QSAR) andproperty—activity relationship (QPAR) techniques haveemerged as rational tools to explore available test data forextracting the maximum possible information from themand for developing predictive models. The premise ofthese techniques is that properties and activities of organicmolecules are well correlated to their molecular structuresand basic physical/chemical properties. Using these tech-niques, it is now possible to derive models from availabletest results that could predict reliable data for related,

2

Page 2: Structure– and Property–Activity Relationship Models for Prediction of Microbial Toxicity of Organic Chemicals to Activated Sludge

SAR/PAR MODELS FOR PREDICTING TOXICITY 113

untested chemicals. While the state-of-the-art QSAR andQPAR techniques cannot yet completely replace experi-mental testing, they are being extensively used to comp-lement test data in design of new chemicals with desiredcharacteristics (Cramer, 1980); environmental assessments(Borman, 1990); screening and grouping of similar chemicalsfor collective evaluation (Clements et al., 1993); prioritiz-ation of testing (Zeeman et al., 1993); classification of mech-anisms and modes of toxic action (Bradbury, 1994); andprediction of joint toxicity in mixtures of chemicals(Nirmalakhandan et al., 1993; Hall et al., 1996; Xu andNirmalakhandan, 1997).

Such models can conserve the limited resources availablefor toxicity testing in a timely manner and guide the regula-tory risk assessment process in a rational and scientificmanner. Recognizing these advantages, the U.S. Environ-mental Protection Agency (EPA), for example, has adaptedmore than 50 QPAR models for use on a daily basis forestimating toxicity of untested SOCs to several different fishspecies (EPA-560/6-88-001). In addition, EPA uses SARmethodology to assess the environmental fate, hazards, andrisks of proposed new industrial chemicals prior to theircommercial manufacture (Zeeman et al., 1993). The OECDand the German Federal Environmental Agency also useQSARs in similar regulatory actions (Fiedler et al., 1990).

QSAR/QPAR Models for Predicting Microbial Toxicity

The basic requirement for developing QSAR or QPARmodels is a consistent and robust training data set (Croninand Dearden, 1995). Based on the number of chemicalstested, the test procedures used, and the stated researchobjectives, the studies reported by Blum and Speece (1990),Tang et al. (1992), Nirmalakhandan et al. (1993), Sun et al.(1994), and Hall et al. (1996) can be considered internallyconsistent and well-designed microbial toxicity-databases.They have been used by their respective authors to deriveQSAR and QPAR models using solvatochromic para-meters, molecular connectivity indices, octanol—water parti-tion coefficient, and aqueous solubility. These parametersare described briefly below.

Solvatochromic parameters, VI, n*, a

mand b

m. The use of

solvatochromic parameters is structure—activity studies waspioneered by Kamlet and co-workers. Their QSAR modelswere based on linear solvation energy relationships(LSERS), where four solvatochromic parameters are used intandem: intrinsic molar volume, »

I, polarity/polarizability,

n*, and hydrogen bond donor acidity, a., and basicity, b

..

The first parameter, »I, could be calculated from the mo-

lecular structures; the other three parameters have to bedetermined experimentally or estimated using ground rulesand from related chemicals. Solvatochromic values forabout 300 chemicals could be found tabulated in several of

the works of Kamlet and co-workers (e.g., Kamlet et al.,1983). Hickey and Passino-Reader (1991) have recently pre-sented a compilation of ground rules for the estimation ofthese parameters.

Molecular connectivity indices. The concept of molecularconnectivity indices (MCIs) was originally proposed byRandic (1975) and since has been extensively developed byKier and Hall (1976, 1986). These indices encode structuraland atomic information of a molecule and have been dem-onstrated to correlate well with a wide range of physical,chemical, and biological properties of chemicals. They canbe calculated using a standard algorithm for any chemicalfollowing the methodology pioneered by Kier and Hall(1976, 1986). In this study, connectivity indices calculated bya slightly modified algorithm proposed by Nirmalakhandan(1988) are used.

Octanol—water partition coefficient. Use of the octanol—water partition coefficient as log P in structure—activitystudies was pioneered by Hansch and co-workers, and hasbeen the parameter of choice in numerous QPAR studies.Listings of log P values can be found in the numerouspapers published in this area. These values can also beestimated using a group contribution method starting froma parent molecule or by a fragment contribution methodstarting from molecular fragments (Leo et al., 1975).However, in some cases, these estimation methods mayyield different results for the same chemical, depending onthe starting point of the estimations. In addition, for spar-ingly soluble chemicals the calculated log P values maydeviate significantly from the experimentally measuredvalues.

Aqueous solubility, S. The use of aqueous solubility inQPAR applications is relatively uncommon. Since log S hasbeen correlated with the solvatochromic parameters (e.g.,Kamlet et al., 1987), with MCIs (e.g. Nirmalakhandan andSpeece, 1988, 1989), and with P (e.g., Hansch et al., 1968), ithas been hypothesized that toxicity could be directly corre-lated with S (Trevizo and Nirmalakhandan, 1997). Experi-mentally measured S data for a large number of chemicalshave been tabulated in the literature; alternatively, they canbe estimated for a wide range of chemicals with a highdegree of reliability (Kamlet et al., 1987; Nirmalakhandanand Speece, 1988, 1989).

Objective of This Study

Most of the QSAR/QPAR models reported in the litera-ture, including those for toxicity, have been of a data-fittingnature. End users of QSAR/QPAR models would be reluc-tant to adapt results of such studies unless the predictiveability of the models is demonstrated on external testingdata sets. The primary objective of this study is to evaluate

Page 3: Structure– and Property–Activity Relationship Models for Prediction of Microbial Toxicity of Organic Chemicals to Activated Sludge

114 NIRMALAKHANDAN ET AL.

the predictive ability of the two QSAR and two QPARmodels reported in the literature for microbial toxicity.

Toxicity was quantified by IC50

, the dose of the testchemical that inhibited microbial respiration by 50% com-pared with an undosed control. For the comparison, ac-tivated sludge microorganisms are chosen as commontarget organisms, and 16 SOCs are chosen as the test set ofchemicals. Activated sludge was chosen because of its com-mon use in municipal wastewater treatment plants. Thechemicals were chosen so as to contain a mixture of struc-tural and elemental features that were represented onlyindividually in the original training tests used in developingthe QSAR/QPAR models. Models reported originally byBlum and Speece (1990), Hall et al. (1996), and Trevizo andNirmalakhandan (1997) are chosen here for comparison.

METHODS AND MATERIALS

QSAR/QPAR Modeling of IC50

A brief description of each of the QSAR/QPAR modelscompiled from the literature is presented below.

LSER model. Blum and Speece (1990) reported toxicitydata and QSAR models for activated sludge (as well asmethanogens, nitrifiers, and Microtox, a commercial micro-bial test culture). The following LSER-based QSAR modelfor activated sludge was reported in that study (where theIC

50values were in micromoles per liter; these have been

converted to millimoles per liter in this study):

log IC50

(mmol/liter)"2.24!4.15»I/100#3.71b

.

!0.41a., n"52, r2"0.92. (1)

The training data set used by Blum and Speece (1990)included 8 of the 16 test set of chemicals assayed in thisstudy. Therefore, those 8 chemicals were deleted from theirdata set to derive an LSER-based model in this study usingthe remaining 44 chemicals as the training set. The resultingmodel is similar in form and quality to their original one:

log IC50

(mmol/liter)"1.99!3.74»I/100#3.65b

.

!0.30a., n"44, r2"0.88. (2)

MCI model. Hall et al. (1996) determined the toxicity of62 chemicals to activated sludge and developed QSARmodels following the molecular connectivity approach. Thezero order (0s7) and first-order (1s7) valence connectivityindices were used to derive QSAR models for subsets of the62 chemicals. The following models are chosen from theirstudy for evaluation: for alcohols,

log IC50

(mmol/liter)"3.43!0.751s7,

n"14, r2"0.83; (3)

for aromatics,

log IC50

(mmol/liter)"3.54!1.281s7,

n"12, r2"0.66; (4)

for haloaliphatics,

log IC50

(mmol/liter)"2.92!0.480s7,

n"14, r2"0.79; (5)

for alkanes,

log IC50

(mmol/liter)"2.02!0.631s7,

n"11, r2"0.75. (6)

Log P model. Blum and Speece (1990) evaluated the oc-tanol—water partition coefficient as a QPAR parameter tomodel toxicity to activated sludge (as well as methanogens,nitrifiers, and Microtox, a commercial microbial test cul-ture). The following log P-based QPAR model for activatedsludge was selected for comparison from their results (wherethe IC

50values were in micromoles per liter, these have been

converted to millimoles per liter in this study):

log IC50

(mmol/liter)"2.12!0.76 log P,

n"53, r2"0.82. (7)

The training set of the above model contained five of thechemicals assayed in this study; therefore, their model wasredeveloped excluding the five chemicals to yield a verysimilar model:

log IC50

(mmol/liter)"2.11!0.74 log P,

n"48, r2"0.87. (8)

Log S model. Trevizo and Nirmalakhandan (1997) evalu-ated aqueous solubility to derive a QPAR model for toxicityto activated sludge (as well as methanogens, nitrifiers, nit-robacter, Polytox, and Microtox). The following solubility-based QPAR model for activated sludge was selected fromtheir study for comparison:

log IC50

(mmol/liter)"!0.10#0.61 log S (mmol/liter),

n"33, r2"0.76. (9)

Experimental Measurement of IC50

IC50

values for the test set of 16 chemicals on activatedsludge were determined in this study using a computer-interfaced, 12-reactor Comput-OX Respirometer (N-CON

Page 4: Structure– and Property–Activity Relationship Models for Prediction of Microbial Toxicity of Organic Chemicals to Activated Sludge

SAR/PAR MODELS FOR PREDICTING TOXICITY 115

Systems, Crawford, GA). Details of this system have beenpresented elsewhere (Nirmalakhandan et al., 1993).

Preparation of activated sludge. The activated sludge testcultures were obtained fresh daily from the return line to thehead end of the aeration tank at the nearby Las CrucesWastewater Treatment Plant, which receives mainly mu-nicipal sewage. Each of the respirometric reactors wasseeded with 10 ml of activated sludge and topped with tapwater to bring the final volume to 60 ml. All the reactorswere then capped with potassium hydroxide pellets inholders attached to the caps.

Determination of IC50

. The toxicant dose was dissolvedin 2 ml of acetone and injected into the liquid phase of theseeded reactors. During the test, the carbon dioxide produ-ced by the organisms is absorbed by the KOH pellets, andthe vacuum caused in the head space opens a valve to admita precise amount of oxygen. The amount of oxygen con-sumed is recorded by the computer to determine the oxygenuptake rate of each of the reactors. The percentageinhibition at different concentrations of the toxicant wastaken as the reduction in oxygen uptake rate of the spikedreactor compared with that of the control reactor. Thepercentage inhibition values were then plotted against therespective concentrations, from which the concentrationcausing 50% inhibition, IC

50, was found (see Hall et al.,

1996).

TABLQSAR/QPAR Parameters f

Solv

ID No. Chemical Codea »I/100

1 2,2,2-Trichloroethanol Alc 0.512 2,2-Dichloroethanol Alc 0.423 1,2-Dichloro-2-methyl propane Hal 0.644 1,2,3-Trichloropropane Hal 0.635 Cyclopentane Alk 0.506 1,1,2-Trichloroethane Hal 0.527 1,3-Dichloropropene Hal 0.548 m-Cresol Aro 0.639 p-Cresol Aro 0.63

10 2-Nitrophenol Aro 0.6811 4-Nitrophenol Aro 0.6812 2,4-Dinitrophenol Aro 0.8213 2,4-Dichlorophenol Aro 0.7214 2,3,4-Trichlorophenol Aro 0.8115 2,3,5-Trichlorophenol Aro 0.8116 2,4-Dinitrotoluene Aro 0.87

aAlc, alcohols; Hal, halogenated aliphatics; Alk, alkanes; Aro, aromatics.bP"octanol—water partition coefficient (!); S"aqueous solubility (mol/cData not available.

RESULTS

The database consisting of the 16 test chemicals assayedin this study and the QSAR/QPAR model parameters forthe four models selected for comparison are tabulated inTable 1. The IC

50values measured in this study, the IC

50values predicted by the respective models, and the corres-ponding factors of errors are presented in Table 2 for thetest set of 16 chemicals. The factor of error is calculated asthe ratio of the predictive IC

50value to the measured value;

if the ratio is less than 1, then its inverse is used. The salientfeatures of the four models and the quality of their predic-tions are summarized in Table 3.

DISCUSSION

The models are compared below on the basis of statisticalvalidity, applicability and ease of use, and predictive ability.

Statistical Validity

In addition to the basic statistics reported for each modelunder Methods and Materials, a key factor to be consideredis the number of cases per independent variable. In linearregression analysis, the number of cases per independentvariable, j, should be high enough to avoid chance correla-tions (Topliss and Costello, 1972). It is generally accepted

E 1or the Chemicals Assayed

QSAR/QPAR parameterb

atochromic Connectivity

b a 0s7 1s7 LogP LogS

0.92 0.51 5.05 2.37 1.54 nac

0.77 0.45 3.99 2.03 0.49 na0.10 0.00 5.47 2.72 3.03 na0.10 0.00 5.39 3.07 1.98 1.110.00 0.00 3.54 2.50 2.05 0.340.10 0.00 4.68 2.51 2.05 1.520.05 0.00 4.12 2.19 1.60 na0.34 0.58 4.75 2.54 1.97 2.340.34 0.58 4.75 2.54 1.97 2.350.57 0.76 4.09 2.17 1.85 1.290.32 0.93 4.09 2.25 1.85 2.170.77 0.92 4.50 2.37 1.91 1.650.18 0.78 5.94 3.09 3.07 1.440.08 0.87 7.00 3.58 3.85 na0.08 0.87 7.00 3.57 3.85 na0.54 0.32 5.05 2.65 2.15 na

liter.)

Page 5: Structure– and Property–Activity Relationship Models for Prediction of Microbial Toxicity of Organic Chemicals to Activated Sludge

TABLE 2Comparison of IC50 Values Predicted by Four QSAR/QPAR Models

Predicted IC50

(mmol/liter) and factor of error (!)

Measured IC50

Model 1 Model 2 Model 3 Model 4(mmol/liter)

ID No. Chemical Codea (this study) IC50

FE IC50

FE IC50

FE IC50

FE

1 2,2,2-Trichloroethanol Alc 18.1 1922.5 106.0 44.9 2.5 8.9 2.0 — —2 2,2-Dichloroethanol Alc 76.8 1233.0 16.0 80.8 1.1 55.9 1.4 — —3 1,2-Dichloro-2-methyl propane Hal 5.0 0.9 5.4 2.0 2.5 0.7 7.6 — —4 1,2,3-Trichloropropane Hal 3.8 1.0 3.9 2.2 1.8 4.1 1.1 3.8 1.05 Cyclopentane Alk 2.3 1.3 1.8 2.8 1.2 3.7 1.6 1.3 1.86 1,1,2-Trichloroethane Hal 7.7 2.6 3.0 4.7 1.6 3.6 2.1 6.7 1.17 1,3-Dichloropropene Hal 3.3 1.4 2.4 8.8 2.6 8.0 2.4 — —8 m-Cresol Aro 5.4 4.9 1.1 1.9 2.8 4.2 1.3 21.3 4.09 p-Cresol Aro 4.8 4.9 1.0 1.9 2.5 4.2 1.1 21.6 4.5

10 2-Nitrophenol Aro 2.3 20.6b 9.0 5.8 2.5 5.2 2.3 4.9 2.111 4-Nitrophenol Aro 0.9 2.2 2.5 4.6 5.1 5.2 5.8 16.7 18.612 2,4-Dinitrophenol Aro 0.9 29.7 32.3 3.2 3.5 4.7 5.1 8.1 8.813 2,4-Dichlorophenol Aro 0.5 0.5 1.2 0.4 1.2 0.6 1.4 6.0 13.314 2,3,4-Trichlorophenol Aro 0.2 0.1 1.6 0.1 1.8 0.2 1.0 — —15 2,3,5-Trichlorophenol Aro 0.1 0.1 1.4 0.1 1.5 0.2 1.1 — —16 2,4-Dinitrotoluene Aro 1.1 4.1 3.8 1.4 1.3 3.1 2.8 — —

Average factor of error 12.0 2.2 2.5 6.1

aAlc, alcohols; Hal, halogenated aliphatics; Alk, alkanes; Aro, aromatics.bValues in boldface indicate that measured toxicity is significantly greater than predicted toxicity.

116 NIRMALAKHANDAN ET AL.

that j should be at least 10 and preferably greater than 20.The two QPAR models satisfy this criterion more thanadequately. Of the two QSAR models, the molecular con-nectivity-based model suffers most from this criterion

TABLSummary o

LSEREq. (2) Eq. (3) E

Comparison of model featuresChemicals covereda Mixed AlcNo. of variables, v 3 1No. of cases, n 44 14No. of cases per variables, j 15 14Correlation coefficient, r2 0.88 0.83Adjusted r2 0.88 0.82RMS residual 0.36 0.30Probability, P 0.0001 (0.0001 0

Comparison of model predictionsNo. of cases tested 16 Overr2 for predicted vs measured 0.44 OverProbability of correlation, P 0.0053 OverAverage factor of error, AFE 12 OverCases with AFE(2.5 50% Over

aAlc, alcohols; Aro, aromatics; Hal, halogenated aliphatics; Alk, alkanes.bStatistically not significant.

because different congeneric groups of chemicals requiredifferent models to explain the variance among the cases.Due to the small number of chemicals covered by each ofthe equations, their j values range from 11 to 14 (Table 3).

E 3f Findings

MCILogP LogS

q. (4) Eq. (5) Eq. (6) Eq. (8) Eq. (9)

Aro Hal Alk Mixed Mixed1 1 1 1 112 14 11 48 3312 14 11 48 33

0.66 0.79 0.75 0.75 0.760.63 0.77 0.72 0.75 0.750.39 0.22 0.23 0.48 0.41.0013 (0.0001 0.0006 0.0001 0.0001

all for MCI models"16 16 9all for MCI models"0.90 0.95 nsball for MCI models"0.0001 0.0001 nsall for MCI models"2.2 2.5 6.1all for MCI models"81% 69% 44%

Page 6: Structure– and Property–Activity Relationship Models for Prediction of Microbial Toxicity of Organic Chemicals to Activated Sludge

SAR/PAR MODELS FOR PREDICTING TOXICITY 117

Even though the validity of these models has beendocumented and justified using a variety of statistical tests(Hall et al., 1996), end users may be reluctant to apply suchmodels unless their predictive ability on a wider range ofchemicals is well demonstrated. Although the LSER modelgiven by Eq. (2) covers 44 cases, it needs their independentvariables to explain 88% of the variance among the casesand is thus only slightly better than the MCI models withj"15.

The balance between number of cases covered, n, andnumber of independent variables, v, required to explain areasonable percentage of the variance among the cases isoften a subject of debate. Because of the limitations of theresources available for toxicity testing, on one hand, andthe information content of the molecular descriptors, on theother, a compromise has to be made. The adjusted r2 isa statistical measure that can compare different models withv and n combinations. Based on the adjusted r2 valuescalculated in this study for the different models, all of themexcept the MCI model for aromatic compounds can be seento be acceptable (Table 3).

Another consideration in regression analysis is that inter-correlation between the independent variables should beminimal when the model requires multiple variables. Forthe LSER model presented here with three independentvariables, intercorrelation between them was found to benegligible, with a maximum r2 of !0.29 between »

I/100

and b.

for the chemicals used in developing the model.As a third consideration, linear regression models have to

be statistically significant to ensure that the model is not dueto chance correlations. A criterion that is often used to testfor significance is the P value. It is generally accepted thatP(0.05 implies borderline significance, P(0.01 impliessignificance, and P(0.005 implies high significance. All themodels evaluated in this study were found to be statisticallyhighly significant as indicated by their respective P values(Table 3). In other words, these models encode a systematicvariation of toxicity with the respective structural featuresor properties, and the relationship is not due to pure chancealone.

Applicability and Ease of Use

From the end users’ perspective, the model parametersshould be readily available for a wide range of chemicals,error free, and reliable, and the models themselves should beapplicable to wide range of chemicals. Based on the avail-ability of model parameters, the molecular connectivity-based approach proved to be the most convenient becausea standard algorithm was available to calculate all theparameters starting from the molecular structures. Whilea computer program was used in this study to determine theMCIs, the calculations can be manually performed withease.

Solvatochromic parameters for only 11 of the chemicalscould be found in the literature; values for the remaining fivechemicals (ID Nos. 1, 2, 10, 12, and 16) had to be estimatedusing the rules reported in the literature (Hickey and Pas-sino-Reader, 1991). The estimated solvatochromic valuescould not be verified independently. The rules for estimatingthe solvatochromic parameters are far from rigid and con-sistent; their application to ‘‘new’’ chemicals requires con-siderable chemical insight, intuition, and judgement; andseveral optional corrections, modifications, and adjust-ments have to be made as deemed necessary by the user(Hickey and Passino-Reader, 1991).

The logP values used in this QPAR work are calculatedrather than experimental. Due to the high uncertainties inthe measured log P values and the availability of a simplealgorithm for their calculation, the calculated values arerecommended to ensure consistent and reproducible results.However, for some chemicals, the calculated values may notyield unique values.

The S values used here are all experimental values foundfrom the literature. Values for S were not readily availablefor seven of the chemicals assayed. Even for the commonchemicals for which experimentally measured data exist,significant discrepancies can be noted; and the experimentalerrors may be large, particularly for the sparingly solubleones which are environmentally relevant. However, sincetoxicity is a solubility-related phenomenon, correlationswith aqueous solubility may be of significance in furtherstudies in mechanistic understanding of microbial toxicity.

In terms of applicability, the LSER model has to be ratedbelow the MCI model. For instance, even though the train-ing set for the LSER model had eight chlorinated phenols,its predictions for the three chlorophenols (ID Nos. 13—15)in the training set of this study are not better than those forthe MCI model, the training set of which did not includeany phenols at all. For the two halogenated aliphatics tested(ID Nos. 3 and 4), the quality of the predictions by theLSER model is inferior to that of the MCI model despitethe fact both models were trained on several such chemicals.It is interesting to note that the MCI model predictedremarkably well for the two chlorinated alcohols (ID Nos.1 and 2) even though its training set did not contain any.From these results it is deduced that the MCI approachcan perform well even for molecules with multiple atomicconstituents as long as those constituents are adequatelyrepresented individually in the training set. Similarresults were found in other MCI-based QSAR studies onaqueous solubility, Henry’s constant, etc. (Nirmalakhandan,1988).

Predictive Ability

The predictive ability is compared on the basis of twofactors: first, the degree of agreement between the predicted

Page 7: Structure– and Property–Activity Relationship Models for Prediction of Microbial Toxicity of Organic Chemicals to Activated Sludge

FIG. 2. Comparison of predictive factor of error by four models.

118 NIRMALAKHANDAN ET AL.

and measured IC50

values, and second, the factor of error inthe predictions. Considering the reproducibility of IC

50values measured by the respirometric method for microbialcultures, the ease of use of QSAR/QPAR models, and thequality of fit of the state-of-the-art QSAR/QPAR models, itis proposed that an average factor of error, AFE, of 2.5 beconsidered an acceptable criterion.

The overall quality of the predictions by the four modelsis illustrated in Fig. 1. Based on the statistically significant(P"0.0001) correlation between the measured and pre-dicted IC

50values, the logP method ranks best (r2"0.95),

followed closely by the MCI model (r2"0.90). The LSERmodel was not satisfactory (r2"0.44), with a marginallysignificant (P"0.0053) correlation, although it coveredonly nine chemicals. No statistically significant correlationwas found between the measured IC

50values and those

predicted by the log S model. The breakdown of the lattertwo models may be due to the inadequacies of the modelsthemselves or/and to the fact that the model parameters (i.e.,the solvatochromic parameters and S) themselves may beerroneous. At this point it is not possible to distinguishbetween the two.

Based on the overall AFE, the MCI models and the log Pmodel can be considered to be good predictors of microbialtoxicity; their predictions for 75% of the test chemicals arewithin a factor of error of 2.5. The predictions of the LSERand log S models for 50% of the test chemicals are above theacceptable factor of error (Table 3). All four models identi-fied the nitroaromatic chemicals (ID Nos. 10—12 and 16) tobe more toxic than the predictions; this is in agreement withthe other researchers in that these chemicals are known toact reactively. All four models predict poorly for 2,4-dinitro-phenol with AFE'2.5, again confirming its highly reactivetoxic mechanism (Fig. 2).

FIG. 1. Comparison of experimental IC50

values and predicted values.Note: r2 for log S model was not significant.

The overall quality of the LSER method is particularlyimpaired by 2,2,2-trichloroethanol and 2,2-dichloroethanol.Since its predictions for the remaining chemicals are accept-able and comparable to the predictions made by the othermodels, the estimated solvatochromic values for these twochemicals are suspicious. The b values of 0.92 and 0.77, inparticular, ‘‘appear’’ to be too high. These values wereestimated (by Hickey, e-mail communication, 1997) usingthe ground rules (Hickey and Passino-Reader, 1991), byadding contributions for two aliphatic carbons, an aliphatichydroxyl group, and three or two aliphatic chlorines. Thesevalues may have to be corrected by allowing for diminishingcontributions by the successive additions of chlorines or byusing a leveling factor of 0.8 to 0.9. Even with such adjust-ments, the predictions for these two chemicals are not ac-ceptable. Anomalies such as this and similar difficulties inestablishing the solvatochromic parameters have greatlydiscouraged the application of the LSER approach bya wider range of researchers.

CONCLUSIONS

Two QSAR models and two QPAR models for predictingtoxicity to activated sludge were evaluated on the basis ofstatistical validity, applicability and ease of use, and predic-tive ability. The predictive ability was compared using IC

50values measured in this study for a test set of 16 chemicals.While the statistical quality of the LSER model in fitting thetraining set data was highest, its ability to predict toxicity tonew chemicals was found to be fair. Based on the results of

Page 8: Structure– and Property–Activity Relationship Models for Prediction of Microbial Toxicity of Organic Chemicals to Activated Sludge

SAR/PAR MODELS FOR PREDICTING TOXICITY 119

this study summarized in Table 3, the molecular connect-ivity-based models and the log P-based models may beconsidered reliable and convenient in predicting toxicity toactivated sludge. Based on the availability of the modelparameters, these two models were found to be convenient.Their utility and limitations need to be verified anddocumented by testing additional chemicals of diverse mo-lecular structures.

ACKNOWLEDGMENTS

The authors acknowledge the support provided by the U.S. Air ForceOffice of Scientific Research under Grants F49620-95-1-0340 and F49620-94-1-0366. The help of Dr. Dora Passino-Reader and Dr. Jim Hickey of theNational Fisheries Research Center, Great Lakes, U.S. Fish and WildlifeService, is also appreciated in providing the LSER parameters for chem-icals 1, 2, 10, 12, and 16.

REFERENCES

Blum, D. J. W., and Speece, R. E. (1990). Determining chemical toxicity toaquatic species. Environ. Sci. ¹echnol. 24, 284—293.

Borman, S. (1990). New QSAR techniques eyed for environmental assess-ments. Chem. Eng. News, Feb. 19, pp. 20—25.

Bradbury, S. P. (1994). Predicting modes for toxic action from chemicalstructure. An overview. SAR QSAR Environ. Res. 2, 89—104.

Clements, R. G., Nabholz, J. V., Johnson, D. E., and Zeeman, M. G. (1993).The use of quantitative structure activity relationships (QSARs) asscreening tools in environmental assessment. In Environmental Toxicol-ogy and Risk Assessment, 2nd vol. ASTM STP 1216, pp. 555—570. Am.Soc. for Testing and Materials, Philadelphia.

Cramer, R. (1980). A QSAR success story. Chemtech 10, 744—747.

Cronin, M. T. D., and Dearden, J. C. (1995). QSAR in toxicology. 1.Prediction of aquatic toxicity. Quant. Struct. Act. Relationship 14, 1—7.

Cronin, M. T. D., Dearden, J. C., and Dobbs, A. J. (1991). QSAR studies ofcomparative toxicity in aquatic organisms. Sci. ¹otal Environ. 109/110,431—439.

Fiedler, H., Hutziner, O., and Giesy, J. P. (1990). Utility of the QSARmodeling system for predicting the toxicity of substances on the Euro-pean inventory of existing commercial chemicals. ¹oxicol. Environ.Chem. 28, 167—188.

Hall, E., Sun, B., Prakash, J., and Nirmalakhandan, N. (1996). Toxicity oforganic chemicals and their mixtures to activated sludge microorgan-isms. J. Environ. ASCE 122, 424—429.

Hansch, C., Quinlan, J. E., and Lawrence, G. L. (1968). The linear freeenergy relationship between partition coefficient and aqueous solubilityof organic liquids. J. Org. Chem. 33, 347—350.

Hickey, J. P., and Passino-Reader, D. R. (1991). Linear solvation energyrelationships: Rules of thumb for estimation of variable values. Environ.Sci. ¹echnol. 25, 1753—1760.

Kamlet, M. H., Abboud, J. L. M., Abraham, M., and Taft, W. (1983). Linearsolvation energy relationships. 23. A comprehensive collection of the

solvatochromic parameters, n*, a and b, and some methods for simplify-ing the generalized solvatochromic equation. J. Org. Chem. 48,2877—2887.

Kamlet, M. J., Doherty, R. M., Abraham, M. H., Carr, P. W., Doherty,R. F., and Taft, R. W. (1987). Linear solvation energy relationships. 41.Important differences between aqueous solubility relationships foraliphatic and aromatic solutes. J. Phys. Chem. 91, 1996—2004.

Kamlet, M. J., Doherty, R. M., Abraham, M. H., Marcus, Y., and Taft,R. W. (1988). Linear solvation energy relationships: An improved equa-tion for correlation and prediction of octanol/water partition coefficientsof organic nonelectrolytes. J. Phys. Chem. 92, 5244—5255.

Kier, L. B., and Hall, L. H. (1976). Molecular Connectivity in Chemistry andDrug Design. Academic Press, New York.

Kier, L. B., and Hall, L. H. (1986). Molecular Connectivity in StructureActivity Analysis. Academic Press, New York.

Leo, A., Jow, P. Y. C., Silipo, C., and Hansch, C. (1975). Calculation ofhydrophobic constant (log P) from n and f constants. J. Med. Chem. 18,865—868.

Nirmalakhandan, N. (1988). Prediction of Aqueous Solubility and Henry’sConstant from Molecular Structure. Ph.D. dissertation, Drexel Univer-sity, Philadelphia.

Nirmalakhandan, N., and Speece, R. E. (1988). Prediction of aqueoussolubility of organic chemicals based on molecular structure. Environ.Sci. ¹echnol. 22, 328—338.

Nirmalakhandan, N., and Speece, R. E. (1989). Prediction of aqueoussolubility of organic chemicals: II. Application to PNAs, PCBs, dioxinsetc. Environ. Sci. ¹echnol. 23, 708—713.

Nirmalakhandan, N., Arulgnanendran, V., Mohsin, M., Sun, B., and Ca-dena, F. (1993). Toxicity of mixtures of organic chemicals to microorgan-isms. ¼ater Res. 28, 543—551.

Randic, M. (1975). On the characterization of molecular branching. J. Am.Chem. Soc. 97, 6609—6615.

Sun, B., Nirmalakhandan, N., Hall, E., Wang, X. H., and Prakash, J. (1994).Estimating toxicity of organic chemicals to activated sludge microorgan-isms. J. Environ. Eng. ASCE 120, 1459—1469.

Tang, N. H., Blum, D. J. W., Nirmalakhandan, N., and Speece, R. E. (1992).QSAR parameters for toxicity of organic chemicals to Nitrobacter. J.Environ. Eng. ASCE 118, 17—37.

Topliss, J. G., and Costello, R. J. (1972). Chance correlations in struc-ture—activity studies using multiple regression analyses. J. Med. Chem.15, 1066—1069.

Trevizo, C., and Nirmalakhandan, N. (1997). Prediction of microbial tox-icity of industrial chemicals., ¼ater Sci. ¹echnol., in press.

Volskay, V. T., and Grady, C. P. L. (1988). Toxicity of selected RCRAcompounds to activated sludge microorganisms. J. ¼PCF 60,1850—1856.

Xu, S., and Nirmalakhandan, N. (1997). Use of QSAR models in predictingjoint effects in multicomponent mixtures of organic chemicals. ¼aterRes., in press.

Zeeman, M., Nabholz, J. V., and Clements, R. G. (1993). The developmentof SAR/QSAR for use under EPA’s Toxic Substances Control Act(TSCA): An introduction. In Environmental ¹oxicology and Risk Assess-ment, 2nd vol., ASTM STP 1216. pp. 523—539. Am. Soc. for Testing& Materials, Philadelphia.