application of neural network prediction model to full-scale anaerobic sludge digestion

69

1

Research ArticleReceived: 28 May 2010 Revised: 26 November 2010 Accepted: 27 November 2010 Published online in Wiley Online Library: 11 January 2011

(wileyonlinelibrary.com) DOI 10.1002/jctb.2569

Application of neural network predictionmodel to full-scale anaerobic sludge digestionDunyamin Guclu,a∗ Nihat Yılmazb and Umay G. Ozkan-Yucela

Abstract

BACKGROUND: Process modeling is a useful tool for description and prediction of the performance of anaerobic digestionsystems under varying operation conditions. The objective of this study was to implement a model to simulate the dynamicbehavior of a large-scale anaerobic sewage sludge digestion system. Artificial neural network (ANN) models using algorithmsbest suited to environmental problems (the Levenberg–Marquardt algorithm and the ‘gradient descent with adaptive learningrate’ back propagation algorithms) were used to model the anaerobic sludge digester of the Ankara Central WastewaterTreatment Plant (ACWTP) using dynamic data.

RESULTS: Based on the relatively low mean square error (MSE), mean absolute error (MAE) and mean absolute percentage error(MAPE) and very high r values, ANN models predicted effluent volatile solid (VS) concentration and methane yield satisfactorily.Effluent VS and methane yields were predicted by the ANN using only conventional parameters such as pH, temperature, flowrate, volatile fatty acids, alkalinity, dry matter and organic matter. The best back propagation algorithm was the gradientdescent with adaptive learning rate algorithm in both models. In the training of the neural network, four-fold cross-validationwas used for validation of the model for better reliability.

CONCLUSION: The proposed ANN models were shown to be capable of dynamically predicting the VS and CH4 productionrates for real system behavior. Only relatively simple monitoring parameters are needed to build the model for this complexanaerobic digestion process.c© 2011 Society of Chemical Industry

Keywords: anaerobic sludge digestion modeling; artificial neural network; K-fold cross-validation; large-scale wastewater treatmentplant

INTRODUCTIONAnaerobic digestion has been applied widely for sludge treat-ment, where primary and secondary excess sludge from biologicaltreatment is stabilized under anaerobic conditions. The advan-tages of the process over aerobic digestion are the minimumexcess sludge production and the end-product methane (CH4)resulting in reduced overall treatment cost.1 Despite the stabilityof the process, inadequate understanding of the biochemical andphysical processes involved in anaerobic treatment often result inlow CH4 production and the accumulation of volatile fatty acids(VFAs) in the digesters. Furthermore, because of the complexityof the process, the behavior of digesters under changing organicloading rates (OLR) is unpredictable. Several mathematical modelshave been developed for the last 40 years2 in order to understandthe mechanism underlying anaerobic digestion and to overcomeoperational problems. The most recent model is the InternationalWater Association (IWA) Anaerobic Digestion Model No.1 (ADM1),which was published in 2002.3 Several successful implementationsof the model have been published for sludge stabilization.4 – 8 Thestate variables used for the model validation were biogas produc-tion, acetate and propionate4 chemical oxygen demand (COD),ammonia, total VFA, pH, volatile solids removal5 total solids (TS),volatile solids (VS) and biogas yield6 and CH4 production rate.7

However, the main drawback to use of mathematical modelsis the large number of parameters that should be estimated for

model implementation.9 The estimated parameters are generallycase specific and difficult to adapt for system modification, sincethey depend on environmental conditions. Moreover, it is essentialto define the biodegradable fraction of the influent stream foraccurate digester simulation. However, the organic compositionand amount of the influent stream is almost impossible to quantify,especially for a full-scale anaerobic digester. There is, therefore, aneed for a fast and simple solution for modelling, especially for full-scale operations. As an alternative, artificial neural networks (ANN)have been used to model biological treatment processes, includinganaerobic treatment. The interest in ANN is due to its capability togeneralize, its ability for learning and nonlinear data processing,and its tolerance to failure.10 Tay and Zhang11 developed aconceptual neural fuzzy model for three different high-rateanaerobic treatment configurations. Their model predicted thevolumetric methane production (VMP), TOC and VFA with high

∗ Correspondence to: Dunyamin Guclu, Selcuk University, Engineering andArchitectural Faculty, Environmental Engineering Department, 42031 Campus,Selcuklu, Konya, Turkey. E-mail: [email protected]

a Selcuk University, Department of Environmental Engineering, Campus, Konya,Turkey

b Selcuk University, Department of Electrical and Electronics Engineering,Campus, Konya, Turkey

J Chem Technol Biotechnol 2011; 86: 691–698 www.soci.org c© 2011 Society of Chemical Industry

69

2

www.soci.org D Guclu, N Yılmaz and U. G. Ozkan-Yucel

Container Station

Digested SludgeThickener

Digester

Excess SludgeThickener

Sludge Dewatering Station

Effluent

Return SludgeExcess Sludge

Primary Sludge

Digested Sludge

Raw Sludge

Raw Sludge

Waste Gas BurnerGas Storage

Power Station

Supernatant

Deposit

SludgeDeposit

Primary SludgeThickener

Supernatant

Grit Chamber F. SedimentationTank

Aeration TankP. SedimentationTank

Screens

Figure 1. Process flow chart of the ACWTP.

accuracy using OLR, hydraulic loading rate, alkalinity loading rate,and VMP, TOC and VFA prediction of the previous day as theinput parameters. Horuichi et al.12 implemented a three-layeredneural network for the prediction of pH, acetic acid, propionicacid concentration in an acidogenic reactor at different residencetimes, by using pH and residence times as inputs. Cakmakci13

modeled a full-scale primary sludge digester using pH, VS, flowrate and temperature of the influent as inputs, and predictedeffluent VS and CH4 yield using ANFIS, satisfactorily.

Reported studies on full-scale applications of ANN modelingto large-scale digesters under varying organic loading conditionswere scarce. Therefore, in this study ANN was used to modelthe full-scale anaerobic sludge digester of the Ankara CentralWastewater Treatment Plant (ACWTP) in Turkey. VS concentrationin the effluent of the digester and CH4 production rates werepredicted using the ANN model. The prediction performance ofthe model has been evaluated and compared using statisticalparameters. The input data for the ACWTP digester werenot constant because of the variability of the influent sludgecharacteristics. The influent sludge was a mixture of primaryand secondary sludge (1 : 1). The primary sludge resulted invariable influent composition and organic load due to the diversecharacteristics of solids removed in the primary settling tank. Thisstudy can be used as a guide for the use of routinely measuredparameters such as influent flow rate, temperature (T), alkalinityand influent VS concentration as input data, eliminating the needand cost for measurement of an excessive number of parametersby on-line sensors.

MATERIAL AND METHODSTreatment process and full-scale digesterThe ACWTP is located about 40 km away from the city centre. Theplant was designed to treat all domestic and pre-treated industrialwastewaters of Ankara (Turkey) and was based on biologicaltreatment with future denitrification and phosphorus removalprocesses. The plant was designed to treat the wastewater of about3 900 000 persons, with a dry weather flow of 765 000 m3 day−1

Table 1. Statistical values of measured input and output parameters

Variables Minimum Mean Maximum

pH 7.18 7.41 7.64

Temperature (◦C) 34.00 35.73 37.00

Flow rate of sludge (m3 day−1)∗ 320.00 486.13 600.00

VFA (mg L−1) 132.50 341.21 638.80

Alkalinity(mg CaCO3 L−1) 1745.00 3338.47 4440.00

CH4 (m3 day−1) 2642 4463.91 5824

VSeff. (mg L−1) 7113 10 018 14 070

VSinf. (mg L−1) 14 755 24 577 38 385

∗ These values represent the modeled lane.

and a storm weather flow of 1 530 000 m3 day−1. Its capacity willextend to treat a flow rate of 1 380 000 m3 day−1 (dry weatherflow), and 2 750 000 m3 day−1 (peak storm weather flow), whichwill be sufficient until 2025. An influent BOD5 concentration of300 mg L−1 leaves the plant with a maximum BOD5 of 30 mg L−1

after the treatment. The effluent is directly discharged intoAnkara Creek. The ACWTP has a non-nitrifying conventionalactivated sludge system and includes screening and grit removal,primary clarification, surface aeration tanks, final clarification andsludge treatment facilities. The process scheme of the wastewatertreatment plant is presented in Fig. 1. The activated sludge systemconsists of 2.5 parallel lanes with four longitudinal aeration tanks,giving a total of 10. The anaerobic digester was a completelymixed, continuous flow, mesophilic anaerobic reactor withoutsludge recycle. The digester had a liquid volume of 10 800 m3 andoperated at 35 ± 1 ◦C. The influent sludge was a 1 : 1 mixture ofprimary and secondary sludges. The characteristics of the influentsludge are given in Table 1.

Analytical methodsAll samples were analyzed for their dry matter (DM), volatilesolids (VS), and total volatile fatty acids (VFA) content. DM and VS

wileyonlinelibrary.com/jctb c© 2011 Society of Chemical Industry J Chem Technol Biotechnol 2011; 86: 691–698

69

3

ANN prediction model for full-scale anaerobic sludge digestion www.soci.org

were measured according to Standard Methods.14 Total VFA (asmg L−1 HAc) was determined by the titration procedure given byAnderson and Yang.15 Temperature (T), pH, sludge inflow rate andtotal gas production were measured on-line with Endress–Hauser(Wiel am Rhein, Germany) analyzers in the wastewater treatmentplant. The gas composition was analyzed by gas chromatography(GC) (Agilent Technologies 6890 N, Palo Alto, CA, USA) equippedwith thermal conductivity detector (TCD). A HP-Plot Q (AgilentTechnologies, USA) capillary column (30.0 m × 530 µm × 40.0 µm)was used for the separation of CH4 and carbon dioxide (CO2). Thecolumn temperature was maintained at 45 ◦C for 1 min, and thenraised to 65 ◦C at a rate of 10 ◦C min−1. The carrier gas (He) had aflow rate of 3 mL min−1. The injector and detector temperatureswere maintained at 100 ◦C and 250 ◦C, respectively.

Input and output variables in the modelInput and output parameters were selected or generated fromthe parameters commonly used in aerobic wastewater treatmentsystems as detailed in the literature. On-line and off-lineoperational data of the ACWTP were used to develop the models.

The stability of anaerobic digestion of waste can be maintainedby evaluation of the solids retention time (SRT),16 the wastecharacteristics in terms of its organic strength and composition, thealkalinity, and the temperature.17 The parameters most commonlyused to monitor digester stability are methane and carbon dioxidecontent of the biogas, gas production rates, pH, and VFA.18

In the case of sludge digestion, the organic strength of municipalsludge may be determined by its VS concentration. The digesterwas operated without sludge recycling, so that influent andeffluent flow rates were equal. Therefore, flow rate was takenas an input parameter instead of SRT. The input parameters otherthan sludge flow rate were selected as T , pH, VFA and alkalinity andinfluent VS concentration. The output parameters were selected asdaily effluent VS concentration and CH4 production. The statisticalcharacteristics of the measured parameters are summarized inTable 1.

Artificial neural networksANNs are mathematical models or computational models that tryto simulate the structure and/or functional aspects of the humanbrain and their interconnections. They have generally a number ofsimple and highly interconnected neurons organized into layers.ANNs have many structures and architectures.19,20 Multilayeredperceptron neural networks (MLPNNs) are the simplest andtherefore most commonly used neural network architectures.20

An MLPNN has, normally, three layers: an input layer, an outputlayer and one or more hidden layer. Neurons in the input layeronly act as buffers for distributing the input signals xi to neuronsin the hidden layer. Each neuron j in the hidden layer sums upits input signals xi after weighting them with the strengths of therespective connections wji from the input layer and computes itsoutput yj as a function f of the sum:

yj = f(∑

wjixi

)(1)

where f can be a sigmoid or hyperbolic tangent function. Theoutput of neurons in the output layer is computed similarly.

Training a network consists of adjusting weights of the networkusing a learning algorithm. The back-propagation with or withoutmomentum (BPM)21 learning algorithm is used in this work

Model design(Network architecture)

Data collection and preprocessing(Identifying model inputs and outputs)

Model training(Training the designed network)

Model testing K-fold cross-validationand testing

(Evaluating model performance)

Model Execution for prediction

Adequate?

yes

no

Figure 2. Stepwise model development process.

because it is the most commonly adopted MLPNN trainingalgorithm. It is a gradient descent algorithm and gives the change�wji(k) in the weight of a connection between neurons i and j asfollows:

�wji(k) = αδjxi + µ�wji(k − 1) (2)

where xi is the input, α is the learning coefficient, µ is themomentum coefficient, and δj is a factor depending on whetherneuron j is an output neuron or a hidden neuron. In some trainingalgorithms, gradient descent momentum and an adaptive learningrate are used for weight and bias values updating. For outputneurons,

δj = ∂f

∂netj(yT

j − yj) (3)

where netj ≡ ∑xiwji and yT

j is the target output for neuron j. Forhidden neurons,

δj = ∂f

∂netj

∑q

wqjδq (4)

As there are no target outputs for hidden neurons in Equation (4),the difference between the target and actual output of a hiddenneuron j is replaced by the weighted sum of the δq termsalready obtained for neurons q connected to the output of j.Thus, iteratively, beginning with the output layer, the δ term iscomputed for all neurons in all layers except the input layer, andweights are then updated according to Equation (2).

MODEL IMPLEMENTATIONA number of steps were carried out during the model developmentprocess. These are shown schematically in Fig. 2.

J Chem Technol Biotechnol 2011; 86: 691–698 c© 2011 Society of Chemical Industry wileyonlinelibrary.com/jctb

69

4


Training and test phases of ANNThe data was collected for 245 days. Effluent VS and CH4

production rates were used for the ANN output. Influentparameters such as Q, T , pH, VFA and alkalinity were used asinput parameters to predict effluent VS concentration. In additionto these input parameters, influent VS concentration was usedas an input parameter for estimation of CH4 production rates.Consequently, the number of nodes for the input ANN and outputANN were 5 : 1 and 6 : 1, respectively.

All variables were normalized and de-normalized between 0 and1 before and after application in the neural network. This processwas carried out by determining the maximum and minimumvalues of each variable over the whole data period and calculatingnormalized variables using the equation

xnorm = x − xmin

xmax − xmin(5)

MLP feedforward was used as the ANN structure. To train the ANN,a gradient descent with momentum and adaptive learning rateback propagation algorithm was used. Gradient descent with mo-mentum and adaptive learning rate backpropagation (traingdx) isa network training function that updates weight and bias values ac-cording to gradient descent momentum and an adaptive learningrate. Learning rate and momentum constants were 0.01 and 0.9,respectively. Levenberg–Marquardt back propagation (trainlm) isa network training function that updates weight and bias val-ues according to Levenberg–Marquardt optimization. trainlm isoften the fastest backpropagation algorithm and is highly recom-mended as a first-choice supervised algorithm, although it doesrequire more memory than other algorithms. Training parameterswere selected as default in this study.

The ANN structure included one and two hidden layers. Theperformance of the ANN model was improved by adjustingexperimentally to detrermine the preferred number of hiddenlayer nodes.

BP neural networks have become a popular tool for modelingenvironmental systems.22 – 24 In this study, two BP algorithms werecompared to select the best fitting BP algorithm for the datagathered. For all algorithms, feedforward networks with a tan-sigmoid transfer function were used within the hidden layer(s) anda log-sigmoid function within the output layer. All the programsused in this study were implemented in MATLAB.

It is crucial that the ANN finds the best network connectionweights for the problem. Researchers generally spend a consid-erable amount of time finding the best network architecture forthe problem in question. Since the intention is to make the ANNlearn rather than memorize, cross-validation is used for training.25

To train the neural network, k-fold cross-validation was used forbetter reliability of test results.26 In k-fold cross-validation, theoriginal sample is randomly partitioned into k subsamples. A sin-gle subsample is retained as the validation data for testing themodel, and the remaining k − 1 subsamples are used as trainingdata. The cross-validation process is then repeated k times (the‘folds’), with each of the k subsamples used exactly once as thevalidation data. The average of the k results gives the validationaccuracy of the algorithm.27

Three modeling steps (training, validation and testing) wereselected for application of the ANN. In order to further evaluate theproposed neural network model, the final model (after training andk-fold cross-validation) was tested on the unseen validation dataset. In our applications, four-fold cross-validation was used. The

data were divided into training, validation, and test subsets, with80% of the data taken for training and four-fold cross-validation,and 20% for the test.

We examined systems with hidden node counts from 2 to 40with one step in the first hidden layer, and 2 to 10 with one step inthe second hidden layer. Epoch numbers were from 250 to 20 000with 250 steps for gradient descent with adaptive learning rate BPand 25 to 2000 with 25 steps for Levenberg–Marquardt.

For determination of the ANN model, more than 100 000 neuralnetworks were trained by searching a variety of parameterssuch as hidden nodes, iteration numbers and transfer functions.Before each training stage, initial weight matrix values of neuralnetworks were determined by a random function. Therefore,every training process was independent from any other. In everyexamination, the correlation coefficient and mean square error(MSE) values in the stored database were used as performancecriteria. The best network structure was selected according to theperformance criteria from the database. The database storednetwork architectures, training parameters and performancecalculations in a Tab-deliminated text file structure that allowsanalysis with Microsoft Excel to be undertaken.

Evaluation of model performanceThe performance of each ANN model was evaluated by calculatingthe mean square error (MSE), mean absolute error (MAE) and meanabsolute percentage error (MAPE) between the modeled outputand measurements of both the training and testing data sets. Thecalculation equations are given in Equations (6)–(8). In addition,the test results satisfying the minimum errors were subjectedto correlation coefficient determination (r), given in Equation (9).These r values were used as another criterion for determination ofoptimum model structure.

MSE (%) =N∑

i=1

(xi − yi)2

N× 100 (6)

MAE (%) = 1

N

(N∑

i=1

|xi − yi|)

× 100 (7)

MAPE (%) = 1

N

(N∑

i=1

[∣∣∣∣ xi − yi

xi

∣∣∣∣])

× 100 (8)

r =∑

(x − x)(y − y)√∑(x − x)2

∑(y − y)2

(9)

where xi and yi are the measured and predicted values, x and y arethe average of the measured and predicted values, respectively,and N is the total number of model outputs.

EXPERIMENTAL RESULTS AND DISCUSSIONANN models were developed to predict effluent VS concentrationand CH4 production rates for the ACWTP. Two different BPalgorithms were used as in Table 2 and all sets were trainedand tested for VS and CH4 models. Several models for differentinput–output sets were trained and tested until the best inputparameters and best fitting input structure, architecture andMATLAB Neural Network Toolbox settings for the models wereobtained. In the VS model structure, five effective input modelvariables were selected, described in Table 2, with output variable


69

5


Table 2. Structure, architecture and settings of the models best predicting VS and CH4

VS models CH4 models

ParametersLevenberg–Marquardt

Gradient descentwith adaptive

learning rate BPLevenberg–Marquardt

Gradient descentwith adaptive

learning rate BP

Best network 5 : 2 : 1, 350 iteration 5 : 4 : 1, 6750 iteration 6 : 13 : 1, 875 iteration 6 : 35 : 1, 3000 iteration

architecture with oneand second hiddenlayer

5 : 13 : 2 : 1, 25 iteration 5 : 8 : 2 : 1, 18250 iteration 6 : 11 : 3 : 1, 250 iteration 6 : 11 : 2 : 1, 5000 iteration

k-fold cross-validation 4 4 4 4

Scanned iterationinterval

25-2000 step = 25 250-20000 step = 250 25-2000 step = 25 250-20000 step = 250

Scanned hidden nodes in 2-40 step = 1 2-40 step = 1 2-40 step = 1 2-40 step = 1

first hidden layer andin second hidden layer

2-10 step = 1 2-10 step = 1 2-10 step = 1 2-10 step = 1

Inputs Q Q Q Q

T T T T

pH pH pH pH

VFA VFA VFA VFA

Alkalinity Alkalinity Alkalinity Influent VS Alkalinity Influent VS

Output Effluent VS Effluent VS CH4 CH4

effluent VS concentration. In the second model structure, six inputvariables were included, with output variable CH4 production rate.When influent VS concentration was used as an input parameterin addition to the five independent parameters, CH4 productionrates in the digester were better described.

The number of input, hidden and output layer nodes inthe optimum Levenberg–Marquardt and gradient descent withadaptive learning rate BP ANN structure were found by experimentto be 5 : 13 : 2 : 1 and 5 : 8 : 2 : 1 for VS and 6 : 11 : 3 : 1 and 6 : 11 : 2 : 1for CH4 model, respectively. For these hidden nodes, minimumtraining, and test errors were obtained. The optimum ANN modelstructure obtained is shown in Fig. 3 for the prediction of effluentVS and CH4 production rates.

In this study, the best fitting BP algorithm minimizing theerror between neural network output and observed value isselected for a given data set. The BP network is one of theseveral neural networks that are successfully implemented ina wide range of environmental problems in the prediction oftarget parameters.23,28 The best BP algorithm for the presentapplication, with minimum training error, is the gradient descentwith adaptive learning rate algorithm with two hidden layers. Thetraining and test results of the optimum ANN models for effluentVS concentration and CH4 production rates are summarized inTable 3. It was found that the ANN trained by one hidden layergave reliable predictions of the effluent VS concentration andCH4 production rates. On the other hand, a second hidden layerimproved the results without a large increase in the total numberof hidden nodes.

For model training, MAEs, MSEs and MAPEs between theobserved and predicted values were 8. 9, 1.12 and 7.07% for theVS model, 9.50, 1.50 and 7.17% for the CH4 model, respectively.Regarding model testing, MAEs, MSEs and MAPEs between theobserved and predicted values were 9.5, 1.20 and 7.06% for theVS model, 9.1, 1.20 and 6.99% for the CH4 model, respectively.Thus, based on the error analysis, the VS and CH4 models gave aprediction performance acceptably high levels of accuracy.

Correlation coefficient (r) analysis was also performed. Fig-ure 4 shows scatter plots of neural network predictions andthe corresponding observations for outputs of the gradientdescent with adaptive learning rate algorithm. While the rvalue of effluent VS concentration was 0.89 for testing, thevalue of CH4 production rates was 0.71. Taking into accountthe non-linear dependence of the data, the prediction appearsto track the observations reasonably well. Perfect agreementis achieved when r = 1.0. The correlation coefficients alsoshow good correspondence between measurement and pre-dictions for the testing of effluent VS and CH4 productionrates.

The modeling performance is visualized in Figs 5–8 for themonitoring period. Model prediction results versus observed datafor the VS concentration is given in Figs 5 and 6 to reinforce themodel performance for training and testing purposes. The trendsin the prediction and measured data were in good agreement.ANN models predicted the dynamic behavior of VS concentrationswith good accuracy and provided a very good fit to the trainingdata and the testing data (Fig. 6). The ANN model showed that thepeak values in the plot of measured concentrations can be wellreproduced by the predictions.

For the CH4 model, the simulation performance was as goodas the VS model. As shown in Figs 7 and 8, daily changes in realprocess values were well predicted, and large fluctuations andpeaks were also closely followed by the model (Fig. 8). Visualinspection indicates that the predictions of the ANN modelswell fitted the real values of VS and CH4 production rates. Theproposed neural networks can satisfactorily describe the behaviorof the process.

As a result, the predicted effluent VS and digester CH4 pro-duction rates matched the observed concentrations based on therelatively low MSE, MAE, MAPE and very high r values, suggestinggood prediction performance of the models. When consideringthe high level of complexity of these biological processes, thebroadness of the data range and the computed error values,it is seen that the method successfully predicts target output


69

6


OutputMethane

Input Layer

Hidden Layer

Output layer

Q

T

pH

VFA

Alkalinity

VSinf.

Inputs

OutputVS

Input Layer

Hidden Layer

Output layer

Q

T

pH

VFA

Alkalinity

Inputs

Figure 3. Representation of the optimal neural network structure used for VS and methane prediction.

Table 3. ANN model performance statistics calculated over the whole of training and testing data

Training Validation Testing

Examination of backpropagation algorithms r MAE MSE r MAE MSE r MAE MSE

VS (Gradient descent with adaptive learning rate backpropagation) with one hidden layer

0.85 10.5 1.70 0.82 12.8 2.24 0.87 12.1 2.0

VS (Levenberg–Marquardt) with one hidden layer 0.84 11.7 1.93 0.80 12.0 2.21 0.84 10.5 1.76

CH4 (Gradient descent with adaptive learning rate backpropagation) with one hidden layer

0.88 6.5 0.79 0.70 9.28 1.81 0.69 9.7 1.74

CH4 (Levenberg–Marquardt) with one hidden layer 0.75 8.9 1.50 0.62 11.0 1.76 0.63 10.2 1.64

VS (Gradient descent with adaptive learning rate backpropagation) with second hidden layer

0.90 8.9 1.12 0.87 11.8 2.32 0.89 9.5 1.20

VS (Levenberg–Marquardt) with second hidden layer 0.84 10.9 2.14 0.76 13.1 2.79 0.83 11.6 1.8

CH4 (Gradient descent with adaptive learning rate backpropagation) with second hidden layer

0.76 9.5 1.50 0.67 9.9 1.73 0.71 9.1 1.20

CH4 (Levenberg–Marquardt) with second hidden layer 0.84 8.0 1.17 0.53 11.0 2.05 0.63 11.0 2.15

parameters. In these ANN models, five independent parameters(sludge flow rate, temperature, pH, VFAs and alkalinity in the in-fluent of the digester) are sufficient to obtain good prediction ofeffluent VS concentration. Six independent parameters (influentVS concentration in addition to the five independent parametersdetailed above) are needed to predict CH4 production rates in thedigester.

CONCLUSIONSThe implementation of an ANN approach to predict effluentVS concentration and CH4 production rates in the anaerobicdigester of a large-scale wastewater treatment plant based ondaily monitored values for 245 days as input was investigated. Theoptimum structure of the network, i.e. number of hidden layer,nodes in the hidden layer, and learning iterations and the choice

r = 0.89

6000

8000

10000

12000

14000

16000(a) (b)

6000 8000 10000 12000 14000 16000

Observed volatile solids, mg/L

Pre

dic

ted

vo

lati

le s

olid

s, m

g/L

r = 0.71

3000

3500

4000

4500

5000

5500

2000 3000 4000 5000 6000

Observed methane, m3/d

Pre

dic

ted

met

han

e, m

3 /d

Figure 4. Scatter plots of observed and predicted effluent volatile solids and methane concentrations for test data sets for a gradient descent withadaptive learning rate BP with two hidden layer.


69

7


0

2000

4000

6000

8000

10000

12000

14000

16000

18000

0 10 20 30 40 50 60 70 80 90 100 110 120 130 140 150 160 170 180 190 200 210 220 230 240 250

Time (days)

Vo

lati

le S

olid

s (V

S),

mg

/L

Observed Predicted

Figure 5. Time plot of observed and predicted effluent volatile solids for all data sets for the gradient descent with adaptive learning rate BP with twohidden layers.

Observed Predicted

Training TestingValidation

0

2000

4000

6000

8000

10000

12000

14000

16000

18000

Vo

lati

le S

olid

s (V

S),

mg

/L

0 10 20 30 40 50 60 70 80 90 100 110 120 130 140 150 160 170 180 190 200 210 220 230 240 250

Time (days)

Figure 6. Time plot of best results of randomized observed and predicted effluent volatile solids for training, validation and testing set for the gradientdescent with adaptive learning rate BP with two hidden layers.

of transfer functions were optimized based on statistical (namelyMSE, MAE, MAPE and correlation coefficient (r)) results before usingthe neural network. The Levenberg–Marquardt algorithm andgradient descent with adaptive learning rate back propagationalgorithm (often used to solve environmental problems) were

compared for the present application. The best back propagationalgorithm was the gradient descent with adaptive learning ratealgorithm with two hidden layers for both models. For trainingthe neural network, four-fold cross-validation was used for betterreliability of test results. The proposed ANN models were shown to

0

1000

2000

3000

4000

5000

6000

7000

8000

Met

han

e, m

3 /d

Observed Predicted

0 10 20 30 40 50 60 70 80 90 100 110 120 130 140 150 160 170 180 190 200 210 220 230 240 250

Time (days)

Figure 7. Time plot of observed and predicted methane for all data sets for the gradient descent with adaptive learning rate BP with two hidden layers.


69

8


Observed Predicted

Validation TestingTraining

0

1000

2000

3000

4000

5000

6000

7000

8000

Met

han

e, m

3 /d

0 10 20 30 40 50 60 70 80 90 100 110 120 130 140 150 160 170 180 190 200 210 220 230 240 250

Time (days)

Figure 8. Time plot of the best results of randomized observed and predicted methane concentrations for training, validation and testing set for thegradient descent with adaptive learning rate BP with two hidden layers.

be capable of dynamically predicting the VS and CH4 productionrates of the real system. The proposed models could be appliedto dynamically monitor changes in the anaerobic digester. Thus,only relatively simple monitoring parameters are needed to builda model of the highly complex anaerobic digestion process.

ACKNOWLEDGEMENTSThis study was supported by the Selcuk University Research Fund(BAP). The authors would like to thank ASKI, Ankara, Turkey, fortheir help during the study and for providing ACWTP process data.

REFERENCES1 Speece RE, Anaerobic Biotechnology for Industrial Wastewaters. Archae

Press, Nashville, TN (1996).2 Gavala HN, Angelidaki I and Ahring BK, Kinetics and modeling of

anaerobic digestion process. Adv Biochem Eng Biotechnol 18:57–93(2003).

3 Batstone DJ, Keller J, Angelidaki I, Kalyuzhnyi S, Pavlostathis SG andRozzi A, et al, Anaerobic Digestion Model No. 1 (ADM1), IWAScientific and Technical Report No. 13. IWA Publishing, London(2002).

4 Blumensaat F and Keller J, Modeling of two-stage anaerobic digestionusing the IWA Anaerobic Digestion Model No. 1 (ADM1). Water Res39:171–183 (2005).

5 Parker WJ, Application of the ADM1 model to advanced anaerobicdigestion. Bioresource Technol 96:1832–1842 (2005).

6 Shang Y, Johnson BR and Sieger R, Application of the IWA AnaerobicDigestion Model (ADM1) for simulating full-scale anaerobic sewagesludge digestion. Water Sci Technol 52:487–492 (2005).

7 Yasui H, Goel R, Li YY and Noike T, Modified ADM1 structurefor modelling municipal primary sludge hydrolysis. Water Res42:249–259 (2008).

8 Ozkan-Yucel UG and Gokcay CF, Application of ADM1 model to a full-scale anaerobic digester under dynamic organic loading conditions.Environ Technol 31:633–640 (2010).

9 Guclu D and Dursun S, Amelioration of carbon removal predictionfor an activated sludge process using an artificial neural network(ANN). Clean 36:717–806 (2008).

10 Basheer IA and Hajmeer M, Artificial neural networks: fundamentals,computing, design, and application. J Microbiol Methods 43:3–31(2000).

11 Tay JH and Zhang X, A fast predicting neural fuzzy model for high-rateanaerobic wastewater treatment systems. Water Res 34:2849–2860(2000).

12 Horiuchi JI, Kikuchi S, Kobayashi M, Kanno T and Shimizu T, Modelingof pH response in continuous anaerobic acidogenesis by an artificialneural network. Biochem Eng J 9:199–204 (2001).

13 Cakmakci M, Adaptive neuro-fuzzy modelling of anaerobic digestionof primary sedimentation sludge. Bioprocess Biosyst Eng 30:349–357(2007).

14 APHA, Standard Methods for the Examination of Water and Wastewater.20th edn. American Public Health Association/American WaterWorks Association/Water Environment Federation, Washington(1998).

15 Anderson GK and Yang G, Determination of bicarbonate and totalvolatile acid concentration in anaerobic digesters using a simpletitration. Water Environ Res 1:53–59 (1992).

16 McCarty PL, Anaerobic waste treatment fundamentals, Part One,Chemistry and microbiology. Public Works 95:107–112 (1964).

17 McCarty PL, Anaerobic waste treatment fundamentals, Part Four, Toxicmaterials and their control. Public Works 95:95–99 (1964).

18 McCarty PL, Anaerobic waste treatment fundamentals, Part Two,Environmental requirements and control. Public Works 95:123–126(1964).

19 Haykin S, Neural Networks: A Comprehensive Foundation. MacmillanCollege Publishing Company, New York (1994).

20 Maren A, Harston C and Pap R, Handbook of Neural ComputingApplications. Academic Press, London (1990).

21 Rumelhart DE and McClelland JL, Parallel Distributed Processing. TheMIT Press, Cambridge, MA (1986).

22 Maier HR and Dandy GC, Understanding the behaviour and optimisingthe performance of back-propagation neural networks: an empiricalstudy. Environ Model Softw 13:179–191 (1998).

23 Ozkaya B, Demir A and Bilgili MS, Neural network prediction model forthe methane fraction in biogas from field-scale landfill bioreactors.Environ Model Softw 22:815–822 (2007).

24 Shetty KV, Nandennavar S and Srinikethan G, Artificial neural networksmodel for the prediction of steady state phenol biodegradation ina pulsed plate bioreactor. J Chem Technol Biotechnol 83:1181–1189(2008).

25 Baykan NA and Yılmaz N, Mineral identification using color spaces andartificial neural networks. Comput Geosci 36:91–97 (2010).

26 Francois D, Rossi F, Wertz V and Verleysen M, Resamplingmethodsforparameter-free and robust feature selection with mutualinformation. Neurocomputing 70:1276–1288 (2007).

27 Diamantidis NA, Karlis D and Giakoumakis EA, Unsupervisedstratifica-tion of cross-validation for accuracy estimation. Artif Intell 116:1–16(2000).

28 Karaca F and Ozkaya B, NN-LEAP: a neural network-based model forcontrolling leachate flow-rate in a municipal solid waste landfill site.Environ Model Softw 21:1190–1197 (2006).


application of neural network prediction model to full-scale anaerobic sludge digestion

Documents