adaptive fuzzy pattern recognition in the anaerobic digestion process

9

Click here to load reader

Upload: stefano-marsili-libelli

Post on 21-Jun-2016

218 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Adaptive fuzzy pattern recognition in the anaerobic digestion process

ELSEVIER Pattern Recognition Letters 17 (1996) 651-659

Pattern Recognition Letters

Adaptive fuzzy pattern recognition in the anaerobic digestion process

Stefano Marsili-Libelli a,*, Andreas Miiller b,1 a Dept. of Systems and Computers, University of Florence. Via S. Marta. 3-50139 Florence, Italy

b Institute of Bioteehnology 2, JMich Research Centre, D-52425 Jfilich, Germany

Received 22 November 1995; revised 15 December 1995

Abstract

This paper addresses the problem of coordinating the operation of several units of a biological wastewater treatment process in order to avoid process failure. The state of the process is monitored through a number of chemical and biological variables and this information is used to infer the plant loading state. After a brief introduction of the plant supervision scheme, the paper describes how the Fuzzy C-Means (FCM) classification algorithm can be used in this application and enhanced with an adaptive capability, enabling the supervisory control system to detect process departure from normal operating conditions.

Keywords: Panem recognition; Fuzzy sets; Adaptation; Classification; Wastewater treatment

1. Introduct ion

The process to be controlled is a biological wastewater treatment plant which consists of the cascade of four process units: anaerobic digestion, aerobic treatment, nitrification and denitrification. Though each unit is locally controlled to guarantee a consistent operation, a high-level supervision is re- quired to harmonize their operation and protect the whole plant, avoiding excessive load from being applied. This study was developed in the framework of a European Union research project entitled "In-

* Corresponding author. E-mail: [email protected]. E-mail: [email protected].

tegrated Process Control: development o f an inte-

grated control system to optimize biological carbon

and nitrogen removal by wastewater treatment

plants". The pilot plant has been assembled at the site of the project leader, the Institute of Biotechnol- ogy, Jiilich Research Centre, Germany. This paper describes the application of a Fuzzy C-Means (FCM) classification algorithm to the detection of poten- tially dangerous loading situations. This is based on the analysis of raw input sewage characteristics and subsequent classification into safe and unsafe operat- ing conditions. The ensuing paradigm of high-level control actions is described elsewhere (Marsili-Libelli et al., 1994; Miiller et al,, 1995). Basically, they consist of a coordination between bypassing the anaerobic digester a n d / o r short-term storing of the

0167-8655/96/$12.00 © 1996 Elsevier Science B.V. All rights reserved Pll S0 167-865 5(96)00030-X

Page 2: Adaptive fuzzy pattern recognition in the anaerobic digestion process

652 S. Marsili-Libelli, A. M'filler / Pattern Recognition Letters 17 (1996) 651-659

, - . - . - o - = - a,%

I n

I I I I

I I

I I i I

; i , x ~ n r o _ | .... ! O D N T I : I : : X t _ W ~ ' - I - ' . ~ g S

a i

I N F L O E N T . . . . % ~ I I o I lWlP. lb IIP. N I I

Fig. 1. Overall scheme of plant and supervisory control.

T O

N I T R I F I C A T I O N D I N I T R I F I C A T I O N

excess influent. Fig. 1 shows the supervision system driven by the fuzzy pattern recognition algorithm, which will be described in this paper. The role of fuzzy pattern recognition is to detect not only the extent of departure from normal operating condition but also the nature of the disturbance. This is accom- plished by fuzzy clustering of input sewage charac- teristics into a predefined number of behaviour classes, which are represented by their prototypes.

2. Definitions of behaviour classes

A small pilot digester was used in parallel with

the main anaerobic digestion stage to detect the quality of the incoming sewage. Given its small dimension (5 litre), the reaction rate is much faster than that of the full scale digester and thus it has the function of an early warning system. After a long and thorough experimentation at the pilot plant site, as documented in (Miiller et al., 1995), two mean- ingful variables were selected that yield sufficient information to allow discrimination among a suffi- cient number of possible critical behaviours. The variables and operational states are listed in Table 1. They refer to the anaerobic digestion stage. In fact this is intrinsically the most critical unit, all the more so since it is placed at the front end of the process chain.

Table 1 Detection variables and critical states for the anaerobic digestion

Detection variables

Biogas Hydrogen in production the off gas [1/h] [ppm]

Operational states

Normal Toxic Overload Inhibition/Underload

Page 3: Adaptive fuzzy pattern recognition in the anaerobic digestion process

S. Marsili-Libelli, A. Mfiller / Pattern Recognition Letters 17 (1996) 651-659 653

3. A fuzzy c lustering a lgor i thm

The measurements from the pilot digester are processed by the fuzzy classifier to determine their similarity with the classes defining the possible plant states. The result of this fuzzification is used to activate the appropriate control rules. The algorithm used in this application is composed of two parts: training and update. In the first part an FCM algo- rithm is used to classify the data into the four behavioural classes listed in Table 1. Then, during on-line operation, the prototypes are updated to fol- low the time-varying trend of the operational data. These two aspects will be examined in this section, first by recalling the basic properties of the FCM classifier and then its extension to the adaptive case.

3.1. The FCM classification algorithm

The original idea of using fuzzy sets in clustering techniques as proposed by Bezdek (1981) consists of weighting a similarity measure (usually the point-to- prototype distance) with a measure of uncertainty. Thus, the classical partition functional to be mini- mized becomes

C N

Ju(C, m ) = 2~ 2~ (uk.i)'dk.i, (1) i = l k = l

where C is the number of clusters, N is the number of data, uk, ~ is the membership of the kth point to the ith cluster, and dk, ~ is the Euclidean distance between the kth point x k and the ith prototype v v This is defined as the fuzzy weighted sum of the data points

N

k = l v i = x , i = 1 , 2 . . . . . C. ( 2 )

E m k = l

Since membership uk. i affects the computation of the cluster centers vi, the data with a high membership will attract the prototype more than points with a low membership. Furthermore, since each element is bounded, i.e. u,. i ~ (0, 1), increasing m results in a fuzzier, less discriminating partition. The cluster cen- ters vj represent the set of prototypes, i.e. the values

of the typical constituent of that cluster, whereas the u~., component of the membership matrix

{U: u , j ~ (0, 1), k = l . . . . . N; i = 1 . . . . . C}

denotes the extent to which the point x, is similar to its prototype. The exponent m ~ [1, oc) appearing in Eq. (1) determines the incidence of the memberships u,., on the partition. The minimization of partition functional (1) with the constraint

C

ui. k = 1 (3) i = 1

yields the following expression for the membership (Bezdek, 1981):

j= 1 dk , j ] (4)

The iterative algorithm consists of alternatively com- puting U through Eq. (4), the prototypes V through Eq. (2) and the distances {dk.i} , until U stabilizes. This algorithm has been used to classify an initial set of data in order to build a starting basis of knowl- edge. However, in order to adapt the initial knowl- edge, a learning feature is added to the basic cluster- ing scheme through an on-line classifier.

3.2. An adaptive FCM classifier

It may be convenient to use the knowledge al- ready gained in partitioning a given set to classify more data during on-line operation. This knowledge is condensed into the prototypes {v i, i = 1 . . . . . C} and can be used to classify subsequent data without having to reprocess the whole data set. This can be done in an economical way if a classifier is devel- oped which retains the information extracted from the previous partition. To design a classifier for a new entry XN+ ~ on the basis of the prior knowledge {vi, i = 1 . . . . . C}, Eq. (4) can be used again to deter- mine the memberships of the new point, after the distances between xu+ t and each of the prototypes v~ have been determined. Contrary to the previous iterative solution of Eq. (4), now the prototypes v~

Page 4: Adaptive fuzzy pattern recognition in the anaerobic digestion process

654 S. Marsili-Libelli, A. Mfiller / Pattern Recognition Letters 17 (1996) 651-659

remain unchanged and Eq. (4) is used only once, to obtain the memberships of the new point

Notice that condition (3), which can now be written as

c

Y'~ un+1, i = 1 ( 6 ) i=1

implicitly holds because of the way in which Eq. (4) was derived. Thus Eq. (5) is the required classifier yielding the memberships {UN+ 1,~; i = 1 . . . . . C} of the new entry x.+~ with respect to the set of predetermined prototypes {v~; i = 1 . . . . . C},

Table 2 Characteristics of the anaerobic digester shock data sets

Experiment Characteristics codename

Earthlb

Moon] * Moon2 *, Moon3 *

Moon5 *

Moon6

Moon8 *, Moon9 *

Moonl0

Although the overload effect is moderate concerning biogas production, there is a medium to strong organic inhibitory effect on the biological system. It is a criti- cal situation because without hu- man intervention the biology acts like a positive closed loop, en- forcing the critical state.

Organic overload: the strength of organic overloads (COD) without any inhibition increases with the sequence of these experiments.

This is a reversible inhibition experiment.

Similar to Moon5 but with a very strong toxic experiment, although at the end the acidifying bacteria could recover.

Underload experiments: the COD load was completely cut off for a period of about 14 hours.

A strong inhibition experiment, but the biogas flow was little affected. The effect of Na2S is roughly the same as in Moon5.

" The asterix denotes inclusion in the basic data set.

Table 3 Prototype values resulting from the fuzzy clustering procedure

Cluster Gas flow Hydrogen (1 /h) (ppm)

Inhibition/Underload 0.6791 91.0608 Toxic 1.3423 215.9553 Overload 2.1124 280.3017 Normal 1.7829 175.9875

This idea was already applied to the classification of ecological data (Marsili-Libelli, 1989), where it was assumed that the classification of the new entry XN+ ~ would not affect the position of the prototypes v i. On the other hand, in many cases it may be necessary to include some trend-following capability, for example in a set of time-varying data, thus making the clustering procedure adaptive. This can be done by reflecting the changing membership val- ues into the prototype locations, which in the case of N + 1 data points can be written as

n + l

E ("k , i )mXk k=l

/'Yi] n + 1 = n + l

k=l

(uk.i)m xk + X°+, k=l

i ' (u .,lm + ( . .+ k=l

s ( . ) + (u°+,.,)mx°+, M ( n ) + ( u . + , . , ) m " (7)

where

k = l k= l

Several options are available to decide whether this change improves the partition or destroys the already established patterns. For this application it is assumed that the basic membership matrix U re- mains unchanged and each new point is classified with respect to the "co re" data used during the initial training phase. In this way long-term on-line

Page 5: Adaptive fuzzy pattern recognition in the anaerobic digestion process

S. Marsili-Libelli, A. M~ller / Pattern Recognition Letters 17 (1996)651-659 655

X 2

1

0.8

0.6 o

0.4

0.2

o ~

o o l t

o o

I o o o o t r a i n i n g d a t a

+ + + + n e w d a t a

X~ c e n t r o i d s o +k

o ~ x o

/ o o

t r a j e c t o r y o f c e n t r o i d a d a p t a t i o n

0.2 0.4 0.6 0.8 X 1

Fig. 2. Adaptation of cluster prototypes as a consequence of new data classification.

operation does not result in unlimited growth of the column dimension of matrix U.

The simple example shown in Fig. 2 considers a set of twelve two-dimensional data together with a two-cluster partition represented by its prototypes. Using the information contained in this partition (i.e. the prototype coordinates and matrix U), a set of more random data is classified sequentially and the prototypes are altered according to Eq. (7). Clearly, the prototypes tend to follow the new entries as they

are classified. The new data tend to influence the prototype to which they are closest, where their clustering membership is greatest. On the contrary, the farthest cluster is least influenced because the new data receive a small membership and the dis- placement of the prototype is therefore limited.

4. Application to the anaerobic digestion process

In the context of the complex treatment plant of Fig. 1, several recognition problems arise in conjunc- tion with early detection of input loading distur- bance. Since anaerobic digestion is usually the front end unit of a complex wastewater treatment plant, the previous recognition algorithm has been applied to this particular process. Furthermore, early detec- tion of incoming overloads or other kinds of distur- bances is very important given the inherent process instability.

A comprehensive class of input disturbance is considered and their detection is based on the flow

4 0 0

3 5 0

3 0 0

l m m • Cluster centers

0. 9 o~-cut

Toxic , / , . . ~ ~,. / \

I ~ - - " - ~ O v o r l o a d . ; . - . - . . . • . , , . / < . . . ~- - , ~ . , \

• , ' I : :., \ \ / 5 .:. l : : ; : " "~ I I %,, ,-]' / I,,.,.,;.,i!il;:.:: /

2 5 0 . .,,....=. \ ::..'~ ..=, :11 / I , , , , l i , | "~ • ; J ( : i : ; " ~ : . ~ . = - . ~ . . . . . . . . ' . = ~ , . - . ~ - v 'qL(%il"=l.b;) . . . . I i l . ' - , t , "d " l . . , . i : . " ' " ~ ' * l "

~= t ".-,'h,;'; .......... ~,,~=m.,, ~.~.~.-, . . . , , . - ' ." ' "J~l:~ I ' i '=~- ' ~./|f ||1"~ 'z~*'rt '." w= = L

\ - - , ~ , J : ' , ' . . . 1 , . . . . . .,ll;:.. • ,-. ~0 uu~lu"~%;I,, , " t ;z)~=l l "ra~` " "

\ . : ,p~p=~t, j . . . .L . : . . ; ; , " P" , , ~ , ~ ; . . : t . . . . . . -- -,,- = ~1 . . . . . . ~ h ' " 200 " ~,;:_~-.:~-J~ii';~;;:~,:.,., ,

. .. .. -...~ ~ . - ,~ . , , . - ~ ..... ..,~.. .., ~ . . . ; ~ ~ , , ' t~ :

150 :" :i.': .': '.i": '~ : '; i '~' ' ' ?. . "'!!;::: . i . j=; I nh ib i t i on /Under load .:: .~ ~ • ,o,; ......... I

~ ~ ~ . ~ . :" : i " . .~h

~" ~" :~ -~..~, ' f" \ " I No rma l I \ .i"" / 100 / : ' -~ ......... - " ; : ~ " ' ," ~

i P'~:'-: " .......... W::'.,,. \ .'" . ] : : "

\ ~ / u I

5 0 0 0 .5 1 115 2 2.'5 3

G a s f low (I/h)

Fig. 3. FCM clustering of input loading of the pilot plant. The data used for clustering are shown as dots, whereas the dashed lines represent the 0.9 alfa cut.

Page 6: Adaptive fuzzy pattern recognition in the anaerobic digestion process

656 S. Marsili-Libelli, A. M~ller / Pattern Recognition Letters 17 (1996) 651-659

350

300

250

c

150

100

50

i o o o o O u s t e r cen te rs I

I 4- + -I- ÷ Classif ied data

Inhibition/Undarload !1

Toxic - ~

4-

Overload m

e Normal

0(~ 015 ~ 115 Gas flow (I/h)

Fig. 4. The classification of Moon5 data produces no change in prototype positions.

2.5

and composition of the off gas of a scaled down anaerobic digester fed with the same sewage entering the real plant. In particular gas flow and hydrogen concentration were assumed to be the basic indicator of an incoming disturbance.

The data used in this analysis were selected from a collection of shock experiments whose characteris- tics are summarized in Table 2. They include various kinds of shock, from organic overload to toxic to underload or inhibition. Part of them were used to

Normal

~ 0 . 6

I 0.4

0.2

0 o 20 40 eb do t ime (h)

0.8 .~.

i 0.6

0.4

0.2

Over load

0 0 20 40 60 80

t ime (h)

1

0.8 .~-

i 0.6

0.4

0.2

°o 2--'0

Toxic Inhibition

A 4o eb 80 % 20 40

t ime (h) t ime (h)

1

a~o.6

~ 0.4

0.2

60 80

Fig. 5. Classification of Moon5 data: the toxic state caused by the shock is clearly recognized.

Page 7: Adaptive fuzzy pattern recognition in the anaerobic digestion process

S. Marsili-Libelli, A. M~ller / Pattern Recognition Letters 17 (1996) 651-659 657

¢-

¢a

0

2

1.5

1

0.5

0 0

i I i i i i i i

toxic shoc "t d u r a t i o n o o o o o Smoothed samples

I I I I I l I I

10 20 30 40 50 60 70 80 90 time (h)

300 / . . . . . ooo o'o Smoothed s'amples

2 0 0

• . . ~ . ~

100 duration I I I I I

SOo lO 20 30 ,o 50 6'0 7'0 8'0 90 time (h)

Fig. 6. Moon5 shock data shown with the smoothed subset used for classification.

8 0 0

7 0 0

6 0 O

5 0 0

o o o o C l u s t e r c e n t e r s +

+ + + -I- C l a s s i f i e d d a t a

4~ ÷

"÷ ÷ "t"

+ O v e r l o a d + ÷+4- ÷÷ ~. * = 300 + ~ . ~ . ~ . + ~ l

"t- " "H" ,+ Tox iC +l" "l~-~d~" * - - . ~ . ~- + 2 0 0 +

4- 4- .4. T "4- -4- ~JE '~ = N o r m a l

1 0 0 f

I n h i b t t l o n / U n d e r l o a d

i °o 0'.5 ~ 1.5 ~ ~.s G a s f l o w ( I /h)

Fig. 7. Classification of Earth lb data. Overload prototype is the most affected, but also the toxic prototype shows some adaptation.

Page 8: Adaptive fuzzy pattern recognition in the anaerobic digestion process

658 S. Marsili-Libelli, A. Mfiller / Pattern Recognition Letters 17 (1996) 651-659

1

0.8 .Q.

0 .6

0 .4

0 .2

Normal

t2 '/

time (h)

Toxic

2 4

08 It ~ 0 .6

~ 0 .4

0.2 /~

0 2 4 time (h)

O v e r l o a d 1

0 .8 .KI-

i 0 .6

0 .4

0 .2

2 4 6 t ime (h)

Inhibition

0 . 8 .Q-

0.2

time (hi

Fig. 8. Membership functions resulting from the Earth lb data classification.

generate the clusters, and the remaining part was used in the recognition experiments. Some data were used in both instances, to check the basic recognition features of the algorithm.

Clustering the test data with the basic FCM algo- rithm described in Section 2 produced the prototypes of Table 3.

The position of the prototypes in the gas f low/hydrogen plane is shown in Fig. 3, where a 0.9 alfa cut is also depicted.

To illustrate the detection capability, both in the adaptive and in the non-adaptive case, some data sets were processed using the prototype values of Table 3. As a first attempt, the data set Moon5 (strong

1.5

u~

0.5

l t ime (h)

1 0 0 0 ,

800

0 I I I

o 1 2

- I I I I I

3 4 5 6 t ime (h)

Fig. 9. Earth lb data.

Page 9: Adaptive fuzzy pattern recognition in the anaerobic digestion process

S. Marsili-Libelli. A. MMler / Pattern Recognition Letters 17 (1996) 651-659 659

toxic shock) was used. This was included in the generation of the prototypes, so no adaptation is expected. Since the data were affected by a strong short-term variability due to measuring equipment, smoothed samples were fed to the classifier. The results are shown in Figs. 4, 5 and 6.

The data set Earthlb was not included in the initial clustering, and since it contains data quite different from the original training set, it is expected to generate some degree of adaptation. In fact, classi- fication of these data produces an appreciable degree of cluster displacement (Fig. 7). Overload prototype is the most affected, but also the toxic classification is somewhat modified. This experiment has mixed features of overload and toxic nature. In fact, it represents an organic inhibition state. Due to the overload the propionic acid arose and caused a de- crease in activity. Although the effect is moderate concerning biogas production, it is a medium to strong organic inhibitory effect on the biological system. It is a critical situation because without human intervention the biology of the microorgan- isms acts as a positive closed loop, reinforcing the critical state.

Fig. 8 shows that during the initial part of the shock an overload state is detected, whereas in the trailing part the esuing effect of organic inhibition is classified as a toxic state, since it caused a decrease in biological activity.

As a concluding remark of this section, it should be noticed that adaptation can be a useful feature in classification algorithms designed to work on-line for the detection of changing conditions. In this case some degree of adaptability is required to track data sets which are significantly different from the train- ing set, which would be otherwise wrongly classi- fied. More experimentation is currently under way to complete the testing of the algorithm with a wider data base.

5. Conclusion

The basic FCM algorithm was enhanced with an adaptive capability. This was achieved by designing

a classifier to obtain the fuzzy membership of a new data point and allowing the prototypes to be altered according to this new entry. In this way a two-step adaptive algorithm was designed to keep track of changing data patterns. The algorithm was tested on the recognition of input disturbances in the anaerobic digestion process, described either by bicarbonate alkalinity or off gas flow and composition. In either case a fuzzy classification of disturbances was possi- ble and the adaptive detection proved an efficient way to recognize the kind and extent of incoming changes. More experimentation is currently under way to complete the testing of the algorithm with a wider data base.

Acknowledgements

This research was supported by the European Community grant no. EV5V-CT92-0233 "Integrated Process Control: development of an integrated con- trol system to optimize biological carbon and nitro- gen removal by wastewater treatment plants".

References

Bezdek, J.C. (1981). Pattern Recognition with Fuzzy Objective Function. Plenum Press, New York.

Marsili-Libelli, S. (1989) Fuzzy clustering of ecological data. Coenoses 2, 95-106.

Marsili-Libelli, S., S. Beni, P. Cianchi and P. Quercioli (1994). High level fuzzy controller for a complex wastewater treat- ment process. Proc. 8th Forum of Applied Bioteehnology, Brugge, 28-30 September 1994, 2071-2079.

Marsili-Libelli, S. and S. Beni (1995) Shock load modelling of the anaerobic digestion process. Ecol. Modelling, in press.

Mi~ller, A., S. Marsili-Libelli, A. Aivasidis, and Ch. Wandrey (1995). Development of an integrated process control system for a multi-staged wastewater treatment plant. Part II: High level fuzzy control. Proc. 9th Forum for Applied Biotechnol- ogy, Gent, 25-27 September 1995, 2467-2473,