[lecture notes in computer science] artificial intelligence and computational intelligence volume...

Local Model Networks for the

Optimization of a Tablet Production Process�

Benjamin Hartmann1, Oliver Nelles1, Ales Belic2,and Damjana Zupancic–Bozic3

1 University Siegen, Department of Mechanical Engineering,D–57068 Siegen, Germany

{benjamin.hartmann,oliver.nelles}@uni-siegen.de2 University Ljubljana, Department of Electrical Engineering,

SLO–1000 Ljubljana, [email protected]

3 KRKA d.d., SLO–8501 Novo Mesto, [email protected]

Abstract. The calibration of a tablet press machine requires compre-hensive experiments and is therefore expensive and time-consuming. Inorder to optimize the process parameters of a tablet press machine on thebasis of measured data this paper presents a new approach that workswith the application of local model networks. Goal of the model-basedoptimization was the improvement of the quality of produced tablets,i.e. the reduction of capping occurence and the variation of the tabletmass as well as the variation of the crushing strength. Modeling and op-timization of the tablet process parameters show that it is possible tofind process settings for the tabletting of non-preprocessed powder suchthat a sufficient quality of the tablets can be achieved.

1 Introduction

The pharmeceutical industry is increasingly aware of the advantages of imple-menting a quality-by-design (QbD) principle, including process analytical tech-nology (PAT), in drug development and manufacturing [1], [2], [3]. Althoughthe implementation of QbD into the product development and manufacturinginevitably requires large resources, both human and financial, large-scale pro-duction can be established in a more cost-effective manner and with improvedefficiency and product quality.

Tablets are the most common pharmaceutical dosage form prepared by com-pression of a dry mixture of powders consisting of active ingredient and ex-cipients into solid compacts. The process of tabletting consists of three stages:a) The powder mixture is filled into the die; b) compaction, where the pow-der is compressed inside a die by two punches, resulting in plastic and elastic

� This work was supported by the German Research Foundation (DFG), grant NE656/3–1.

H. Deng et al. (Eds.): AICI 2009, LNAI 5855, pp. 241–250, 2009.c© Springer-Verlag Berlin Heidelberg 2009

242 B. Hartmann et al.

deformation and/or particles fragmentation, and c) ejection, where the tabletis ejected from the die and elastic recovery of the tablet may occur. An inten-sive elastic recovery can lead to separating the upper part of a tablet from thetablet body (capping). The mechanical behavior of powders during tablettingand the quality of tablets depend on the powder characteristics (formulation)and the tabletting parameters on the tablet press machine [4], [5], [6], [7]. Inorder to assure a high and repeatable quality of products the processes and for-mulations should be optimized. This is particularly important on high capacityrotary tablet presses which can produce a few hundred thousand tablets perhour and are very sensitive to product and process variables. A special challengeis the large scale production of products in form of tablets where the active drugloading is high (> 50% of the total mass of a tablet). Active drugs as organicmolecules very often have inappropriate physical properties which prevent usingdirect compression of powder mixtures, which is the most economical produc-tion method (shortest process time, minimum cleaning and validation need, lowenergy consumption, environmentally friendly). The problem of capping is oftenobserved in production of tablets with high drug loading. This problem can besolved either by formulation or process optimization. To find the optimal setof the parameters (main compression force, precompression force, compressionspeed ...) on the tablet press, it is possible to make a few tablets for each setof parameters, and then the operator can decide which settings should workbest, and then starts the production. Fine machine setup optimization mustbe performed, due to some batch to batch variations of raw materials or someother variables. During the setup procedure, many of the tablets do not passthe quality control and must hence be discarded, which can be very expensive,regarding the price of the raw material and the number of trials, needed to findoptimal settings. To optimize the quality of tablets and to reduce the number offaulty tablets, modeling and simulation procedures can be used, where a modelof quality parameters with respect to process parameters must be developed.The raw materials used in tests should cover all the expected characteristics ofraw materials that can be expected to appear in production. With the model,optimal settings of the machine can be found with respect to raw material char-acteristics, without further loss of raw material. The aim of the present studywas the optimization of the tabletting process in order to diminish capping oc-currence and the variation of tablet mass and crushing strength. Optimizationwas performed for the product where 70% of the tablet weight represent activeingredients, on the high capacity rotary-tablet press, in the standard productionenvironment. Local model networks were used to model the relation betweenquality and process parameters.

2 Production of Model Tablets

Due to the high drug content and chosen manufacturing procedure (direct com-pression) changes in raw material characteristics can result in variable crushingstrength of tablets, large tablet mass variation and most problematic, intense

Local Model Networks for the Optimization of a Tablet Production Process 243

capping problems. If problems in direct tabletting are too severe, dry granula-tion can be used as an alternative to improve the particle compression character-istics. Dry granulation means aggregation of smaller particles using force. Thecompacts are milled and sieved afterwards. Next step is tabletting. Three typesof mixtures for tabletting were prepared:

A mixture for direct tabletting (Type: “Direkt” - 1 sample),B granulate prepared by slugging (precompression on a rotary-tablet press),

using different parameters of tabletting speed and compression force (Type:“Briket” - 4 samples),

C granulate prepared by precompressing on a roller compactor using differentparameters of compacting speed and force (Type: “Kompakt” - 4 samples).

The composition of all powder mixtures was the same, milling and sieving ofcompacts was performed on the same equipment. After dry granulation, the par-ticle size distribution is normally shifted toward bigger particles, also cappingoccurence is often decreased due to a change in particle characteristics (flowa-bilty and compression characteristics). Nine samples were tableted on a rotary-tablet press using different combinations of the following parameters: compres-son speed v, precompression force pp, main compression force P. Each of theseparameters were set at three levels. Tablets were evaluated according to cappingoccurence, crushing strength and tablet mass variability [8].

2.1 Capping Coefficient

The capping occurency (capping coefficient, CC) was calculated as the ratiobetween the number of tablets that have a tendency to cap and the number oftablets that have been observed. It was observed visually on each tablet after aclassical tablet hardness test (Erweka). A Tablet was considered to have cappingtendency if the upper part of the tablet falled off from the tablet body or if asignificant shape (a step-like form) was formed on the breaked surface of thetablet.

2.2 Experiments on the Rotary-Tablet Press

For experiments on the rotary-tablet press three groups of raw materials wereused (“Direkt”, “Briket”, “Kompakt”). The types Kompakt and Briket werefurther divided into four sub-groups depending on the combination of compres-sion force and speed during slugging or roller compaction. For each of theseraw materials several combinations of settings on the rotary-tablet press wereset (tabletting speed, main compression force, precompression force). Producedtablets were controlled afterwards: CC, average crushing strength, standard devi-ation of crushing strength, average tablet mass and standard deviation of tabletmass. For each raw material and parameter setting, 10 tablets were made andanalyzed. All together 76 data points were measured.


3 Modeling

To model the relation between process parameters, raw material characteristics,and quality of the tabletting process (CC, standard deviation of mass, standarddeviation of crushing strength) it was not reasonable to use physical laws thatdescribe the operation of the tabletting machine, considering the aim. The sys-tem characteristics depend on many stochastic processes (particle rearrangementand deformation type, friction, bondforming between particles, ...), therefore, themodel would be too complex for process parameter optimization. On the otherhand, there is a relatively large database available, hence, it is more appropriateto use data-based computational intelligence methods instead.

3.1 Local Model Networks

To model the system characteristics a local model network approach [9], [10],[11] was used with two to four inputs, depending on the output. For each output(CC, standard deviation of crushing strength, standard deviation of mass) aseparate neuro-fuzzy model was constructed.

The output y of a local model network with p inputs u = [u1 u2 · · · up]T canbe calculated as the interpolation of M local model outputs yi, i = 1, . . . , M [12],

y =M∑

i=1

yi(u)Φi(u) (1)

where the Φi(·) are called interpolation or validity or weighting functions. Thesevalidity functions describe the regions where the local models are valid; theydescribe the contribution of each local model to the output. From the fuzzy logicpoint of view (1) realizes a set of M fuzzy rules where the Φi(·) represent therule premises and the yi are the associated rule consequents. Because a smoothtransition (no switching) between the local models is desired here, the validityfunctions are smooth functions between 0 and 1. For a reasonable interpretationof local model networks it is furthermore necessary that the validity functionsform a partition of unity:

M∑

i=1

Φi(u) = 1 . (2)

Thus, everywhere in the input space the contributions of all local models sum upto 100%. In principle, the local models can be chosen of arbitrary type. If their pa-rameters shall be estimated from data, however, it is extremely beneficial to choosea linearly parameterized model class. The most common choice are polynomials.Polynomials of degree 0 (constants) yield a neuro-fuzzy system with singletonsor a normalized radial basis function network. Polynomials of degree 1 (linear)yield local linear model structures, which is by far the most popular choice. Asthe degree of the polynomials increases, the number of local models required fora certain accuracy decreases. Thus, by increasing the local models’ complexity,


at some point a polynomial of high degree with just one local model (M = 1) isobtained, which is in fact equivalent with a global polynomial model (Φ1(·) = 1).

Besides the possibilities of transferring parts of mature linear theory to thenonlinear world, local linear models seem to represent a good trade-off betweenthe required number of local models and the complexity of the local modelsthemselves. This paper deals only with local models of linear type:

yi(u) = wi,0 + wi,1u1 + wi,2u2 + . . . + wi,pup . (3)

However, an extension to higher degree polynomials or other linearly parame-terized model classes is straightforward.

3.2 Data Pre-processing

Each raw material batch has a unique composition of particles, since its pro-duction cannot be exactly repeated. The raw material characteristics can becharaterized by the particle size distribution. The model outputs are a functionof the tablet composition, the main compression pressure P, the pre-compressionpressure pp, and the tabletting speed v. The tablet composition and the raw ma-terial characteristics, respictively, are given by the particle size distribution. Theparticle sizes were separated in the following 8 subgroups: 0-0.045, 0.045-0.071,0.071-0.125, 0.125-0.25, 0.25-0.5, 0.5-0.71, 0.71-1.0, and 1.0-1.25mm. As trainingdata we had 9 different tablet compositions available. Figure 1 shows the cor-relation between the standard deviation σ of the particle size dirstribution andthe mean values µ. Due to the sparse training data we assumed that the rawmaterial characteristic is uniquely desribed with the mean value µ of the particlesize distribution. For simplicity the inputs and outputs had been normalized tothe interval [0, 1].

3.3 Model Structure

Due to available prior knowledge it can be assumed that the three differentmodel outputs which represent the tablet quality depend on different sets of

Fig. 1. Correlation between standard deviation σ of the particle size dirstribution andthe mean values µ


Fig. 2. Model structure. One model for each output dimension.

input variables. Therefore it is advantageous to build up three individual MISOmodels each with the relevant inputs. Using a single MIMO model would discardthis knowledge and thus would yield suboptimal results, compare Fig. 2. Thetask is to model the following three output dimensions: The capping coefficientCC, the standard deviation of the tablet mass σm and the standard deviationof the crushing strength σcs, respectively.

4 Results

First, the local model networks were designed and validated, next, the modelswere used to find the optimal process parameters. Furthermore we investigatedthe robustness of the optimization in terms of varying training data.

4.1 Modeling

The training of the local model networks was performed with the LOLIMOT al-gorithm proposed in [9]. One advantage of this training algorithm among othersis that the training has only to be carried out once to get reliable and repro-ducible results. For final training the whole data set was used, since, consideringthe rather small dimensionality of the problem, the nonlinear input-output rela-tionship could be visually monitored in order to check for potential overfitting.The identified relations can be seen in Figs. 4 to 7. For modeling the standarddeviation of the mass σm 3 local models (LMs) were used. The standard devi-ation of the crushing strength σcs was modeled with 4 LMs and the cappingcoefficient CC with 3 LMs.

4.2 Optimization

Once the models for the tabletting process parameters have been built, the ques-tion arises: How they should be utilized for optimization? Figure 3 illustrates theused optimization scheme. Basically, the manipulated variables that are modelinputs are varied by an optimization technique until their optimal values are


Fig. 3. Optimization of the tablet press parameters

found. Optimality is measured in terms of a loss function that depends on thetabletting process parameters, i.e. the model outputs. When the operating pointdependent optimal input values are found, they can be stored in a static mapthat represents a feedforward controller, which can be realized either as a look-uptable (since it has only two inputs) or as a neural network. To convert the multi-objective optimization problem into a single-objective problem the loss functionwas formulated as follows:

J (σm, σcs, CC) = w1σm + w2σcs + w3CC−→ min

P,ppwith 0 ≤ P ≤ 1, 0 ≤ pp ≤ 1 . (4)

This method is called the weighted-sum method, where the coefficients wi of thelinear combination are called weights [13], [14]. All relevant model outputs aredirectly incorporated and the loss function is chosen as a weighted sum of thetablet process parameters. Due to expert knowledge the parameters were set tow1 = 0.25, w2 = 0.25 and w3 = 0.5. For optimization the BFGS quasi-Newtonalgorithm [15] was used. The results are summarized in Table 1. Figure 6 showsthe shapes of the loss functions for the three tablet types “Direkt”, “Kompakt”and “Briket”. Figure 8 illustrates the variation of the optimal process param-eters, if single data points are omitted from the training dataset. Because ofthe sparse dataset, omitting critical data points can lead to different shapes ofthe loss function. Therefore outliers can be observed in the boxplots. However,the comparison of the median values with the results generated with the wholedataset (Table 1) is satisfactory.


Fig. 4. Model plot that shows the relations between the inputs mean value of thetablet particle sizes µ, main compression force P, pre-compression force pp, the tablet-ting speed v and the output standard deviation of tablet mass σm. The red dots aremeasured data points.

Fig. 5. Model plot that shows the relations between the inputs mean value of the tabletparticle sizes µ, main compression force P , pre-compression force pp and the outputstandard deviation of the crushing strength σcs

Fig. 6. Loss functions with respect to the raw material parameter µ = 0, 0.86, 1, maincompression force P , pre-compression force pp and tabletting speed v = 1


Fig. 7. Model plot that shows the relations between the inputs mean value of the tabletparticle sizes µ, main compression force P and the output capping coefficient CC

Fig. 8. Boxplots that illustrate the variation of the optimal process parameter valuesin case of omitting single data points

Table 1. Results of the optimization

µ P pp

0 0.21 0.69

0.5 0.04 0.73

0.86 0.01 0.70

1 0.03 0.74

5 Conclusion

This paper presents a local model network approach for the model-based opti-mization of the parameters of a tablet press machine. Several experiments onthe tablet press machine with different tablet and process parameters made itpossible to create a well generalizing model. Due to the high flexibility and ro-bustness with respect to the sparse and noisy data, local model networks are wellsuited for this kind of modeling problem. Model plots show the applicability of


the selected approach. Modeling and simulation indicate that it is possible tofind process settings for the tabletting of non-preprocessed powder such that asufficient quality of tablets can be achieved.

References

1. Frake, P., Greenhalgh, D., Gierson, S., Hempenstall, J., Rudd, D.: Process controland end-point determination of a fluid bed granulation by application of near infra-red spectroscopy. International Journal of Pharmaceutics 1(151), 75–80 (1997)

2. Informa: PAT - Quality by design and process improvement, Amsterdam (2007)3. Juran, J.: Juran on Quality by Design. The Free Press, New York (1992)4. Sebhatu, T., Ahlneck, C., Alderborn, G.: The effect of moisture content on the

compression and bond-formation properties of amorphous lactose particles. Inter-national Journal of Pharmaceutics 1(146), 101–114 (1997)

5. Rios, M.: Developments in powder flow testing. Pharmaceutical Technology 2(30),38–49 (2006)

6. Sorensen, A., Sonnergaard, J., Hocgaard, L.: Bulk characterization of pharmaceu-tical powders by low-pressure compression ii: Effect of method settings and particlesize. Pharmaceutical Development and Technology 2(11), 235–241 (2006)

7. Zhang, Y., Law, Y., Chakrabarti, S.: Physical properties and compact analysis ofcommonly used direct compression binders. AAPS Pharm. Sci. Tech. 4(4) (2003)

8. Belic, A., Zupancic, D., Skrjanc, I., Vrecer, F., Karba, R.: Artificial neural networksfor optimisation of tablet production

9. Nelles, O.: Axes-oblique partitioning strategies for local model networks. In: In-ternational Symposium on Intelligent Control (ISIC), Munich, Germany (October2006)

10. Hartmann, B., Nelles, O.: On the smoothness in local model networks. In: AmericanControl Conference (ACC), St. Louis, USA (June 2009)

11. Hartmann, B., Nelles, O.: Automatic adjustment of the transition between localmodels in a hierarchical structure identification algorithm. In: IFAC Symposiumon System Identification (SYSID), Budapest, Hungary (August 2009)

12. Nelles, O.: Nonlinear System Identification. Springer, Berlin (2001)13. Chong, E., Zak, S.: An Introduction to Optimization. John Wiley & Sons, New

Jersey (2008)14. Boyd, S., Vandenberghe, L.: Convex Optimization. Cambridge University Press,

Cambridge (2004)15. Scales, L.: Introduction to Non-Linear Optimization. Computer and Science Series.

Macmillan, London (1985)

[lecture notes in computer science] artificial intelligence and computational intelligence volume...

Documents