multiobjective tabu search method used in chemistry

7
Multiobjective Tabu Search Method Used in Chemistry T. RUSU, 1 V. BULACOVSCHI 2 1 P. Poni Institute of Macromolecular Chemistry, Iasi, Romania 2 Gh. Asachi Technical University, Faculty of Chemistry, Iasi, Romania Received 28 October 2004; accepted 21 June 2005 Published online 12 December 2005 in Wiley InterScience (www.interscience.wiley.com). DOI 10.1002/qua.20898 ABSTRACT: The use of a combined artificial intelligence method in macromolecular chemistry design is described. This method implies a Back-Propagation (BP) Neural Network, modified for two-dimensional input data and for a system composed of a genetic algorithm extended by a Tabu Search operator used to incorporate high-level chemical knowledge: thermodynamic polymer properties. © 2005 Wiley Periodicals, Inc. Int J Quantum Chem 106: 1406 –1412, 2006 Key words: artificial intelligence in chemistry; polymer design; imposed properties; Tabu Search method Introduction M odern trends in macromolecular chemistry consider artificial intelligence (AI) methods as an alternative in the design of new molecules with imposed physical, chemical, and biological properties. According to Venkatasubramanian et al. [1], the computer-aided molecular design (CAMD) requires solutions for two problems: (i) computa- tion of physical, chemical, and biological properties from molecular structure; and (ii) identification of appropriate molecular structure for the desired/ imposed properties. Venkatasubramanian and colleagues proposed an evolutionary molecular design approach, using genetic algorithms to solve the inverse problem. To apply evolutionary techniques to computer-aided molecular design, two things must be known to some extent: the desired properties, as well as how they relate to the molecule structure. The structure– activity relationship is needed both to determine the necessary properties and to build a molecule that has those properties. This can be described as a structure-to-property stage and as a property-to- structure stage. The first stage determines the prop- erties of a molecule based on its structure. The second stage builds a structure based on the desired properties. Venkatasubramanian extended the ini- tial approach by describing a framework that uses both neural networks and genetic algorithms [2] to address difficulties in both the forward and inverse problems. The neural network methodology for predicting a property addresses the forward prob- lem, while the genetic algorithmic component was selected for the inverse problem. Correspondence to: T. Rusu; e-mail: [email protected] International Journal of Quantum Chemistry, Vol 106, 1406 –1412 (2006) © 2005 Wiley Periodicals, Inc.

Upload: t-rusu

Post on 06-Jul-2016

212 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Multiobjective Tabu Search method used in chemistry

Multiobjective Tabu Search MethodUsed in Chemistry

T. RUSU,1 V. BULACOVSCHI2

1P. Poni Institute of Macromolecular Chemistry, Iasi, Romania2Gh. Asachi Technical University, Faculty of Chemistry, Iasi, Romania

Received 28 October 2004; accepted 21 June 2005Published online 12 December 2005 in Wiley InterScience (www.interscience.wiley.com).DOI 10.1002/qua.20898

ABSTRACT: The use of a combined artificial intelligence method in macromolecularchemistry design is described. This method implies a Back-Propagation (BP) NeuralNetwork, modified for two-dimensional input data and for a system composed of agenetic algorithm extended by a Tabu Search operator used to incorporate high-levelchemical knowledge: thermodynamic polymer properties. © 2005 Wiley Periodicals, Inc.Int J Quantum Chem 106: 1406–1412, 2006

Key words: artificial intelligence in chemistry; polymer design; imposed properties;Tabu Search method

Introduction

M odern trends in macromolecular chemistryconsider artificial intelligence (AI) methods

as an alternative in the design of new moleculeswith imposed physical, chemical, and biologicalproperties. According to Venkatasubramanian et al.[1], the computer-aided molecular design (CAMD)requires solutions for two problems: (i) computa-tion of physical, chemical, and biological propertiesfrom molecular structure; and (ii) identification ofappropriate molecular structure for the desired/imposed properties.

Venkatasubramanian and colleagues proposedan evolutionary molecular design approach, usinggenetic algorithms to solve the inverse problem. To

apply evolutionary techniques to computer-aidedmolecular design, two things must be known tosome extent: the desired properties, as well as howthey relate to the molecule structure. The structure–activity relationship is needed both to determinethe necessary properties and to build a moleculethat has those properties. This can be described as astructure-to-property stage and as a property-to-structure stage. The first stage determines the prop-erties of a molecule based on its structure. Thesecond stage builds a structure based on the desiredproperties. Venkatasubramanian extended the ini-tial approach by describing a framework that usesboth neural networks and genetic algorithms [2] toaddress difficulties in both the forward and inverseproblems. The neural network methodology forpredicting a property addresses the forward prob-lem, while the genetic algorithmic component wasselected for the inverse problem.Correspondence to: T. Rusu; e-mail: [email protected]

International Journal of Quantum Chemistry, Vol 106, 1406–1412 (2006)© 2005 Wiley Periodicals, Inc.

Page 2: Multiobjective Tabu Search method used in chemistry

In this study, we present a hybrid algorithm builton the genetic algorithm structure with a TabuSearch crossover generator, for the inverse prob-lem. We have selected a “back-propagation” algo-rithm, for the forward problem, that does not rep-resent any particular network architecture (a multi-layered net is generally used), but rather a speciallearning process. System performance is evaluatedaccording to the degree of similarity (DS) betweenthe simple genetic algorithm and the hybrid algo-rithm (GA-TS).

Artificial Intelligence Method

Starting from the Venkatasubramanian scheme,we suggest a hybrid method for the computer-based macromolecular design (Fig. 1). For the in-verse problem, we propose an evolutionary molec-ular design approach using genetic algorithmscombined with a Tabu Search operator (GA-TS).Genetic algorithms can successfully overcome thepotential barriers in some cases but in most cases itis still common that GA staggers in local potentialwells. With respect to the escape from local minima,TS appears superior to GA; however, it convergesrelatively slowly, especially near the best solutions.So, according to their merits and shortcomings, ahybrid algorithm combining GA with TS is pro-posed. The evaluation of the benefit in using the TSoperator, combined with GA versus the simple GAapproach, is also presented in this work.

NEURAL NETWORK APPROACH

Neural network applications to chemistry havebeen explored in recent years. These include sys-tems for predicting secondary protein structure [3],QSAR parameters to determine biological activities[4], functional group identification from mass and

infrared spectra [5], and connecting tables for pre-diction chemical composition of aromatic substitu-tion reactions [6].

The essential abilities and the flexibility of neuralnetworks are related to the interconnection of indi-vidual arithmetic units. Many kinds of networkingstrategies have been investigated. However, for ourwork we have considered a “back-propagation” al-gorithm that does not represent any particular net-work architecture (a multi-layered net is generallyused), but is rather a special learning process (Fig.2). The learning through back-propagation stemsfrom the fact that adjustments to the neural netweights can be calculated on the basis of well-defined equations.

If wij(n) represents the weight of j neuron beforefit and wij(n � 1) the weight after the fitting proce-dure, Hebb’s law for learning is

wij�n � 1� � wij�n� � cyiyj, (1)

where yi is the output for i neuron (input for j) andyj is the output for j:

wij�n � 1� � wij�n� � cS� yi�S� yi�. (2)

If more than one neuron is activated, the networkforms local clusters, which depend on the input sig-nals. The equation for the output of j will be

yj � �� Ij � �k��K

k�K

cj,kyj�k� , (3)

where Ij is the stimulus

j: Ij � �i�1

m

wjixi and ��

is a nonlinear function.

FIGURE 1. Hybrid method for computer-based poly-mer design.

FIGURE 2. Neural Network representation.

MULTIOBJECTIVE TABU SEARCH METHOD

VOL. 106, NO. 6 DOI 10.1002/qua INTERNATIONAL JOURNAL OF QUANTUM CHEMISTRY 1407

Page 3: Multiobjective Tabu Search method used in chemistry

The back-propagation algorithm may be used forone or multi-layered networks and is a supervisedlearning process.

OVERVIEW OF TABU SEARCH METHOD

The Tabu Search is fairly new [7], and themethod is still actively investigated, undergoingcontinuous evolution and improvement. The TabuSearch proceeds according to the supposition thatthere is no point in accepting a new solution, unlessit is to avoid a path that has already been investi-gated. This ensures that new regions of a problemsolution space will be investigated with the goal ofavoiding local minima and ultimately finding thedesired solution.

In order to attempt this goal the method beginsby marching to a local minimum. To avoid retrac-ing the steps used, the method records the recentmoves in one or more Tabu lists. The original aim ofthe list was not to prevent a previous move frombeing repeated, but rather to ensure that it was notreversed. The Tabu lists are historical in nature andform the Tabu Search memory. The role of thememory may change as the algorithm proceeds. Forthe beginning, one makes a rough examination ofthe solution space, known as “diversification.” Af-ter the candidate locations are identified, the searchis more focused to produce local optimal solutionsin an “intensification” process (Fig. 3).

In the first stage, an initial solution is specified orrandomly generated at the start of the iterations;then, some moves are generated from the currentsolution. Each of these moves is evaluated using aneighborhood function, and they are ranked in or-der, with the best move at the head of the list.Moves are considered as Tabu if they are not dif-ferent enough from those solutions in the Tabu list.

The best move will be accepted if it is better thanother solutions in the Tabu list.

A neighborhood function is a map that definesfor each solution x a subset solutions N(x) called aneighborhood. Each solution for N(x) is called aneighbor of x:

N : S 3 2E. (4)

Evidently the key to this procedure lies in themanagement of the Tabu list. This list is involved inupdating the Tabu list, i.e., in deciding on howmany and which moves, and/or move attributes,have to be set Tabu in any iteration of the search.

For the GA-TS system, used in our study, thedetermination of the appropriate range for thelength of Tabu list was done empirically. The uni-formly varying Tabu list size was used between thelower limit, mSmin, and the upper limit, mSmax. Aset of experiments on a moderate size was run todetermine an appropriate set for initial polymersamples (Table I).

GENETIC ALGORITHM APPROACH

Genetic algorithm methods are based on theprinciples of Darwinian models for natural selec-tion and evolution (Fig. 4). For genetic algorithms,there are essentially two main components. First,there must be a code or structure as possible solu-tions to the problem. This basic code is called astring. Strings are formed from a sequence of char-acters of finite length, n, composed of binary bits orsymbols. Such a string represents each candidatesolution. The set of solutions, Pj, is referred to as thepopulation of the jth generation. Second, a transi-tion rule must be defined to mimic the biologicalevolution process. The transition rule consists of areproductive plan and genetic operators. The repro-ductive plan is an algorithm that selects strings ofthe current population that participates in the gen-erating process of the next population. The geneticoperators appropriately modify the structure of se-lected strings, in order to produce members of thenext solution generation. A simple genetic algo-rithm consists of one reproductive plan called thefitness proportionate reproduction and two geneticoperators: crossover and mutation [8, 9].

For the crossover, we used a modified crossoveroperator by adding a Tabu Search parameter. Thecrossover probability (Pc) decreases linearly withgeneration number. The chances to do point muta-FIGURE 3. TS procedure.

RUSU AND BULACOVSCHI

1408 INTERNATIONAL JOURNAL OF QUANTUM CHEMISTRY DOI 10.1002/qua VOL. 106, NO. 6

Page 4: Multiobjective Tabu Search method used in chemistry

tion, crankshaft move mutation, and three-bead flipmutation are 50%, 25%, and 25%, respectively, butthe probability of mutation (Pm) increases linearly.For selection, a roulette wheel selection was used.Our main goal was to deter the high-energy con-formation and to search relatively low-energy con-formations candidates. The chance of survival isproportional to their energy value. The individualis selected out from the previous population withthe probability defined as follows:

P�individual� ��energy�

¥n�1N �energyn�

, (5)

where N is the population size.The target graphs, for the two sequences in the

copolymer, representing a graph to be matched andthe source graph from the case base, are denoted byGhydrophilic � (Vt, At, �t, �t, L�t, Lat) and byGhydrophobic � (Vs, As, �s, �s, L�s, Las), respectively.

The degree of similarity between energy verticesu � Vt (for the hydrophobic sequence) and � � Vs

(for the hydrophilic sequence) in correspondence fis denoted by DSf(u, �). If f(u) � �, then DSf(u, �) �0, or

DSf�u, �� � �1, if �t�u� � �s���0, if �t�u� � �s���. (6)

Experimental Results

The synthesis of azo-ester-containing polydim-ethylsiloxanes (AEPS) with different molecularweights of siloxane sequences and a different con-tent of azo groups was realized according to theprocedure described in Ref. [10].

The copolymers were obtained by radical poly-merization of methacrylic acid (MAA) in the pres-ence of polydimethylsiloxane azo-ester macroinitia-tors (AEPS) and ethylene glycol dimethacrylate(EGDMA) (1% mol. vs. MAA) as the cross-linkingagent (Fig. 5). The reaction was carried out in tolu-ene (total concentration 25%, by weight) under N2,by maintaining the reaction mixture for 20 h at 80°Cin sealed ampoules. The synthesis data are summa-rized in Table II.

The starting parameters are related to a set ofcopolymers with hydrophilic and hydrophobic se-quences, to which the best-fit molecular ratio wasdesigned, according to evaporation speed from themacromolecular network [11, 12] (Fig. 6).

TABLE I ______________________________________________________________________________________________Results of different Tabu list length ranges.

Ranges mSmin mSmax Span Avg. solution Std. solution

1 0.4 0.6 0.2 6.5657 e8 2.65823 e7

2 0.8 1.2 0.4 6.40407 e8 1.53838 e7

3 1.6 2.4 0.8 6.3487 e8 2.35536 e7

4 0.8 1.0 0.2 6.43163 e8 1.80529 e7

5 1.6 1.8 0.2 6.21366 e8 1.45742 e7

6 2.2 2.4 0.2 6.51094 e8 2.85804 e7

FIGURE 4. Genetic algorithm crossover and mutation. FIGURE 5. Copolymer synthesis.

MULTIOBJECTIVE TABU SEARCH METHOD

VOL. 106, NO. 6 DOI 10.1002/qua INTERNATIONAL JOURNAL OF QUANTUM CHEMISTRY 1409

Page 5: Multiobjective Tabu Search method used in chemistry

Three physical parameters were considered inour tests: evaporation speed, columnar water va-por, and columnar liquid water:

g � fNN�T�, (7)

where g � {W, V, L, Ts} is the vector of simulta-neously retrieved parameters; W represents evapo-ration speed, V is columnar water vapor, L is co-lumnar liquid water, a T is the temperature.

The local exchange coefficient Ul for each capil-lary section is calculated by using the electric anal-ogy:

U1 �1

1htube�water

�1

htube�

1hwater

. (8)

The total exchange coefficient might be writtenas

U � 1/N �1

N 11

htube�air�

1htube

�1

hwater

. (9)

The energy assessment includes the energy ex-changed through the capillary exchanger, the en-ergy exchanged through lateral sides of the poly-mer network (htube), the energy exchanged by waterrenewal (hwater), and the thermally exchanged en-ergy (htube�water/air). The energy and flow data as afunction of the molecular ratio between the PDMS-co-PMAA were used to train the neural network.

Discussion

The results for the GA-TS algorithm were com-pared with those of the genetic algorithm using adegree of similarity term. One set includes testproblems that are randomly produced and dividedinto 10 groups with different values for DSrecord.Table III compares the performances of the geneticalgorithm (DSga) and genetic algorithm–Tabu

TABLE II _____________________________________________________________________________________________Synthesis data.

MnPDMS

Initial mixture Final mixture

mol SiO mol MAA SiO/AMAmol SiOreacted

mol MAAreacted � (%)

SiO/AMAfinal

1 000 0.0085 0.1353 0.0625 0.0072 0.1353 84.70 0.0521 000 0.0132 0.1057 0.125 0.0095 0.1057 71.96 0.0901 000 0.0197 0.0787 0.250 0.0164 0.0787 83.13 0.2081 000 0.0287 0.0574 0.500 0.0263 0.0574 91.63 0.4583 000 0.0146 0.1165 0.125 0.0106 0.1165 72.60 0.0983 000 0.0164 0.6537 0.250 0.0159 0.6537 96.7 0.2423 000 0.0326 0.0651 0.500 0.0266 0.0651 70.33 0.2113 000 0.2857 0.2857 1.000 0.2108 0.2857 73.78 0.7388 000 0.0117 0.0673 0.250 0.0115 0.0673 98.21 0.2468 000 0.0916 0.0820 0.500 0.0720 0.0820 78.6 0.3938 000 0.0311 0.0311 1.000 0.0271 0.0311 87.14 0.8718 000 0.0710 0.0355 2.000 0.0565 0.0355 79.58 1.592

FIGURE 6. Residual water from the six PDMS-co-PMAA networks (Table I) as a function of drying time,after swelling equilibrium has been achieved, by keep-ing them in excess of water.

RUSU AND BULACOVSCHI

1410 INTERNATIONAL JOURNAL OF QUANTUM CHEMISTRY DOI 10.1002/qua VOL. 106, NO. 6

Page 6: Multiobjective Tabu Search method used in chemistry

Search method (DStabu). The right part of Table IIIshows the results in terms of DSgap obtained by theGA-TS method. The left part shows results obtainedwith the genetic algorithm. Each row representsone test group formed with corresponding valuesof DSrecord. The figures in the table field representthe number of test problems of a certain group,whose DSgap value belongs to the range defined inthe column title. The bottom row shows the totalnumber of such test problems.

One can observe better performance of theGA-TS method over the genetic algorithm. The totalnumber of test problems where the DStabu is higherthan DSrecord is 795, as compared with 372, usingthe genetic algorithm. The total number of testproblems where the DStabu is smaller than DSrecordis 3, as compared with 11, using the genetic algo-rithm. In contrast, the GA-TS is so fast that theaverage processing time is only 372 ms.

Figure 7 shows the results of three test groupsin terms of DSgap. Six points on the x-axis areused to represent the six copolymers from the testgroup. Figure 7 presents the average value ofDSgap for the two problems in its subgroup. Theperformance of the simple Tabu Search for thetest problems of the group B and C in the 0.4 – 0.6range of DSrecord is slightly less favorable than forothers.

Figure 8 compares the time, in seconds, neces-sary to generate solutions for different qualities.Three points on the x-axis are used to representthree test groups. The points on the solid curve

represent the average running time for generating asolution for one of the test problems in its corre-sponding group, which renders DSgap ranging from�0.04 to 0.03. The points on the dotted curve rep-resent the average running time for generating aslightly less favorable solution for one of the testproblems in its corresponding group, which ren-ders a DSgap ranging from 0.04 to 0.06. The secondexperiment illustrates that the simple Tabu Searchapproach, developed so far, is capable of generatinghigh-quality solutions that render the degree ofsimilarity values better than, or close to, the DSrecordvalues.

TABLE III ____________________________________________________________________________________________Simulation results of the simple GA compared with the GA–TS method.

RangeofDSrecord

No. of pairs of copolymer samples

DSgap � DSrecord � DSga DSgap � DSrecord � DStabu

(�0.1, 0) [0, 0] (0, 0.1) [0.1, 0.4) (�0.1, 0) [0, 0] (0, 0.1) [0.1, 0.4)

[0, 0.1) 1 0 1 0 2 0 0 0[0.1, 0.2) 6 4 2 0 12 0 0 0[0.2, 0.3) 5 6 3 1 15 0 0 0[0.3, 0.4) 38 21 0 2 61 0 0 0[0.4, 0.5) 42 28 0 1 68 1 2 0[0.5, 0.6) 54 40 0 1 95 0 0 0[0.6, 0.7) 88 83 0 0 169 1 1 0[0.7, 0.8) 76 117 0 0 188 5 0 0[0.8, 0.9) 55 172 0 0 166 61 0 0[0.9, 1.0) 7 146 0 0 19 134 0 0

[0, 1.0) 372 617 6 5 795 202 3 0

FIGURE 7. Performance comparison of the testgroups of different sizes.

MULTIOBJECTIVE TABU SEARCH METHOD

VOL. 106, NO. 6 DOI 10.1002/qua INTERNATIONAL JOURNAL OF QUANTUM CHEMISTRY 1411

Page 7: Multiobjective Tabu Search method used in chemistry

Conclusion

A Tabu Search approach is presented as an al-ternative in polymer design. Our preliminary ex-perimental results show the efficiency of the TabuSearch, as this method provides a mechanism forsystematically mapping out basins of attraction forlocal minima, thereby allowing the search to moveto neighboring local minima. For problems with a

large number of small experiments, Tabu Search isan efficient means to explore the local minima. Theproperties of Tabu Search (i.e., memory of thesearch to date) were combined with genetic algo-rithms and tested. The combination algorithm per-formed favorably, as compared with the geneticalgorithm, in terms of computational efficiency.

References

1. Venkatasubramanian, V.; Chan, K.; Caruthers, J. M. ComputChem Eng 1994, 18, 833.

2. Venkatasubramanian, V.; Chan, K.; Caruthers, J. M. J ChemInform Comput Sci 1995, 35, 188.

3. Qian, N.; Sejnowski, T. J. J Mol Biol 1988, 202, 865.4. Aoyama, T.; Suzuki, Y.; Ichikawa, H. J Med Chem 1990, 33,

905.5. Robb, E. W.; Munk, M. E. Mikrochim Acta 1990, I, 31.6. Zupan, J.; Gasteiger, J. Neural Networks for Chemists: A

Textbook; VCH: Weinheim, 1993.7. Glover, F. Comput Oper Res 1986, 5, 533.8. Goldberg, D. E. Genetic Algorithms in Search, Optimization,

and Machine Learning; Addison-Wesley: 1989; ISBN 0-201-15767-5.

9. Jiang, T.; Cui, O.; Shi, G.; Ma, S. J Chem Phys 2003, 119, 4592.10. Harabagiu, V.; Hamciuc, V.; Giurgiu, D. Makromol Chem

1990, 11, 433.11. Rusu, T.; Ioan, S.; Buraga, S. C. Eur Polym J 2001, 37, 2005.12. Rusu, T.; Gogan, O. M. Mol Cryst 2004, 415/418, 155.

FIGURE 8. Time needed to generate solutions of dif-ferent qualities.

RUSU AND BULACOVSCHI

1412 INTERNATIONAL JOURNAL OF QUANTUM CHEMISTRY DOI 10.1002/qua VOL. 106, NO. 6