Efficiency of tabu-search-based conformational search algorithms

Download Efficiency of tabu-search-based conformational search algorithms

Post on 13-Jun-2016




0 download

Embed Size (px)


<ul><li><p>Efciency of Tabu-Search-Based Conformational SearchAlgorithms</p><p>CHRISTOPH GREBNER, JOHANNES BECKER, SVETLANA STEPANENKO, BERND ENGELS</p><p>Julius-Maximilians-Universitat Wurzburg, Institut fur Physikalische und Theoretische Chemie,Am Hubland, 97074 Wurzburg, Germany</p><p>Received 19 January 2011; Revised 10 March 2011; Accepted 10 March 2011DOI 10.1002/jcc.21807</p><p>Published online 3 May 2011 in Wiley Online Library (wileyonlinelibrary.com).</p><p>Abstract: Efcient conformational search or sampling approaches play an integral role in molecular modeling,leading to a strong demand for even faster and more reliable conformer search algorithms. This article compares the</p><p>efciency of a molecular dynamics method, a simulated annealing method, and the basin hopping (BH) approach</p><p>(which are widely used in this eld) with a previously suggested tabu-search-based approach called gradient only</p><p>tabu search (GOTS). The study emphasizes the success of the GOTS procedure and, more importantly, shows that</p><p>an approach which combines BH and GOTS outperforms the single methods in efciency and speed. We also show</p><p>that ring structures built by a hydrogen bond are useful as starting points for conformational search investigations of</p><p>peptides and organic ligands with biological activities, especially in structures that contain multiple rings.</p><p>q 2011 Wiley Periodicals, Inc. J Comput Chem 32: 22452253, 2011</p><p>Key words: conformational search; global optimization; Tabu search; basin hopping; simulated annealing; MonteCarlo with minimization</p><p>Introduction</p><p>Global optimization algorithms are subjects of current interest in</p><p>elds ranging from economics to natural science.13 In chemis-</p><p>try, pharmacy, and biology such methods are, for example,</p><p>needed to determine the properties of molecules possessing</p><p>many rotatable single bonds.46 Such computations require</p><p>knowledge of the three-dimensional (3D) structure of the mole-</p><p>cule, which is strongly related to the global minimum of its</p><p>potential energy surface (PES).710 However, often not only the</p><p>global minimum is populated.1113 Further geometrical arrange-</p><p>ments are also energetically accessible at room temperature,</p><p>because rotations around a single bond are low energy processes.</p><p>Hence, for exible molecules, the properties are determined by</p><p>an ensemble of conformers, which all have to be determined for</p><p>a careful characterization of the molecules.1417</p><p>The determination of these energetically accessible conform-</p><p>ers is called conformational search or analysis.12,18 Other well-</p><p>known conformational search problems include the determina-</p><p>tion of the equilibration phase for QM/MM computations of bio-</p><p>molecular systems,19 the computation of the 3D structures of</p><p>proteins from scratch and the determination of all possible reac-</p><p>tion paths between reactants and products.2022</p><p>Conformer search algorithms can be divided into determinis-</p><p>tic2326 and stochastic procedures.2730 The former is only possi-</p><p>ble for smaller molecules and determine the conformations by</p><p>systematical scans of the PES.23,31 If the number of freely rotat-</p><p>able bonds increases, a so-called combinatorial explosion18 occurs</p><p>because all degrees of freedom have to be varied simultaneously.</p><p>To overcome these problems, specialized conformational search</p><p>algorithms, each with its own strength and weaknesses, have been</p><p>developed over the past several years.12,30,3234</p><p>Some commonly used techniques for conformational search</p><p>are for example classical molecular dynamics (MD),28,35 mutu-</p><p>ally orthogonal Latin squares conformational search techni-</p><p>ques,36 smoothing/deformation search techniques,37 Monte Carlo</p><p>(MC),38 simulated annealing (SA),39,40 potential ooding,41</p><p>energy leveling,42 metadynamics,43 and genetic algorithms.44</p><p>The MC with minimization (MCM) method represents a very</p><p>successful approach to determine low energy conformations.4547</p><p>Originally developed by Li and Scheraga,45 the method was sub-</p><p>sequently generalized by Wales and Doye48 yielding the so-</p><p>called basin hopping (BH) approach. In the MCM and BH</p><p>approaches, each randomly generated structure is optimized, and</p><p>the resulting minima are used within the MC. This resetting of</p><p>the geometry before the new perturbation strongly increases the</p><p>efciency as was shown in many examples, for example, Len-</p><p>nard-Jones clusters,7,9,48 water shells,49 and peptides.5052</p><p>Additional Supporting Information may be found in the online version of</p><p>this article.</p><p>Contract/grant sponsor: DFG (Deutsche Forschungsgemeinschaft); con-</p><p>tract/grant numbers: SFB 630</p><p>Correspondence to: B. Engels. e-mail: bernd@chemie.uni-wuerzburg.de</p><p>q 2011 Wiley Periodicals, Inc.</p></li><li><p>Recently, we developed a new approach based on tabu search</p><p>(TS), a method which has found wide application in energy</p><p>resource planning, bioinformatics, computer-aided molecular</p><p>design, pattern classication, mineral exploration, as well as in</p><p>many industrial application settings,53 and in quantitative struc-</p><p>tureactivity relationship.53,54 TS5557 uses an adaptive memory</p><p>design and represents a metaheuristic method.5862 After reach-</p><p>ing a local optimum by a series of descent moves, which select</p><p>the highest evaluation moves from a candidate list, the method</p><p>provides an escape from this optimum by continuing to choose</p><p>highest evaluation moves but using tabu restrictions to avoid</p><p>revisiting solutions previously examined. A common way to</p><p>implement the tabu restrictions is to use a tabu list (TL), which</p><p>assigns a tabu status to elements of previously generated solu-</p><p>tions. The TS method also monitors the search using frequency</p><p>memory or other more elaborate forms of memory to determine</p><p>if the search gets stuck in a given region. If this happens, a</p><p>diversication search (DS) is performed, which guides the search</p><p>to different and hopefully more promising regions of the search</p><p>space.</p><p>TS was originally developed for noncontinuous problems and</p><p>subsequently applied also to solve continuous nonlinear and</p><p>global optimization problems.6367 To adopt the TS to the con-</p><p>tinuous conformational search problem, we developed several</p><p>TS-based approaches.68,69 Within these approaches, the gradient</p><p>only TS (GOTS) turned out to be most efcient.69,70 For the</p><p>minimization step that launches a descent to the next local mini-</p><p>mum, GOTS uses a Quasi-Newton method, combined with a</p><p>steepest descent approach.7174 To escape local minima, the</p><p>GOTS uses grids of function values. An efcient blocking of al-</p><p>ready visited regions is achieved by using tabu directions and</p><p>tabu regions in combination with the TL.56</p><p>This article has two primary goals. The rst is to test the ef-</p><p>ciency of the GOTS algorithm for conformational search. For</p><p>this purpose, we perform conformational searches for ve mole-</p><p>cules of different sizes and compare the efciency of GOTS</p><p>Figure 1. Flowchart of the main algorithm for search starting structure.</p><p>2246 Grebner et al. Vol. 32, No. 10 Journal of Computational Chemistry</p><p>Journal of Computational Chemistry DOI 10.1002/jcc</p></li><li><p>with MD, SA, and BH. The analysis shows that for successful</p><p>applications at larger molecules, the GOTS needs efcient DS</p><p>strategies. Hence, we used short BH sequences as DS within the</p><p>GOTS. This combination (GOTS/BH) outperforms both single</p><p>methods.</p><p>The second goal of this work is the evaluation of ve-, six-,</p><p>or seven-membered ring structures containing hydrogen bonds as</p><p>representative starting structures for biologically active smaller</p><p>peptides or organic ligands. As the build-up of such structures is</p><p>not very time consuming, they may also be useful within a</p><p>superposition approach to thermodynamics.75,76</p><p>It is obvious that reasonable starting structures are very help-</p><p>ful for conformational searches for larger molecules because the</p><p>search space becomes too large for an exhaustive search. An im-</p><p>pressive example is given in a review about the Critical Assess-</p><p>ment of Techniques for Protein Structure Prediction 2006</p><p>(CASP7).77 Several other examples can be found in litera-</p><p>ture.36,7785</p><p>This article is organized as follows. We rst describe the</p><p>algorithm that builds up the ring structures (STARTOPT). Then,</p><p>the efciency of the GOTS in simulations starting from ran-</p><p>domly generated structures is compared with other approaches</p><p>as MD, SA, and BH. This part also focuses on the effectiveness</p><p>of a combination of GOTS and BH. Finally, we investigate the</p><p>inuence of the starting structures containing the above men-</p><p>tioned rings on the efciency of the various approaches.</p><p>Description of the STARTOPT Algorithm</p><p>Ring structures closed by hydrogen bonds between hydrogen</p><p>bond donors and acceptors represent good starting structures for</p><p>the conformational searches because they are often lower in</p><p>energy than the corresponding ring-open conformations.1113,86</p><p>The STARTOPT algorithm developed to detect such conforma-</p><p>tions is depicted in . In the rst step, the algorithm uses the rep-</p><p>resentation of the molecule in Cartesian coordinates to build up</p><p>a connection table, which is then used to identify all covalent</p><p>bonds and all hydrogen bond acceptors and donors of the mole-</p><p>cule. The owchart searching for possible ve-, six- and seven-</p><p>membered rings is shown in . Starting from the rst hydrogen</p><p>bond donor, the algorithm moves atom by atom along the cova-</p><p>lent bonds of the molecule and searches for heteroatoms repre-</p><p>senting the hydrogen bond acceptors. If the ring size becomes</p><p>larger than seven atoms before an acceptor is found, the loop is</p><p>left, and the next donor is taken as a starting point. If an</p><p>acceptor is found and the ring size equals ve, six, or seven</p><p>atoms, then the atom sequence is saved. Already visited atoms</p><p>are remembered in a Visited-List to avoid circulating in the mol-</p><p>ecule (e.g., in ring systems).</p><p>After locating all possible ring structures, the Cartesian coor-</p><p>dinates of the molecule are reordered for each possible ring. In</p><p>the new coordinate set, the atoms of a given ring are placed on</p><p>the rst positions in the Cartesian coordinate le because this</p><p>allows computing the internal coordinates of the rings in Z-ma-</p><p>trix notation very easily from the Cartesian coordinates. After</p><p>generating the internal coordinates, the ring is closed by chang-</p><p>ing the dihedral angles of the ring to standard values of cyclo-</p><p>pentane, -hexane, and -heptane. To ensure proper rotations</p><p>around a given single bond, for example, of end-standing methyl</p><p>groups, we use main and dependent torsions87,88 as developed</p><p>by Echenique and Alonso.89 To obtain relaxed ring structures,</p><p>we perform three subsequent optimizations. In the rst one, the</p><p>ring atoms are xed, whereas the rest of the molecule is opti-</p><p>mized. The reversed scheme is used in the second optimization.</p><p>Finally, a full optimization is performed. To construct structures</p><p>that contain several rings, the program is applied several times.</p><p>Description of the Simulations</p><p>To achieve insights into the efciency of the GOTS, we per-</p><p>formed conformational searches for molecules with 3176 atoms</p><p>(Fig. 3). The conformational searches are performed with ve</p><p>different approaches. Simple MD simulations are performed to</p><p>obtain a feeling if the given molecule is so small that its phase</p><p>space can easily be exhaustively scanned. Hence, these MD sim-</p><p>ulations do not contain heating and cooling parts. The simulation</p><p>time was 1 ns (NVT ensemble) with a time step of 1 fs leading</p><p>to 1,000,000 steps in total. A snapshot was taken for every 10</p><p>ps, which was subsequently energy minimized with the Newton-</p><p>like local optimizer90 implemented in Tinker.9095 In total, 100</p><p>optimized structures were obtained.</p><p>Figure 2. Flowchart of the algorithm for searching all possible ringswhich can be built up by the existing acceptors and donors.</p><p>2247Tabu-Search-Based Conformational Algorithms</p><p>Journal of Computational Chemistry DOI 10.1002/jcc</p></li><li><p>Heating and cooling parts are included in the SA approach.</p><p>Again we used the standard procedure implemented in the Tin-</p><p>ker program package,9095 that is, the initial temperature is 1000</p><p>K, and 100 steps were performed for equilibration. The cooling</p><p>to 0 K was performed in 1,000,000 steps with a linear decrease</p><p>in temperature by the factor (current step number)/(total number</p><p>of steps) for every step. A snapshot was taken every 10 ps and</p><p>the 100 obtained structures were subsequently optimized. Addi-</p><p>tionally, we used the MCM or BH approach for global optimiza-</p><p>tion as implemented in Tinker.9095</p><p>The results of these approaches were compared with the results</p><p>of our GOTS search. To enable this comparison, the GOTS was</p><p>combined with the TINKER program package. The GOTS and ba-</p><p>sin-hopping approaches seem to complement each other. Steered</p><p>by the alternating descent and ascent strategy, the GOTS repre-</p><p>sents a more local approach. The BH approach on the other hand</p><p>jumps through the phase space. To test the efciency of the com-</p><p>bination of both approaches, we also performed searches in which</p><p>the BH approach was used for diversication in the GOTS</p><p>(GOTS/BH). Such a DS uses 200 BH steps.</p><p>To investigate if ring structures generated by the STARTOPT</p><p>represent good starting structures for conformational searches,</p><p>we used different starting points in each simulation. In the rst</p><p>series, we started from structures that were generated within a</p><p>prior MD simulation, which has a duration of 1 ns with a time</p><p>step of 1 fs (NVT ensemble). Snapshots were taken every 10 ps.</p><p>From the 100 structures, we randomly chose 30 starting struc-</p><p>tures for the subsequent conformation searches. In the second se-</p><p>ries, we performed STARTOPT once and started the simulations</p><p>from the resulting structures containing one ring. These simula-</p><p>tions are abbreviated by STARTOPT. The last series started</p><p>from structures containing the maximal number of ring struc-</p><p>tures of a given molecule. They were obtained by performing</p><p>STARTOPT repeatedly until no new structures were generated.</p><p>These simulations are abbreviated by STARTOPT/Mult. For</p><p>molecules 4 and 5, the number of starting structure turned out to</p><p>be too large. Hence, only the ones being lowest in energy were</p><p>used to produce the next generation (see below).</p><p>All computations were performed with the OPLS-AA96 force</p><p>eld as implemented in the TINKER program package. The</p><p>coordinates of the best structures can be found in the Supporting</p><p>Information.</p><p>Results and Discussion</p><p>Table 1 shows the results obtained for the tripeptide Gly-Ala-Ser</p><p>(1) consisting of 31 atoms. The molecule contains 10 formally</p><p>freely rotating single bonds, but two bonds are rather rigid am-</p><p>ide bonds. Table 1 also shows that molecule 1 is too small to</p><p>Figure 3. Test systems used in this work.</p><p>2248 Grebner et al. Vol. 32, No. 10 Journal of Computational Chemistry</p><p>Journal of Computational Chemistry DOI 10.1002/jcc</p></li><li><p>represent a reasonable test system. Starting from the MD gener-</p><p>ated starting structures only MD and SA do not nd this global</p><p>minimum. But the energetically lowest minimum found by these</p><p>simulations is located only 0.9 kcal/mol above the global mini-</p><p>mum. This is detected by BH and GOTS in about 80% of the</p><p>si...</p></li></ul>


View more >