efficiency of tabu-search-based conformational search algorithms

9
Efficiency of Tabu-Search-Based Conformational Search Algorithms CHRISTOPH GREBNER, JOHANNES BECKER, SVETLANA STEPANENKO, BERND ENGELS Julius-Maximilians-Universita¨t Wu¨rzburg, Institut fu¨r Physikalische und Theoretische Chemie, Am Hubland, 97074 Wu¨rzburg, Germany Received 19 January 2011; Revised 10 March 2011; Accepted 10 March 2011 DOI 10.1002/jcc.21807 Published online 3 May 2011 in Wiley Online Library (wileyonlinelibrary.com). Abstract: Efficient conformational search or sampling approaches play an integral role in molecular modeling, leading to a strong demand for even faster and more reliable conformer search algorithms. This article compares the efficiency of a molecular dynamics method, a simulated annealing method, and the basin hopping (BH) approach (which are widely used in this field) with a previously suggested tabu-search-based approach called gradient only tabu search (GOTS). The study emphasizes the success of the GOTS procedure and, more importantly, shows that an approach which combines BH and GOTS outperforms the single methods in efficiency and speed. We also show that ring structures built by a hydrogen bond are useful as starting points for conformational search investigations of peptides and organic ligands with biological activities, especially in structures that contain multiple rings. q 2011 Wiley Periodicals, Inc. J Comput Chem 32: 2245–2253, 2011 Key words: conformational search; global optimization; Tabu search; basin hopping; simulated annealing; Monte Carlo with minimization Introduction Global optimization algorithms are subjects of current interest in fields ranging from economics to natural science. 1–3 In chemis- try, pharmacy, and biology such methods are, for example, needed to determine the properties of molecules possessing many rotatable single bonds. 4–6 Such computations require knowledge of the three-dimensional (3D) structure of the mole- cule, which is strongly related to the global minimum of its potential energy surface (PES). 7–10 However, often not only the global minimum is populated. 11–13 Further geometrical arrange- ments are also energetically accessible at room temperature, because rotations around a single bond are low energy processes. Hence, for flexible molecules, the properties are determined by an ensemble of conformers, which all have to be determined for a careful characterization of the molecules. 14–17 The determination of these energetically accessible conform- ers is called conformational search or analysis. 12,18 Other well- known conformational search problems include the determina- tion of the equilibration phase for QM/MM computations of bio- molecular systems, 19 the computation of the 3D structures of proteins from scratch and the determination of all possible reac- tion paths between reactants and products. 20–22 Conformer search algorithms can be divided into determinis- tic 23–26 and stochastic procedures. 27–30 The former is only possi- ble for smaller molecules and determine the conformations by systematical scans of the PES. 23,31 If the number of freely rotat- able bonds increases, a so-called combinatorial explosion 18 occurs because all degrees of freedom have to be varied simultaneously. To overcome these problems, specialized conformational search algorithms, each with its own strength and weaknesses, have been developed over the past several years. 12,30,32–34 Some commonly used techniques for conformational search are for example classical molecular dynamics (MD), 28,35 mutu- ally orthogonal Latin squares conformational search techni- ques, 36 smoothing/deformation search techniques, 37 Monte Carlo (MC), 38 simulated annealing (SA), 39,40 potential flooding, 41 energy leveling, 42 metadynamics, 43 and genetic algorithms. 44 The MC with minimization (MCM) method represents a very successful approach to determine low energy conformations. 45–47 Originally developed by Li and Scheraga, 45 the method was sub- sequently generalized by Wales and Doye 48 yielding the so- called basin hopping (BH) approach. In the MCM and BH approaches, each randomly generated structure is optimized, and the resulting minima are used within the MC. This resetting of the geometry before the new perturbation strongly increases the efficiency as was shown in many examples, for example, Len- nard-Jones clusters, 7,9,48 water shells, 49 and peptides. 50–52 Additional Supporting Information may be found in the online version of this article. Contract/grant sponsor: DFG (Deutsche Forschungsgemeinschaft); con- tract/grant numbers: SFB 630 Correspondence to: B. Engels. e-mail: [email protected] q 2011 Wiley Periodicals, Inc.

Upload: christoph-grebner

Post on 13-Jun-2016

221 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Efficiency of tabu-search-based conformational search algorithms

Efficiency of Tabu-Search-Based Conformational Search

Algorithms

CHRISTOPH GREBNER, JOHANNES BECKER, SVETLANA STEPANENKO, BERND ENGELS

Julius-Maximilians-Universitat Wurzburg, Institut fur Physikalische und Theoretische Chemie,Am Hubland, 97074 Wurzburg, Germany

Received 19 January 2011; Revised 10 March 2011; Accepted 10 March 2011DOI 10.1002/jcc.21807

Published online 3 May 2011 in Wiley Online Library (wileyonlinelibrary.com).

Abstract: Efficient conformational search or sampling approaches play an integral role in molecular modeling,

leading to a strong demand for even faster and more reliable conformer search algorithms. This article compares the

efficiency of a molecular dynamics method, a simulated annealing method, and the basin hopping (BH) approach

(which are widely used in this field) with a previously suggested tabu-search-based approach called gradient only

tabu search (GOTS). The study emphasizes the success of the GOTS procedure and, more importantly, shows that

an approach which combines BH and GOTS outperforms the single methods in efficiency and speed. We also show

that ring structures built by a hydrogen bond are useful as starting points for conformational search investigations of

peptides and organic ligands with biological activities, especially in structures that contain multiple rings.

q 2011 Wiley Periodicals, Inc. J Comput Chem 32: 2245–2253, 2011

Key words: conformational search; global optimization; Tabu search; basin hopping; simulated annealing; Monte

Carlo with minimization

Introduction

Global optimization algorithms are subjects of current interest in

fields ranging from economics to natural science.1–3 In chemis-

try, pharmacy, and biology such methods are, for example,

needed to determine the properties of molecules possessing

many rotatable single bonds.4–6 Such computations require

knowledge of the three-dimensional (3D) structure of the mole-

cule, which is strongly related to the global minimum of its

potential energy surface (PES).7–10 However, often not only the

global minimum is populated.11–13 Further geometrical arrange-

ments are also energetically accessible at room temperature,

because rotations around a single bond are low energy processes.

Hence, for flexible molecules, the properties are determined by

an ensemble of conformers, which all have to be determined for

a careful characterization of the molecules.14–17

The determination of these energetically accessible conform-

ers is called conformational search or analysis.12,18 Other well-

known conformational search problems include the determina-

tion of the equilibration phase for QM/MM computations of bio-

molecular systems,19 the computation of the 3D structures of

proteins from scratch and the determination of all possible reac-

tion paths between reactants and products.20–22

Conformer search algorithms can be divided into determinis-

tic23–26 and stochastic procedures.27–30 The former is only possi-

ble for smaller molecules and determine the conformations by

systematical scans of the PES.23,31 If the number of freely rotat-

able bonds increases, a so-called combinatorial explosion18 occurs

because all degrees of freedom have to be varied simultaneously.

To overcome these problems, specialized conformational search

algorithms, each with its own strength and weaknesses, have been

developed over the past several years.12,30,32–34

Some commonly used techniques for conformational search

are for example classical molecular dynamics (MD),28,35 mutu-

ally orthogonal Latin squares conformational search techni-

ques,36 smoothing/deformation search techniques,37 Monte Carlo

(MC),38 simulated annealing (SA),39,40 potential flooding,41

energy leveling,42 metadynamics,43 and genetic algorithms.44

The MC with minimization (MCM) method represents a very

successful approach to determine low energy conformations.45–47

Originally developed by Li and Scheraga,45 the method was sub-

sequently generalized by Wales and Doye48 yielding the so-

called basin hopping (BH) approach. In the MCM and BH

approaches, each randomly generated structure is optimized, and

the resulting minima are used within the MC. This resetting of

the geometry before the new perturbation strongly increases the

efficiency as was shown in many examples, for example, Len-

nard-Jones clusters,7,9,48 water shells,49 and peptides.50–52

Additional Supporting Information may be found in the online version of

this article.

Contract/grant sponsor: DFG (Deutsche Forschungsgemeinschaft); con-

tract/grant numbers: SFB 630

Correspondence to: B. Engels. e-mail: [email protected]

q 2011 Wiley Periodicals, Inc.

Page 2: Efficiency of tabu-search-based conformational search algorithms

Recently, we developed a new approach based on tabu search

(TS), a method which has found wide application in energy

resource planning, bioinformatics, computer-aided molecular

design, pattern classification, mineral exploration, as well as in

many industrial application settings,53 and in quantitative struc-

ture–activity relationship.53,54 TS55–57 uses an adaptive memory

design and represents a metaheuristic method.58–62 After reach-

ing a local optimum by a series of descent moves, which select

the highest evaluation moves from a candidate list, the method

provides an escape from this optimum by continuing to choose

highest evaluation moves but using tabu restrictions to avoid

revisiting solutions previously examined. A common way to

implement the tabu restrictions is to use a tabu list (TL), which

assigns a tabu status to elements of previously generated solu-

tions. The TS method also monitors the search using frequency

memory or other more elaborate forms of memory to determine

if the search gets stuck in a given region. If this happens, a

diversification search (DS) is performed, which guides the search

to different and hopefully more promising regions of the search

space.

TS was originally developed for noncontinuous problems and

subsequently applied also to solve continuous nonlinear and

global optimization problems.63–67 To adopt the TS to the con-

tinuous conformational search problem, we developed several

TS-based approaches.68,69 Within these approaches, the gradient

only TS (GOTS) turned out to be most efficient.69,70 For the

minimization step that launches a descent to the next local mini-

mum, GOTS uses a Quasi-Newton method, combined with a

steepest descent approach.71–74 To escape local minima, the

GOTS uses grids of function values. An efficient blocking of al-

ready visited regions is achieved by using tabu directions and

tabu regions in combination with the TL.56

This article has two primary goals. The first is to test the effi-

ciency of the GOTS algorithm for conformational search. For

this purpose, we perform conformational searches for five mole-

cules of different sizes and compare the efficiency of GOTS

Figure 1. Flowchart of the main algorithm for search starting structure.

2246 Grebner et al. • Vol. 32, No. 10 • Journal of Computational Chemistry

Journal of Computational Chemistry DOI 10.1002/jcc

Page 3: Efficiency of tabu-search-based conformational search algorithms

with MD, SA, and BH. The analysis shows that for successful

applications at larger molecules, the GOTS needs efficient DS

strategies. Hence, we used short BH sequences as DS within the

GOTS. This combination (GOTS/BH) outperforms both single

methods.

The second goal of this work is the evaluation of five-, six-,

or seven-membered ring structures containing hydrogen bonds as

representative starting structures for biologically active smaller

peptides or organic ligands. As the build-up of such structures is

not very time consuming, they may also be useful within a

superposition approach to thermodynamics.75,76

It is obvious that reasonable starting structures are very help-

ful for conformational searches for larger molecules because the

search space becomes too large for an exhaustive search. An im-

pressive example is given in a review about the Critical Assess-

ment of Techniques for Protein Structure Prediction 2006

(CASP7).77 Several other examples can be found in litera-

ture.36,77–85

This article is organized as follows. We first describe the

algorithm that builds up the ring structures (STARTOPT). Then,

the efficiency of the GOTS in simulations starting from ran-

domly generated structures is compared with other approaches

as MD, SA, and BH. This part also focuses on the effectiveness

of a combination of GOTS and BH. Finally, we investigate the

influence of the starting structures containing the above men-

tioned rings on the efficiency of the various approaches.

Description of the STARTOPT Algorithm

Ring structures closed by hydrogen bonds between hydrogen

bond donors and acceptors represent good starting structures for

the conformational searches because they are often lower in

energy than the corresponding ring-open conformations.11–13,86

The STARTOPT algorithm developed to detect such conforma-

tions is depicted in . In the first step, the algorithm uses the rep-

resentation of the molecule in Cartesian coordinates to build up

a connection table, which is then used to identify all covalent

bonds and all hydrogen bond acceptors and donors of the mole-

cule. The flowchart searching for possible five-, six- and seven-

membered rings is shown in . Starting from the first hydrogen

bond donor, the algorithm moves atom by atom along the cova-

lent bonds of the molecule and searches for heteroatoms repre-

senting the hydrogen bond acceptors. If the ring size becomes

larger than seven atoms before an acceptor is found, the loop is

left, and the next donor is taken as a starting point. If an

acceptor is found and the ring size equals five, six, or seven

atoms, then the atom sequence is saved. Already visited atoms

are remembered in a Visited-List to avoid circulating in the mol-

ecule (e.g., in ring systems).

After locating all possible ring structures, the Cartesian coor-

dinates of the molecule are reordered for each possible ring. In

the new coordinate set, the atoms of a given ring are placed on

the first positions in the Cartesian coordinate file because this

allows computing the internal coordinates of the rings in Z-ma-

trix notation very easily from the Cartesian coordinates. After

generating the internal coordinates, the ring is closed by chang-

ing the dihedral angles of the ring to standard values of cyclo-

pentane, -hexane, and -heptane. To ensure proper rotations

around a given single bond, for example, of end-standing methyl

groups, we use main and dependent torsions87,88 as developed

by Echenique and Alonso.89 To obtain relaxed ring structures,

we perform three subsequent optimizations. In the first one, the

ring atoms are fixed, whereas the rest of the molecule is opti-

mized. The reversed scheme is used in the second optimization.

Finally, a full optimization is performed. To construct structures

that contain several rings, the program is applied several times.

Description of the Simulations

To achieve insights into the efficiency of the GOTS, we per-

formed conformational searches for molecules with 31–76 atoms

(Fig. 3). The conformational searches are performed with five

different approaches. Simple MD simulations are performed to

obtain a feeling if the given molecule is so small that its phase

space can easily be exhaustively scanned. Hence, these MD sim-

ulations do not contain heating and cooling parts. The simulation

time was 1 ns (NVT ensemble) with a time step of 1 fs leading

to 1,000,000 steps in total. A snapshot was taken for every 10

ps, which was subsequently energy minimized with the Newton-

like local optimizer90 implemented in Tinker.90–95 In total, 100

optimized structures were obtained.

Figure 2. Flowchart of the algorithm for searching all possible rings

which can be built up by the existing acceptors and donors.

2247Tabu-Search-Based Conformational Algorithms

Journal of Computational Chemistry DOI 10.1002/jcc

Page 4: Efficiency of tabu-search-based conformational search algorithms

Heating and cooling parts are included in the SA approach.

Again we used the standard procedure implemented in the Tin-

ker program package,90–95 that is, the initial temperature is 1000

K, and 100 steps were performed for equilibration. The cooling

to 0 K was performed in 1,000,000 steps with a linear decrease

in temperature by the factor (current step number)/(total number

of steps) for every step. A snapshot was taken every 10 ps and

the 100 obtained structures were subsequently optimized. Addi-

tionally, we used the MCM or BH approach for global optimiza-

tion as implemented in Tinker.90–95

The results of these approaches were compared with the results

of our GOTS search. To enable this comparison, the GOTS was

combined with the TINKER program package. The GOTS and ba-

sin-hopping approaches seem to complement each other. Steered

by the alternating descent and ascent strategy, the GOTS repre-

sents a more local approach. The BH approach on the other hand

jumps through the phase space. To test the efficiency of the com-

bination of both approaches, we also performed searches in which

the BH approach was used for diversification in the GOTS

(GOTS/BH). Such a DS uses 200 BH steps.

To investigate if ring structures generated by the STARTOPT

represent good starting structures for conformational searches,

we used different starting points in each simulation. In the first

series, we started from structures that were generated within a

prior MD simulation, which has a duration of 1 ns with a time

step of 1 fs (NVT ensemble). Snapshots were taken every 10 ps.

From the 100 structures, we randomly chose 30 starting struc-

tures for the subsequent conformation searches. In the second se-

ries, we performed STARTOPT once and started the simulations

from the resulting structures containing one ring. These simula-

tions are abbreviated by STARTOPT. The last series started

from structures containing the maximal number of ring struc-

tures of a given molecule. They were obtained by performing

STARTOPT repeatedly until no new structures were generated.

These simulations are abbreviated by STARTOPT/Mult. For

molecules 4 and 5, the number of starting structure turned out to

be too large. Hence, only the ones being lowest in energy were

used to produce the next generation (see below).

All computations were performed with the OPLS-AA96 force

field as implemented in the TINKER program package. The

coordinates of the best structures can be found in the Supporting

Information.

Results and Discussion

Table 1 shows the results obtained for the tripeptide Gly-Ala-Ser

(1) consisting of 31 atoms. The molecule contains 10 formally

freely rotating single bonds, but two bonds are rather rigid am-

ide bonds. Table 1 also shows that molecule 1 is too small to

Figure 3. Test systems used in this work.

2248 Grebner et al. • Vol. 32, No. 10 • Journal of Computational Chemistry

Journal of Computational Chemistry DOI 10.1002/jcc

Page 5: Efficiency of tabu-search-based conformational search algorithms

represent a reasonable test system. Starting from the MD gener-

ated starting structures only MD and SA do not find this global

minimum. But the energetically lowest minimum found by these

simulations is located only 0.9 kcal/mol above the global mini-

mum. This is detected by BH and GOTS in about 80% of the

simulations. Note that their combination (GOTS/BH) always

detects the global minimum, indicating that the two approaches

complement each other quite well. Using structures obtained by

a single application of STARTOPT (i.e. containing one ring)

seem to improve the MD and SA results. They still do not end

in the global minimum, but the structure being 0.9 kcal mol21

higher in energy is reached more often. On the other hand, the

success of BH and GOTS decreases if they are started from

such a single-ring structure because these systems also contain

higher lying structures, which do not seem to represent good

starting structures. The necessary path to reach to the minimum

seems to be too long for normal GOTS searches. If the simula-

tion is started from energetically more favorable structures the

global minimum is detected more often. This was investigated in

more detail for molecule 5 (see below). Even if the simulations

are started from structures containing two rings (STARTOPT/

Mult), the efficiency of the search is not increased. If, however,

small BH sequences are used for the DS of the GOTS, the effi-

ciency is increased to 100%. This indicates that for such small

molecules ring structures are not particularly useful.

Table 1. Results for Molecule (1) Containing 31 Atoms.

Optimization method Emina #globalb (%) #stepsc CPU timed

MD 0.9 17 59 1.2

SA 0.9 13 10 0.2

BH 0.0 83 1613 1.3

GOTS 0.0 80 322 0.3

GOTS/BH 0.0 100 58 0.1

MD-STARTOPT 0.9 75 3 0.1

SA-STARTOPT 0.9 80 9 0.2

GOTS-STARTOPT 0.0 55 250 0.2

BH-STARTOPT 0.0 60 1877 1.5

GOTS STARTOPT/Mult 0.0 53 164 0.1

GOTS/BH STARTOPT/Mult 0.0 100 258 0.4

aRelative energy of the energetically lowest minimum found in the given

simulation with respect to the lowest minimum found in all simulations

of this molecule (E 5 2168.4 kcal mol21). All energies are given in

kcal mol21.bPercentage of runs of the simulation which found this minimum.cAveraged numbers of steps (in case of MC and GOTS) or snapshots (in

case of MD and SA) needed to find the minimum depicted in column 1

for the first time. The values average only over those runs in which the

lowest energy was actually found.dThe corresponding averaged CPU time in minutes.

Table 2. Results for Molecule (5) Containing 76 Atoms.

Optimization method Emina #globalb (%) #stepsc CPU timed

MD 17.8 – 44 4.8

SA 6.2 – 46 5.1

BH 0.7 6 3140 13.2

GOTS 1.2 – 955 2.2

GOTS/BH 0.0 37 632 4.5

MD/STARTOPT 4.7 – 64 7.0

SA/STARTOPT 5.8 5 57 6.3

BH/STARTOPT 0.7 13 2347 9.9

GOTS/STARTOPT 0.7 – 694 1.6

BH/STARTOPT/Mult 0.7 – 436 2.0

GOTS/STARTOPT/Mult 0.7 8 128 0.3

0.0 9 184 0.4

GOTS/BH STARTOPT/Mult 0.0 68 472 3.4

aRelative energy of the energetically lowest minimum found in the given

simulation with respect to the lowest minimum found in all simulations

of this molecule (E 5 2376.3 kcal mol21). All energies are given in

kcal/mol.bPercentage of runs of the simulation which found this minimum. If the

minimum is found only once no percentage is given.cAveraged numbers of steps (in case of MC and GOTS) or snapshots (in

case of MD and SA) needed to find the minimum depicted in column 1

for the first time. The values average only over those runs in which the

lowest energy was actually found.dThe corresponding averaged CPU time in minutes.

Figure 4. (a) Energetically lowest conformations of molecule 5 found in the present investigations. (b)

Structure SI9 generated by the multiple use of STARTOPT.

2249Tabu-Search-Based Conformational Algorithms

Journal of Computational Chemistry DOI 10.1002/jcc

Page 6: Efficiency of tabu-search-based conformational search algorithms

Table 2 contains the results for the polypeptide Gly-Lys-Ser-

Cys-Pro (5). It consists of 76 atoms and 25 freely rotatable for-

mal single bonds. Two of them are conformationally constrained

amide bonds. The structure of the lowest conformer found in the

present investigations is depicted in Figure 4. The results given

in Table 2 clearly show that molecule 5 represents a useful test

case. If the simulations are initiated at the MD-generated starting

structures, the subsequent simple MD simulation again fails in

reaching the global minimum. However, in this case, the struc-

ture lowest in energy is located about 17 kcal mol21 above the

global minimum. This shows that the phase space became too

large for simple approaches. SA detects a considerably lower

minimum. However, it is still about 6 kcal mol21 higher in

energy than the global one. A simple GOTS predicts a minimum

which is only 1.2 kcal mol21 higher in energy than the global

minimum. It is detected only once. The efficiency of the BH

approach is underlined by the fact that its lowest minimum is

only 0.7 kcal mol21 above the global minimum. This structure

is found in 6% of the runs. The best results are obtained by the

combination of GOTS and BH. This combination detects the

global minimum in about 37% of the runs.

Table 3. Characterization of the Conformers of Molecule (5), Which

Were Obtained by Applying STARTOPT Three Times.

structure RMSDa Eminb

#semic #globald

GOTS GOTS GOTS/BH

F15 43.7 11.8 17 17 90

F5 54.0 15.1 – – 40

F7 39.7 11.4 – – 80

SA2 36.1 11.4 – – 90

SA5 53.1 10.3 – – 10

SB1 55.9 11.8 11 – 100

SI9 36.0 6.4 32 32 80

Total 9 7 70

aRMSD value giving the difference between the torsional angles.bRelative energy of the structure with respect to the lowest minimum

found in all simulations of this molecule (E 5 2376.3 kcal mol21). All

energies are given in kcal mol21.cPercentage of runs of the simulation which found the minimum laying

0.7 kcal mol21 above the global minimum.dPercentage of runs of the simulation which found the global minimum.

Figure 5. Characterization of typical simulation runs starting from structure SI9 (see Table 3 and Fig. 4).

(a) Generated minimums along a GOTS/BH simulations. (b) Accepted minima along a GOTS/BH simula-

tions. (c) Accepted minima along along a BH simulation. All simulations started from structure SI9.

2250 Grebner et al. • Vol. 32, No. 10 • Journal of Computational Chemistry

Journal of Computational Chemistry DOI 10.1002/jcc

Page 7: Efficiency of tabu-search-based conformational search algorithms

For molecule 5, single-ring containing starting structures

seem to be helpful. MD, SA, and GOTS simulations end at

lower energies, and the detection quota of the BH runs is

increased. In this series, we also started from all generated ring

structures also including the ones which are higher in energy.

For molecule 5, STARTOPT could build up structures con-

taining up to three different rings. As the number of possible

ring structures increases tremendously from generation to gener-

ation, we started the next generation only from the 10 energeti-

cally lowest structures. The first generation consisted of 10 dif-

ferent ring structures, which are 30–50 kcal mol21 above the

global minimum. In the second generation, 33 different struc-

tures were generated. They are 15–35 kcal mol21 higher in

energy than the global minimum. In the third generation, we

obtained 36 different structures being 7–23 kcal mol21 higher in

energy than the global minimum. The seven energetically lowest

structures are analyzed in Table 3. Figure 4 shows structure SI9

as an example. As BH, GOTS, and GOTS/BH possess probabil-

ity elements, that is, each run proceeds differently, and 10 runs

were generated from each structure. Starting the simulations

from the structures given in Table 3, the success of the BH

approach does not increase. For GOTS-based approaches, the

use of these structures was considerably more helpful. The sim-

ple GOTS approach finds the global minimum in 7% of the sim-

ulations while the slightly higher lying minimum is found in 9%

of the tests. The GOTS/BH approach is considerably more suc-

cessful. It locates the global minimum in 70% of the cases.

A more detailed picture about the quality of the starting

structures is given in Table 3. The differences of the various

starting structures with respect to the global minimum are char-

acterized by RMSD values computed from the torsion angles

dij ¼ 1N

PNk¼1min HðiÞ

k �HðjÞk

� �2; 2p�HðiÞ

k �HðjÞk

� �2� �� �1=2 !

11

and the relative energies. Table 3 shows that a normal GOTS

only ends in the global minimum if it starts from the structures

F15, F8, and SI9. The slightly higher located minimum is found

if the GOTS starts from structures F15, F8, SB1, and SI9.

Unfortunately, the usefulness of a starting structure does not

seem to correlate with the relative energy or the RMSD value.

SI9 represents the starting structures lowest in energy and pos-

sesses the smallest RMSD value. A computation starting from

this structure is very successful. However, in computations start-

ing from structure SA2 that possess quite similar energy and

RMSD values neither the global nor the slightly higher lying

minimum is found. The GOTS/BH approach is considerably

more successful. It detects the global minimum from all starting

structures with quite high success quotas with the exception of

structure SA5 (10%). For SB1, F15, SA2, SI9, and F7_2, the

quota is 80% or higher. Unfortunately, also in this case, an

obvious correlation between success quota and relative energy

or RMSD value is not recognizable.

While the use of starting structures generated by the STAR-

TOPT seems to accelerate GOTS and GOTS/BH, no advantage

is seen for the BH itself. The reason for this difference becomes

clear from Figure 5, which analyzes the progress of a typical

conformational search starting from structure SI9. Figure 5a

gives the energies of the minima generated in the GOTS/BH

run. For this approach, probability elements enter at two points.

Obviously, the BH part used as DS contains such elements. For

the GOTS part, probability elements enter if the next minimum

is higher in energy. Then an MC criterion is used to decide if

Table 4. Results for [Met5]enkephalin (4) Containing 75 Atoms.

Optimization method Emina

#globalb

(%) #stepscCPU

timed,e

MD 7.2 67 36 4.0

SA 1.5 20 59 6.5

BH 0.0 57 2610 19.3

GOTS 0.0 – 159 1.2

GOTS/BH 0.0 87 397 4.5

MD-STARTOPT 2.0 15 49 3.9

SA-STARTOPT 2.0 46 48 5.4

BH-STARTOPT 0.0 42 2417 17.9

GOTS-STARTOPT 0.0 – 35 0.3

GOTS-STARTOPT/Mult 0.0 4 172 1.3

GOTS/BH STARTOPT/Mult 0.0 76 464 4.8

aRelative energy of the energetically lowest minimum found in the given

simulation with respect to the lowest minimum found in all simulations of

this molecule (E52263.5 kcal mol21). All energies are given in kcal/mol.bPercentage of runs of the simulation which found this minimum. If the

minimum is found only once no percentage is given.cAveraged numbers of steps (in case of MC and GOTS) or snapshots (in

case of MD and SA) needed to find the minimum depicted in column 1 for

the first time. The values average only over those runs in which the lowest

energy was actually found.dThe corresponding averaged CPU time in minutes.

Table 5. Results for EPNP (2) Containing 38 Atoms.

Optimization method Emina

#globalb

(%) #stepscCPU

timed,e

MD 0.5 90 23 0.7

SA 0.5 63 38 1.1

BH 0.0 100 420 0.8

GOTS 0.0 33 319 0.7

GOTS/BH 0.0 67 224 0.9

MD-STARTOPT 0.5 92 20 0.6

SA-STARTOPT 0.4 31 53 1.6

BH-STARTOPT 0.0 100 1858 3.7

GOTS-STARTOPT 0.4 15 131 0.3

GOTS-STARTOPT/Mult 0.0 11 201 0.4

GOTS/BH STARTOPT/Mult 0.0 99 192 0.7

aRelative energy of the energetically lowest minimum found in the given

simulation with respect to the lowest minimum found in all simulations

of this molecule (E 5 22.8 kcal mol21). All energies are given in

kcal mol21.bPercentage of runs of the simulation which found this minimum. If the

minimum is found only once no percentage is given.cAveraged numbers of steps (in case of MC and GOTS) or snapshots (in

case of MD and SA) needed to find the minimum depicted in column 1

for the first time. The values average only over those runs in which the

lowest energy was actually found.dThe corresponding averaged CPU time in minutes.

2251Tabu-Search-Based Conformational Algorithms

Journal of Computational Chemistry DOI 10.1002/jcc

Page 8: Efficiency of tabu-search-based conformational search algorithms

the new minimum is used as the starting point or if the simulation

proceeds from the previous minimum but using another direction.

To show this influence, Figure 5b gives the minima that were

accepted as starting points. It can be concluded that the adjust-

ment yields a good convergence. The BH as implemented in TIN-

KER accepts much higher lying minima as starting points. As a

consequence, the starting structure is rapidly destroyed.

If the molecules get progressively larger, time factor becomes

quite important. Also in this respect the combination of GOTS

and BH possesses advantages over all other approaches. The

number of steps needed to find the minima is often smaller than

for BH. As the GOTS part is less expensive than the BH part,

the same holds for the CPU time. Note that the time needed for

the STARTOPT is quite small because the computation of one

structure needs only three optimization steps.

Molecule 5 is only slightly larger than the neurotransmitter

peptide [Met5]enkephalin (molecule 4), which consists of 75

atoms and possesses 20 freely rotatable formal single bonds. In

[Met5]enkephalin, four formal single bonds are amide bonds.

[Met5]enkephalin was also used in other investigations.46,97,98 The

present results summarized in Table 4 indicate that [Met5]enke-

phalin is easier to handle than molecule 5. Starting from the struc-

tures generated by MD again, MD and SA are not able to find the

global minimum but BH and GOTS/BH detect it with success quo-

tas of 57 and 87%, respectively. As seen for molecule 1, also for

[Met5]enkephalin, the success quota decrease slightly if the search

is started from structures built up by the STARTOPT.

The investigations for molecules 2 and 3 provide the same

results as obtained for molecules 1, 4, and 5. As expected from

their size the difficulties to find their global minima range

between molecule 1 and 5. The results are depicted in Tables 5

and 6 but we refrain from further discussions.

Summary and Conclusions

In this article, we compare the efficiency of the tabu-search-

based optimizer GOTS for conformational search with other of-

ten used approaches in this field. The investigation comprises a

simple MD approach, the SA procedure as implemented in the

TINKER program package, and the very efficient BH approach.

The study not only emphasizes the success of the GOTS but

also reveals that an efficient DS strategy is needed for larger

molecules. Short sequences of the BH approach turned out to be

very useful in this respect. Applications of the combination of

GOTS and BH (GOTS/BH) to five molecules ranging from 31

to 76 atoms show that it outperforms the single methods.

Additionally, we investigate five-, six-, and seven-membered

ring structures in which one bond represents a hydrogen bond to

determine whether they are reasonable structures for launching

conformational searches for smaller peptides and organic ligands

with pharmaceutical activities. For this purpose, the above men-

tioned simulation techniques are started from randomly gener-

ated structures and such ring structures and their convergences

are compared. The study shows that such ring structures are use-

ful for larger molecules especially if structures containing multi-

ple rings are used. The examination of additional nonlinear TS

strategies represents a direction for future research.

Acknowledgments

The authors thank Johannes Kastner (University Stuttgart) for

helpful discussions.

References

1. Goetschalckx, M.; Vidal, C. J.; Dogan, K. Eur J Oper Res 2002,

143, 1.

2. Kolda, T. G.; Lewis, R. M.; Torczon, V. SIAM Rev 2003, 45, 385.

3. Floudas, C. A.; Akrotirianakis, I. G.; Caratzoulas, S.; Meyer, C. A.;

Kallrath, J. Comput Chem Eng 2005, 29, 1185.

4. Musafia, B.; Senderowitz, H. Exp Opin Drug Discov 2010, 5, 943.

5. Halperin, I.; Ma, B. Y.; Wolfson, H.; Nussinov, R. Proteins: Struct

Funct Genet 2002, 47, 409.

6. Foloppe, N.; Chen, I. J. Curr Med Chem 2009, 16, 3381.

7. Wales, D. J.; Doye, J. P. K.; Miller, M. A.; Mortenson, P. N.;

Walsh, T. R. In Advances in Chemical Physics, Prigione, I., Ed.;

Vol.115; John Wiley & Sons Inc: New York, 2000; pp. 1–111.

8. Wales, D. J.; Bogdan, T. V. J Phys Chem B 2006, 110, 20765.

9. Wales, D. J. Phys Biol 2005, 2, S86.

10. Wales, D. J. Int Rev Phys Chem 2006, 25, 237.

11. Becker, O. M.; MacKerrel, A. D., Jr.; Roux, B.; Watanabe, M. Com-

putational Biochemistry and Biophysics; Marcel Dekker Inc.: New

York, 2001.

12. Leach, A. R. Molecular Modelling—Principle and Application, 2nd

ed.; Pearson Education Limited: Harlow, 2001.

13. Wales, J. D. Energy Landscapes; Cambridge University Press:

United Kingdom, 2003.

14. Kannan, S.; Zacharias, M. J Struct Biol 2009, 166, 288.

15. Schlund, S.; Schmuck, C.; Engels, B. Chem Eur J 2007, 13, 6644.

16. Han, R. S.; Leo-Macias, A.; Zerbino, D.; Bastolla, U.; Contreras-Mor-

eira, B.; Ortiz, A. R. Proteins: Struct Funct Bioinf 2008, 71, 175.

17. Lee, K. Int J Mol Sci 2008, 9, 65.

Table 6. Results for E64c (3) Containing 50 Atoms.

Optimization method Emina #globalb (%) #stepsc CPU timed

MD 2.1 63 32 1.6

SA 0.0 27 47 2.4

BH 0.0 87 830 1.2

GOTS 0.0 13 276 0.5

GOTS/BH 0.0 93 113 0.7

MD-STARTOPT 0.0 – 45 2.3

SA-STARTOPT 0.0 25 56 2.8

BH-STARTOPT 0.0 92 915 1.3

GOTS-STARTOPT 0.0 - 365 0.7

GOTS-STARTOPT/Mult 0.0 10 392 0.7

GOTS/BH STARTOPT/Mult 0.0 98 125 0.7

aRelative energy of the energetically lowest minimum found in the given

simulation with respect to the lowest minimum found in all simulations

of this molecule (E 5 268.8 kcal mol21). All energies are given in

kcal mol21.bPercentage of runs of the simulation which found this minimum. If the

minimum is found only once no percentage is given.cAveraged numbers of steps (in case of MC and GOTS) or snapshots (in

case of MD and SA) needed to find the minimum depicted in column 1

for the first time. The values average only over those runs in which the

lowest energy was actually found.dThe corresponding averaged CPU time in minutes.

2252 Grebner et al. • Vol. 32, No. 10 • Journal of Computational Chemistry

Journal of Computational Chemistry DOI 10.1002/jcc

Page 9: Efficiency of tabu-search-based conformational search algorithms

18. Leach, A. R. InReviews in Computational Chemistry, Vol.2; Lipko-

witz, K. B.; Boyd, D. B., Eds.; VCH: New York, 1991.

19. Senn, H. M.; Thiel, W. Angew Chem Int Ed Engl 2009, 48, 1198.

20. Engels, B.; Peyerimhoff, S. D. J Phys Chem 1989, 93, 4462.

21. Helten, H.; Schirmeister, T.; Engels, B. J Org Chem 2005, 70, 233.

22. Musch, P. W.; Engels, B. J Am Chem Soc 2001, 123, 5557.

23. Bruccoleri, R. E.; Karplus, M. Biopolymers 1987, 26, 137.

24. Gippert, G. P.; Wright, P. E.; Case, D. A. J Biomol NMR 1998, 11, 241.

25. Sadowski, J.; Bostrom, J. J Chem Inf Model 2006, 46, 2305.

26. Smellie, A.; Stanton, R.; Henne, R.; Teig, S. J Comput Chem 2003,

24, 10.

27. Chandrasekhar, J.; Saunders, M.; Jorgensen, W. L. J Comput Chem

2001, 22, 1646.

28. Chen, J. H.; Im, W.; Brooks, C. L. J Comput Chem 2005, 26, 1565.

29. Glen, R. C.; Payne, A. W. R. J Comput Aided Mol Des 1995, 9,

181.

30. Scheraga, H. A.; Lee, J.; Pillardy, J.; Ye, Y. J.; Liwo, A.; Ripoll, D.

J Global Opt 1999, 15, 235.

31. Goodman, J. M.; Still, W. C. J Comput Chem 1991, 12, 1110.

32. Bohm, G. Biophys Chem 1996, 59, 1.

33. Floudas, C. A.; Klepeis, J. L.; Pardalos, P. M.DIMACS Series in

Discrete Mathematics and Theoretical Computer Science, Vol.47;

American Mathematical Society: New Jersey, 1999.

34. Neumaier, A. SIAM Rev 1997, 39, 407.

35. Li, Z. Q.; Laidig, K. E.; Daggett, V. J Comput Chem 1998, 19, 60.

36. Vengadesan, K.; Gautham, N. Curr Sci 2005, 88, 1759.

37. Kostrowicki, J.; Scheraga, H. A. J Phys Chem 1992, 96, 7442.

38. Chang, G.; Guida, W. C.; Still, W. C. J Am Chem Soc 1989, 111,

4379.

39. Morales, L. B.; Gardunojuarez, R.; Romero, D. J Biomol Struct Dy-

namics 1991, 8, 721.

40. Wilson, S. R.; Cui, W.; Moskowitz, J. W.; Schmidt, K. E. J Comput

Chem 1991, 12, 342.

41. Grubmuller, H. Phys Rev E Stat Phys Plasmas Fluids Relat Interdis-

cip Topics 1995, 52, 2893.

42. Huber, T.; Torda, A. E.; van Gunsteren, W. F. J Comput Aided Mol

Des 1994, 8, 695.

43. Iannuzzi, M.; Laio, A.; Parrinello, M. Phys Rev Lett 2003, 90,

238302.

44. Yang, Y. D.; Liu, H. Y. J Comput Chem 2006, 27, 1593.

45. Li, Z. G.; Scheraga, H. A. Biophys J 1987, 51, A232.

46. Li, Z. Q.; Scheraga, H. A. Proc Natl Acad Sci USA 1987, 84, 6611.

47. Wales, D. J.; Scheraga, H. A. Science 1999, 285, 1368.

48. Wales, D. J.; Doye, J. P. K. J Phys Chem A 1997, 101, 5111.

49. Hernandez-Rojas, J.; Breton, J.; Llorente, J. M. G.; Wales, D. J.

J Phys Chem B 2006, 110, 13357.

50. Li, Z. Q.; Scheraga, H. A. Biophys J 1988, 53, A46.

51. Nanias, M.; Chinchio, M.; Oldziej, S.; Czaplewski, C.; Scheraga, H.

A. J Comput Chem 2005, 26, 1472.

52. Oldziej, S.; Czaplewski, C.; Liwo, A.; Chinchio, M.; Nanias, M.;

Vila, J. A.; Khalili, M.; Arnautova, Y. A.; Jagielska, A.; Makowski,

M.; Schafroth, H. D.; Kazmierkiewicz, R.; Ripoll, D. R.; Pillardy, J.;

Saunders, J. A.; Kang, Y. K.; Gibson, K. D.; Scheraga, H. A. Proc

Natl Acad Sci USA 2005, 102, 7547.

53. Baumann, K.; von Korff, M.; Albert, H. J Chemom 2002, 16, 351.

54. Baumann, K.; Albert, H.; von Korff, M. J Chemom 2002, 16, 339.

55. Glover, F. ORSA J Comput 1989, 1, 190.

56. Glover, F. ORSA J Comput 1990, 2, 4.

57. Glover, F.; Laguna, M. Tabu Search; Kluwer Academic Publishers:

MA, USA, 1997.

58. Blum, C.; Roli, A. ACM Computing Surveys 2003, 35, 268.

59. Pardalos, P. M.; Resende, M. G. C. Handbook of Applied Optimiza-

tion; Oxford University Press: Oxford, 2002.

60. Rayward-Smith, V. J.; Osman, I. H.; Reeves, C. R.; Smith, G. D.

Modern Heuristic Search Methods; Wiley: Chichester, 1996.

61. Ribeiro, C. C.; Hansen, P. Essays and Surveys in Metaheuristics;

Kluwer Academic Publishers: Boston, MA, 2002.

62. Gendreau, M.; Potvin, J.-Y.Handbook of Metaheuristics, Vol.146;

Springer International Series in Operations Research & Management

Science: New York, 2010.

63. Glover, F. Discrete Appl Math 1994, 49, 231.

64. Glover, F.; Mulvey, J. M.; Hoyland, K. In Meta-Heuristics: Theory

& Applications; Osman, I. H., Kelly, J. P., Eds.; Kluwer Academic

Publishers: Berlin, 1996; pp. 429–448.

65. Hvattum, L. M.; Glover, F. Eur J Oper Res 2009, 195, 31.

66. Ugray, Z.; Lasdon, L.; Plummer, J.; Glover, F.; Kelly, J.; Marti, R.

Informs J Comput 2007, 19, 328.

67. Duarte, A.; Marti, R.; Glover, F.; Gortazar, F. In Annals of Opera-

tions Research; Springer Science1Business Media, LLC: Piscat-

away, New Jersey, 2009.

68. Stepanenko, S.; Engels, B. J Comput Chem 2007, 28, 601.

69. Stepanenko, S.; Engels, B. J Comput Chem 2008, 29, 768.

70. Stepanenko, S.; Engels, B. J Phys Chem A 2009, 113, 11699.

71. Dennis, J. E.; Schnabel, R. B. Numerical Methods for Unconstrained

Optimization; SIAM: Philadelphia, 1996.

72. Gill, P. E.; Murray, W.; Wright, M. H. Practical Optimization; Aca-

demic Press: San Diego, 1981.

73. Nocedal, J.; Wright, S. J. Numerical Optimization; Springer: New

York, 1999.

74. Press, W. H.; Teukolsky, S. A.; Vetterling, W. T.; Flannery, B. P.

Numerical Recipes in C11, 2nd ed.; Cambridge University Press:

New York, 2002.

75. Bogdan, T. V.; Wales, D. J.; Calvo, F. J Chem Phys 2006, 124, 044102.

76. Kirchner, B. J Chem Phys 2005, 123, 204116.

77. Kryshtafovych, A.; Fidelis, K. Drug Discov Today 2009, 14, 386.

78. Abagyan, R.; Totrov, M.; Kuznetsov, D. J Comput Chem 1994, 15, 488.

79. Baysal, C.; Meirovitch, H. J Chem Phys 1996, 105, 7868.

80. Felts, A. K.; Gallicchio, E.; Chekmarev, D.; Paris, K. A.; Friesner,

R. A.; Levy, R. M. J Chem Theory Comput 2008, 4, 855.

81. Naray-Szabo, G.; Warshel, A. Computational Approaches to Biochemi-

cal Reactivity; 1st ed.; Kluwer Academic Publisher: NewYork, 2002.

82. Qu, X. T.; Swanson, R.; Day, R.; Tsai, J. Curr Protein Pept Sci

2009, 10, 270.

83. Wei, B. Q.; Weaver, L. H.; Ferrari, A. M.; Matthews, B. W.; Shoi-

chet, B. K. J Mol Biol 2004, 337, 1161.

84. Xia, J. C.; Margulis, C. J Biomol NMR 2008, 42, 241.

85. Zhang, Y. Curr Opin Struct Biol 2008, 18, 342.

86. Clayden, J.; Greeves, N.; Warren, S.; Wothers, P. Organic Chemis-

try; Oxford University Press: New York, 2004.

87. Abagyan, R. A.; Mazur, A. K. J Biomol Struct Dynamics 1989, 6, 833.

88. Mazur, A. K.; Abagyan, R. A. J Biomol Struct Dynamics 1989, 6, 815.

89. Echenique, P.; Alonso, J. L. J Comput Chem 2006, 27, 1076.

90. Ponder, J. W.; Richards, F. M. J Comput Chem 1987, 8, 1016.

91. Hodsdon, M. E.; Ponder, J. W.; Cistola, D. P. J Mol Biol 1996, 264,

585–602.

92. Kundrot, C. E.; Ponder, J. W.; Richards, F. M. J Comput Chem

1991, 12, 402.

93. Pappu, R. V.; Hart, R. K.; Ponder, J. W. J Phys Chem B 1998, 102,

9725.

94. Ren, P. Y.; Ponder, J. W. J Comput Chem 2002, 23, 1497.

95. Ren, P. Y.; Ponder, J. W. J Phys Chem B 2003, 107, 5933.

96. Jorgensen, W. L.; Maxwell, D. S.; TiradoRives, J. J Am Chem Soc

1996, 118, 11225.

97. Evans, D. A.; Wales, D. J. J Chem Phys 2003, 119, 9947–9955.

98. Lee, J.; Scheraga, H. A.; Rackovsky, S. J Comput Chem 1997, 18,

1222.

2253Tabu-Search-Based Conformational Algorithms

Journal of Computational Chemistry DOI 10.1002/jcc