chapter 2_molecular modelling

8/7/2019 Chapter 2_Molecular Modelling

1/27

Crystallization of proteins and complexes for X-ray analysis

Batch crystallization of monoclinic ribonuclease II 6

Microbatch crystallization (selection of protein not specified) 12

Hanging drop vapour diffusion 13Batch crystallization of lysozyme at acid pH employing NSDB195 non-detergent 14

Recording and measurement of X-ray diffraction patterns

Setting up a data collection on the Mar image plate 42

Preliminary crystal data

Example 1 of space group determination: for mistletoe lectin MLI 47

Example 2 of space group determination: ricin agglutinin, RCA 47

Calculation of number of reflections in data set for MLI 49

Example 1 for analysis of solvent content (MLI) 50

Example 2 for analysis of solvent content (RCA) 51

Determination of the phases (hkl) for protein crystals

Application of the AMoRe algorithms to mistletoe lectin MLI 70

Application of density modification DM 73

Protein structure prediction

Fragment-based modelling using SYBYL COMPOSER 103

Docking of ligands to proteins

Docking usingAUTODOCKversion 3.0 109

Ab initio(de novo) ligand design

Setting up an ab initio ligand design procedure using LEAPFROG 113

The spectroscopic basis of circular dichroism

Overall strategy for analysis of a CD spectrum 130

The circular dichroism spectrometer

What the operator must control in a CD experiment 134

Monitoring ligand binding by CD spectroscopy

The three criteria for analysis of simple 1:1 binding 147

Experimental protocols

Ligand binding to serum albumin revealed by tryptophan fluorescence quenchingand lifetime studies 182

xvii

Protocol list


2/27

Preparation of model membranes labelled with FPE 183

Labelling of the outer bilayer leaflet of the plasma membranes of lymphocytes

with FPE 184

Labelling of erythrocyte membranes with FPE 185

Labelling of B12 rat glial cells plasma membranes with FPE 186

Preparation of FPE-labelled gp60 vesicles and control liposomes 187

Spatial imaging of the cell surface electrostatic potential; use to identify localized

binding/membrane insertion interactions 189

Testing the performance of a stopped-flow instrument

Testing the mixing efficiency of a stopped-flow instrument 209

Measurement of the dead-time of a stopped-flow instrument in the fluorescence

mode 210

Measurement of the dead-time of a stopped-flow instrument in the absorbance

mode 211

Experimental design

Procedure for acquisition of transmission spectra 246

Post-processing of spectra 252

Application of band narrowing techniques 256

Thermal and solvent manipulation techniques for assessing proteinligand

interactions

Thermal stressing of proteins 259

Raman experiments

Obtaining Raman spectroscopic data on proteinligand interactions 273

Acquisition of spectra 278

Preparation of a silver sol by sodium citrate reduction of silver nitrate 290

Specific information on proteins and their interactions with ligands

Estimation of hydrogen bond strength 301

Electrospray ionization

Nano-flow with a capillary ESI source 319

Proteinligand interactions of calmodulin

ESI mass spectrometry of calmodulin under denaturing conditions 329

ESI-MS of calmodulin under buffered conditions 332

Solution phase H/D exchange 334

Gas phase H/D exchange 336

Ion binding to calmodulin 338

Chemical and enzymatic fragmentation of proteins 341

Origin of the EPR spectrum

Establishing conditions for running EPR experiments 351

Conventional EPR spectroscopy

Spin labelling membrane proteins with 4-maleimido-TEMPO 361

Spin labelling membrane proteins with 4-isothiocyanato-TEMPO 362

Obtaining a spectrum of spin labelled protein 364

Dialysis of membranes 365

Determining the spin label to protein ratio 367

Deconvolution of a two-component spectrum 369

PROTOCOL LIST

xviii


3/27

Saturation transfer EPR

Calibration of the microwave field 373

Recording a ST-EPR spectrum of a spin labelled membrane protein 375

Proteinligand interactions

Examining the effects of ligand on the conventional EPR spectrum 378

Time-resolved spectroscopy380

Chemical exchange and analysing the NMR spectra for proteinligand

complexes

Determination of fast and slow exchange in proteinligand complexes 387

Screening for ligand interactions

Screening for weak proteinligand interactions 389

Overview of NMR techniques used to study proteinligand interactions

Optimizing the NMR spectral line widths in the presence of proteinaggregation 392

Quantification of the adhesion forces of individual proteinligand complexes

Immobilization of antibody for AFM force measurements 416

Immobilization of proteins to an AFM probe via a PEG linker 420

PROTOCOL LIST

xix


4/27

1 Introduction

Molecular modelling or computational chemistry have come a long way. From

the era when they were restricted to a small number of scientists using veryspecialized and user-unfriendly software/hardware to the present where we areconfronted with a vast amount of well integrated programs and relatively cheaphardware. The recent explosion of the Internet has added significantly to theamount of software available, as it is often accessible directly via the net. Now-adays, many of these programs can be used also by someone without program-ming expertise, and it is not necessary to have studied theoretical chemistry foryears beforehand. However, the program in question should not be treated as ablack box either and the user must have an understanding of the underlyingtheory and concepts.

This chapter is intended to give a flavour of the molecular modelling tech-niques used in the context of proteinligand interactions. It is beyond the scopeof this chapter to describe all programs/techniques in detail, and in many casesthey will be only mentioned briefly. However, the aim is to provide the readerwith an overview of some basic techniques in the area of proteinligand model-ling and to give some example applications using selected programs/techniques.The selection of these examples is rather arbitrary and reflects the authors ex-perience and it is not intended to indicate any degree of superiority over otherprograms with the same or similar functionality. Reference to other programswill be given wherever possible.

In the context of modelling proteinligand interactions one can think of a

variety of issues, which are outlined in Figure 1. The first point is that for manyapplications one needs a protein structure to begin with. This structure cancome from experimental determination (usually X-ray or NMR). In cases wherethere is no structure available, one can resort to the techniques of protein struc-ture prediction. Usually, with the (protein) receptor structure in their hand, onecan start to analyse the structure for potential binding sites and/or the character-istics of this site. In cases where the structure(s) of the protein(s) are available,but not the geometry of the complex, elucidating the mechanisms by which

99

Chapter 2

Molecular modelling

Romano T. KroemerDepartment of Chemistry, Queen Mary and Westfield College, University of

London, Mile End Road, London E1 4NS, UK.


5/27

ROMANO T. KROEMER

proteins or small molecule ligands dock is often required: here one tries to fittwo molecules together in energetically favourable conformations, by applica-tion of computational procedures.

A further related problem has attracted considerable attention: The design ofa new ligand from scratch (de novo) with the aim of interfering with a specific

biological process. In some cases it is necessary to perform precise calculationsor predictions of the binding energies, with the aim to distinguish betweenseveral possible ligands. Last but not least there are computational methods forthe statistical analysis of binding affinities for a set of ligand molecules. Normallythese latter procedures are applied when the receptor structure is not available,but the binding affinities of several different ligands are known.

Before considering how we address these topics in a practical sense it is worthpointing out that the boundaries between many of the topics (see the headingsofFigure 1) are rather diffuse and that there is sometimes considerable overlapbetween them.

2 Protein structure predictionThe primary source of experimentally determined protein structures is theBrookhaven Protein Data Bank (PDB, www.rcsb.org) (1). At the time of writing

100

Figure 1 Topics related to the modelling of proteinligand interactions. The numbers

correspond to the sections in this chapter.


6/27

MOLECULAR MODELLING

the PDB contained approximately 9000 protein structures. This is a considerableamount and the number of experimentally determined structures depositedhere per year has been increasing significantly. However, the structure of theprotein of interest is not always known and ventures like the human genomeproject produce many more sequences than solved structures. Therefore, one

may need to resort to structure prediction techniques in order to obtain therequisite co-ordinates of the target protein or receptor.Every protein in its native functional state is folded in a specific way, but the

mechanism by which a specific fold is adopted is not understood. The correla-tion between sequence and fold is low, and many different proteins can sharesimilar folding patterns. Thus it has been suggested that some general featuresof the sequence determine at least partially the fold and that the number ofthese folding patterns (or families) is much smaller than the number of proteins(24). To date, three distinct categories of protein structure prediction methodsexist.

2.1 Ab initio predictions

Here one starts from a sequence and tries to predict the secondary or tertiarystructure of a protein. Secondary structure prediction methods rely on stat-istical analysis of known structures. Perhaps the most widely used algorithmsare those of Chou and Fasman (5) and of Garnier et al. (6). More recent methods(7) are reported to have exceeded 70% three-state prediction (helix, sheet, andother) accuracy (8). Once secondary structural units of a protein have beenassigned, one can then attempt to pack them together in three dimensions, usingseveral evaluation criteria for the different packing modes generated. Such anapproach was successfully used, for example, in the case of interleukin-4, prior

to its experimental structure became available (9). The prediction of tertiarystructure directly from the protein sequence is certainly the most ambitioustask. Methods include semi-exhaustive searches of the main chain dihedral anglesfor small proteins and generation of the 3D structures from a predicted set ofinter-residue contacts (10).

2.2 Threading, or fold recognition techniques

These techniques try to assess whether a given sequence is compatible with oneof the structures in a database of known folds. They account for the possibilitythat two proteins may share a similar fold despite having no detectable sequence

relationship. Since 1991, when the three-dimensional profiles method was developed (11), a number of fold recognition techniques relying on differentscoring functions have emerged (1217).

2.3 Homology (or comparative) modelling

This is usually the method of choice when there is a clear relationship orhomology between the sequence of a target protein and at least one known

101


7/27

ROMANO T. KROEMER

structure. This technique is based on the assumption that the tertiary structuresof two proteins will be similar if their sequences are related, and it is theapproach most likely to give accurate results.

At first sequences of proteins with known structure have to be identified. Thiscan be achieved by sequence database searching (18, 19). Knowledge of the

function or the family of the target protein might prove useful for the identi-fication of homologues. The next task is the alignment of the target sequencewith those of the known structure. This is a very important step in the pro-cedure, as serious errors at this stage are very difficult to correct later. If the per-centage identities between the compared sequences are high ( 45%) the correctsequence alignment is straightforward. When identity is low ( 25%) alignmentbecomes difficult. However, knowledge of the family fold (e.g. a four-helixbundle as in helical cytokines, associated with the presence of a hydrophobiccore) may allow for a good alignment even in these cases (20).

Two widely used approaches to comparative modelling are discussedbelow.

102

Figure 2 Outline of fragment-based and restraint-based comparative modelling.


8/27

MOLECULAR MODELLING

2.3.1 Fragment-based comparative modelling

In this approach rigid fragments from other proteins are used to assemble thestructure of the target protein (2124). The main steps are:

(a) A common framework (structurally conserved regions, SCRs) (25) is derivedfrom the superimposed known structures (Figure 2).

(b) The side chains of the target protein are projected onto the SCRs, using thesequence alignment generated initially. Side chain conformations can bemodelled with respect to the template structures, orif not applicablebycomparison to rotamer libraries (26, 27).

(c) Variable regions (loops) can be added by employing database searches forsuitable loop fragments (21, 28). Other approaches for adding this infor-mation include systematic search procedures or methods involving MD/MCsimulations.

Frequently used for fragment-based modelling is the software COMPOSER,which was originally developed by T. L. Blundell and co-workers (29, 30). It is

also incorporated in the software SYBYL (31). The use of COMPOSER is outlinedinProtocol 1.

103

Protocol 1

Fragment-based modelling using SYBYL COMPOSER

Equipment

Method

1 Select the sequence (Select Sequence).

2 Identify homologous, structurally known proteins and align their sequences with

the target sequence (Find Homologs).

3 Assign seed residues for the start of an iterative 3D-alignment (fitting) procedure of

the known structures (Identify Seeds).a

4 Perform the 3D-alignment, i.e. superposition of the known structures, and extract

the structurally conserved regions (Align Structures).

5 Generate the structurally conserved core of the target protein (Build SCRs).

6 Add the remaining parts of the target protein via database search (Add Loops).

aAlternatively one can perform steps 1 and 2 with other software, i.e. searching in other data-

bases and performing sequence alignments with different methods. In this case the user has

to provide three files for step 3: A .pir file with the sequence of the target protein, a .homol

file indicating the homologous sequences, and a .homlog file containing the alignment of

these sequences with the target sequence. The best strategy in this case would be to initiate

first a standard COMPOSER run, to modify the files that have been generated by the program

accordingly, and to re-start the procedure with these files.

SYBYL software including the composer

module

Unix workstation


9/27

ROMANO T. KROEMER

2.3.2 Restraint-based modelling

Predictions are based on restraints (such as inter-atomic distances) derived fromhomologous protein structures (3235). The initial two steps are similar to thosefor fragment-based modelling, i.e. sequence alignment and structural alignmentof the known structures. In the next step distance (and other) restraints are

derived from the superimposed structures. A technique referred to as distancegeometry (36) is then applied to build an ensemble of models that satisfy theinput restraints. An outline of the method is given inFigure 2.

An example of this procedure is the program MODELLER, which has been de-veloped by A. Sali and T. L. Blundell (37). Restraints include structural propertiesat residue positions and relationships between residues. In addition to distancerestraints they include solvent accessibility, secondary structure, and hydrogenbonding.

At the end of each structure prediction exercise it is necessary to performsome checks in order to ensure that the structure is free from strain, that thereare no clashes between amino acids, and that the chiralities are correct. Energy

refinement might be necessary in order to relieve strain and to alleviate slightirregularities in the structure. Verification of the prediction should include in-formation from experiments, such as mutation data, if available.

3 Analysis of protein structures

Once a 3D structure has been obtained for the protein of interest, it can beanalysed using different computational techniques. The purpose of such ananalysis is many-fold. If the ligand binding site is not known it will be importantto determine putative binding sites. Several computational procedures havebeen developed in order to detect clefts or cavities in proteins (3841). Thesecavities can then be explored for binding of selected molecules (i.e. docking) orfor the design of ligands. In many cases the binding site is known, and one wantsto gain information on the properties of the binding site and how a ligand canbind.

The analysis of known interfaces has contributed significantly to our under-standing of proteinprotein or proteinligand interfaces. S. Jones and J. M.Thornton discovered that the properties of the interfaces depend largely on thetype of binding partners involved (42): molecules that exist both in complexesand as independent structures tend to form more polar interfaces in the

complexes. Small molecules usually bind by docking into a cleft of their bindingpartner. These and other characteristics then served as a basis for the predictionof putative interfaces (43). Another study has revealed that many proteinprotein interfaces are enriched in aromatic and aliphatic amino acids and depletedin certain charged residues (44). From a visual survey of 136 homodimeric pro-teins T. A. Larsen and co-workers concluded that a significant number of differ-ent types of proteinprotein interface exist (45). They found that approximatelyone-third of the interfaces display a defined hydrophobic core, surrounded by a

104


10/27

MOLECULAR MODELLING

ring of polar interactions. 61% of the complexes analysed, however, had inter-faces composed of a mixture of hydrophobic patches and polar interactions.

The studies mentioned above indicate that it might be of interest to analyseand display the properties of a binding site in more detail, prior to any dockingor design studies. A commonly used technique is to calculate a surface repre-

sentation of the protein and to map certain properties such as hydrophobicityor electrostatic potential onto it (see, e.g. 4650). Calculation and graphical ex-amination of the electrostatic potential around a molecule can provide otheruseful information (51). The location of charged and polar groups in a proteincan have significant influence on the shape of the potential. As an example, thishas been demonstrated convincingly for the enzyme trypsin (Plate 5). In thiscase the electrostatic potential around the molecule was calculated using thefinite difference PoissonBoltzmann method (52). The potential contours revealhow two proteins having both net positive charges are able to associate in acomplex.

Another procedure for analysing binding sites in proteins is to predict favour-

able positions for probes or small molecules (53, 54). One of the most populartools for this is the GRID program by Peter Goodford (55). Here the binding siteis embedded in a regular grid. A probe (atom, group, or small molecule such aswater) is then placed at the lattice intersections and the interaction energy be-tween the probe and the protein is calculated using an empirical energy func-tion. The resulting energy map can then be analysed for favourable interactions.The program MCSS combines the analysis of the protein binding site with thecalculation of energetically favourable orientations of small functional groups(56).

In many cases the analyses we have described so far in this chapter lead ondirectly to the following methods: docking and ab initio design.

4 Docking of ligands to proteins

In molecular docking one attempts to find the preferred mode of binding ofligands in a proteinligand or proteinprotein complex. The first approach of thistype was to manually explore this docking by using interactive graphics. This,however, did not allow for systematic searching of the possible ligandreceptororientations. Since then a considerable number of docking methods have beendeveloped. Nowadays docking is usually performed in an automated fashionand evaluation of the goodness of fit of the ligand is done by an energy scoring

function, rather than by eye.In the docking process both the receptor and the ligand can be treated asrigid structures. This reduces the degrees of freedom in the orientational searchsignificantly. However, in nature the key-lock principle is only valid for a flexiblelock and a flexible key. Therefore, in order to find a compromise between com-putational tractability and accuracy, a number of docking algorithms have beendeveloped where at least the ligand is treated as being flexible. Nevertheless,rigid docking algorithms have also proven remarkably successful (57).

105


11/27

ROMANO T. KROEMER

In principle all docking applications include four steps:

Identification and preparation of the receptor site

Preparation of the ligand(s)

Docking the ligand(s)

Evaluation of the docked orientations

Criteria for differentiating between the various docking methods include theway the receptor site is described, the type of ligand (small molecule and/or pro-tein), the docking algorithm, and the evaluation procedure (scoring function). Aselection of different docking programs/methods is given in Table 1.

As described in Section 2, there are three different routes for determining 3Dstructures of proteins: X-ray crystallography, high-resolution NMR, and homologymodelling. The structures coming from these sources have to be sufficientlyaccurate in order to be of use for the docking experiments. For X-ray structuresthis implies that their resolution should be in the region of 2 or lower and the

thermal factors (R-factors) should be less than 30% (68). NMR structure deter-mination normally results in an ensemble of structures. One can use the averageof these structures, in which case the rmsd variation between the structuresshould be 1.5 . An approach for using the entire ensemble of structures for

106

Table 1 Examples of docking programs

Program Ref. Site Search Scoring Type of Flexibility

description method ligands

AUTODOCK 58 Grid points MC, GA FF Small, Ligand

proteins

DOCK 59, 60 Spheres Geometric Shape, FF Small,

(site points) matching proteins

DOCKVISION 61 Atoms MC, GA FF Small Ligand

FLEXX 62 Interaction Fragment LUDIscoring Small Quasiflexible ligand

points based function via discrete model

FLOG 63 Site points Geometric FF terms Small Quasiflexible ligand

filter via multiconformer

database

FTDOCK 64 Grid Systematic Shape, elec. Proteins

complementarity

GRAMM 65 Function Systematic Energy function Small,

proteins

LIGIN 66 Atomic Geometric Steric Small Ligand

surface filter Interact

H-bond

MULTIDOCK 67 Atoms Side chain FF Proteins Both

mean field

optimization,

rigid body

minimization


12/27

MOLECULAR MODELLING

docking has been recently reported (69). Structures resulting from homologymodelling are likely to have rmsd values higher than 0.7 for the C atoms (70)and larger errors can be expected for the side chains (71). Nevertheless, proteinmodels structures have been successfully used for docking studies (72, 73).

Once a suitable protein structure has been selected for the docking process,

one has to determine the region of interest for the docking procedure. Usuallythis is the active site of an enzyme, or the binding site of a receptor. In caseswhere the location of this site is not known, it may be identified by graphicalinspection or by using an automated method (41, 59, 74). Many docking pro-grams do not use an atomistic description of the binding site but rely on altern-ative representations. For example, the molecular surface of a protein bindingsite can be represented as an array of spheres ( Figure 3). The centres of thesespheres can then be used in the calculation for fitting a ligand atom in the dock-ing process. Another approach would be to represent the binding site by a grid,where the grid points carry information about the interaction energies betweenprobe atoms and the binding site (55).

The ligand(s) have to be considered next. This could involve pre-calculation ofdifferent conformers, assignment of rotatable bonds, building of a database ofsuitable fragments (in order to introduce flexibility), or generation of a grid-representation (discretisation) of the molecule(s). Other tasks may include theevaluation of partial charges for the ligand atoms.

After the preparation has been done, the examination of the docking processcan start. Again, depending on the program used, a variety of strategies are pos-sible. The docking step can be performed by matching (fitting) ligand atoms topre-calculated site points, where the latter represent potentially favourable inter-actions. By matching the ligand atoms to these points different orientations ofthe ligand in the protein are generated. Grid or systematic searches fit the ligandinto the active site by rotating and translating the ligand in discrete steps.Fragment-based methods can introduce flexibility by docking several ligandfragments in favourable orientations and subsequently reconnecting them. Thefragment methods can be used for the evaluation of existing inhibitors or canbe applied to so-called ab initio ligand design. Last but not least, kinetic dockingmethods can fit ligands to receptor sites by exploring the potential energy

107

Figure 3 Preparation of the binding site: Representation by spheres.


13/27

ROMANO T. KROEMER

surface. This is achieved by application of molecular dynamics or Monte Carlo(simulated annealing) procedures. An advantage of these latter methods is thatthey somehow mimic the actual docking process, i.e. the approach of a ligandmolecule to the receptor, followed by binding.

Docking methods are able to generate a large number of possible receptor

ligand orientations. Some of the methods (e.g. the MD or MC procedures) alreadyinclude the evaluation step because they contain energetic functions. In othermethods the different orientations and conformations generated have to beevaluated in a separate step. The goodness of fit can be determined from avariety of scoring functions, based either on geometric or on energetic criteria.Geometry-based scoring functions attempt to evaluate shape or surface com-plementarity between the receptor and the ligand(s). Examples include evalua-tion of a correlation function for the grid representations of the two molecules(64) or counting the number of receptor atoms within a specified distance ofevery ligand atom (75). For energy-based scoring the property of interest is thefree energy of binding. However, accurate calculation of this quantity is often

not tractable computationally and simplifications have to be introduced. A com-mon approach is to consider only the enthalpy of binding, calculated by a stand-ard force field such as AMBER (76) or the MMFF94 force field (77). In addition tothe standard force field terms (such as LennardJones and Coulomb potentials)many of the docking programs include empirical terms for solvation, hydro-phobicity, or hydrogen bonding, with the aim to find an approximate value forthe free energy of binding.

As an example we illustrate the use of the program AUTODOCK(58) in moredetail (Protocol 2). AutoDock uses a Monte Carlo simulated annealing techniquefor configurational exploration. The ligand is treated as being flexible by assign-ing user-specified torsions as rotatable. The degrees of freedom (including in-ternal torsions) of the substrate molecule are then modified randomly by asmall amount, starting from the previous configuration. Each of these randomorientations is evaluated energetically. If the move is energetically downhill, itis accepted. In case the energy has risen compared to the last configuration, thestep is accepted only after comparison with a Boltzmann factor. At high enoughtemperatures, almost all steps are accepted. At lower temperatures, fewer highenergy structures are accepted. This implies that the substrate molecule per-forms an energy-guided random walk in the space around or in the receptor site.Rapid energy evaluation is achieved by pre-calculating atomic affinity potentialsfor each atom type in the ligand molecule (55). A probe atom is placed at each

point of a rectangular grid around the protein/active site. The interaction energyofthe probe atom (typically H, C, O, and N) with the receptor is calculated at eachgrid point and assigned to it. Steric interactions are calculated using a LennardJones potential (a so-called 126 potential), hydrogen bonds are calculated byapplication of a separate potential (referred to as a 1210 potential). The electro-static grid can be evaluated either using a probe charge of1 and a coulombpotential or by solving the linearized PoissonBoltzmann equation (51). Solventscreening can be modelled as well (78).

108


14/27

MOLECULAR MODELLING

The way that different ligands are predicted to fit into a receptor providesthen a rationale for selecting certain compounds for synthesis. The geometriesof docked complexes can also be used as starting points for further studies, suchas ligand design (c.f. next section) or 3D QSAR (79).

The scanning of entire databases in order to identify putative ligands for aknown receptor has been reported as well (73, 80) and will become more import-ant with the advent of large compound libraries generated by combinatorialchemistry.

5 Ab initio(de novo) ligand design

A central problem in current drug design research is to find an inhibitor startingwith a known macromolecular binding site. Many procedures have appearedwhich accomplish this task in an automated fashion. These programs attemptto generate novel ligands from scratch (ab initio) to fit the receptor site. A widevariety of methods for generating hypothetical ligand structures exist. The main

109

Protocol 2

Docking using AUTODOCKversion 3.0

Equipment

UNIX workstation AUTODOCKsoftware

Method

1 Prepare a file (extension .pdbq) containing the co-ordinates of the receptor (in pdb-

format), including polar hydrogens and partial atomic charges.a

2 Assign atomic solvation parameters and create a .pdbqs file, which contains co-

ordinates, charges, and solvation parameters. (mol2topdbqs, addsol)b

3 Define ligand torsions and generate the ligand .pdbq file. (deftors)

4 Use the receptor .pdbqs and the ligand .pdbq files in order to create the grid para-

meter file (.gpf) and the docking parameter file (.dpf). (mkgpf3, mkdpf3)

5 Calculate the grid maps. (autogrid3)

6 Perform the docking. (autodock3)

7 Create a pdb formatted file containing all docked conformations. (get-docked)

8 View the results using a molecular modelling program.

aA .pdbq file has standard PDB format, with the exception that it contains partial atomic charges

in columns 7176 of the file.b Steps 1 and 2 can be performed simultaneously as follows: Use SYBYL to add hydrogens (Bio-

polymer, essential_only) and partial charges to the protein structure of interest. Save the struc-

ture as a file with the extension .mol2, use theAUTODOCKmol2topdbqs program to create the

.pdbqs file.


15/27

ROMANO T. KROEMER

differences between the programs are related to the main stages of an ab initiodesign procedure as follows:

Preparation of the active site/derivation of constraints

Generation of ligand structures

Evaluation of ligand structuresIn the following the reader will find a quick run-down of a number of differ-

ent procedures available. For further information he is referred to the precedingchapter of this volume (Chapter 1).

5.1 Active site preparation

After identification of an active site (c.f. Section 3 in this chapter) it usually isanalysed further in order to extract information necessary for setting up an abinitio design procedure. The objectives of such an analysis are to identify par-ticular types of potentially favourable interactions with the receptor (such as

hydrogen bonds, hydrophobic or charged interactions) and/or to prepare forefficient algorithms for ligand evaluation. In principle there are two types ofapproach for this:

1 Potential-based methods use interaction energies calculated by application ofa force field in order to derive information about the active site. Representativealgorithms of this approach are the programs GRID (55), GREEN(54), and GROUP-

BUILD (81). In all these methods a grid is defined within the active site. Differenttypes of probe are placed at the lattice intersections and the interaction energybetween the probe and the protein is then computed. The force field used in thecomputation may include LennardJones, Coulomb, and hydrogen bonding terms.

Another method for exploring the interactions with the active site is incorpor-ated in the maximum common substructure (MCSS) procedure (56). At first theactive site is filled with many copies of randomly oriented functional groups.Examples of these functional groups include acetic acid, methyl ammonium,and water. A combination of energy minimization and quenched dynamics pro-cedures is then used to determine energetically favourable positions and orienta-tions of the groups within the active site. The minimization is constrained so thatthe functional groups see only the protein, and not each other. The resultingoutput is then a functionality map indicating the sites of favourable interactionswith the protein. Similar approaches have been incorporated in other programs(82, 83).

2 Rule-based or knowledge-based procedures apply knowledge about geometricalfeatures of ligandprotein interactions in order to explore the active site. Theserules are derived from statistical analyses of known structures of proteinligandor proteinprotein complexes. For example information about preferred hydro-gen bonding geometries is extracted. The active site is then examined and theserules are applied. InLUDI(84), for example, sets of vectors are generated aroundhydrogen donor or acceptor sites. The orientation of the vectors reflects distances

110


16/27

MOLECULAR MODELLING

and angles of potential hydrogen bonds at a particular site. Hydrophobic inter-actions are represented by single points (Figure 4). Other programs use similarapproaches (85, 86). The vectors and interaction points provide then constraintsfor the placement of atoms, fragments, or ligands during the next step, thedesign procedure.

5.2 Generation of structuresFor the actual generation of structures within the active site a multitude of dif-ferent methods exists. One way to differentiate between the approaches wouldbe to ask the question: Growing or linking? Another criterion for classificationwould be whether the methods follow a deterministic or a stochastic approach.

The linking approach implies that at first several fragments are placed in-dependently into the active site. In the next step linking elements are soughtthat connect these fragments into a complete molecule. An example for such astrategy is the program HOOK (87), which starts with the output generated by

MCSS. MCSS generates a set of functional group sites making favourable inter-actions with the protein (56). HOOKthen attempts to link these groups togetherwith molecular skeletons taken from a pre-computed database. The results arenew molecules containing multiple functional groups (the original fragments) inpotentially favourable positions. Other programs employing similar strategiesareLUDI(84),NEWLEAD (88), and SPROUT(89). The advantage of these proceduresis that the original fragments can be placed in an optimal fashion into the activesites. A potential problem, however, is to find a suitable linker.

In the growing approach a so-called seed fragment is placed into the bindingsite. Subsequently, additional fragments are added in a stepwise fashion. Theprogram GROW was originally developed for the design of peptides using aminoacid building blocks (90). Here the structures grow from an acetyl group placed

as the seed corn into the active site. Subsequent amino acid building blocks areadded stepwise via an amide bond linkage. The building blocks are taken from adatabase of amino acids in various low energy conformations. An advantage ofthe grow approach is that the resulting structures are more likely to be syn-thetically accessible. Other variations on this theme are implemented in theprogramsLEGEND (91) and GROUPBUILD (81). The programLUDIis an example ofa method that can build molecules either through linking or through growing(84).

111

Figure 4 Active site characterization using hydrogen-bond vectors and lipophilic interaction

points.


17/27

ROMANO T. KROEMER

All the algorithms mentions so far contain deterministic features. Earlierconstructs predetermine those that follow in a given design sequence, i.e. laterconstructs are in response not only to the binding site but also to earlier con-structs. A continuous metamorphic/non-serial construction of ligands wouldfavour a more exhaustive structural exploration of regiochemical and configur-

ational space. With these ideas in mind a series of stochastic approaches involv-ing molecular dynamics or Monte Carlo procedures and random changes in thestructures have been designed (82, 83, 92, 93). Usually all the programs (deter-ministic or stochastic) use empirical scoring functions or force fields for theevaluation of the structures generated. Recently a method has been reportedwhich combines a stochastic approach with a semi-empirical quantum mech-anical Hamiltonian for the assessment of ligand structures (94). In this latterapproach ligands are generated on an atomistic basis rather than on a fragmentbasis. This permits the construction of higher energy transitional structures thatbridge between viable ligand structures. Changes are accomplished by randomchoice to create, destroy, move, or change a randomly selected atom. A semi-

empirical Hamiltonian is used to calculate the internal energy of the ligand andits interaction with the receptor. The feasibility of such an approach has beendemonstrated using two examples.

5.3 Evaluation of structures

There are several reasons why the evaluation of the structures in ab initio liganddesign is an essential part of the procedure. First, in the sequential approaches,evaluation provides guidelines for further steps in the design sequence. Secondly,once a number of potential ligands have been generated, it is of interest to pro-duce some kind of ranking of these ligands, in order to decide which of them

should be synthesized first. Thirdly, most of the programs are able to generatemany thousands of different ligands, due to the number of possible combina-tions of fragments to form a complete structure. Therefore, a variety of scoringschemes has been devised. Some of these schemes are applied during the designprocedure; others are used in a post-processing manner.

Simple scoring schemes rank molecules according to molecular weight,number of atoms, etc. Another way to classify molecules is to consider theirsynthetic accessibility. In this case the availability of starting materials and thecomplexity of the molecules can be evaluated in a rule-based fashion (86). Someprograms prioritize ligands based on the number of pharmacophoric groups(such as hydrogen bond donors/acceptors, lipophilic sites, etc.) they contain (85).

Other methods apply rule-based scoring functions derived from the analysis ofproteinligand complexes (89, 95, 96). A function of this type is given as follows(95):

Gbind 5.4 4.7 h-bonds

f(r,0) 8.3 ionic

g(r,0) 0.17Alipophilic 1.4Nrotatable [1]

The elements in this function for the estimation of the free energy of binding(Gbind) are the number of hydrogen bonds between ligand and receptor (frefers

112


18/27

MOLECULAR MODELLING

to the geometry of the H-bonds), the number of ionic interactions, the lipophiliccontact area (A), and the number of rotatable bonds in the ligand. Some pro-grams incorporate mixed functions containing force field like terms and otherdescriptors, such as rotatable bonds or number of lipophilic contacts (97).

Another example for a de novo ligand design program is theLEAPFROG module

in SYBYL (31). It combines a number of ideas incorporated also in the programsconsidered above. LEAPFROG follows a deterministic approach, by repeatedlymaking some structural changes, and then either keeping or discarding the re-sults, depending on the evolution of the ligand during the run. As a start it canprocess two different types of input: A receptor structure (from experiment ormodelling) or, alternatively, a pharmacophore model. After selection of the inputthe program can operate in three different modes (c.f. Protocol 3): The OPTIMIZEmode aims at improving existing ligand structures. In the DREAM mode novelligands are suggested. In the GUIDE mode the user can interfere interactivelywith the design process. New ligand structures are evaluated mainly on theirbinding energy relative to their immediate precursor using an approximation of the GRID

procedure (55). Synthetic difficulty can be included in the scoring scheme. Theuser can decide on the trade-off between variety and quality of the outputligand structures. Therefore, the user may decide to emphasize in a first run thevariety. Subsequently, distinctively different structures can be chosen from theoutput in order to be optimized in a second run. At the end of a run the struc-tures are saved in a SYBYL database and are referenced by a molecular spread-sheet containing theLEAPFROG binding energies.

113

Protocol 3

Setting up an ab initioligand design procedure using

LEAPFROG

Equipment

Method

1 Start LEAPFROG. In the dialog box choose an input structure (cavity molecule).

2 Select the operating mode (Guide, Optimise, Dream, c.f. text).

3 Select whether to run the program interactively or in background.

4 If necessary alter other input data such as the fragment database used (Data).

5 Choose either to calculate or to read (from a previous run) the sitepoint/box

description.a

SYBYL software including theLEAPFROG

module

Unix workstation


19/27


20/27

MOLECULAR MODELLING

description of complex formation between two molecules A and B by a thermo-dynamic cycle (Figure 5). From this thermodynamic cycle it follows for the freeenergy of association G:

G Ggas Gsolv. [2]

withGsolv. GA

solv. GBsolv. GAB

solv. [3]

Each of these terms contains different contributions and grouping them togetherone can write alternatively (109):

G Gelec. Gcav. Gconf. GvdW Grt [4]

Gelec. contains both the electrostatic solutesolute and the solutesolvent inter-actions. Gcav. refers to the cavitation free energy on binding and can be definedas the free energy required to form a solute-sized cavity in the solvent when allinteractions (dispersion and electrostatic) are switched off. Gconf. is the loss of

side chain conformational entropy on binding (TS). The van der Waals energy(GvdW) is often neglected, under the assumption that the van der Waals inter-actions at the interface of the associated complex are equal to those with thesolvent molecules in dissociated form. The last term (Grt) represents the loss oftranslational and rotational entropy on complex formation. A number of tech-niques for calculation of all these terms has been reported (52, 112121).

115

Figure 5 Thermodynamic cycle for complex formation. Gsolv. refers to the free energy of

solvation. Ggas is the free energy of association in the gas phase.

The third method for calculating absolute free energies of binding, referredto as the linear interaction energy (LIE) model, is a hybrid method includingelements of the former two. The method was introduced by quist et al. (122)and employs a linear response approximation to calculate absolute binding free

energies as follows:Gbind E

elec. EvdW [5]

E is the energy difference between average contributions from ligandsolventand proteinligand interactions for the bound an unbound states, as derivedfrom short simulations. and represent empirical parameters derived by fittingenergy components from a set of ligands to experimental binding affinities. Re-cently, extensions/modifications of this method have been reported (123125).


21/27

ROMANO T. KROEMER

7 Statistical analysis of a set of protein ligands

Sometimes affinity data for a set of protein ligands are available, but it is notpossible to obtain a reliable receptor structure. However, it would be desirableto perform some kind of rational manipulation of the ligands, with the aim todesign improved structures. In this case one has to resort to other methods formodelling potential proteinligand interactions and to predict novel ligands with higher binding affinities. Most commonly applied under these circum-stances are pharmacophore modelling and QSAR (quantitative structureactivity relationship) techniques.

The functionalities of a molecule that are essential for exhibiting its pharma-cological activity (i.e. usually binding to a receptor) are referred to as pharma-cophores. The idea behind pharmacophore modelling is to analyse a set ofmolecules and to extract common features for biologically active compounds.The assumption is that these features make favourable interactions with thereceptor. In case three-dimensional representations of the molecules are chosen

for analysis, the result includes not only the type of functionalities but also theirrelative orientation or directionality. An excellent overview of pharmacophoremodelling and its applications is given in A. K. Ghoses and J. J. Wendoloskisreview (126).

QSAR techniques attempt to derive quantitative relationships between thefeatures of a set of ligands and (usually) their binding affinities. Traditional QSARanalyses the influence of steric, hydrophobic, and electrostatic effects of sub-stituents on biological activity. It identifies which of these are the dominantfeatures behind the change in biological properties. Statistical evidence for thevalidity of the proposed relationships is provided. The descriptors used are notdirectly related to 3D structure and are extrapolated from one reaction to

another.In 3D QSAR one calculates the descriptive properties directly from the 3D

representations of the molecules. Usually these descriptors are calculated insuch a way that their 3D distribution is retained in the model itself. The output ofa 3D QSAR analysis is often represented as 3D graphics, which is superimposedon the molecules of the data set. This representation makes it easy to recognizewhich properties of the molecules investigated are beneficial for binding andwhich ones are detrimental.

The best known 3D QSAR method is comparative molecular field analysis, orCoMFA (127). In this procedure steric and electrostatic interaction energies

between a probe atom and a set of superimposed molecules are calculated at thesurrounding points of a predefined grid. The rationale behind this is that theprobe atom corresponds to a receptor atom and scans the superimposed mol-ecules for favourable and unfavourable interactions with a putative receptor. Thecalculated energy values are then correlated with some property of the com-pounds (usually biological activity). Once a reliable correlation has been found,the QSAR model can be used for predicting the activities of novel compounds.

Plate 6 illustrates how a 3D pharmacophore analysis and 3D QSAR can be

116


22/27

MOLECULAR MODELLING

combined (128, 129). Recent advances in 3D QSAR and related techniques havebeen summarized in two volumes of a book series (130) and in a number ofpapers (131).

8 Concluding remarks

Although many of the computational methods presented in this chapter incorp-orate various approximations and simplifications, they work remarkably well,as shown by the many examples where experimental results could be repro-duced or predicted with good accuracy. The usefulness of these approaches hasalso been demonstrated by some success stories in the development of newdrug molecules (see, e.g. 73, 132, 133). Also, it has become clear that applicationof these procedures is most useful in combination with experiments, either toobtain confirmation from the latter or to provide guidelines for new ones. Inmany instances this has proven to be highly synergistic.

The future looks bright for molecular modelling in the area of proteinligand

interactions. New hardware with novel architecture and faster processors andimproved generations of software will advance the field significantly. Recentdevelopments such as the advent of combinatorial chemistry will lead to novelcomputational methods, for example combinatorial docking and virtual screen-ing (134, 135).

Acknowledgements

R. T. K. gratefully acknowledges the help of Martin Parretti for the preparationof Plate 5.

References

1. Bernstein, F. C., Koetzle, T. F., Williams, G. J. B., Meyer, E. F., Brice, M. D., Rodgers,J. R., et al. (1977).J. Mol. Biol., 112, 535.

2. Richardson, J. S. (1977).Adv. Protein Chem., 343, 167.3. Ptitsyn, O. B. and Finkelstein, A. V. (1980). Q. Rev. Biophys., 13, 339.4. Chothia, C. (1992).Nature, 357, 543.5. Chou, P. Y. and Fasman, G. D. (1974).Biochemistry, 13, 222.6. Garnier, J., Osguthorpe, D. J., and Robson, B. (1978).J. Mol. Biol., 120, 97.7. Rost, B. and Sander, C. (1993).J. Mol. Biol., 232, 584.8. Rost, B. and Sander, C. P. (1995).Proteins, 23, 295.9. Curtis, B. M., Presnell, S. R., Srinivasan, S., Sassenfeld, H., Klinke, R., Jeffery, E., et al.

(1991).Proteins, 11, 111.10. Defay, T. and Cohen, F. E. (1995).Proteins, 23, 431.11. Bowie, J. U., Luthy, R., and Eisenberg, E. (1991). Science, 253, 164.12. Godzik, A., Kolinski, A., and Skolnick, J. (1992).J. Mol. Biol., 227, 227.13. Jones, D. T., Taylor, W. R., and Thornton, J. M. (1992).Nature, 358, 86.14. Sippl, M. J. and Weitckus, S. (1992).Proteins, 13, 258.15. Ouzounis, C., Sander C., Scharf, M., and Schneider, R. (1993).J. Mol. Biol., 232, 805.16. Bryant, S. H. and Lawrence, C. E. (1993).Proteins, 16, 92.

117


23/27

ROMANO T. KROEMER

17. Krogh, A., Brown, M., Mian, I. S., Solander, K., and Haussler, D. (1994).J. Mol. Biol., 235,1501.

18. Altschul, S. F., Gish, W., Miller, W., Myers, E. W., and Lipman, D. J. (1990).J. Mol. Biol.,215, 403.

19. Pearson, W. R. and Lipman, D. J. (1988).Proc. Natl. Acad. Sci. USA, 85, 2444.20. Kroemer, R. T., Doughty, S. W., Robinson, A. J., and Richards, W. G. (1996).Protein

Eng., 9, 493.21. Jones, T. H. and Thirup, S. (1986).EMBO J., 5, 819.22. Unger, R., Harel, D., Wherland, S., and Sussman, J. L. (1989).Proteins, 5, 355.23. Claessens, M., VanCutsem, E., Lasters, I., and Wodak, S. (1989).Protein Eng., 4, 335.24. Levitt, M. (1992).J. Mol. Biol., 226, 507.25. Sutcliffe, M. J., Haneef, I., Carney, D., and Blundell, T. L. (1987).Protein Eng., 1, 377.26. Sutcliffe, M. J., Hayes, F. R., and Blundell, T. L. (1987).Protein Eng., 1, 385.27. Ponder, J. W. and Richards, F. M. (1987).J. Mol. Biol., 193, 775.28. Moult, J. and James, M. N. (1986).Proteins, 1, 146.29. Blundell, T. L., Sibanda, B. L., Sternberg, M. J. E., and Thornton, J. M. (1987).Nature,

326, 347.30. Blundell, T. L., Carney, D., Gardner, S., Hayes, F., Howlin, B., Hubbard, T., et al. (1988).

Eur. J. Biochem., 172, 513.31. SYBYL6.5, Tripos Inc., St. Louis. http://www.tripos.com.32. Srinivasan, S., March, C. J., and Sudarsanam, S. (1993).Protein Sci., 2, 277.33. Brocklehurst, S. M. and Perham, R. N. (1993).Protein Sci., 2, 626.34. Fujiyoshi-Yoneda, T., Yoneda, S., Kitamura, K., Amisaki, T., Ikeda, K., Inoue, M., et al.

(1991).Protein Eng., 4, 443.35. Friedrichs, M. S., Goldstein, R. A., and Wolynes, B. G. (1991).J. Mol. Biol., 222, 1013.36. Crippen, G. M. and Havel, T. F. (1988).Distance geometry and molecular conformation.

Chemometrics Research Studies Series 15. New York, Wiley.37. Sali, A. and Blundell, T. L. (1993).J. Mol. Biol., 234, 779.38. Connolly, M. L. (1992).Biopolymers, 32, 1215.39. Lewis, R. A. (1989).J. Comput.-Aided Mol. Design, 3, 133.

40. Kleywegt, G. J. and Jones, T. A. (1994).Acta Crystallogr. Sect. D, 50, 178.41. Hendlich, M., Rippmann, F., and Barnickel, G. (1997).J. Mol. Graphics Model., 15, 359.42. Jones, S. and Thornton, J. M. (1997).J. Mol. Biol., 272, 121.43. Jones, S. and Thornton, J. M. (1997).J. Mol. Biol., 272, 133.44. LoConte, L., Chothia, C., and Janin, J. (1999).J. Mol. Biol., 285, 2177.45. Larsen, T. A., Olson, A. J., and Goodsell, D. S. (1998). Structure, 6, 421.46. Wireko, F. C., Kellogg, G. E., and Abraham, D. J. (1991).J. Med. Chem., 34, 758.47. Kellog, G. E., Semus, S. F., and Abraham, D. J. (1991).J. Comput.-Aided Mol. Design, 5, 545.48. Nicholls, A., Sharp, K. A., and Honig, B. H. (1991).Proteins, 11, 281.49. Sanner, M. F., Olson, A. J., and Spehner, J.-C. (1996).Biopolymers, 38, 305.50. Duncan, B. S., Macke, T. J., and Olson, A. J. (1995).J. Mol. Graphics, 13, 271.51. Honig, B. H. and Nicholls, A. (1995). Science, 268, 1144.52. Warwicker, J. and Watson, H. C. (1982).J. Mol. Biol., 157, 671.53. Boobbyer, D. N. A., Goodford, P. J., McWhinnie, P. M., and Wade, R. C. (1989).J. Med.

Chem., 32, 1083.54. Tomioka, N. and Itai, A. (1994).J. Comput.-Aided Mol. Design, 8, 347.55. Goodford, P. J. (1985).J. Med. Chem., 28, 849.56. Miranker, A. and Karplus, M. (1991).Proteins, 11, 29.57. Shoichet, B. K., Stroud, R. M., Santi, D. V., Kuntz, I. D., and Perry, K. M. (1993). Science,

259, 1445.

118


24/27

MOLECULAR MODELLING

58. Morris, G. M., Goodsell, D. S., Halliday, R. S., Huey, R., Hart, W. E., Belew, R. K., et al.(1998).J. Comp. Chem., 19, 1639. http://www.scripps.edu/pub/olson-web/dock/autodock/

59. Kuntz, I. D., Blaney, J. M., Oatley, S. J., Langridge, R., and Ferrin, T. E. (1982).J. Mol.Biol., 161, 269. http://www.cmpharm.ucsf.edu/kuntz/dock.html

60. Oshiro, C. M. and Kuntz I. D. (1995).J. Comput.-Aided Mol. Design, 9, 113.61. Hart, T. N., Ness, S. R., and Read, R. J. (1997).Proteins, S1, 205.

http://www.dockvision.com/62. Rarey, M., Kramer, B., Lengauer, T., and Klebe, G. (1996).J. Mol. Biol., 261, 470.

http://cartan.gmd.de/FlexX/63. Miller, M. D., Kearsley, S. K., Underwood, D. J., and Sheridan, R. P. (1994).J. Comput.-

Aided Mol. Design, 8, 153.64. Gabb, H. A., Jackson, R. M., and Sternberg, M. J. E. (1997).J. Mol. Biol., 272, 106.

http://www.bmm.icnet.uk/ftdock/ftdock.html65. Vakser, I. A. (1997).Proteins, S1, 226. http://reco3.musc.edu/gramm/index.html66. Sobolev, V., Wade, R. C., Vriend, G., and Edelman, M. (1996).Proteins, 25, 120.

http://swift.embl-heidelberg.de/ligin/67. Jackson, R. M., Gabb, H. A., and Sternberg, M. J. E. (1998).J. Mol. Biol., 276, 265.

http://www.bmm.icnet.uk/multidock/multidock.html

68. Stroud, R. M. and Fauman, E. B. (1995).Protein Sci.,4

, 2392.69. Knegtel, R., Oshiro, C., and Kuntz, I. (1997).J. Mol. Biol., 266, 424.70. Srinivasan, N. and Blundell, T. L. (1993).Protein Eng., 6, 501.71. Sali, A. (1995). Curr. Opin. Biotechnol., 6, 437.72. Ring, C. S., Sun, E., McKerrow, J. H., Lee, G. K., Rosenthal, P. J., Kuntz, I. D., et al.

(1993).Proc. Natl. Acad. Sci. USA, 90, 3583.73. Li, R., Chen, X., Gong, B., Selzer, P. M., Li, Z., Davidson, E., et al. (1996).Bioorg. Med.

Chem., 4, 1421.74. Liang, J., Edelsbrunner, H., and Woodward, C. (1998).Protein Sci., 7, 1884.75. Shoichet, B. K., Bodian, D. L., and Kuntz, I. D. (1992).J. Comp. Chem., 13, 380.76. Cornell, W. D., Cieplak, P., Bayly, C. I., Gould, I. R., Merz, K. M., Ferguson, D. M., et al.

(1995).J. Am. Chem. Soc., 117, 5179.77. Halgren, T. A. (1996).J. Comp. Chem., 17, 490.78. Mehler, E. L. and Solmajer, T. (1991).Protein Eng., 4, 903.79. Gamper, A. M., Winger, R. H., Liedl, K. R., Sotriffer, C. A., Varga, J. M., Kroemer, R. T.,

et al. (1996).J. Med. Chem., 39, 3882.80. Bhm, H.-J. (1994).J. Comput.-Aided Mol. Design, 8, 623.81. Rotstein, S. H. and Murcko, M. A. (1993).J. Med. Chem., 36, 1700.82. Pearlman, D. A. and Murcko, M. A. (1993).J. Comp. Chem., 14, 1184.83. Bohacek, R. S. and McMartin, C. (1994).J. Am. Chem. Soc., 116, 5560.84. Bhm, H.-J. (1992).J. Comput.-Aided Mol. Design, 6, 61.85. Clark, D. E., Frenkel, D., Levy, S. A., Li, J., Murray, C. W., Robson, B., et al. (1995).

J. Comput.-Aided Mol. Design, 9, 13.86. Gillet, V. J., Myatt, G. J., Zsoldos, Z., and Johnson, A. P. (1995).Perspect. Drug Discov.

Design, 3, 34.87. Eisen, M. B., Wiley, D. C., Karplus, M., and Hubbard, R. E. (1994).Proteins, 19, 199.88. Tshinke, V. and Cohen, N. C. (1993).J. Med. Chem., 36, 3863.89. Gillet, V. J., Johnson, A. P., Mata, P., Sike, S., and Williams, P. (1993).J. Comput.-Aided

Mol. Design, 7, 127.90. Moon, J. B. and Howe, W. J. (1991).Proteins, 11, 314.91. Nishibata, Y. and Itai, A. (1991). Tetrahedron, 47, 8985.92. Pearlman, D. A. and Murcko, M. A. (1996).J. Med. Chem., 39, 1651.93. Miranker, A. and Karplus, M. (1995).Proteins, 23, 472.

119


25/27

ROMANO T. KROEMER

94. Rothman, J. H. and Kroemer, R. T. (1997).J. Mol. Model., 3, 261.95. Bhm, H.-J. (1994).J. Comput.-Aided Mol. Design, 8, 243.96. Bhm, H.-J. (1998).J. Comput.-Aided Mol. Design, 12, 309.97. Head, R. D., Smythe, M. L., Oprea, T. I., Waller, C. L., Green, S. M., and Marshall, G. R.

(1996).J. Am. Chem. Soc., 118, 3959.98. Leach, A. R. (1996).Molecular modelling, principles and applications. Longman, Harlow,

UK.99. Allen, M. P. and Tildesley, D. J. (1987). Computer simulation of liquids. Clarendon Press,

Oxford, UK.100. Tembe, B. L. and McCammon, J. A. (1984). Comput. Chem., 8 , 281.101. Jorgensen, W. L., Buckner, J. K., Boudon, S., and Tirado-Rives, J. (1988).J. Chem. Phys.,

89, 3742.102. Wong, C. and McCammon, J. A. (1986).J. Am. Chem. Soc., 109, 3830.103. Straatsma, T. P. and McCammon, J. A. (1992).Annu. Rev. Phys. Chem., 43, 407.104. Kollman, P. (1993). Chem. Rev., 93, 2395.105. Mitchell, M. J. and McCammon, J. A. (1991).J. Comput. Chem., 12, 271.106. Tidor, B. and Karplus, M. (1994).J. Mol. Biol., 238, 405.107. Pranta, J. and Jorgensen, W. L. (1991). Tetrahedron, 47, 2491.

108. Gilson, M. K., Given, J. A., Bush, B. L., and McCammon, J. A. (1997).Biophys. J., 72,1047.

109. Chothia, C. and Janin, J. (1975).Nature, 256, 705.110. Searle, M. S., Williams, D. H., and Gerhard, U. (1992).J. Am. Chem. Soc., 114, 10697.111. Weng, Z., Vajda, S., and DeLisi, C. (1996).Protein Sci., 5, 614.112. Klapper, I., Hagstrom, R., Fine, R., Sharp, K. A., and Honig, B. (1986).Proteins, 1, 47.113. Gilson, M. K. and Honig, B. (1988).Proteins, 4, 7.114. Jackson, R. M. and Sternberg, M. J. E. (1995).J. Mol. Biol., 250, 258.115. Gilson, M. K., Sharp, K. A., and Honig, B. (1988).J. Comp. Chem., 9, 327.116. Jackson, R. M. and Sternberg, M. J. E. (1994).Protein Eng., 7, 371.117. Janin, J. (1995).Proteins, 21, 30.

118. Janin, J. (1995).Prog. Biophys. Mol. Biol.,64

, 145.119. Creamer, T. P. and Rose, G. D. (1992).Proc. Natl. Acad. Sci. USA, 89, 5937.120. Sternberg, M. J. E. and Chickos, J. S. (1994).Protein Eng., 7, 149.121. Finkelstein, A. V. and Janin, J. (1989).Protein Eng., 3, 1.122. quist, J., Medina, C., and Samuelsson, J. E. (1994).Protein Eng., 7, 385.123. Carlson, H. A. and Jorgensen, W. L. (1995).J. Phys. Chem., 99, 10667.124. McDonald, N. A., Carlson, H. A., and Jorgensen, W. L. (1997).J. Phys. Org. Chem., 10,

563.125. Hansson, T., Marelius, J., and quist, J. (1998).J. Comput.-Aided Mol. Design, 12, 27.126. Ghose, A. K. and Wendoloski, J. J. (1998).Perspect. Drug Discov. Design, 911, 253.127. Cramer, R. D., III, Patterson, D. E., and Bunce, J. D. (1988).J. Am. Chem. Soc., 110,

5959.

128. Kroemer, R. T., Hecht, P., Guessregen, S., and Liedl, K. R. (1998).Perspect. Drug Discov.Design, 12, 41.

129. Kroemer, R. T., Koutsilieri, E., Hecht, P., Liedl, K. R., Riederer, P., and Kornhuber, J.(1998).J. Med. Chem., 41, 393.

130. Kubinyi, H., Folkers, G., and Martin, Y. C. (ed.) (1998). 3D QSAR in drug design, Vols. 2and 3. KLUWER/ESCOM, Dordrecht, The Netherlands.

131. (1998).Perspect. Drug Discov. Design, Volumes 914.132. Hong, H., Neamati, N., Wang, S., Nicklaus, M. C., Mazumder, A., Zhao, H., et al.

(1997).J. Med. Chem., 40, 930.

120


26/27

MOLECULAR MODELLING

133. Von Itzstein, M., Wu, W. Y., Kok, G. B., Pegg, M. S., Dyason, J. C., Jin, B., et al. (1993).Nature, 363, 418.

134. Sun, Y., Ewing, T. J. A., Skillman, A. G., and Kuntz, I. D. (1998).J. Comput.-Aided Mol.Design, 12, 597.

135. Bures, M. G. and Martin, Y. C. (1998). Curr. Opin. Chem. Biol., 2, 376.

121


27/27

chapter 2_molecular modelling

Documents