computational methods in medicinal chemistry - diva …844370/fulltext01.pdfthese two applications...

66
ACTA UNIVERSITATIS UPSALIENSIS UPPSALA 2015 Digital Comprehensive Summaries of Uppsala Dissertations from the Faculty of Pharmacy 201 Computational Methods in Medicinal Chemistry Mechanistic Investigations and Virtual Screening Development FREDRIK SVENSSON ISSN 1651-6192 ISBN 978-91-554-9293-9 urn:nbn:se:uu:diva-259443

Upload: hoangnguyet

Post on 24-Apr-2018

230 views

Category:

Documents


4 download

TRANSCRIPT

Page 1: Computational Methods in Medicinal Chemistry - DiVA …844370/FULLTEXT01.pdfThese two applications of computational medicinal chemistry is the focus of this thesis. ... Organic Pharmaceutical

ACTAUNIVERSITATIS

UPSALIENSISUPPSALA

2015

Digital Comprehensive Summaries of Uppsala Dissertationsfrom the Faculty of Pharmacy 201

Computational Methods inMedicinal Chemistry

Mechanistic Investigations and Virtual ScreeningDevelopment

FREDRIK SVENSSON

ISSN 1651-6192ISBN 978-91-554-9293-9urn:nbn:se:uu:diva-259443

Page 2: Computational Methods in Medicinal Chemistry - DiVA …844370/FULLTEXT01.pdfThese two applications of computational medicinal chemistry is the focus of this thesis. ... Organic Pharmaceutical

Dissertation presented at Uppsala University to be publicly examined in A1:107a, BMC,Husargatan 3, Uppsala, Friday, 25 September 2015 at 09:15 for the degree of Doctor ofPhilosophy (Faculty of Pharmacy). The examination will be conducted in English. Facultyexaminer: Professor Anna Linusson (Umeå University, Department of Chemistry).

AbstractSvensson, F. 2015. Computational Methods in Medicinal Chemistry. MechanisticInvestigations and Virtual Screening Development. Digital Comprehensive Summaries ofUppsala Dissertations from the Faculty of Pharmacy 201. 65 pp. Uppsala: Acta UniversitatisUpsaliensis. ISBN 978-91-554-9293-9.

Computational methods have become an integral part of drug development and can help bringnew and better drugs to the market faster. The process of predicting the biological activity oflarge compound collections is known as virtual screening, and has been instrumental in thedevelopment of several drugs today in the market. Computational methods can also be used toelucidate the energies associated with chemical reactivity and predict how to improve a syntheticprotocol. These two applications of computational medicinal chemistry is the focus of this thesis.

In the first part of this work, quantum mechanics has been used to probe the energy surface ofpalladium(II)-catalyzed decarboxylative reactions in order to gain a better understating of thesesystems (paper I-III). These studies have mapped the reaction pathways and been able to makeaccurate predictions that were verified experimentally.

The other focus of this work has been to develop virtual screening methodology. Our firststudy in the area (paper IV) investigated if the results from several virtual screening methodscould be combined using data fusion techniques in order to get a more consistent result andbetter performance. The study showed that the results obtained from data fusion were moreconsistent than the results from any single method. The data fusion methods also for severaltarget had a better performance than any of the included single methods.

Next, we developed a dataset suitable for evaluating the performance of virtual screeningmethods when applied to large compound collection as a replacement or complement for highthroughput screening (paper V). This is the first benchmark dataset of its kind.

Finally, a method for using computationally derived reaction coordinates as basis for virtualscreening was developed. The aim was to find inhibitors that resemble key steps in themechanism (paper VI). This initial proof of concept study managed to locate several known andone previously not reported reaction mimetics against insulin regulated amino peptidase.

Keywords: DFT, IRAP, Virtual Screening, Catalysis, Palladium

Fredrik Svensson, Department of Medicinal Chemistry, Organic Pharmaceutical Chemistry,Box 574, Uppsala University, SE-75123 Uppsala, Sweden.

© Fredrik Svensson 2015

ISSN 1651-6192ISBN 978-91-554-9293-9urn:nbn:se:uu:diva-259443 (http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-259443)

Page 3: Computational Methods in Medicinal Chemistry - DiVA …844370/FULLTEXT01.pdfThese two applications of computational medicinal chemistry is the focus of this thesis. ... Organic Pharmaceutical

”A really good theoretical chemist can obtain right answers with wrong

models.” -J. H. Van Vleck

Page 4: Computational Methods in Medicinal Chemistry - DiVA …844370/FULLTEXT01.pdfThese two applications of computational medicinal chemistry is the focus of this thesis. ... Organic Pharmaceutical
Page 5: Computational Methods in Medicinal Chemistry - DiVA …844370/FULLTEXT01.pdfThese two applications of computational medicinal chemistry is the focus of this thesis. ... Organic Pharmaceutical

List of Papers

This thesis is based on the following papers, which are referred to in the text by their Roman numerals.

I F. Svensson, R. Mane, J. Sävmarker, M. Larhed, C. Sköld,

Theoretical and Experimental Investigation of Palladium(II)-Catalyzed Decarboxylative Addition of Arenecarboxylic Acid to Nitrile, Organometallics 2013, 32, 490-497

II J. Rydfjord, F. Svensson, A. Trejos, P. Sjöberg, C. Sköld, J.

Sävmarker, L. Odell, M. Larhed, Decarboxylative Palladi-um(II)-Catalyzed Synthesis of Aryl Amidines from Aryl Car-boxylic Acids: Development and Mechanistic Investigation, Chem. Eur. J. 2013, 19, 13803-13810

III F. Svensson, A. Fardost, B. Skillinghaug, P. Wakchaure, M.

Wejdemar, M. Larhed, C. Sköld, Mechanistic Investigation of Palladium(II)-Catalyzed Decarboxylative Synthesis of Elec-tron-Rich Styrenes and 1,1-Diarylethenes, Manuscript

IV F. Svensson, A. Karlén, C. Sköld, Virtual Screening Data Fu-

sion Using Both Structure- and Ligand-Based Methods, J. Chem. Inf. Model. 2012, 52, 225-232

V M. Lindh, F. Svensson, W. Schaal, J. Zhang, C. Sköld, P.

Brandt, A. Karlén, Toward a Benchmarking Data Set Able to Evaluate Ligand- and Structure-based Virtual Screening Using Public HTS Data, J. Chem. Inf. Model. 2015, 55, 343-353

VI F. Svensson, K. Engen, T. Lundbäck, M. Larhed, C. Sköld,

Virtual Screening for Transition State Analogue Inhibitors of IRAP Based on Quantum Mechanically Derived Reaction Co-ordinates, J. Chem. Inf. Model., In press, DOI: 10.1021/acs.jcim.5b00359

Reprints were made with permission from the respective publishers.

Page 6: Computational Methods in Medicinal Chemistry - DiVA …844370/FULLTEXT01.pdfThese two applications of computational medicinal chemistry is the focus of this thesis. ... Organic Pharmaceutical

List of additional publications

2015 A. Belfrage, J. Gising, F. Svensson, E. Åkerblom, C. Sköld, A. Sandström, Efficient and Selective Palladium-Catalysed C-3 Urea Couplings to 3,5-Dichloro-2(1H)-pyrazinones, Eur. J. Org. Chem., 5, 978-986

2014 S. Borhade, U. Rosenström, J. Sävmarker, T. Lundbäck, A.

Jenmalm-Jensen, K. Sigmundsson, H. Axelsson, F. Svensson, V. Konda, C. Sköld, M. Larhed, M. Hallberg, Inhibition of In-sulin-Regulated Aminopeptidase (IRAP) by Arylsulfona-mides, ChemistryOpen, 3, 256-263

2014 B. Skillinghaug, C. Sköld, J. Rydfjord, F. Svensson, M. Beh-

rends, J. Sävmarker, P. J. R. Sjöberg, M. Larhed, Palladi-um(II)-Catalyzed Desulfitative Synthesis of Aryl Ketones from Sodium Arylsulfinates and Nitriles; Scope, Limitations, and Mechanistic Studies, J. Org. Chem., 79, 12018-12032

2013 J. Rydfjord, F. Svensson, M. Fagrell, J. Sävmarker, M. Thu-

lin, M. Larhed, Temperature measurements with two different IR sensors in a continuous-flow microwave heated system, Beilstein J. Org. Chem., 9, 2079-2087

Page 7: Computational Methods in Medicinal Chemistry - DiVA …844370/FULLTEXT01.pdfThese two applications of computational medicinal chemistry is the focus of this thesis. ... Organic Pharmaceutical

Contents

1 Introduction ......................................................................................... 11 1.1 Computational Tools in Drug Discovery ........................................ 11

2 Reactivity and Energy .......................................................................... 12 2.1 Energies of Ligand Binding ............................................................ 12 2.2 Potential Energy Surface ................................................................. 13 2.3 Transition State Theory ................................................................... 14 2.4 Molecular Mechanics ...................................................................... 15 2.5 Quantum Mechanics ....................................................................... 16

2.5.1 Density Functional Theory .................................................... 17 2.5.2 Level of Theory ..................................................................... 17

3 Virtual Screening ................................................................................. 20 3.1 Ligand-Based Methods ................................................................... 20 3.2 Molecular Docking ......................................................................... 20 3.3 Database and Target Preparation .................................................... 21

3.3.1 Water Molecules in Virtual Screening ................................... 21 3.4 The Concept of Chemical Space ..................................................... 22

3.4.1 Principal Component Analysis .............................................. 22 3.5 Benchmark Datasets ........................................................................ 22 3.6 Evaluation Metrics .......................................................................... 23

4 Catalysis ............................................................................................... 25 4.1 Palladium Catalysis ......................................................................... 25

4.1.1 Decarboxylation ..................................................................... 26 4.1.2 Palladium Ligands ................................................................. 26 4.1.3 Density Functional Theory Modeling of Palladium-Catalyzed Reactions ............................................................................. 27

4.2 Enzyme Catalysis ............................................................................ 28 4.2.1 Modeling of Enzyme Reactions ............................................. 28 4.2.2 Cluster Approach ................................................................... 29

5 Aims of the Thesis ............................................................................... 30

6 Decarboxylative Palladium(II)-Catalyzed Reactions (papers I-III) ..... 31 6.1 Synthesis of Aryl Ketones ............................................................... 31 6.2 Synthesis of Aryl Amidines ............................................................ 34 6.3 Styrene Synthesis ............................................................................ 36

Page 8: Computational Methods in Medicinal Chemistry - DiVA …844370/FULLTEXT01.pdfThese two applications of computational medicinal chemistry is the focus of this thesis. ... Organic Pharmaceutical

7 Method Consensus in Virtual Screening (paper IV) ............................ 41

8 Evaluation of Virtual Screening Using Publicly Available High Throughput Screening Data (paper V) ................................................. 44

9 Virtual Screening Using Computationally Derived Reaction Coordinates (paper VI) ........................................................................ 48

10 Conclusions ......................................................................................... 54

11 Acknowledgements.............................................................................. 55

12 References ........................................................................................... 56

Page 9: Computational Methods in Medicinal Chemistry - DiVA …844370/FULLTEXT01.pdfThese two applications of computational medicinal chemistry is the focus of this thesis. ... Organic Pharmaceutical

Abbreviations

APN aminopeptidase N B3LYP Becke, 3-parameter, Lee-Yang-Parr BEDROC Boltzman-enhanced discrimination of ROC DEKOIS demanding evaluation kits for objective in silico screening DFT density functional theory DUD directory of useful decoys EF enrichment factor ERAP endoplasmic reticulum aminopeptidase ESI-MS electrospray ionization mass spectrometry FF force field fp false positive HTS high throughput screening IRAP insulin-regulated aminopeptidase IRC intrinsic reaction coordinate LB ligand-based MM molecular mechanics MUV maximum unbiased validation (dataset) PCA principal component analysis PES potential energy surface PDB protein data bank QM quantum mechanics QRC quick reaction coordinate ROC receiver operator characteristic SB structure-based tp true positive TS transition state TST transition-state theory VA vinyl acetate VS virtual screening

Page 10: Computational Methods in Medicinal Chemistry - DiVA …844370/FULLTEXT01.pdfThese two applications of computational medicinal chemistry is the focus of this thesis. ... Organic Pharmaceutical
Page 11: Computational Methods in Medicinal Chemistry - DiVA …844370/FULLTEXT01.pdfThese two applications of computational medicinal chemistry is the focus of this thesis. ... Organic Pharmaceutical

11

1 Introduction

For thousands of years humans have treated ailments with remedies derived from natural sources such as plants, fungi, insects, and animals. During the nineteenth century it was realized that the therapeutic effects were due to specific molecules present in these sources. Subsequent development of the chemical industry provided new synthetic sources for these molecules, set-ting the stage for the emergence of the pharmaceutical industry. Although this process revolutionized the treatment of many diseases, the discovery of new drugs still happened mainly by chance. Gradually, the newly invented methods led to what we call rational drug design, the leading paradigm of drug development today.1

1.1 Computational Tools in Drug Discovery The rational development of new drugs is a complex process involving a myriad of different disciplines. In nearly all of these, computational tools serve to guide the design decisions, ultimately speeding up and reducing the cost of development.2 In the area of medicinal chemistry, computational methods are applied both to identify new starting points and to guide the further development of promising substances.

Computational tools can also be applied to further our understanding of chemical reactivity and the mechanisms behind chemical reactions. Quantum mechanics (QM) allows for the detailed study of the electronic properties of a molecule and the energies associated with its reactivity.

Thus, computational methods can be used to guide the decision on what compounds to make, and to help decide how best to make these compounds. These two factors are the focus of this thesis.

Page 12: Computational Methods in Medicinal Chemistry - DiVA …844370/FULLTEXT01.pdfThese two applications of computational medicinal chemistry is the focus of this thesis. ... Organic Pharmaceutical

12

2 Reactivity and Energy

The fundamentals of thermodynamics dictate that a chemical reaction will be a spontaneous process if it is associated with a decrease in the free energy of the system. Free energy in the context of thermodynamics is defined as the energy available to do work, and for a process carried out at constant tem-perature it can be measured as Gibbs free energy. The change in Gibbs free energy (ΔG) of a process is dependent on the change in enthalpy (ΔH) and entropy (ΔS) as well as the temperature of the system (T), according to: ∆ = ∆ − ∆

The thermodynamic properties of a reaction tell us if it is energetically fa-vorable, but they do not determine the rate of the reaction. How fast a reac-tion will proceed is determined by the activation energy (Ea) of the process. This is expressed by the Arrhenius equation as: = /

where k is the reaction rate coefficient, A is the frequency factor specific to the reaction, R is the gas constant, and T is the temperature.

2.1 Energies of Ligand Binding

Like any process, the binding of a drug molecule to its macromolecular tar-get only proceeds spontaneously if the process is associated with negative free energy. The enthalpy change associated with this binding process de-pends on the interactions occurring between the ligand and the protein com-pared to the interactions of the unbound ligand protein with the solvent. Pos-sible interactions in the ligand–protein complex can be detected by using, for example, X-ray methodology to find the structure. Ideally, these interactions would be associated with an energy term and then summed. Unfortunately, it has been shown that the interaction energies behave in a non-additive fash-ion.3,4

Page 13: Computational Methods in Medicinal Chemistry - DiVA …844370/FULLTEXT01.pdfThese two applications of computational medicinal chemistry is the focus of this thesis. ... Organic Pharmaceutical

13

Estimation of the entropy term is less straightforward, as it is dependent on the effects of the binding event on the degrees of freedom of the ligand, the protein, and the solvent molecules.5

The ultimate goal of computational medicinal chemistry could be thought of as the accurate prediction of the free energy of a binding event.

2.2 Potential Energy Surface A function of the atom positions which return the potential energy of the systems could be thought of in terms of a multidimensional potential energy surface (PES). Points on the PES that have a zero gradient, especially mini-ma and first-order saddle points, are of especial interest. The minima points correspond to stable species and a saddle point along the minimum energy path separating two minima represents a transition state (TS). If the system is restricted to two degrees of freedom, such as two bond lengths occurring during a reaction, this can be visualized in a graph. The PES for the palladi-um(II)-catalyzed decarboxylation of benzoic acid is shown in Figure 1.

Figure 1. Calculated PES for the decarboxylation of benzoic acid. Energy is plotted as a function of bond lengths r1 and r2.

The relative energies of the TS saddle point to the connecting minima on either side represent the minimal amounts of energy required for the transi-tions to take place in the forward and backward directions, which affects the rates of these reactions.

The PES is also of interest when considering the conformations of mole-cules. A low energy minimum on the PES for a certain of molecules will be highly populated, with the population density decreasing as the energy in-

Page 14: Computational Methods in Medicinal Chemistry - DiVA …844370/FULLTEXT01.pdfThese two applications of computational medicinal chemistry is the focus of this thesis. ... Organic Pharmaceutical

14

creases. The probability distribution of molecules over their possible con-formational states is described by a Boltzmann distribution through:

= /∑ /

where pi is the probability of state i, εi is the energy state i, k is the Boltz-mann constant, T is the temperature, and N is the number of possible states.

Since we are usually interested in the low energy conformations, we need methods for locating these for a given PES. Given an arbitrary starting point on the PES, minimization methods can find the closest local minimum point by following the local energy gradient. Thus, the optimization of a starting geometry can easily be achieved, yielding the conformation associated with the downhill energy minimum from the starting conformation. However, no method of readily locating the global energy minimum exists. In fact, locat-ing the global minimum with certainty usually requires that all possible con-formations are sampled. Several different approaches have been developed for sampling the conformational space of molecules.6,7

Once located, the zero gradient point can be characterized by calculating the Hessian matrix of the potential energy function. Although potentially time consuming, this will give information on the local curvature of the PES. The Hessian matrix can also be used to calculate the force constants of the harmonic vibrations in a molecule.

2.3 Transition State Theory Transition state theory (TST) is closely linked to the Arrhenius equation and the PES. This concept, which was introduced in 1935 by Eyring, includes mechanistic considerations that are not included in the Arrhenius rate law.8 TST links the movement over the PES to the reaction rate constant (k) through:

= κ ℎ ∆ ‡

where κ is the transmission coefficient relative to the probability that the product will form from the activated complex, kB is Boltzman’s constant, h is Planck’s constant, ΔG‡ is the free energy of activation, T is the temperature, and R is the gas constant.

Page 15: Computational Methods in Medicinal Chemistry - DiVA …844370/FULLTEXT01.pdfThese two applications of computational medicinal chemistry is the focus of this thesis. ... Organic Pharmaceutical

15

2.4 Molecular Mechanics The application of principles from classical mechanics to describe molecular systems is called molecular mechanics (MM). In MM, an atom is considered as a single particle and bonds between atoms are treated like springs. Through this approach, the potential energy of a molecular system can be calculated using a set of parameterized functions called a force field (FF). The concept of MM was introduced in 1946 by Hill,9 and Westheimer and Mayer.10

The terms included depend on the FF used. Generally, terms describing bond length, torsion, angle, and non-bonded interactions are considered. The parameters are chosen so that the model can reproduce known experimental structures or structures derived from QM calculations. One of the first suc-cessful FFs was MM2, published in 1977 by Allinger.11 In this FF the poten-tial energy is described as:

= + ++ +

where: = 71.94 ( − ) (1 − 2( − )), = 0.021914 ( − ) (1 + 7 ∙ 10 ∙ ( − ) ), = 2.51124 (θ − )( − ) + (l − ) ,

= (2.9 ∙ 10 ∙ − 2.25 ), = (1 + cos ) + (1 − cos 2 ) + (1 + cos 3 ),

ks is a bond-specific constant, l is the bond length, l0 is the ideal bond length, kθ is a constant specific for the atom types making up the angle, θ is a three-atom angle, θ0 is the ideal angle, ksθ is a stretch-bond constant determined by the included atoms, ε is an atom-specific constant, P is the sum of the van der Waal radii divided by the distance between the interaction centers, V is a torsion-type constant, and ω is the torsion angle.

Because MM FF energy calculations are relatively simple, they run very fast on a modern computer and the analytical derivative is easily accessible, allowing for efficient optimization of the conformational geometry. The

Page 16: Computational Methods in Medicinal Chemistry - DiVA …844370/FULLTEXT01.pdfThese two applications of computational medicinal chemistry is the focus of this thesis. ... Organic Pharmaceutical

16

speed of MM calculations allows large databases of small molecules to be considered, and large systems to be readily optimized.

However, the standard FF parameterization does not allow chemical reac-tions to be modeled. Even with a set of parameters adapted for modeling a specific reaction, MM would struggle to accurately describe the energy of the process. This is because, in order to capture the full nature of a reaction, the electronic structure must be considered.

2.5 Quantum Mechanics The basis of QM is the wave function (Ψ), which constitutes the most com-plete description of a physical system. When queried with different opera-tors, all observable properties can be derived from the wave function of a system. The operator that return the energy (E) of the system is called the Hamiltonian operator (Ĥ), and the energy can thus be determined from: Ψ = Ψ

This constitutes the time-independent Schrödinger equation.12 QM methods, in contrast to MM methods, are derived from basic physical

laws and are not parameterized. For this reason, QM calculations are some-times called ab initio, meaning “from first principles”. Classic QM methods explicitly consider the electron wave function, thus allowing chemical reac-tions to be accurately modeled. However, this makes the methods dependent on 4N coordinates (N is the number of electrons), 3N spatial coordinates and N spin coordinates. Such systems can only be solved approximately through very time-consuming methods.

A key to make any attempt to solve the Schrödinger equation is the varia-tional principle. This principle states that, for any guessed normalized wave function, the computed energy will be the upper boundary of the true energy at the ground state.

The wave function itself is not observable. However, an N-electron wave function can be related to the electron probability density ρ(r) of the system through:

( ) = … |Ψ( , , , … , , )| …

where r is the spatial coordinates and s is the spin coordinate. Although, strictly speaking, ρ(r) is a probability density, it is usually referred to as the electron density.

Page 17: Computational Methods in Medicinal Chemistry - DiVA …844370/FULLTEXT01.pdfThese two applications of computational medicinal chemistry is the focus of this thesis. ... Organic Pharmaceutical

17

2.5.1 Density Functional Theory Density functional theory (DFT) relates the system properties to the electron density function, and depends on only three spatial coordinates. This has made DFT a cost-effective alternative for modeling larger systems.

Although it was recognized early on that using the electron density func-tion could provide a simpler approach than the wave function methods, no proof existed that the electron density function uniquely determined the properties of the system. Hohenberg and Kohn provided the fundamental principles for DFT when they proved that the electron density of a system completely determines the electronic energy.13 In their second theorem, they also showed that any trial electron density satisfying the boundary conditions results in energy higher than or equal to the true ground-state energy; in oth-er words, they showed that the variational principle is also valid in this ap-proach.

Although this marked a theoretical breakthrough, the Hohenberg-Kohn theorems do not give any practical guidance on how to construct the density functional and determine the energy of the studied system. Also, the second theorem is of limited practical use, as it is only valid for the exact density functional, which is not available.

Once the soundness of the method had been proved, however, it did not take long for a practical approach to the density functional to be presented by Kohn and Sham.14

One challenge with the Kohn-Sham method is that it includes an ex-change-correlation functional term for which the expression is unknown. Any attempt to apply the Kohn-Sham method thus has to be accompanied by an estimate of this term, resulting in a multitude of different functional forms.

2.5.2 Level of Theory All DFT calculations in this thesis were carried out using the Jaguar soft-ware15 and the Becke, three-parameter, Lee-Yang-Parr (B3LYP) hybrid functional.16,17 The term hybrid functional means that the correlation term partly consists of an exact exchange term from Hartree-Fock theory.

B3LYP-derived energies have been estimated on the G3/05 QM valida-tion dataset to have a mean absolute deviation of 4.1 kcal mol-1.18 In most applications, these energies are used relative to the energies of similar sys-tems and thus the accuracy is expected to be higher as a result of error can-celing. However, the comparison of some systems, such as charged and neu-tral systems, might be more challenging. In these cases, the results are de-pendent on a suitable solvation model to give accurate results.19

In this thesis, the B3LYP functional has been paired with the LACVP (Los Alamos national laboratory, outermost Core orbitals, Valence only,

Page 18: Computational Methods in Medicinal Chemistry - DiVA …844370/FULLTEXT01.pdfThese two applications of computational medicinal chemistry is the focus of this thesis. ... Organic Pharmaceutical

18

Pople) basis set which utilizes the 6-31G20–25 basis set for elements H-Ar and an effective core potential together with a valence basis set for heavier ele-ments.26 The 6-31G basis set is a split-valence double-zeta basis set. This means that it uses one basis function for each core atomic orbital and two basis functions for the outer atomic orbitals. The basis sets can be further expanded by the addition of polarization functions, denoted by * when added to all atoms except for transition metals, hydrogen, and helium, and by ** when also added to hydrogen and helium atoms. The purpose is to expand the available space for the orbitals to accommodate polarization in a certain direction. The orbital radius can be further expanded by the addition of dif-fuse functions (denoted + or ++) to increase the accuracy when considering anions and highly electronegative atoms.

2.5.2.1 Dispersion Electron correlation gives rise to long range attractive forces called disper-sion. These long range electron-electron interactions are not well modeled in most DFT functionals.27 Nevertheless, an accurate description of the disper-sion energy is essential for modeling some systems and has to be accounted for.28

In the papers employing DFT calculations presented in this thesis, the DFT-D3 protocol was used to correct for dispersion.29 Antony et al. have shown that, when modeling protein ligand energies, the accuracy of disper-sion-corrected DFT methods equals that of high-level wave function meth-ods.30

2.5.2.2 Solvation Chemical reactions generally do not take place in an ideal gas phase, as as-sumed in standard QM calculations, but rather in some form of solvent. The solvent molecules constantly interact with the reaction system and alter its properties. The most accurate way to model this would be to include explicit solvent molecules in the QM calculations. This, however, rapidly becomes impractical as the size of the system would drastically increase and the con-formational freedoms would soon become unmanageable. Instead, solvent effects are usually modeled by continuum solvent models that simulate the average effect of the solvent. The solvation models generally apply the sol-vent's dielectric constant (ε) and radius to simulate the electrostatic effect of the solvent on the system.

In this thesis, the Poisson-Boltzman finite continuum model31,32 has been applied in all DFT studies.

2.5.2.3 Free Energy The final energies reported for the DFT studies were derived by adding the electronic energy from the gas phase calculations to the solvation energy, the

Page 19: Computational Methods in Medicinal Chemistry - DiVA …844370/FULLTEXT01.pdfThese two applications of computational medicinal chemistry is the focus of this thesis. ... Organic Pharmaceutical

19

calculated dispersion correction, the zero point energy, and the harmonic entropy contribution at the investigated temperature.

2.5.2.4 Interpreting the Model With the help of DFT we can calculate the free-energy profile of a chemical reaction.33 The free-energy profile can be thought of as a schematic intersec-tion of the free-energy surface of the reaction. A free-energy profile of a hypothetical reaction is shown in Figure 2.

Figure 2. Hypothetical free-energy profile of a reaction.

The first thing to note is that we are only interested in relative energies and, thus, we choose to denote the starting point as zero energy. From the energy profile we can then determine the thermodynamic properties of the reaction (endergonic or exergonic) by comparing the relative energies of the reactant and the product. We can also calculate the amount of free energy that is re-quired for the reaction to take place by comparing the free energy of the highest energy TS to the lowest preceding point (reactant to the second TS, 80 kJ mol-1). This free-energy requirement in turn is proportional to the rate of the reaction.

Page 20: Computational Methods in Medicinal Chemistry - DiVA …844370/FULLTEXT01.pdfThese two applications of computational medicinal chemistry is the focus of this thesis. ... Organic Pharmaceutical

20

3 Virtual Screening

Virtual screening (VS) is the in silico process of querying a large collection of molecules for those that are predicted to have biological activity on a macromolecular target of interest. VS can be divided into structure-based (SB) and ligand-based (LB) methods where the SB methods utilize infor-mation on the macromolecular target and the LB methods use information on known active compounds. Both SB and LB methods can then be further subdivided depending on their complexity.

VS is well recognized today and several examples of successful applica-tions of VS are present in the literature.34,35 VS has also played an instrumen-tal role in the development of several drugs now in the clinic.36

3.1 Ligand-Based Methods LB methods are based on the assumption that similar molecules have similar biological activity.37 This statement appears simple enough at first, but the concept of similarity is not easily defined and it can be measured in any property of a molecule. The simplest approaches only consider similarity in easily calculated molecular properties, such as molecular weight or the num-ber of hydrogen-bond acceptors. These kinds of methods are generally re-ferred to as 1D methods. 2D methods, on the other hand, calculate similari-ties between the fingerprints derived from the 2D structures of the mole-cules. The most advanced methods consider the three-dimensional shape of the molecule and are accordingly called 3D methods.38

Instead of considering the similarities between entire molecules, a set of key features thought to be responsible for the biological activity can be de-fined. This set of key features is referred to as a pharmacophore.

3.2 Molecular Docking The most commonly used SB method is molecular docking. Docking aims to predict the binding mode of small molecules to a macromolecular target, usually a protein. The predicted poses are then scored using a scoring func-tion in order to rank the compounds by predicted activity. To achieve this, the docking process must tackle several challenges. The input molecules

Page 21: Computational Methods in Medicinal Chemistry - DiVA …844370/FULLTEXT01.pdfThese two applications of computational medicinal chemistry is the focus of this thesis. ... Organic Pharmaceutical

21

need to be conformationally sampled, fitted in to the binding site of the mac-romolecular target, and the energies of the formed interactions must be esti-mated to generate a score.39

Performance of SB VS requires the protein structure to be derived, usual-ly by X-ray crystallography of a protein crystal. The Protein Data Bank (PDB) is a valuable source of protein structures which contains more than 100 000 structures that have been deposited by researchers around the world.40 The electron densities of many structures are also available through the Uppsala Electron-Density Server.41

Although the availability of large numbers of crystal structures has been of immense importance to understanding protein–ligand interactions and the development of new drugs, careful evaluation should accompany the use of protein crystal data.42 When evaluating the quality of a crystal structure for use in SB VS, it is essential to review the actual electron density map in or-der to make sure that the ligand and the binding site are well defined. Also, the relevance of the crystalized conformation has to be evaluated and the possible effects of flexibility assessed.

3.3 Database and Target Preparation The input library of chemical substances and the protein target must be pre-pared for VS.43 Depending on the requirements of the VS method, this can include different considerations. Steps for preparing the compound library include the generation of reasonable 3D structures of the molecules, the ap-propriate tautomers, the protonation states, and the stereoisomers.

For SB techniques, the protein target also requires the correct tautomeric- and protonation states to be assigned to the binding-site amino acid residues. Since the protein is often considered to be rigid during docking calculations it is necessary to also consider the conformations of the binding-site resi-dues. Reasonable conformations of the heavy atoms are usually readily available from the crystal structure input, although they might need revision to be suitable for the specific docking problem at hand. The hydrogens, however, often constitute a challenging task since they are not visible in most X-ray-derived structures and must therefore be added by the program. This means that the conformation of each individual hydrogen has to be es-timated.

3.3.1 Water Molecules in Virtual Screening All of the above protein properties can be influenced by the water molecules in the protein surroundings.44 Water molecules also influence the binding of ligands to the protein through solvation/desolvation effects, bridging interac-tions, and entropy contributions.

Page 22: Computational Methods in Medicinal Chemistry - DiVA …844370/FULLTEXT01.pdfThese two applications of computational medicinal chemistry is the focus of this thesis. ... Organic Pharmaceutical

22

When preparing a protein for VS, the water present in the active site re-quires special consideration. For simplicity, water is often removed but, for some inhibitors, the presence of water molecules might be essential in order to predict the correct binding pose.45 Overall, the displacement of water from the right areas of the active site cavity is essential to achieve favorable bind-ing free energy.46,47

3.4 The Concept of Chemical Space All of the molecules that can possibly be made constitute the chemical space; it has been estimated that for small organic molecules this consists of about 1060 molecules.48 Thus, for all practical applications, chemical space can be considered infinite.

Despite this, many molecules of interest have been identified, for example in the form of drugs. Is this because the entire chemical space is filled with molecules that are suitable as drugs? Probably not. The history of success probably stems from the fact that in our pursuit of new drugs we generally consider only a small part of chemical space. Here, concepts such as “drug-likeness” become important. By imitating natural products and drugs that are already known, we can focus our efforts in a subspace that is more likely to contain molecules of interest.49 A famous example is Lipinski’s rule of five; this is a set of rules derived from known drugs and is intended to character-ize the part of chemical space associated with solubility and permeability for oral drugs.50

3.4.1 Principal Component Analysis Chemical space can be thought of as multidimensional space, where each dimension is a molecular property. By measuring or calculating these prop-erties we can describe molecules with respect to each other. In order to visu-alize this many-dimensional space, methods that reduce the dimensionality of the data, such as principal component analysis (PCA), are useful.

PCA can map the original features of the data to a new set of uncorrelated features called principal components.51 This is done in such a way that the first principal component accounts for the most variability.

3.5 Benchmark Datasets Evaluating the success of VS and comparing the results between methods is not a trivial undertaking.52 It can be argued that the best validation is the prospective application of VS methods to new targets. However, this is both time consuming and costly. A more efficient way to evaluate VS methods is

Page 23: Computational Methods in Medicinal Chemistry - DiVA …844370/FULLTEXT01.pdfThese two applications of computational medicinal chemistry is the focus of this thesis. ... Organic Pharmaceutical

23

through a test set with known active compounds pooled with other mole-cules, referred to as decoys. Such a dataset can be used to estimate the ability of different algorithms to retrieve active compounds. A number of such benchmark sets are available.53 The most well-known are: the directory of useful decoys (DUD) and the enhanced version (DUD-E),54,55 the maximum unbiased validation (MUV) dataset,56 and the demanding evaluation kits for objective in silico screening (DEKOIS).57,58

Many benchmark datasets are compiled using compounds from publicly available databases for small molecules such as the ZINC,59,60 BindingDB,61 and PubChem62 databases. Often the data are supplemented using com-pounds from the literature for the specific target in question.

Since benchmark datasets are based on sets of known compounds, there is an inherited risk of bias. Large chemical collections tend to sample only certain regions of chemical space, such as molecules that are easy to make or those that fall into the regions surrounding known inhibitors.63 Therefore, strictly speaking, the benchmark datasets only test performance in the chem-ical and biological space that they cover.

The composition of benchmark datasets has led to the introduction of the concepts of artificial enrichment and analog bias. A difference in physico-chemical properties between active and inactive molecules can lead to artifi-cial enrichment, i.e. enrichment due to the dissimilarity between active and inactive compounds rather than due to the efficacy of the VS method evalu-ated.64 Similarly, if all active compounds are in the same structure class this could lead to an “all or nothing” outcome. Too much similarity among the active compounds is generally referred to as analog bias.65

3.6 Evaluation Metrics Several metrics have been proposed to measure the success of VS. The en-richment factor (EF) is a commonly used metric and is defined as:

= /( )/

where tp is the number of true positives, fp is the number of false positives, A is the total number of active compounds, and N is the total number of both active and inactive compounds. The EF can be interpreted as comparing the fraction of active compounds in the top part of the data set to the fraction of active compounds in the whole dataset.

The results generated by VS are often visualized using a receiver operator characteristic (ROC) curve where the tp rate is plotted against the fp rate. The area under the ROC curve can be used as an evaluation metric.

Page 24: Computational Methods in Medicinal Chemistry - DiVA …844370/FULLTEXT01.pdfThese two applications of computational medicinal chemistry is the focus of this thesis. ... Organic Pharmaceutical

24

These commonly used methods do not account for the position of the ac-tive compound in the hit-list; they only indicate whether it is within the de-fined cutoff. This is called the “early recognition problem” and several other metrics have been proposed that take this into consideration.66 One such metric is Boltzman-enhanced discrimination of ROC (BEDROC), which is defined as:

= ∑ // ∙ ∙ sinhcosh − cosh − + 11 − ( )

where n is the number of active compounds, N is the total number of com-pounds, ri is the rank of the ith active compound, and α is a tuning parame-ter. However, these more complicated metrics often lack the simple inter-pretability of the standard measurements.

Page 25: Computational Methods in Medicinal Chemistry - DiVA …844370/FULLTEXT01.pdfThese two applications of computational medicinal chemistry is the focus of this thesis. ... Organic Pharmaceutical

25

4 Catalysis

The term catalysis was introduced by the Swedish scientist Jacob Berzelius.67 A catalyst facilitates a reaction by providing an alternate route from reactant to product that has lower activation energy than the non-catalyzed pathway. As such, the catalyst will change the rate of the reaction.

This thesis deals with both homogeneous catalysis, where the catalyst is dissolved in the same phase as the reactant, and biological catalysis by en-zymes.

4.1 Palladium Catalysis Palladium is a late transition metal with properties well suited for chemical catalysis. It favors the 0 and +2 oxidation states and has a relatively small energy barrier between these. This makes palladium suitable for oxida-tion/reduction types of reactions. Palladium has ten d-electrons and prefers coordination with four ligands, resulting in an 18-electron, coordinatively saturated, tetrahedral complex. The eight-electron palladium(II) instead fa-vors a four-ligand, 16-electron, low-spin, square-planar conformation.

Perhaps one of the most well-known examples of palladium catalysis is the Mizoroki-Heck reaction where an aryl halide or vinyl halide reacts with an alkene to form a new carbon-carbon bond. This reaction was reported independently by Mizoroki68 in 1971 and Heck69 in 1972. The Mizoroki-Heck reaction starts with oxidative addition followed by a migratory inser-tion where the alkene is inserts into the metal-ligand bond. This is followed by a β-hydride elimination and subsequent regeneration of palladium(0) by reductive elimination.

The Mizoroki-Heck reaction is classified as a palladium(0)-catalyzed re-action since each catalytic cycle is started in that oxidation state. The first palladium(II)-catalyzed reaction was reported as early as 1968 by Heck70 but, until recently, palladium(0)-catalyzed reactions have been the main fo-cus of palladium chemistry. This is largely because many palladium(II)-catalyzed reactions result in the reduction of palladium. In order to restore the catalyst for the next catalytic cycle, an oxidant has to be included in stoi-chiometric quantities.

Page 26: Computational Methods in Medicinal Chemistry - DiVA …844370/FULLTEXT01.pdfThese two applications of computational medicinal chemistry is the focus of this thesis. ... Organic Pharmaceutical

26

4.1.1 Decarboxylation Benzoic acids, which are cheap, non-toxic, stable precursors to aryl-palladium(II), are a common reactant in palladium-catalyzed, carbon-carbon bond formations. The first metal-mediated decarboxylations were reported as early as 1930,71 but the use of single metal palladium catalysis was only reported recently, by Myers.72–74

The greatest limitation of palladium-catalyzed decarboxylative reactions is the need for ortho substituents on the benzoic acids in order for the reac-tions to be productive.75 Theoretical investigations have suggested that the carboxylic acid functionality has to be orthogonal to the plane of the aro-matic system to allow for decarboxylation (Figure 3). This conformation breaks the conjugation between the acid and the aromatic ring and is there-fore energetically disfavored. Ortho substituents break the planar system by steric interactions, thus reducing the free-energy requirement for the orthog-onal conformation.76

Figure 3. Top and side views of the optimized geometry (B3LYP/LACVP*) of a 2,6-dimethoxybenzoic acid palladium complex preceding decarboxylation. The acid functionality is orthogonal to the aromatic plane. Distances are measured in Ång-ström.

4.1.2 Palladium Ligands A palladium ligand is any compound capable of coordinating with palladi-um. Many synthetic protocols employ compounds specifically designed for this purpose and, although they do not directly take part in the reaction, they can be crucial to the efficacy of the catalyst. Ligating palladium can have several beneficial effects; it can retard the aggregation of metallic palladi-um(0), alter the electron density of palladium and, using steric effects, influ-ence the neighboring coordination sites.

Page 27: Computational Methods in Medicinal Chemistry - DiVA …844370/FULLTEXT01.pdfThese two applications of computational medicinal chemistry is the focus of this thesis. ... Organic Pharmaceutical

27

Palladium ligands can be grouped according to the type of atom coordi-nating palladium and the number of coordinations they make. The number of non-contiguous interactions made by the ligand to the metal center is called its denticity. A ligand with one atom coordinating the metal is called mono-dentate, and a ligand where two atoms coordinate is called bidentate. The denticity is denoted by the symbol κ.

The palladium-catalyzed reactions investigated in this thesis all employed bidentate nitrogen-based ligands. Some common nitrogen-based ligands are shown in Table 1.

Table 1. Common nitrogen-based bidentate palladium ligands.

Structure Chemical name 2,2'-bipyridine

6-methyl-2,2'-bipyridine

6,6´-dimethyl-2,2'-bipyridine

1,10-phenantroline

2,9-dimethyl-1,10-phenantroline

2,9-dimethyl-4,7-diphenyl-1,10-phenanthroline

1,1'-biisoquinoline

4.1.3 Density Functional Theory Modeling of Palladium-Catalyzed Reactions

The modeling of palladium-catalyzed reactions requires specific mention. One of the advantages of modeling metal-catalyzed reactions is that the con-formational freedom of the reactants is limited by the fact that they are coor-dinated to a metal center. However, the modeling done for this thesis includ-ed some approximations and assumptions to further reduce the complexity and computational time:

• Only complexes involving one palladium atom were considered.

Page 28: Computational Methods in Medicinal Chemistry - DiVA …844370/FULLTEXT01.pdfThese two applications of computational medicinal chemistry is the focus of this thesis. ... Organic Pharmaceutical

28

• The bidentate nitrogen ligands were always bound to palladium in a κ2 fashion.

• The process of ligand exchange was not modeled but was, in most cases, assumed to be diffusion-controlled and to have a free-energy requirement of about 20 kJ mol-1.77

4.1.3.1 Quick Reaction Coordinates When a TS has been identified, it has to be verified that it connects to the supposed reactant and product. This is done in order to make sure that it is the TS of interest. This can be achieved by intrinsic reaction coordinate (IRC) calculations, which follow the reaction coordinate downhill on the PES along the minimal energy pathway in both forward and backward direc-tions.78,79 IRC calculations are generally very accurate but are carried out at a high computational cost. Studies of organic reactivity generally do not re-quire that level of accuracy and quick reaction coordinate (QRC) calcula-tions can be applied.80 QRC calculations take one short step in each direction along the TS normal mode and the derived structures are then minimized. If the minimization proceeds smoothly, one can be reasonably sure it has ar-rived at the connecting energy minimum.

4.2 Enzyme Catalysis Enzymes are protein catalysts that are essential to many biochemical trans-formations. The traditional viewpoint is that enzymes catalyze reactions by stabilizing the TS of the reaction and, thus, lowering the activation energy.81 More recent views also include a dynamic feature of the protein–ligand complex to facilitate barrier crossing.82

Because of their essential role in most processes in the body, enzymes are important drug targets; about one third of the currently marketed drugs target enzymes.83 A better understanding of enzyme reactivity is therefore im-portant in the context of drug discovery.

4.2.1 Modeling of Enzyme Reactions Theoretical modeling of enzyme reactions using QM methods has become a powerful tool for investigating enzyme reactivity and deriving the most probable reaction mechanisms.84,85 The two main methods are QM/MM and the cluster approach. In the QM/MM approach, the entire enzyme is included in the model but only the core is investigated using QM techniques while the surrounding protein is investigated using MM methods.86 The enzyme mod-eling conducted for this thesis was carried out using the cluster approach, as described below.

Page 29: Computational Methods in Medicinal Chemistry - DiVA …844370/FULLTEXT01.pdfThese two applications of computational medicinal chemistry is the focus of this thesis. ... Organic Pharmaceutical

29

4.2.2 Cluster Approach The cluster approach to enzyme modeling allows the study of biochemical problems using solely QM techniques. In this approach the amino acid resi-dues in the vicinity of the active site are extracted to construct a model that is investigated using only QM methods. In order to account for the steric and electrostatic effects of the surrounding protein structure, the coordinates of selected atoms are fixed during optimization and the entire complex is treat-ed using a continuum solvation model.87,88

A range of values has been used for the dielectric constant (ε) when mod-eling proteins, but a value of four is the most common.89 For the cluster ap-proach, the effect of the solvation model decreases when the model system increases in size and, for systems larger than 200 atoms, the relative energy is nearly the same regardless of the chosen dielectric constant.90–92

The first step in this approach is to select the model substrate, amino acid residues, cofactors, metal ions, and water molecules that will be included in the model. This is often done by selecting as many of the amino acids sur-rounding the catalytic site as possible with regard to the model size. With today’s computational power, it is reasonable to consider models of up to 300 atoms. Naturally, care must be taken to include all residues that take part in the reaction or stabilize the generated intermediates. Thus, the more in-formation that is available on the reaction the better are the chances of con-structing an accurate model.

The cluster approach is associated with some limitations, according to the way the model system is constructed. Since only a small part of the enzyme is included in the model, long range interactions are not modeled.93 Also, the geometry optimization has to be carefully monitored to ensure that no artifi-cial geometry changes occur.94

Page 30: Computational Methods in Medicinal Chemistry - DiVA …844370/FULLTEXT01.pdfThese two applications of computational medicinal chemistry is the focus of this thesis. ... Organic Pharmaceutical

30

5 Aims of the Thesis

This thesis aspires to further the field in computational medicinal chemistry by improving the methods used to decide which molecules to make, and by theoretically investigating how best to make them. The first goal was pur-sued by method development in the area of VS and the second by DFT-based mechanistic investigations of reactions. More specifically, the aims of this thesis were: • to elucidate reaction pathways for palladium(II)-catalyzed decarboxyla-

tive reactions by means of DFT;

• to develop an improved VS protocol based on SB and LB method con-sensus;

• to compile a dataset suitable for the evaluation of both LB and SB VS methods when applied in a high throughput screening (HTS) scenario; and

• to develop a new VS methodology using computationally predicted TS and intermediate structures of enzymatic transformations to identify TS and intermediate mimetics.

Page 31: Computational Methods in Medicinal Chemistry - DiVA …844370/FULLTEXT01.pdfThese two applications of computational medicinal chemistry is the focus of this thesis. ... Organic Pharmaceutical

31

6 Decarboxylative Palladium(II)-Catalyzed Reactions (papers I-III)

For this part of the thesis, three reactions involving palladium(II)-catalyzed decarboxylation (the synthesis of aryl ketones, aryl amidines, and styrene) were investigated using DFT.

6.1 Synthesis of Aryl Ketones Lindh et al. developed a protocol for the synthesis of aryl ketones through palladium(II)-catalyzed decarboxylative addition of benzoic acids to acetoni-trile (Scheme 1).95 They also investigated the mechanism using electrospray ionization mass spectrometry (ESI-MS).

Our goal was to expand their mechanistic study results by means of DFT calculations, and to explain the experimentally observed outcomes for the palladium ligands employed during the ligand screening process.

Scheme 1. General reaction scheme for the decarboxylative palladium(II)-catalyzed synthesis of aryl ketones.

The palladium-catalyzed decarboxylation of aryl carboxylic acids has previ-ously been the focus of theoretical studies.96,97 However, the systems used in those studies were quite different from the one we used as they did not em-ploy a bidentate nitrogen ligand.

We set out to model the reaction path using 6-methyl-2,2'-bipyridine as the palladium ligand, 2,6-dimethoxybenzoic acid as the precursor to aryl-palladium, and acetonitrile as the carbopalladation partner and solvent for the reaction. The calculated free-energy profile for decarboxylation is shown in Figure 4.

Page 32: Computational Methods in Medicinal Chemistry - DiVA …844370/FULLTEXT01.pdfThese two applications of computational medicinal chemistry is the focus of this thesis. ... Organic Pharmaceutical

32

Figure 4. Calculated free-energy profile for decarboxylation. Reprinted with per-mission from Svensson et al.98

The lowest energy complex identified prior to decarboxylation was the neu-tral complex II where palladium coordinates 2,6-dimethoxybenzoate and acetate. This complex also marks the starting point of the catalytic cycle. The complex immediately preceding decarboxylation (IV) can be formed from II by releasing the acetate and rotating the benzoate. From there, the decarboxylation can occur through TS-I to form the product complex V. Release of CO2 and coordination with another benzoic acid then generates the lowest energy intermediate (VII) between decarboxylation and the sub-sequent carbopalladation. Overall, the decarboxylation process had a calcu-lated free-energy requirement of 110 kJ mol-1.

Next, the carbopalladation step was investigated (Figure 5). The calcula-tions indicated carbopalladation as the rate-determining step of the reaction with a calculated free-energy requirement of 132 kJ mol-1 (VII to TS-II). Following the carbopalladation, coordination with benzoate and then ligand exchange between the formed ketimine and acetate reformed the starting point of the catalytic cycle (II). Hydrolysis of the product yielded the desired aryl ketone and further lowered the free energy of the system.

Page 33: Computational Methods in Medicinal Chemistry - DiVA …844370/FULLTEXT01.pdfThese two applications of computational medicinal chemistry is the focus of this thesis. ... Organic Pharmaceutical

33

Figure 5. Calculated free-energy pathway for carbopalladation and product for-mation. Adapted with permission from Svensson et al.98

The free-energy profiles of all the nitrogen-based ligands that had been used during the ligand screening were investigated to better understand the re-ported experimental outcomes for the different palladium ligands. The com-putationally investigated ligands are depicted in Table 1. A clear trend was observed where the more sterically hindered ligands, such as 6,6´-dimethyl-2,2'-bipyridine, had a lower free-energy requirement for decarboxylation than the less hindered ligands. This could be due to that the lowest energy complex prior to decarboxylation (II) was sterically crowded and was desta-bilized by the methyl groups on these ligands.

One of the main findings of this theoretical study was that the lowest en-ergy complex prior to decarboxylation was a neutral complex (II) and, in order to proceed to decarboxylation, the system had to undergo a charge separation to yield the charged pre-decarboxylative complex (IV). The same pattern is true for the carbopalladation, with the neutral complex VII as the lowest point prior to the charged carbopalladation TS (TS-II). This indicates that the solvent played an important role in the reaction and that the free-energy requirement can be lowered if the solvent better stabilizes the charged species. In fact, DFT calculations applying a water solvation model showed that the free-energy requirements were reduced with these settings. This prompted us to try a series of experimental setups with increasing water content while monitoring the rate and yield of the reaction.

Page 34: Computational Methods in Medicinal Chemistry - DiVA …844370/FULLTEXT01.pdfThese two applications of computational medicinal chemistry is the focus of this thesis. ... Organic Pharmaceutical

34

The experiments showed an increased reaction rate for systems with high-er concentrations of water in the solvent system. However, the formation of the proto-decarboxylation product 1,3-dimethoxybenzene also increased with the concentration of water and, thus, the systems with a high water con-centration did not generate yields as high as those in the previously investi-gated systems. Although the addition of water did not improve the protocol, the increased reaction rate supports the computational findings.

6.2 Synthesis of Aryl Amidines The palladium-catalyzed synthesis of aryl amidines from aryl trifluorobo-rates has previously been reported by Sävmarker et al.99 Since the decarbox-ylative synthesis of aryl ketones proved to be a rewarding reaction, we chose to also investigate the related reaction furnishing aryl amidines using cyan-amides (Scheme 2). The mechanism was investigated using DFT calcula-tions and ESI-MS in conjunction with the synthetic results.

Scheme 2. The general scheme for palladium-catalyzed aryl amidine synthesis.

The free-energy profile for three benzoic acids (Scheme 3) was investigated; we used benzoic acids that were more (2,4,6-trimethoxy benzoic acid, 1a) or less (2,3,6-trimethoxy benzoic acid, 1c) electron-rich than 2,6-dimethoxy benzoic acid (1b), which had been our focus in the theoretical studies of decarboxylation.

Scheme 3. Structures of the theoretically investigated benzoic acids.

For 1b, the overall free-energy profile for decarboxylation was naturally nearly identical to that previously calculated for the aryl ketone pathway, with small differences stemming from the application of different solvent parameters. Parameters suitable for N-methyl-2-pyrrolidone were used for this study. The following carbopalladation was mechanistically highly simi-lar to the nitrile counterpart but less energetically demanding. The calculated free energy for the two reaction steps is shown in Table 2.

Page 35: Computational Methods in Medicinal Chemistry - DiVA …844370/FULLTEXT01.pdfThese two applications of computational medicinal chemistry is the focus of this thesis. ... Organic Pharmaceutical

35

Table 2. Calculated free energy for the decarboxylation and carbopalladation steps.

Substrate ΔG‡ decarboxylation (kJ mol-1) ΔG‡ carbopalladation (kJ mol-1)

1a 110.3 110.91b 111.1 117,91c 138.5 149.1

Scheme 4. Proposed catalytic cycle based on DFT calculations and ESI-MS for the palladium(II)-catalyzed decarboxylative synthesis of aryl amidines. The calculated free-energy requirement for the two main reaction steps for substrate 1a is indicated in the figure.

The investigation showed that the more electron-poor substrate 1c had sub-stantially higher free-energy requirements for both reaction steps. The lower free-energy requirement of the carbopalladation process placed it close in energy to the decarboxylation process, in particular for the more electron-rich substrates. The carbopalladation was still predicted to be the rate-

Page 36: Computational Methods in Medicinal Chemistry - DiVA …844370/FULLTEXT01.pdfThese two applications of computational medicinal chemistry is the focus of this thesis. ... Organic Pharmaceutical

36

determining step but the difference was within the expected margins of error for the DFT method.

The ESI-MS investigation detected several of the complexes that the cal-culations had predicted would form during the reaction. A putative catalytic cycle based on the theoretical investigation and the ESI-MS results is shown in Scheme 4.

6.3 Styrene Synthesis For the final palladium(II)-catalyzed reaction covered in this thesis, we set out to create a decarboxylative protocol for the synthesis of styrenes (Scheme 5). Also in this case, protocols employing alternate routes to aryl-palladium were known.100–102

Scheme 5. Schematic for the palladium(II)-catalyzed decarboxylative styrene syn-thesis with the main by-products shown.

It proved difficult to develop a general protocol for the synthesis of styrene because of the formation of by-products. At first, when the experiments were conducted with an excess of vinyl acetate (VA), some reaction systems yielded large quantities of transvinylated by-product and, when the amounts of VA were reduced, the diarylated product was formed instead of the de-sired styrene. In order to better understand the mechanisms behind this, a DFT investigation was initiated.

Page 37: Computational Methods in Medicinal Chemistry - DiVA …844370/FULLTEXT01.pdfThese two applications of computational medicinal chemistry is the focus of this thesis. ... Organic Pharmaceutical

37

Figure 6. Calculated free-energy pathway for the formation of styrene.

The reaction was modeled using 2,6-dimethoxybenzoic acid as the aryl-palladium precursor and palladium(II)-acetate as the palladium source. In order to facilitate the calculations we opted to use 2,2'-bipyridine as the pal-ladium ligand. Although this ligand was not the most favorable of the ap-plied ligands, it was shown to be productive.

We started by mapping the reaction pathway for the formation of the de-sired styrene (Figure 6). The calculations indicated that the rate-determining step was the decarboxylation (117 kJ mol-1, XI to TS-III) and the overall reaction was found to be exergonic. The results also indicated that, after decarboxylation, α-arylation was preferred over β-arylation with free-energy requirements of 98.5 kJ mol-1 and 115 kJ mol-1, respectively. This indicates that styryl acetate formation, resulting from β-arylation, was unlikely to be one of the main pathways in this reaction. This was also supported by the experimental outcome, where only a small amount of styryl acetate was de-tected. With this pathway established, we proceeded to the by-products, and the calculated free-energy pathway for the transvinylation is shown in Fig-ure 7.

Page 38: Computational Methods in Medicinal Chemistry - DiVA …844370/FULLTEXT01.pdfThese two applications of computational medicinal chemistry is the focus of this thesis. ... Organic Pharmaceutical

38

Figure 7. Calculated free-energy pathway for the transvinylation process. The calcu-lated free-energy requirement for decarboxylation is shown (in red) for comparison.

The transvinylation process turned out to have a calculated free-energy re-quirement very similar to that of decarboxylation (124 kJ mol-1, XI to TS-VII). Thus, it is not surprising that a large amount of the by-product was formed when the reaction was carried out using high concentrations of VA.

Our first investigation of decarboxylation revealed that the more sterically hindered ligands favored decarboxylation. The ligand screen conducted in this project supported this conclusion, as the sterically hindered ligands clearly resulted in less of the transvinylation by-product than the unhindered ligands, indicating that the free-energy requirement for decarboxylation in those cases was lower than that for transvinylation.

The free-energy pathway leading to the diarylated by-products, stilbene and 1,1-diarylethene (Figure 8), was also investigated. These calculations showed, somewhat surprisingly, that the reaction path leading to the sterical-ly congested 1,1-diarylethene product was the energetically favored path-way. The free-energy requirement for this transformation was found to be 104 kJ mol-1 (XV to TS-IXa). This is lower than the free energy required for decarboxylation and similar to the free energy required for the first arylation. This indicated that it would be difficult to avoid a second arylation in these protocols.

Page 39: Computational Methods in Medicinal Chemistry - DiVA …844370/FULLTEXT01.pdfThese two applications of computational medicinal chemistry is the focus of this thesis. ... Organic Pharmaceutical

39

Figure 8. Calculated free-energy profile for the formation of the styrene product in black and the 1,1-diarylethene in red.

The key reaction steps were also investigated using 2,9-dimethyl-1,10-phenantroline as the palladium ligand. These results indicated that the for-mation of the transvinyl product was energetically disfavored when employ-ing this ligand. This is consistent with the experimental outcome where only small amounts of the transvinylation product were observed for systems employing 2,9-dimethyl-1,10-phenantroline. Further, the calculations indi-cated that the sterically congested 1,1-diarylethene product was favored over the linear counterpart for this ligand also.

Based on the theoretical results, a synthetic effort was initiated with the aim of developing a protocol for the selective synthesis of 1,1-diarylethenes. After optimization, the protocol using 2,6-dimethoxybenzoic acid yielded 74% of the desired product with excellent selectivity over the corresponding stilbene (Scheme 6). The protocol was, with some success, extended to 2,4,6-trimethoxybenzoic acid and 2,4-dimethoxybenzoic acid, reaching 29% and 37% yields, respectively. Unfortunately, the protocol gave very low yields for other substrates.

Page 40: Computational Methods in Medicinal Chemistry - DiVA …844370/FULLTEXT01.pdfThese two applications of computational medicinal chemistry is the focus of this thesis. ... Organic Pharmaceutical

40

Scheme 6. Reaction conditions for the selective synthesis of 1,1-diarylethene.

Overall, the calculations confirmed the experimental findings that it might be difficult to produce a general protocol for the selective synthesis of styrene. This is because the two possible side reactions, transvinylation and diaryla-tion, have free-energy requirements that are very similar to or lower than those of the main reaction and, thus, will be difficult to circumvent. Further, the situation was made more difficult by the fact that a high concentration of VA, which would have been one way of retarding the diaryl formation, leads to high yields of the transvinylation by-product while, in contrast, a low con-centration of VA leads to problems with diarylation.

Page 41: Computational Methods in Medicinal Chemistry - DiVA …844370/FULLTEXT01.pdfThese two applications of computational medicinal chemistry is the focus of this thesis. ... Organic Pharmaceutical

41

7 Method Consensus in Virtual Screening (paper IV)

Combining the signals from several sources has the potential to increase their accuracy.103 This concept is generally referred to as data fusion or sig-nal consensus. In the field of VS, the concept of consensus scoring for dock-ing has been known for a long time.104 The work of Willet has shown that these concepts are applicable for fusing the results from different methods in LB VS.105

Consider a collection of VS methods. If the methods agree better when ranking active compounds than when ranking inactive compounds, then the combination of these methods should yield better results for active com-pounds. If, instead, the applied methods find different active compounds, then the best use of the data would be to select the top compounds from each method.

Besides improving the result over that of any individual signal, another possible beneficial result of data fusion is that it might reduce variation in performance. Since no VS method is known to consistently give the best result, and the performance often varies depending on the target and com-pound dataset, it is very difficult to know which method to apply when faced with a new target. If data fusion can function as an averaging effect between methods it could be used to remove this uncertainty, although at a higher computational cost.

LB and SB methods rely on quite different principles but are both known to give good results in VS. We therefore set out to investigate whether the results of VS could be improved by applying data fusion to the results of both LB and SB methods, and which data fusion scheme would be the most successful. We chose VS methods to represent the current state of the art: docking using Glide,106,107 shape screening using ROCS,108 pharmacophore screening using Phase,109,110 and electrostatic shape screening with EON.111

Five data fusion schemes, all previously reported in the literature, were applied: sum rank, rank vote, sum score, Pareto ranking, and parallel selec-tion.105,112–115 The algorithms are further described in Table 3. These schemes are convenient as they do not require any prior knowledge of the target or dataset and no training set needs to be applied. The only variable that has to be decided upon prior to screening is the number of structures that should

Page 42: Computational Methods in Medicinal Chemistry - DiVA …844370/FULLTEXT01.pdfThese two applications of computational medicinal chemistry is the focus of this thesis. ... Organic Pharmaceutical

42

receive votes from the rank vote method. This would typically be set based on the number of hits desired rather than the properties of the dataset.

Table 3. The data fusion algorithms applied in this study. Adapted with permission from Svensson et al.116

Name Algorithm Sum rank The ranks from the single method rank lists are summed.Rank vote Each single method gives a vote to the 250 highest-ranked compounds.

The hit lists are then combined and the compounds are sorted first on the number of votes and then on the sum score.

Sum score All scores are divided by the best score from that method. This relative score is then summed from all the single methods.

Pareto ranking A compound is ranked based on how many other compounds are better in all single methods. Ties are broken using sum rank.

Parallel selection Compounds are selected in turn from the top of the single method hit lists until the desired number of compounds is reached. If a compound has already been selected, the next compound from the current hit list is chosen instead.

We chose 16 datasets to evaluate the data fusion schemes, six developed by Jacobson and Karlén117 and later modified by Mutas et al.,118 and ten from the DUD. This subset of DUD datasets has previously been identified to be the most suitable for evaluation of LB methods since it contains more differ-ent scaffolds among the active compounds than the average DUD dataset.119

Among the single methods applied in this study, docking and shape screening performed best and electrostatic shape performed worst. It is, however, important to note that no effort was made to optimize the usage of the applied methods; they were applied with the recommended general set-tings. Nonetheless, each method performed best for at least one dataset, il-lustrating how difficult the choice of method is for a new target.

The results of the study are shown in Table 4. In order to compare the EF between methods, we calculated the relative EF (the EF divided by the max-imum possible EF).

Table 4. The average relative EF across all the datasets for the investigated single methods and data fusion algorithms, shown for three top fractions of the datasets. The numbers in bold indicate the best performance for that database fraction.

Method 100 compounds 1% 10%

Glide 0.34 0.44 0.66Phase 0.33 0.34 0.46ROCS 0.43 0.46 0.55EON 0.33 0.37 0.53Sum rank 0.38 0.43 0.64Rank vote 0.45 0.52 0.71Sum score 0.43 0.50 0.69Pareto ranking 0.40 0.51 0.73 Parallel selection 0.44 0.53 0.73

Page 43: Computational Methods in Medicinal Chemistry - DiVA …844370/FULLTEXT01.pdfThese two applications of computational medicinal chemistry is the focus of this thesis. ... Organic Pharmaceutical

43

When the results were reviewed it became clear that data fusion was indeed able to improve the outcome of VS. Not only did the data fusion have an averaging effect, meaning that the fused results were more consistent than any of the single methods, but data fusion recovered more active compounds than any of the single methods for some targets. Also, on average across all the datasets, data fusion recovered more active compounds than any single method alone. Among the applied data fusion algorithms, parallel selection performed best, indicating that for these datasets and methods the hit lists are at least to some extent complementary.

The results for one of the targets are shown in Figure 9. The differences between the different data fusion algorithms were clearly smaller than the differences between the different VS methods. This was a general trend across the tested targets. This is beneficial for data fusion since it can serve to reduce the impact of the method selection. If the spread between data fu-sion algorithms had been as wide as for the individual methods we would just have replaced one problem of method selection for another.

Figure 9. ROC curves for one of the investigated targets (ERα). The left plot shows the individual VS methods and the right plot shows the result from the five data fusion algorithms. Reprinted with permission from Svensson et al.116

Following our study, another investigation of data fusion using disparate methods has also shown improvements in the VS results.120 That study also introduced a new fusion method that had the best performance. In this fusion method, called Z2, the combined score was derived from the mean of the normalized scores for the two best individual methods for each compound. This method relies on the same basic principle as the parallel selection method; if a compound is scored sufficiently highly by any or by a few methods then that is sufficient to include the compound in the fused hit list. These results further strengthen the view that the hit lists, when applying principally different VS methods, are likely to be complementary.

Page 44: Computational Methods in Medicinal Chemistry - DiVA …844370/FULLTEXT01.pdfThese two applications of computational medicinal chemistry is the focus of this thesis. ... Organic Pharmaceutical

44

8 Evaluation of Virtual Screening Using Publicly Available High Throughput Screening Data (paper V)

One possible application of VS is as a replacement or complement to HTS at the start of a drug discovery project.121 The aim is then to identify new mo-lecular starting points for medicinal chemistry development by finding lig-and binders in large collections of compounds. This scenario is in many ways different from what is tested when VS protocols are evaluated using benchmark datasets such as the DUD. One of the biggest differences is that the active compounds in most benchmark datasets are collected from the literature and, thus, represent already optimized binders. This makes the affinity of the active compounds too high compared to what would be ex-pected to be found in an HTS library and, if whole series of molecules are included, they might also be highly similar. Furthermore, the ratio of active to inactive compounds is usually different from what would be expected in HTS, with too few inactive compounds per active compound. Since the ex-isting benchmark datasets are not sufficient to evaluate the success of VS in an HTS scenario, we set out to create a benchmark dataset that could be used to evaluate both SB and LB methods.

The most obvious way to evaluate how well VS would fare in an HTS scenario was to test the VS protocols on data already evaluated in high-throughput screens. The PubChem BioAssay database62 is an excellent source of publicly available HTS data that has been used before to create the MUV dataset.56 The MUV dataset, however, was constructed mainly to evaluate ligand-based methods and the aim is not to resemble a high-throughput screen.

The problem with using HTS data for retrospective evaluation is the gen-erally poor quality of the data generated.122 Figure 10 shows the percent inhibition measured for a library of almost 300 000 compounds against VHR1 (PubChem AID 1654). As can be seen clearly from the graph, the spread in the measured inhibition is quite high, indicating low confidence in the acquired inhibition values.

Page 45: Computational Methods in Medicinal Chemistry - DiVA …844370/FULLTEXT01.pdfThese two applications of computational medicinal chemistry is the focus of this thesis. ... Organic Pharmaceutical

45

Figure 10. Reported % inhibition at 13.3 μM for PubChem HTS id 1654.

Another complication when using data from PubChem is the lack of stand-ardization for how the data are annotated and the assay procedures de-scribed. This makes the curation of the data a manual and time-consuming job.

Issues regarding data quality as well as several other aspects need to be considered before using HTS data for VS evaluation. We stipulated that, in order to be useful as an evaluation set, at least 1000 compounds should have been screened, with a minimum of twenty actives found. The bioassays also needed to be linked to at least one protein crystal structure in the PDB, bind-ing a drug-like molecule. In order to improve confidence in the data, we only used HTS data where the active compounds had been confirmed in a dose-response-type secondary assay. This addressed the most acute problem of false positives in the HTS data but did not take into account the issue of false negatives. However, due to the large number of inactive compounds, false negatives should only have had a minor impact on the evaluation metrics used for VS, unless present in excessive amounts. At first, we also excluded all proteins with a co-factor binding site but since this excluded target clas-ses such as kinases, which are well characterized regarding ligand binding, we selectively included targets where binding in alternative sites was deemed unlikely.

With the inclusion criteria set, we used Python scripts to mine PubChem for assays that fulfilled these conditions.

Following the automated filters, the collected data were manually re-viewed and the details of each bioassay were assessed. All data from whole-cell assays were removed as the mode of action of potential inhibitors was

Page 46: Computational Methods in Medicinal Chemistry - DiVA …844370/FULLTEXT01.pdfThese two applications of computational medicinal chemistry is the focus of this thesis. ... Organic Pharmaceutical

46

considered too uncertain. To our surprise, from the thousands of screens deposited in PubChem, only seven datasets fulfilled our selection criteria, indicating that the quality of the publicly available HTS data was far from optimal. The seven datasets and the number of included compounds are listed in Table 5. The number of inactive molecules to active molecules, on average almost 2 500 inactive for every active, clearly distinguished these datasets from previous benchmark datasets.

Table 5. Number of active and inactive compounds in the compiled datasets.

Target # active # inactive

MMP13 19 64 814DUSP3 52 289 236

PTPN22 28 292 058

EPHX2 44 90 262

CTDSP1 369 337 634

MAPK10 57 59 405

CDK5 23 334 339

The seven datasets were characterized using physicochemical descriptors and the enrichment from 1D similarity to the crystal ligands was calculated. In previous benchmark datasets, much effort has gone towards eliminating artificial enrichment (enrichment due to differences in 1D descriptors be-tween the active and inactive molecules). While it is necessary to be aware of such differences in the dataset in order to carry out sound evaluations, Bender and Glen have suggested that, instead of removing inactive mole-cules that are too different, the results from more advanced methods can be compared with the 1D similarity results.123 Since this strategy allows the dataset to be more true to the original high-throughput screen, all inactive compounds were included. Any evaluation using these datasets must then of course be taken in context with the performance of 1D methods.

We also investigated the analog bias (are the active compounds too simi-lar?) by calculating the number of Bemis-Murcko scaffolds124 in each da-taset. The results showed that the active compounds in general were quite diverse, with fewer than two structures sharing the same scaffold on average.

In order to evaluate the suitability of the datasets for method validation, we also conducted VS using standard methods. To represent the limited in-formation at the start of a drug discovery project, we chose only to use the information from the crystal structure and the co-crystalized ligand in the VS. The VSs were only moderately successful. This was probably because of the diverse structures of the active compounds and the often quite differ-ent structure of the crystalized ligand that was used as query in the LB tech-niques.

Page 47: Computational Methods in Medicinal Chemistry - DiVA …844370/FULLTEXT01.pdfThese two applications of computational medicinal chemistry is the focus of this thesis. ... Organic Pharmaceutical

47

Keeping the dissimilar inactive compounds in the dataset also gives some advantage to the evaluation. The inactive molecules will cover much more of the physicochemical space and, thus, methods that are biased towards big or charged molecules, for example, will be penalized in the comparison. It is not possible to detect that kind of bias using a property balanced dataset such as the DUD. Indeed, when we evaluated the results of the VS, it became clear that docking, in many cases, showed a preference for very large mole-cules, as shown in Figure 11.

Figure 11. PCA plots showing the distribution of inactive compounds as a contour plot, the active compounds as black dots, the crystal ligand as a red triangle, and the one hundred best-scored compounds from docking as yellow squares. Size is a major factor when determining the first principal component. Reprinted with permission from Lindh et al.125

In conclusion, the presented datasets can be a valuable complement to al-ready widely used benchmark datasets by allowing for a more accurate eval-uation of how VS methods fare in an HTS-like scenario. The large number of inactive compounds as well as their wide spread in physicochemical space will also allow for the detection of some method biases.

Hopefully, the amount of publicly available high-quality HTS data will continue to grow, allowing the dataset to be expanded in the future.

Page 48: Computational Methods in Medicinal Chemistry - DiVA …844370/FULLTEXT01.pdfThese two applications of computational medicinal chemistry is the focus of this thesis. ... Organic Pharmaceutical

48

9 Virtual Screening Using Computationally Derived Reaction Coordinates (paper VI)

Enzymes catalyze a variety of reactions and are essential for life as we know it. The capacity of an enzyme to catalyze a reaction stems from its ability to stabilize the TS of the reaction and, thus, to lower the required free energy of the transition. TS mimetic inhibitors resemble the TS and can therefore take advantage of the same stabilizing effect in their binding. This results in the potential for TS mimetics to be extraordinarily potent inhibitors.126

It is in the nature of TSs that their structure cannot be experimentally de-termined by methods such as X-ray crystallography. They can, however, be modeled computationally using various approaches, the most common being QM/MM methods86 and the so-called cluster approach.87,88

Research from the laboratory of Schramm has shown that it is possible to use computationally derived TSs as a basis for inhibitor design and that the designed inhibitors can be extremely potent.127,128 We envisioned that instead of designing inhibitors, computed TS or high-energy intermediate structures might be used for VS in order to query compound collections for TS mimetic inhibitors.

We chose insulin-regulated aminopeptidase (IRAP) as the target for our efforts. IRAP belongs to the M1 family of aminopeptidases and has been indicated as a promising target for cognitive enhancement.129 A schematic mechanism based on studies of the related enzyme aminopeptidase N (APN) and endoplasmic reticulum aminopeptidase (ERAP) is shown in Scheme 7.

Scheme 7. Schematic mechanism for the M1 family of aminopeptidases. Reprinted with permission from Svensson et al.130

Page 49: Computational Methods in Medicinal Chemistry - DiVA …844370/FULLTEXT01.pdfThese two applications of computational medicinal chemistry is the focus of this thesis. ... Organic Pharmaceutical

49

Several TS or high-energy-intermediate mimetic inhibitors of enzymes in the M1 family are known; among these, bestatin and amastatin have been con-firmed to also be active against IRAP.131

When the project started, the crystal structure of IRAP was not available and we chose to use the crystal structure of the closely related enzyme APN as the basis for modeling. The active site of APN has a very high sequence similarity to its IRAP counterpart. Starting from the APN crystal 4FYR,132 the residues closest to the active site were extracted. The extracted active site contained eleven amino acid side chains; of these, two differed between APN and IRAP. The APN residues Gln213 and Val385 are the equivalents of the IRAP residues Glu295 and Ile461, respectively, and these were modi-fied to reflect the IRAP structure. A model substrate and the water molecule necessary for the reaction were added to the active site. The resulting model, consisting of 227 atoms, was used to model the reaction by means of DFT calculations and the cluster approach.

During the course of our work, two crystal structures for IRAP were re-leased (PDB ID 4PJ6 and 4P8Q).133 We evaluated these and found that they did not provide a significant advantage over the APN-based model as the parts included in our active site model were highly similar between the two enzymes and the quality of the released IRAP structures was in some ways inadequate. Firstly, the resolution of the released structures was quite poor (2.96 Å and 3.02 Å) compared to that of the APN structure (1.91 Å). The IRAP structures also held no substrate or inhibitor. Instead, a single amino acid residue was modeled in the active site. The structures were also in an intermediate position between the open and closed conformations know from ERAP.134 A schematic picture of the open and closed conformations of ERAP1 is shown in Figure 12. It has been suggested that the peptide sub-strate C-terminal in ERAP binds to a positively charged binding site in the open conformation of the enzyme. The ligand binding prompts the enzyme to adopt the closed formation. If the substrate is too large, the enzyme cannot close, and if it is too small it will not reach the N-terminal binding site where the catalytic cleavage takes place. Thus, ERAP only effectively cleaves pep-tides of a certain length.135

Page 50: Computational Methods in Medicinal Chemistry - DiVA …844370/FULLTEXT01.pdfThese two applications of computational medicinal chemistry is the focus of this thesis. ... Organic Pharmaceutical

50

Figure 12. Schematic picture of the open (left, PDB ID 3MDJ136) and closed (right, PDB ID 2YDO134) conformations of ERAP1.

The biological significance of this half open conformation in the IRAP crys-tals is not known. However, in this conformation, Tyr549, which is located in the active site and known from other M1 family peptidases to stabilize the reaction intermediate,137 does not have any interaction with the amino acid crystalized in the substrate position on the active site. The Gly-Ala-Met-Glu-Asn amino acid loops that line the active site are also displaced compared to the APN structure. This could be because of the enzyme conformation or it could reflect an actual difference between the two enzymes. To resolve this uncertainty, more and better crystal structures of IRAP are required.

The modeled reaction path (Figure 13), based on the APN active site model, starts with the nucleophilic attack of the water molecule on the sub-strate carbonyl to form the gem-diol intermediate. From there, the hydrogen bonds to and from the formed gem-diol have to shift to prepare for the pro-tonation of the former amide nitrogen. From here, protonation takes place and subsequently the peptide is cleaved. The first hydrogen bond shift was found to have the highest free-energy requirement in our model (58.5 kJ mol-

1). Overall, the calculated reaction pathway follows the literature mechanism for the M1 family aminopeptidases and has reasonable associated energies.

Page 51: Computational Methods in Medicinal Chemistry - DiVA …844370/FULLTEXT01.pdfThese two applications of computational medicinal chemistry is the focus of this thesis. ... Organic Pharmaceutical

51

IRAP-I IRAP-TS-I IRAP-II

IRAP-TS-II IRAP-III IRAP-TS-III

IRAP-IV IRAP-TS-IV IRAP-V

IRAP-TS-V IRAP-VI

Figure 13. Schematic view of the calculated intermediates and TSs of the modeled reaction. Reprinted with permission from Svensson et al.130

Once the computed TS and intermediate structures were determined, we set out to use them for VS. We chose to use the structures IRAP-II and IRAP-

Page 52: Computational Methods in Medicinal Chemistry - DiVA …844370/FULLTEXT01.pdfThese two applications of computational medicinal chemistry is the focus of this thesis. ... Organic Pharmaceutical

52

V since they represent the two main steps of the reaction. These intermedi-ates were very similar to the preceding TS structure, and we expected that the difference was not of any major significance when used for docking.

The compound library was first docked and then submitted to a pharma-cophore filter. This was done in order to only keep the structures that were predicted to bind in a similar way to the reaction intermediate. The pharma-cophore model was built from the substrate in structure IRAP-II. The allow-ance for the pharmacophore filter was quite big and the chosen features were present throughout the reaction; therefore, the choice of structure on which the pharmacophore was based should have been of lesser importance. The compounds used for screening were from the eMolecules database, which at the time consisted of around 5.6 million compounds. Before VS, we filtered off any structure with a molecular weight over 600 g mol-1 and prepared the structures for VS with LigPrep.

The VS yielded several interesting compounds, including known ami-nopeptidase TS/high-energy intermediate mimetics. The ranking of these compounds was dramatically improved by the application of the pharmaco-phore model, indicating that some form of binding mode filter is beneficial. The VS protocol was also excellent at reproducing the experimentally de-termined binding mode of bestatin, Figure 14.

Figure 14. The conformation of bestatin resulting from the VS protocol (pink car-bons) overlaid on the crystal structure of bestatin in APN (yellow carbons, PDB ID 4FKK138). Reprinted with permission from Svensson et al.130

Seven compounds were purchased and submitted for biological testing with respect to their ability to inhibit IRAP. Among these, ami-no(phenyl)methylphosphonic acid (2, Scheme 8) was confirmed as an IRAP inhibitor with an IC50 of 230 (± 7) μM. We also tested amastatin as a positive control and it registered an IC50 of 26 (± 0.6) μM in our assay.

Page 53: Computational Methods in Medicinal Chemistry - DiVA …844370/FULLTEXT01.pdfThese two applications of computational medicinal chemistry is the focus of this thesis. ... Organic Pharmaceutical

53

Scheme 8. Structure of the identified IRAP inhibitor.

α-Aminophosphonic acids are known to inhibit APN but have not previously been shown to inhibit IRAP.139 Although, the IRAP inhibition is not surpris-ing it further confirms the usefulness of this VS strategy.

Although 2 was a close match in pharmacophore features to the TS, the measured affinity was not very high. This might have been the result of the small size of the inhibitor, especially compared to the natural substrate of IRAP. It is also possible that the electron density and charge distribution were not sufficiently similar to those of the TS to achieve a tighter binding.

Overall, this study confirms that it is possible to use calculated reaction coordinates as the basis for VS in order to enrich compounds that can act as TS/high-energy intermediate mimetics.

Page 54: Computational Methods in Medicinal Chemistry - DiVA …844370/FULLTEXT01.pdfThese two applications of computational medicinal chemistry is the focus of this thesis. ... Organic Pharmaceutical

54

10 Conclusions

In summary, the work presented in this thesis has resulted in the following new insights:

• The theoretical studies increased the understanding of palladium(II)-

catalyzed decarboxylation reactions. The studies not only explained the experimental outcomes, but also inspired the pursuit of improved proto-cols.

• Data fusion of results obtained with LB and SB methodologies improved the VS outcomes.

• A new VS benchmark dataset that can be used to evaluate the perfor-mance of VS techniques was compiled, evaluated, and made generally available.

• A new method for identifying TS and high-energy intermediates through VS using DFT-derived reaction coordinates was developed and applied. The presented protocol was validated through the identification of known TS mimetics.

Page 55: Computational Methods in Medicinal Chemistry - DiVA …844370/FULLTEXT01.pdfThese two applications of computational medicinal chemistry is the focus of this thesis. ... Organic Pharmaceutical

55

11 Acknowledgements

The work presented in this thesis would not have been possible without the help, support, and inspiration provided by others. I wish to express my sin-cere gratitude to the following people: My supervisor Associate Professor Christian Sköld and my co-supervisor Professor Anders Karlén, for the best possible support. I have truly enjoyed working with you. Professor Mats Larhed, the palladium group, the TB group, the IRAP group, and the CADD group for fruitful scientific discussions. All my colleagues at the department, past and present, for making our work-place such a great environment for both research and socializing. Professor Fahmi Himo at Stockholm University and all members of his group for generously sharing their knowledge of enzymatic reaction model-ing and the cluster approach. Apotekarsocieteten for financial support allowing parts of this work to be presented at international conferences. My always loving and supportive family. My partner in life, Susanne “sus” Bornelöv, for simply being the best.

Page 56: Computational Methods in Medicinal Chemistry - DiVA …844370/FULLTEXT01.pdfThese two applications of computational medicinal chemistry is the focus of this thesis. ... Organic Pharmaceutical

56

12 References

(1) Jones, A. W. Early Drug Discovery and the Rise of Pharmaceutical Chemistry. Drug Test. Anal. 2011, 3, 337–344.

(2) Jorgensen, W. L. The Many Roles of Computation in Drug Discovery. Science 2004, 303, 1813–1818.

(3) Mark, A. E.; van Gunsteren, W. F. Decomposition of the Free Energy of a System in Terms of Specific Interactions: Implications for Theoretical and Experimental Studies. J. Mol. Biol. 1994, 240, 167–176.

(4) Dill, K. A. Additivity Principles in Biochemistry. J. Biol. Chem. 1997, 272, 701–704.

(5) Ajay; Murcko, M. A. Computational Methods to Predict Binding Free Energy in Ligand-Receptor Complexes. J. Med. Chem. 1995, 38, 4953–4967.

(6) Christen, M.; van Gunsteren, W. F. On Searching In, Sampling Of, and Dynamically Moving through Conformational Space of Biomolecular Systems: A Review. J. Comput. Chem. 2008, 29, 157–166.

(7) Saunders, M.; Houk, K. N.; Wu, Y. D.; Still, W. C.; Lipton, M.; Chang, G.; Guida, W. C. Conformations of Cycloheptadecane. A Comparison of Methods for Conformational Searching. J. Am. Chem. Soc. 1990, 112, 1419–1427.

(8) Eyring, H. The Activated Complex in Chemical Reactions. J. Chem. Phys. 1935, 3.

(9) Hill, T. L. On Steric Effects. J. Chem. Phys. 1946, 14. (10) Westheimer, F. H.; Mayer, J. E. The Theory of the Racemization of

Optically Active Derivatives of Diphenyl. J. Chem. Phys. 1946, 14. (11) Allinger, N. L. Conformational Analysis. 130. MM2. A Hydrocarbon Force

Field Utilizing V1 and V2 Torsional Terms. J. Am. Chem. Soc. 1977, 99, 8127–8134.

(12) Schrödinger, E. An Undulatory Theory of the Mechanics of Atoms and Molecules. Phys. Rev. 1926, 28, 1049–1070.

(13) Hohenberg, P.; Kohn, W. Inhomogeneous Electron Gas. Phys. Rev. 1964, 136, B864–B871.

(14) Kohn, W.; Sham, L. J. Self-Consistent Equations Including Exchange and Correlation Effects. Phys. Rev. 1965, 140, A1133–A1138.

(15) Bochevarov, A. D.; Harder, E.; Hughes, T. F.; Greenwood, J. R.; Braden, D. A.; Philipp, D. M.; Rinaldo, D.; Halls, M. D.; Zhang, J.; Friesner, R. A. Jaguar: A High-Performance Quantum Chemistry Software Program with Strengths in Life and Materials Sciences. Int. J. Quantum Chem. 2013, 113, 2110–2142.

(16) Becke, A. D. A New Mixing of Hartree–Fock and Local Density-Functional Theories. J. Chem. Phys. 1993, 98, 1372.

Page 57: Computational Methods in Medicinal Chemistry - DiVA …844370/FULLTEXT01.pdfThese two applications of computational medicinal chemistry is the focus of this thesis. ... Organic Pharmaceutical

57

(17) Lee, C.; Yang, W.; Parr, R. G. Development of the Colle-Salvetti Correlation-Energy Formula into a Functional of the Electron Density. Phys. Rev. B 1988, 37, 785–789.

(18) Curtiss, L. A.; Redfern, P. C.; Raghavachari, K. Assessment of Gaussian-3 and Density-Functional Theories on the G3/05 Test Set of Experimental Energies. J. Chem. Phys. 2005, 123, 124107.

(19) Ahlquist, M.; Kozuch, S.; Shaik, S.; Tanner, D.; Norrby, P.-O. On the Performance of Continuum Solvation Models for the Solvation Energy of Small Anions. Organometallics 2006, 25, 45–47.

(20) Ditchfield, R.; Hehre, W. J.; Pople, J. A. Self-Consistent Molecular-Orbital Methods. IX. An Extended Gaussian-Type Basis for Molecular-Orbital Studies of Organic Molecules. J. Chem. Phys. 1971, 54, 724.

(21) Hehre, W. J.; Pople, J. A. Self-Consistent Molecular Orbital Methods. XIII. An Extended Gaussian-Type Basis for Boron. J. Chem. Phys. 1972, 56, 4233.

(22) Binkley, J. S.; Pople, J. A. Self-Consistent Molecular Orbital Methods. XIX. Split-Valence Gaussian-Type Basis Sets for Beryllium. J. Chem. Phys. 1977, 66, 879.

(23) Hariharan, P. C.; Pople, J. A. The Influence of Polarization Functions on Molecular Orbital Hydrogenation Energies. Theor. Chim. Acta 1973, 28, 213–222.

(24) Francl, M. M.; Pietro, W. J.; Hehre, W. J.; Binkley, J. S.; Gordon, M. S.; DeFrees, D. J.; Pople, J. A. Self-Consistent Molecular Orbital Methods. XXIII. A Polarization-Type Basis Set for Second-Row Elements. J. Chem. Phys. 1982, 77, 3654.

(25) Hehre, W. J.; Ditchfield, R.; Pople, J. A. Self—Consistent Molecular Orbital Methods. XII. Further Extensions of Gaussian—Type Basis Sets for Use in Molecular Orbital Studies of Organic Molecules. J. Chem. Phys. 1972, 56, 2257.

(26) Hay, P. J.; Wadt, W. R. Ab Initio Effective Core Potentials for Molecular Calculations. Potentials for K to Au Including the Outermost Core Orbitals. J. Chem. Phys. 1985, 82, 299.

(27) Grimme, S. Density Functional Theory with London Dispersion Corrections. Wiley Interdiscip. Rev. Comput. Mol. Sci. 2011, 1, 211–228.

(28) Ahlquist, M. S. G.; Norrby, P.-O. Dispersion and Back-Donation Gives Tetracoordinate [Pd(PPh3)4]. Angew. Chemie Int. Ed. 2011, 50, 11794–11797.

(29) Grimme, S.; Antony, J.; Ehrlich, S.; Krieg, H. A Consistent and Accurate Ab Initio Parametrization of Density Functional Dispersion Correction (DFT-D) for the 94 Elements H-Pu. J. Chem. Phys. 2010, 132, 154104–154119.

(30) Antony, J.; Grimme, S.; Liakos, D. G.; Neese, F. Protein–Ligand Interaction Energies with Dispersion Corrected Density Functional Theory and High-Level Wave Function Based Methods. J. Phys. Chem. A 2011, 115, 11210–11220.

Page 58: Computational Methods in Medicinal Chemistry - DiVA …844370/FULLTEXT01.pdfThese two applications of computational medicinal chemistry is the focus of this thesis. ... Organic Pharmaceutical

58

(31) Tannor, D. J.; Marten, B.; Murphy, R.; Friesner, R. A.; Sitkoff, D.; Nicholls, A.; Honig, B.; Ringnalda, M.; Goddard, W. A. Accurate First Principles Calculation of Molecular Charge Distributions and Solvation Energies from Ab Initio Quantum Mechanics and Continuum Dielectric Theory. J. Am. Chem. Soc. 1994, 116, 11875–11882.

(32) Marten, B.; Kim, K.; Cortis, C.; Friesner, R. A.; Murphy, R. B.; Ringnalda, M. N.; Sitkoff, D.; Honig, B. New Model for Calculation of Solvation Free Energies:  Correction of Self-Consistent Reaction Field Continuum Dielectric Theory for Short-Range Hydrogen-Bonding Effects. J. Phys. Chem. 1996, 100, 11775–11788.

(33) Hratchian, H. P.; Schlegel, H. B. Finding Minima, Transition States, and Following Reaction Pathways on Ab Initio Potential Energy Surfaces. In Theory and Applications of Computational Chemistry; Elsevier: Amsterdam, 2005; pp. 195–249.

(34) Ripphausen, P.; Nisius, B.; Peltason, L.; Bajorath, J. Quo Vadis, Virtual Screening? A Comprehensive Survey of Prospective Applications. J. Med. Chem. 2010, 53, 8461–8467.

(35) Matter, H.; Sotriffer, C. Applications and Success Stories in Virtual Screening. In Virtual Screening; Wiley-VCH Verlag GmbH & Co. KGaA, 2011; pp. 319–358.

(36) Clark, D. E. What Has Virtual Screening Ever Done for Drug Discovery? Expert Opin. Drug Discov. 2008, 3, 841–851.

(37) Concepts and Applications of Molecular Similarity; Johanson, M. A.; Maggiora, G. M., Eds.; John Wiley & Sons: New York, 1990.

(38) Tresadern, G.; Bemporad, D.; Howe, T. A Comparison of Ligand Based Virtual Screening Methods and Application to Corticotropin Releasing Factor 1 Receptor. J. Mol. Graph. Model. 2009, 27, 860–870.

(39) Meng, X.-Y.; Zhang, H.-X.; Mezei, M.; Cui, M. Molecular Docking: A Powerful Approach for Structure-Based Drug Discovery. Curr. Comput. Aided. Drug Des. 2011, 7, 146–157.

(40) Berman, H. M.; Westbrook, J.; Feng, Z.; Gilliland, G.; Bhat, T. N.; Weissing, H.; Shindyalov, I. N.; Bourne, P. E. The Protein Data Bank. Nucleic Acids Res. 2000, 28, 235–242.

(41) Kleywegt, G. J.; Harris, M. R.; Zou, J. Y.; Taylor, T. C.; Wählby, A.; Jones, T. A. The Uppsala Electron-Density Server. Acta Crystallogr. D. Biol. Crystallogr. 2004, 60, 2240–2249.

(42) Davis, A. M.; St-Gallay, S. A.; Kleywegt, G. J. Limitations and Lessons in the Use of X-Ray Structural Information in Drug Design. Drug Discov. Today 2008, 13, 831–841.

(43) Madhavi Sastry, G.; Adzhigirey, M.; Day, T.; Annabhimoju, R.; Sherman, W. Protein and Ligand Preparation: Parameters, Protocols, and Influence on Virtual Screening Enrichments. J. Comput. Aided. Mol. Des. 2013, 27, 221–234.

(44) Kirchmair, J.; Spitzer, G. M.; Liedl, K. R. Consideration of Water and Solvation Effects in Virtual Screening. In Virtual Screening; Wiley-VCH Verlag GmbH & Co. KGaA, 2011; pp. 263–289.

Page 59: Computational Methods in Medicinal Chemistry - DiVA …844370/FULLTEXT01.pdfThese two applications of computational medicinal chemistry is the focus of this thesis. ... Organic Pharmaceutical

59

(45) Lenselink, E. B.; Beuming, T.; Sherman, W.; van Vlijmen, H. W. T.; IJzerman, A. P. Selecting an Optimal Number of Binding Site Waters To Improve Virtual Screening Enrichments Against the Adenosine A2A Receptor. J. Chem. Inf. Model. 2014, 54, 1737–1746.

(46) Abel, R.; Young, T.; Farid, R.; Berne, B. J.; Friesner, R. A. The Role of the Active Site Solvent in the Thermodynamics of Factor Xa-Ligand Binding. J. Am. Chem. Soc. 2008, 130, 2817–2831.

(47) Young, T.; Abel, R.; Kim, B.; Berne, B. J.; Friesner, R. A. Motifs for Molecular Recognition Exploiting Hydrophobic Enclosure in Protein–ligand Binding. Proc. Natl. Acad. Sci. 2007, 104 , 808–813.

(48) Kirkpatrick, P.; Ellis, C. Chemical Space. Nature 2004, 432, 823. (49) Lipinski, C.; Hopkins, A. Navigating Chemical Space for Biology and

Medicine. Nature 2004, 432, 855–861. (50) Lipinski, C. A.; Lombardo, F.; Dominy, B. W.; Feeney, P. J. Experimental

and Computational Approaches to Estimate Solubility and Permeability in Drug Discovery and Development Settings. Adv. Drug Deliv. Rev. 1997, 23, 3–25.

(51) Pearson, K. LIII. On Lines and Planes of Closest Fit to Systems of Points in Space. Philos. Mag. Ser. 6 1901, 2, 559–572.

(52) Jain, A. N.; Nicholls, A. Recommendations for Evaluation of Computational Methods. J. Comput. Aided. Mol. Des. 2008, 22, 133–139.

(53) Xia, J.; Tilahun, E. L.; Reid, T.-E.; Zhang, L.; Wang, X. S. Benchmarking Methods and Data Sets for Ligand Enrichment Assessment in Virtual Screening. Methods 2015, 71, 146–157.

(54) Huang, N.; Scoichet, B. K.; Irwin, J. J. Benchmarking Sets for Molecular Docking. J. Med. Chem. 2006, 49, 6789–6801.

(55) Mysinger, M. M.; Carchia, M.; Irwin, J. J.; Shoichet, B. K. Directory of Useful Decoys, Enhanced (DUD-E): Better Ligands and Decoys for Better Benchmarking. J. Med. Chem. 2012, 55, 6582–6594.

(56) Rohrer, S. G.; Baumann, K. Maximum Unbiased Validation (MUV) Data Sets for Virtual Screening Based on PubChem Bioactivity Data. J. Chem. Inf. Model. 2009, 49, 169–184.

(57) Vogel, S. M.; Bauer, M. R.; Boeckler, F. M. DEKOIS: Demanding Evaluation Kits for Objective in Silico Screening--a Versatile Tool for Benchmarking Docking Programs and Scoring Functions. J. Chem. Inf. Model. 2011, 51, 2650–2665.

(58) Bauer, M. R.; Ibrahim, T. M.; Vogel, S. M.; Boeckler, F. M. Evaluation and Optimization of Virtual Screening Workflows with DEKOIS 2.0 – A Public Library of Challenging Docking Benchmark Sets. J. Chem. Inf. Model. 2013, 53, 1447–1462.

(59) Irwin, J. J.; Shoichet, B. K. ZINC − A Free Database of Commercially Available Compounds for Virtual Screening. J. Chem. Inf. Model. 2005, 45, 177–182.

(60) Irwin, J. J.; Sterling, T.; Mysinger, M. M.; Bolstad, E. S.; Coleman, R. G. ZINC: A Free Tool to Discover Chemistry for Biology. J. Chem. Inf. Model. 2012, 52, 1757–1768.

(61) Liu, T.; Lin, Y.; Wen, X.; Jorissen, R. N.; Gilson, M. K. BindingDB: A Web-Accessible Database of Experimentally Determined Protein–ligand Binding Affinities. Nucleic Acids Res. 2007, 35, D198–D201.

Page 60: Computational Methods in Medicinal Chemistry - DiVA …844370/FULLTEXT01.pdfThese two applications of computational medicinal chemistry is the focus of this thesis. ... Organic Pharmaceutical

60

(62) Wang, Y.; Xiao, J.; Suzek, T. O.; Zhang, J.; Wang, J.; Zhou, Z.; Han, L.; Karapetyan, K.; Dracheva, S.; Shoemaker, B. A.; Bolton, E.; Gindulyte, A.; Bryant, S. H. PubChem’s BioAssay Database. Nucleic Acids Res. 2012, 40, D400–D412.

(63) Irwin, J. J. Community Benchmarks for Virtual Screening. J. Comput. Aided. Mol. Des. 2008, 22, 193–199.

(64) Verdonk, M. L.; Berdini, V.; Hartshorn, M. J.; Mooij, W. T. M.; Murray, C. W.; Taylor, R. D.; Watson, P. Virtual Screening Using Protein-Ligand Docking: Avoiding Artificial Enrichment. J. Chem. Inf. Comput. Sci. 2004, 44, 793–806.

(65) Good, A. C.; Hermsmeier, M. a.; Hindle, S. a. Measuring CAMD Technique Performance: A Virtual Screening Case Study in the Design of Validation Experiments. J. Comput. Aided. Mol. Des. 2004, 18, 529–536.

(66) Truchon, J.-F.; Bayly, C. I. Evaluating Virtual Screening Methods:  Good and Bad Metrics for the “Early Recognition” Problem. J. Chem. Inf. Model. 2007, 47, 488–508.

(67) Berzelius, J. Årsberättelsen Om Framsteg I Fysik Och Kemi; Royal Swedish Academy of Sciences, 1835.

(68) Mizoroki, T.; Mori, K.; Ozaki, A. Arylation of Olefin with Aryl Iodide Catalyzed by Palladium. Bull. Chem. Soc. Jpn. 1971, 44, 581.

(69) Heck, R. F.; Nolley, J. P. Palladium-Catalyzed Vinylic Hydrogen Substitution Reactions with Aryl, Benzyl, and Styryl Halides. J. Org. Chem. 1972, 37, 2320–2322.

(70) Heck, R. F. Acylation, Methylation, and Carboxyalkylation of Olefins by Group VIII Metal Derivatives. J. Am. Chem. Soc. 1968, 90, 5518–5526.

(71) Shepard, A. F.; Winslow, N. R.; Johnson, J. R. THE SIMPLE HALOGEN DERIVATIVES OF FURAN. J. Am. Chem. Soc. 1930, 52, 2083–2090.

(72) Tanaka, D.; Myers, A. G. Heck-Type Arylation of 2-Cycloalken-1-Ones with Arylpalladium Intermediates Formed by Decarboxylative Palladation and by Aryl Iodide Insertion. Org. Lett. 2004, 6, 433–436.

(73) Tanaka, D.; Romeril, S. P.; Myers, A. G. On the Mechanism of the Palladium(II)-Catalyzed Decarboxylative Olefination of Arene Carboxylic Acids. Crystallographic Characterization of Non-Phosphine Palladium(II) Intermediates and Observation of Their Stepwise Transformation in Heck-like Processes. J. Am. Chem. Soc. 2005, 127, 10323–10333.

(74) Myers, A. G.; Tanaka, D.; Mannion, M. R. Development of a Decarboxylative Palladation Reaction and Its Use in a Heck-Type Olefination of Arene Carboxylates. J. Am. Chem. Soc. 2002, 124, 11250–11251.

(75) Dickstein, J. S.; Mulrooney, C. A.; O’Brien, E. M.; Morgan, B. J.; Kozlowski, M. C. Development of a Catalytic Aromatic Decarboxylation Reaction. Org. Lett. 2007, 9, 2441–2444.

(76) Skillinghaug, B.; Sköld, C.; Rydfjord, J.; Svensson, F.; Behrends, M.; Sävmarker, J.; Sjöberg, P. J. R.; Larhed, M. Palladium(II)-Catalyzed Desulfitative Synthesis of Aryl Ketones from Sodium Arylsulfinates and Nitriles: Scope, Limitations, and Mechanistic Studies. J. Org. Chem. 2014, 79, 12018–12032.

Page 61: Computational Methods in Medicinal Chemistry - DiVA …844370/FULLTEXT01.pdfThese two applications of computational medicinal chemistry is the focus of this thesis. ... Organic Pharmaceutical

61

(77) McMullin, C. L.; Jover, J.; Harvey, J. N.; Fey, N. Accurate Modelling of Pd(0) + PhX Oxidative Addition Kinetics. Dalt. Trans. 2010, 39, 10833–10836.

(78) Fukui, K. The Path of Chemical Reactions - the IRC Approach. Acc. Chem. Res. 1981, 14, 363–368.

(79) Müller, K. Reaction Paths on Multidimensional Energy Hypersurfaces. Angew. Chemie Int. Ed. 1980, 19, 1–13.

(80) Goodman, J. M.; Silva, M. A. QRC: A Rapid Method for Connecting Transition Structures to Reactants in the Computational Analysis of Organic Reactivity. Tetrahedron Lett. 2003, 44, 8233–8236.

(81) Lienhard, G. E. Enzymatic Catalysis and Transition-State Theory. Science 1973, 180, 149–154.

(82) Schwartz, S. D.; Schramm, V. L. Enzymatic Transition States and Dynamic Motion in Barrier Crossing. Nat. Chem. Biol. 2009, 5, 551–558.

(83) Imming, P.; Sinning, C.; Meyer, A. Drugs, Their Targets and the Nature and Number of Drug Targets. Nat. Rev. Drug Discov. 2006, 5, 821–834.

(84) Blomberg, M. R. A.; Borowski, T.; Himo, F.; Liao, R.-Z.; Siegbahn, P. E. M. Quantum Chemical Studies of Mechanisms for Metalloenzymes. Chem. Rev. 2014, 114, 3601–3658.

(85) Carvalho, A. T. P.; Barrozo, A.; Doron, D.; Kilshtain, A. V.; Major, D. T.; Kamerlin, S. C. L. Challenges in Computational Studies of Enzyme Structure, Function and Dynamics. J. Mol. Graph. Model. 2014, 54, 62–79.

(86) Senn, H. M.; Thiel, W. QM/MM Studies of Enzymes. Curr. Opin. Chem. Biol. 2007, 11, 182–187.

(87) Siegbahn, P. E. M.; Himo, F. The Quantum Chemical Cluster Approach for Modeling Enzyme Reactions. Wiley Interdiscip. Rev. Comput. Mol. Sci. 2011, 1, 323–336.

(88) Siegbahn, P. M.; Himo, F. Recent Developments of the Quantum Chemical Cluster Approach for Modeling Enzyme Reactions. J. Biol. Inorg. Chem. 2009, 14, 643–651.

(89) Li, L.; Li, C.; Zhang, Z.; Alexov, E. On the Dielectric “Constant” of Proteins: Smooth Dielectric Function for Macromolecular Modeling and Its Implementation in DelPhi. J. Chem. Theory Comput. 2013, 9, 2126–2136.

(90) Sevastik, R.; Himo, F. Quantum Chemical Modeling of Enzymatic Reactions: The Case of 4-Oxalocrotonate Tautomerase. Bioorg. Chem. 2007, 35, 444–457.

(91) Hopmann, K. H.; Himo, F. Quantum Chemical Modeling of the Dehalogenation Reaction of Haloalcohol Dehalogenase. J. Chem. Theory Comput. 2008, 4, 1129–1137.

(92) Georgieva, P.; Himo, F. Quantum Chemical Modeling of Enzymatic Reactions: The Case of Histone Lysine Methyltransferase. J. Comput. Chem. 2010, 31, 1707–1714.

(93) Hu, L.; Eliasson, J.; Heimdal, J.; Ryde, U. Do Quantum Mechanical Energies Calculated for Small Models of Protein-Active Sites Converge?†. J. Phys. Chem. A 2009, 113, 11793–11800.

(94) Sumner, S.; Söderhjelm, P.; Ryde, U. Effect of Geometry Optimizations on QM-Cluster and QM/MM Studies of Reaction Energies in Proteins. J. Chem. Theory Comput. 2013, 9, 4205–4214.

Page 62: Computational Methods in Medicinal Chemistry - DiVA …844370/FULLTEXT01.pdfThese two applications of computational medicinal chemistry is the focus of this thesis. ... Organic Pharmaceutical

62

(95) Lindh, J.; Sjöberg, P. J. R.; Larhed, M. Synthesis of Aryl Ketones by palladium(II)-Catalyzed Decarboxylative Addition of Benzoic Acids to Nitriles. Angew. Chemie Int. Ed. 2010, 49, 7733–7737.

(96) Xue, L.; Su, W.; Lin, Z. A DFT Study on the Pd-Mediated Decarboxylation Process of Aryl Carboxylic Acids. Dalt. Trans. 2010, 39, 9815–9822.

(97) Zhang, S.-L.; Fu, Y.; Shang, R.; Guo, Q.-X.; Liu, L. Theoretical Analysis of Factors Controlling Pd-Catalyzed Decarboxylative Coupling of Carboxylic Acids with Olefins. J. Am. Chem. Soc. 2009, 132, 638–646.

(98) Svensson, F.; Mane, R. S.; Sävmarker, J.; Larhed, M.; Sköld, C. Theoretical and Experimental Investigation of Palladium(II)-Catalyzed Decarboxylative Addition of Arenecarboxylic Acid to Nitrile. Organometallics 2013, 32, 490–497.

(99) Sävmarker, J.; Rydfjord, J.; Gising, J.; Odell, L. R.; Larhed, M. Direct Palladium(II)-Catalyzed Synthesis of Arylamidines from Aryltrifluoroborates. Org. Lett. 2012, 14, 2394–2397.

(100) Kormos, C. M.; Leadbeater, N. E. Preparation of Nonsymmetrically Substituted Stilbenes in a One-Pot Two-Step Heck Strategy Using Ethene As a Reagent. J. Org. Chem. 2008, 73, 3854–3858.

(101) Cabri, W.; Candiani, I.; Bedeschi, A.; Santi, R. Palladium-Catalyzed Arylation of Unsymmetrical Olefins. Bidentate Phosphine Ligand Controlled Regioselectivity. J. Org. Chem. 1992, 57, 3558–3563.

(102) Arai, I.; Daves, G. D. Palladium-Catalyzed Phenylation of Enol Ethers and Acetates. J. Org. Chem. 1979, 44, 21–23.

(103) Mitchell, H. B. Multi-Sensor Data Fusion; Springer-Verlag: Berlin, 2007. (104) Charifson, P. S.; Corkery, J. J.; Murcko, M. A.; Walters, W. P. Consensus

Scoring:  A Method for Obtaining Improved Hit Rates from Docking Databases of Three-Dimensional Structures into Proteins. J. Med. Chem. 1999, 42, 5100–5109.

(105) Willet, P. Enhancing the Effectivness of Ligand-Based Virtual Screening Using Data Fusion. QSAR Comb. Sci. 2006, 25, 1143–1152.

(106) Friesner, R. A.; Banks, J. L.; Murphy, R. B.; Halgren, T. A.; Klicic, J. J.; Mainz, D. T.; Repasky, M. P.; Knoll, E. H.; Shelly, M.; Perry, J. K.; Shaw, D. E.; Francis, P.; Shenkin, P. S. Glide: A New Approach for Rapid, Accurate Docking and Scoring. 1. Method and Assessment of Docking Accuracy. J. Med. Chem. 2004, 47, 1739–1749.

(107) Halgren, T. A.; Murphy, R. B.; Friesner, R. A.; Beard, H. S.; Frye, L. L.; Pollard, W. T.; Banks, J. L. Glide: A New Approach for Rapid, Accurate Docking and Scoring. 2. Enrichment Factors in Database Screening. J. Med. Chem. 2004, 47, 1750–1759.

(108) Hawkins, P. C. D.; Skillman, A. G.; Nicholls, A. Comparison of Shape-Matching and Docking as Virtual Screening Tools. J. Med. Chem. 2007, 50, 74–82.

(109) Dixon, S. L.; Smondyrev, A. M.; Knoll, E. H.; Rao, S. N.; Shaw, D. E.; Freisner, R. A. PHASE: A New Engine for Pharmacophore Perception, 3D QSAR Model Development, and 3D Databases Screening: 1. Methodology and Preliminary Results. J. Comput. Aided. Mol. Des. 2006, 20, 647–671.

(110) Dixon, S. L.; Smondyrev, A. M.; Rao, S. N. PHASE: A Novel Approach to Pharmacophore Modeling and 3D Database Searching. Chem. Biol. Drug Des. 2006, 67, 370–372.

Page 63: Computational Methods in Medicinal Chemistry - DiVA …844370/FULLTEXT01.pdfThese two applications of computational medicinal chemistry is the focus of this thesis. ... Organic Pharmaceutical

63

(111) Muchmore, S. W.; Souers, A. J.; Akritopoulou-Zanze, I. The Use of Three-Dimensional Shape and Electrostatic Similarity Searching in the Identification of a Melanin-Concentrating Hormone Receptor 1 Antagonist. Chem. Biol. Drug Des. 2006, 67, 174–176.

(112) Tan, L.; Geppert, H.; Sisay, M. T.; Gütschow, M.; Bajorath, J. Integrating Structure- and Ligand-Based Virtual Screening: Comparison of Individual, Parallel, and Fused Molecular Docking and Similarity Search Calculations on Multiple Targets. ChemMedChem 2008, 1566–1571.

(113) Baber, J. C.; Shirley, W. A.; Gao, Y.; Feher, M. The Use of Consensus Scoring in Ligand-Based Virtual Screening. J. Chem. Inf. Model. 2006, 46, 244–288.

(114) Cross, S.; Baroni, M.; Carosati, E.; Benedetti, P.; Clementi, S. FLAP: GRID Molecular Interaction Fileds in Virtual Screening. Validation Using the DUD Data Set. J. Chem. Inf. Model. 2010, 50, 1442–1450.

(115) Wang, R.; Wang, S. How Does Consensus Scoring Work for Virtual Library Screening? An Idealized Comuter Experiment. J. Chem. Inf. Comput. Sci. 2001, 41, 1422–1426.

(116) Svensson, F.; Karlén, A.; Sköld, C. Virtual Screening Data Fusion Using Both Structure- and Ligand-Based Methods. J. Chem. Inf. Model. 2012, 52, 225–232.

(117) Jacobsson, M.; Karlén, A. Ligand Bias of Scoring Functions in Structure-Based Virtual Screening. J. Chem. Inf. Model. 2006, 46, 1334–1343.

(118) Muthas, D.; Sabnis, Y. A.; Lundborg, M.; Karlén, A. Is It Possible to Increse Hit Rates in Structure-Based Virtual Screening by Pharmacophore Filtering? An Investigation of the Advantages and Pitfalls of Post-Filtering. J. Mol. Graph. Model. 2008, 26, 1237–1251.

(119) Cheeseright, T. J.; Mackey, M. D.; Melville, J. L.; Vinter, J. G. FielsScreen: Virtual Screening Using Molecular Fields. Application to the DUD Data Set. J. Chem. Inf. Model. 2008, 48, 2108–2117.

(120) Sastry, G. M.; Inakollu, V. S. S.; Sherman, W. Boosting Virtual Screening Enrichments with Data Fusion: Coalescing Hits from Two-Dimensional Fingerprints, Shape, and Docking. J. Chem. Inf. Model. 2013, 53, 1531–1542.

(121) Subramaniam, S.; Mehrotra, M.; Gupta, D. Virtual High Throughput Screening (vHTS) - A Perspective. Bioinformation 2008, 3, 14–17.

(122) Blucher, A. S.; McWeeney, S. K. Challenges in Secondary Analysis of High Throughput Screening Data. Pac. Symp. Biocomput. 2014, 114–124.

(123) Bender, A.; Glen, R. C. A Discussion of Measures of Enrichment in Virtual Screening:  Comparing the Information Content of Descriptors with Increasing Levels of Sophistication. J. Chem. Inf. Model. 2005, 45, 1369–1375.

(124) Bemis, G. W.; Murcko, M. A. The Properties of Known Drugs. 1. Molecular Frameworks. J. Med. Chem. 1996, 39, 2887–2893.

(125) Lindh, M.; Svensson, F.; Schaal, W.; Zhang, J.; Sköld, C.; Brandt, P.; Karlén, A. Toward a Benchmarking Data Set Able to Evaluate Ligand- and Structure-Based Virtual Screening Using Public HTS Data. J. Chem. Inf. Model. 2015, 55, 343–353.

Page 64: Computational Methods in Medicinal Chemistry - DiVA …844370/FULLTEXT01.pdfThese two applications of computational medicinal chemistry is the focus of this thesis. ... Organic Pharmaceutical

64

(126) Karolina Gluza; Kafarski, P. Transition State Analogues of Enzymatic Reactions as Potential Drugs. In Drug Discovery; El-Shemy, H., Ed.; InTech: Rijeka, Croatia, 2013.

(127) Schramm, V. L. Enzymatic Transition States, Transition-State Analogs, Dynamics, Thermodynamics, and Lifetimes. Annu. Rev. Biochem. 2011, 80, 703–732.

(128) Schramm, V. L. Transition States, Analogues, and Drug Development. ACS Chem. Biol. 2012, 8, 71–81.

(129) Andersson, H.; Hallberg, M. Discovery of Inhibitors of Insulin-Regulated Aminopeptidase as Cognitive Enhancers. Int. J. Hypertens. 2012, 2012, Article ID 789671.

(130) Svensson, F.; Engen, K.; Lundbäck, T.; Larhed, M.; Sköld, C. Virtual Screening for Transition State Analogue Inhibitors of IRAP Based on Quantum Mechanically Derived Reaction Coordinates. J. Chem. Inf. Model. 2015. DOI: 10.1021/acs.jcim.5b00359

(131) Matsumoto, H.; Rogi, T.; Yamashiro, K.; Kodama, S.; Tsuruoka, N.; Hattori, A.; Takio, K.; Mizutani, S.; Tsujimoto, M. Characterization of a Recombinant Soluble Form of Human Placental Leucine Aminopeptidase/oxytocinase Expressed in Chinese Hamster Ovary Cells. Eur. J. Biochem. 2000, 267, 46–52.

(132) Wong, A. H. M.; Zhou, D.; Rini, J. M. The X-Ray Crystal Structure of Human Aminopeptidase N Reveals a Novel Dimer and the Basis for Peptide Processing. J. Biol. Chem. 2012, 287, 36804–36813.

(133) Hermans, S. J.; Ascher, D. B.; Hancock, N. C.; Holien, J. K.; Michell, B. J.; Yeen Chai, S.; Morton, C. J.; Parker, M. W. Crystal Structure of Human Insulin-Regulated Aminopeptidase with Specificity for Cyclic Peptides. Protein Sci. 2015, 24, 190–199.

(134) Kochan, G.; Krojer, T.; Harvey, D.; Fischer, R.; Chen, L.; Vollmar, M.; von Delft, F.; Kavanagh, K. L.; Brown, M. A.; Bowness, P.; Wordsworth, P.; Kessler, B. M.; Oppermann, U. Crystal Structures of the Endoplasmic Reticulum Aminopeptidase-1 (ERAP1) Reveal the Molecular Basis for N-Terminal Peptide Trimming. Proc. Natl. Acad. Sci. 2011, 108, 7745–7750.

(135) Gandhi, A.; Lakshminarasimhan, D.; Sun, Y.; Guo, H.-C. Structural Insights into the Molecular Ruler Mechanism of the Endoplasmic Reticulum Aminopeptidase ERAP1. Sci. Rep. 2011, 1.

(136) Nguyen, T. T.; Chang, S.-C.; Evnouchidou, I.; York, I. A.; Zikos, C.; Rock, K. L.; Goldberg, A. L.; Stratikos, E.; Stern, L. J. Structural Basis for Antigenic Peptide Precursor Processing by the Endoplasmic Reticulum Aminopeptidase ERAP1. Nat. Struct. Mol. Biol. 2011, 18, 604–613.

(137) McGowan, S.; Porter, C. J.; Lowther, J.; Stack, C. M.; Golding, S. J.; Skinner-Adams, T. S.; Trenholme, K. R.; Teuscher, F.; Donnelly, S. M.; Grembecka, J.; Mucha, A.; Kafarski, P.; DeGori, R.; Buckle, A. M.; Gardiner, D. L.; Whisstock, J. C.; Dalton, J. P. Structural Basis for the Inhibition of the Essential Plasmodium Falciparum M1 Neutral Aminopeptidase. Proc. Natl. Acad. Sci. 2009, 106, 2537–2542.

(138) Chen, L.; Lin, Y.-L.; Peng, G.; Li, F. Structural Basis for Multifunctional Roles of Mammalian Aminopeptidase N. Proc. Natl. Acad. Sci. 2012, 109, 17966–17971.

Page 65: Computational Methods in Medicinal Chemistry - DiVA …844370/FULLTEXT01.pdfThese two applications of computational medicinal chemistry is the focus of this thesis. ... Organic Pharmaceutical

65

(139) Grzywa, R.; Sokol, A. M.; Sieńczyk, M.; Radziszewicz, M.; Kościołek, B.; Carty, M. P.; Oleksyszyn, J. New Aromatic Monoesters of Α-Aminoaralkylphosphonic Acids as Inhibitors of Aminopeptidase N/CD13. Bioorg. Med. Chem. 2010, 18, 2930–2936.

Page 66: Computational Methods in Medicinal Chemistry - DiVA …844370/FULLTEXT01.pdfThese two applications of computational medicinal chemistry is the focus of this thesis. ... Organic Pharmaceutical

Acta Universitatis UpsaliensisDigital Comprehensive Summaries of Uppsala Dissertationsfrom the Faculty of Pharmacy 201

Editor: The Dean of the Faculty of Pharmacy

A doctoral dissertation from the Faculty of Pharmacy, UppsalaUniversity, is usually a summary of a number of papers. A fewcopies of the complete dissertation are kept at major Swedishresearch libraries, while the summary alone is distributedinternationally through the series Digital ComprehensiveSummaries of Uppsala Dissertations from the Faculty ofPharmacy. (Prior to January, 2005, the series was publishedunder the title “Comprehensive Summaries of UppsalaDissertations from the Faculty of Pharmacy”.)

Distribution: publications.uu.seurn:nbn:se:uu:diva-259443

ACTAUNIVERSITATIS

UPSALIENSISUPPSALA

2015