molecular design: one step back and two paths forward

Molecular Design: One step back and two paths forward

Peter W Kenny ([email protected])

Some things that are hurting Pharma

• Having to exploit targets that are less well-linked to

human disease

• Inability to predict idiosyncratic toxicity

• Inability to measure free (unbound) physiological

concentrations of drug for remote targets (e.g.

intracellular or within blood brain barrier)

Dans la merde: http://fbdd-lit.blogspot.com/2011/09/dans-la-merde.html

Keep an eye out for creative data analysis

Add Normally-distributed noise

Data set A Data set B

Points plotted at

constant increment

Equal numbers of points

for each value of x

Preparation of data sets

r2 = 0.99

RMSE = 0.36

Data set A: Fit median value of Y to X

An example of this approach to plotting data can be seen in Leeson & Springthorpe, The influence of

drug-like concepts on decision-making in medicinal chemistry. Nat. Rev. Drug Discov. 2007, 7, 881-890.

Low Medium High

Data set B: Use value of X to split into three equally-sized groups

and show mean and associated confidence interval for each

An example of this approach to analysing data can be seen in: Gleeson, Generation of a

Set of Simple, Interpretable ADMET Rules of Thumb. J. Med. Chem. 2008, 51, 817-834.

What data set A really looks like

Fit to original data

N=11000; r2 = 0.09 ; RMSE = 9.95

Fit to transformed data

N=11; r2 = 0.99 ; RMSE = 0.36

Percentile plot (see Colclough et al

BMC 2008, 16, 6611-6616)

90%

75%

50%

25%

10%

Residual plot for fit to original data

Fit to original data

N=10000; r2 = 0.08 ; RMSE = 10.0)

Residual plot for fit to original data

Low Medium High

What data set B really looks like

Mean values of Y and (barely visible)

confidence intervals shown with

standard deviations

x

Octanol was the first mistake...

Lipophilic & half ionised Hydrophilic

Introduction to partition coefficients

Polarity

NClogP ≤ 5 Acc ≤ 10; Don ≤5

An alternative view of the Rule of 5

Does octanol/water ‘see’ hydrogen bond donors?

--0.06 -0.23 -0.24

--1.01 -0.66

Sangster lab database of octanol/water partition coefficients: http://logkow.cisti.nrc.ca/logkow/index.jsp

--1.05

Octanol/Water Alkane/Water

Octanol/water is not the only partitioning system

logPoct = 2.1

logPalk = 1.9

DlogP = 0.2

logPoct = 1.5

logPalk = -0.8

DlogP = 2.3

logPoct = 2.5

logPalk = -1.8

DlogP = 4.3

Differences in octanol/water and alkane/water logP values

reflect hydrogen bonding between solute and octanol

Toulmin et al, J. Med. Chem. 2008, 51, 3720-3730

DlogP = 0.5

PSA/ Å2 = 48

Polar Surface Area is not predictive of

hydrogen bond strength


DlogP = 4.3

PSA/ Å2 = 22

1.0 1.1 0.8 1.3 1.7

0.8 1.5

Measured values of DlogP


1.6 1.1

DlogP

(corrected)

Vmin/(Hartree/electron)

DlogP

(corrected)

Vmin/(Hartree/electron)

N or ether OCarbonyl O

Prediction of contribution of acceptors to DlogP


DlogP = DlogP0 x exp(-kVmin)

logPoct = 0.89

predicted logPalk = -4.2

PSA/Å2 = 53

logPoct = 1.58

predicted logPalk = -1.4

PSA/Å2 = 65

Lipophilicity/polarity of Morphine & Heroin


logPhxdlogPoct

log

(Cbra

in/C

blo

od)

DlogP

Prediction of blood/brain partitioning

R2 = 0.66

RMSE = 0.54R2 = 0.82

RMSE = 0.39

R2 = 0.88

RMSE = 0.32


Difficulties in measuring logPalk:

Many compounds poorly soluble in alkanes

Self-association masks polarity

Alkane/water partition coefficients: Where next?

General access to logPalk

likely to require predictive

models for some time

Carefully measure logPalk

for structurally diverse

compounds

Solvation models: logPalk

easier to measure than

ΔG(gaq)

Another way to look at SAR

(Descriptor-based) QSAR/QSPR:

Some questions

• How valid is methodology (especially for validation)

when distribution of compounds in training/test space

is highly non-uniform?

• Are models predicting activity or locating neighbours?

• Are ‘global’ models ensembles of local models?

• How well do the methods handle ‘activity cliffs’?

• How should we account for sizes of descriptor pools

when comparing models?

Measures of Diversity & Coverage

•• •

•

••

•

•

•

••

•

••

•

2-Dimensional representation of chemical space is used here to illustrate concepts of diversity

and coverage. Stars indicate compounds selected to sample this region of chemical space.

In this representation, similar compounds are close together

Neighborhoods and library design

Examples of relationships between structures

Tanimoto coefficient (foyfi) for structures is 0.90

Ester is methyl-substituted acid Amides are ‘reversed’

Leatherface molecular editor

From chain saw to Matched Molecular Pairs

c-[A;!R]

bnd 1 2

c-Br

cul 2

hyd 1 1

[nX2]1c([OH])cccc1

hyd 1 1

hyd 3 -1

bnd 2 3 2

Kenny & Sadowski Structure modification in chemical databases, Methods and Principles in Medicinal

Chemistry (Chemoinformatics in Drug Discovery 2005, 23, 271-285.

Glycogen Phosphorylase inhibitors:

Series comparison

DpIC50

DlogFu

DlogS

0.38 (0.06)

-0.30 (0.06)

-0.29 (0.13)

DpIC50

DlogFu

DlogS

0.21 (0.06)

0.13 (0.04)

0.20 (0.09)

DpIC50

DlogFu

DlogS

0.29 (0.07)

-0.42 (0.08)

-0.62 (0.13)

Standard errors in mean values shown in parenthesis; see Birch et al, BMCL 2009, 19, 850-853

Effect of bioisosteric replacement

on plasma protein binding

?

Date of Analysis N DlogFu SE SD %increase

2003 7 -0.64 0.09 0.23 0

2008 12 -0.60 0.06 0.20 0

Mining PPB database for carboxylate/tetrazole pairs suggested that bioisosteric

replacement would lead to decrease in Fu so tetrazoles not synthesised.

Birch et al, BMCL 2009, 19, 850-853

Amide N DlogS SE SD %Increase

Acyclic (aliphatic amine) 109 0.59 0.07 0.71 76

Cyclic 9 0.18 0.15 0.47 44

Benzanilides 9 1.49 0.25 0.76 100

Effect of amide N-methylation on aqueous solubility

is dependent on substructural context

Birch et al, BMCL 2009, 19, 850-853

Relationships between structures

Discover new

bioisosteresPrediction of activity

& properties

Recognise

extreme data

Direct prediction

(e.g. look up

substituent effects)

Indirect prediction

(e.g. apply correction

to existing model)

Bad measurement

or interesting effect?

Conclusions

• Data can be massaged and correlations can

be enhanced but it won’t extract us from ‘la

merde’

• There is life beyond octanol/water if we

choose to look for it

• Even molecules can have meaningful

relationships

Selected references

• Seiler (1974) Interconversion of lipophilicities from hydrocarbon/water systems into the octanol/water

system. Eur. J. Med. Chem. 9, 473–479.

• Toulmin, Wood & Kenny (2008) Toward Prediction of Alkane/Water Partition Coefficients. J. Med. Chem.

51, 3720-3730. http://dx.doi.org/10.1021/jm701549s

• Kenny & Sadowskii (2005) Structure modification in chemical databases. Methods and Principles in

Medicinal Chemistry 23(Chemoinformatics in Drug Discovery), 271-285

http://dx.doi.org/10.1002/3527603743.ch11

• Leach et al (2006) Matched Molecular Pairs as a Guide in the Optimization of Pharmaceutical Properties; a

Study of Aqueous Solubility, Plasma Protein Binding and Oral Exposure,. J. Med. Chem. 49, 6672-6682.

http://dx.doi.org/10.1021/jm0605233

• Birch et al (2009) Matched molecular pair analysis of activity and properties of glycogen phosphorylase

inhibitors. Bioorg. Med. Chem. Lett. 19, 850-853. http://dx.doi.org/10.1016/j.bmcl.2008.12.003

• Wassermann, Wawer & Bajorath (2010) Activity Landscape Representations for Structure−Activity

Relationship Analysis. J. Med. Chem. 53, 8209-8223. http://dx.doi.org/10.1021/jm100933w

Alkane/water partition coefficents

Relationships between structures

molecular design: one step back and two paths forward

Education