revising the topliss decision tree

38
Revising the Topliss decision tree …based on 30 years of medicinal chemistry literature Noel O’Boyle and Roger Sayle NextMove Software Jonas Boström AstraZeneca 248 th ACS National Meeting Aug 2014

Upload: nextmove-software

Post on 10-May-2015

2.057 views

Category:

Science


4 download

DESCRIPTION

Presented at 238th ACS National Meeting San Francisco MEDI Division

TRANSCRIPT

Page 1: Revising the Topliss Decision Tree

Revising the Topliss decision tree

…based on 30 years of medicinal chemistry literature

Noel O’Boyle and Roger Sayle NextMove Software

Jonas Boström AstraZeneca

248th ACS National Meeting Aug 2014

Page 2: Revising the Topliss Decision Tree

http

://ww

w.acsm

edch

em

.org

/top

liss.htm

l

Page 3: Revising the Topliss Decision Tree

Topliss Tree for Substituted Phenyl

Topliss, J. G. Utilization of Operational Schemes for Analog Synthesis in Drug Design. J. Med. Chem. 1972, 15, 1006–1011.

Page 4: Revising the Topliss Decision Tree

Features of the Topliss Tree

• Maximize the chances of synthesizing the most potent compound in the series as soon as possible

• Based on inferring Hansch structure-activity relationship from relative potencies of R groups

– Electronic (σ), hydrophobic (π), steric (Es)

• General scheme

– for any target

– for any scaffold

Page 5: Revising the Topliss Decision Tree

ChEMBL Bioactivity database

• July 2008 - ChEMBL established with Wellcome Trust grant

– John Overington, EMBL-EBI

• Open access source of bioactivity data abstracted from the literature

– Chemical structures, activity values, activity type, assay description, journal article name, target

– www.ebi.ac.uk/chembl/

Gaulton et al. Nucleic Acids Res. 2012, 40, D1100

Page 6: Revising the Topliss Decision Tree

ChEMBL Bioactivity database

• ChEMBL 19 – July 2014

– 57k papers

• 94% from Bioorg. Med. Chem. Lett., J. Med. Chem., J. Nat. Prod., Bioorg. Med. Chem., Eur. J. Med. Chem., Antimicrob. Agents Chemother., Med. Chem. Res.

– 1.4 million compounds with 12 million activities

– 1.1 million assays against 10k targets

Page 7: Revising the Topliss Decision Tree

0

500

1000

1500

2000

2500

3000

3500

4000

4500

5000

1977 1982 1987 1992 1997 2002 2007 2012

Co

un

t

Year

Number of articles extracted from a particular year

Page 8: Revising the Topliss Decision Tree

Matched (Molecular) series

• Recent concept in cheminformatics (*)

– … not so recent in medicinal chemistry

• Series of structural analogs

– same scaffold

– different R groups at a single position

* “Matching molecular series” introduced by Wawer and Bajorath J. Med. Chem. 2011, 54, 2944

Page 9: Revising the Topliss Decision Tree

Matched Series of length 3

[Cl, F, NH2]

Page 10: Revising the Topliss Decision Tree

Matched Series of length 3

[4-Cl-Ph, 4-F-Ph, 4-NH2-Ph]

Page 11: Revising the Topliss Decision Tree

Ordered Matched Series

[4-Cl-Ph > 4-F-Ph > 4-NH2-Ph]

3.5

2.1

1.6

pIC50

Page 12: Revising the Topliss Decision Tree

1

10

100

1000

10000

100000

1000000

2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 42 44 46 48 50

Fre

qu

en

cy

Series length

Matched series in ChEMBL19 IC50 binding assays

Length 2: 240,967 212,494 Length 3: 59,753 52,666 Length 4: 27,779 24,306 Length 5: 15,892 13,834 Length 6: 10,619 9,203

Method described in O’Boyle, Boström, Sayle, Gill. Using Matched Molecular Series as a Predictive Tool To Optimize Biological Activity. J. Med. Chem. 2014, 57, 2704.

(ChEMBL16)

Page 13: Revising the Topliss Decision Tree

0

1000

2000

3000

4000

5000

6000

7000

8000

9000Fr

eq

ue

ncy

R Groups sorted by frequency

Analysis of the 16268 matched series containing at least 4 substituted phenyls

Page 14: Revising the Topliss Decision Tree

Find R Groups that increase activity

A > B

Query A > B > C C > A > B D > A > B > C D > A > C > B E > D > A > B …

R Group Observations

Obs that

increase

activity

% that

increase

activity

D 3 3 100

E 1 1 100

C 4 1 25

… … …

O’Boyle, Boström, Sayle, Gill. Using Matched Molecular Series as a Predictive Tool To Optimize Biological Activity. J. Med. Chem. 2014, 57, 2704.

Page 15: Revising the Topliss Decision Tree

Example

Page 16: Revising the Topliss Decision Tree

Example II

Page 17: Revising the Topliss Decision Tree

Topliss Decision Tree

Page 18: Revising the Topliss Decision Tree

Topliss Decision Tree

Page 19: Revising the Topliss Decision Tree

Topliss Decision Tree

Page 20: Revising the Topliss Decision Tree

Topliss Decision Tree

(1st if lower cutoff)

Page 21: Revising the Topliss Decision Tree

Topliss Decision Tree

Page 22: Revising the Topliss Decision Tree

Topliss Decision Tree

(20th)

Page 23: Revising the Topliss Decision Tree

Topliss Decision Tree

Page 24: Revising the Topliss Decision Tree

Topliss Decision Tree

(21st) “Assuming that the –σ effect is the most probable explanation…”

Page 25: Revising the Topliss Decision Tree

Topliss Decision Tree

Page 26: Revising the Topliss Decision Tree

Topliss Decision Tree

Page 27: Revising the Topliss Decision Tree

Topliss Decision Tree

Page 28: Revising the Topliss Decision Tree

Topliss Decision Tree

*

Page 29: Revising the Topliss Decision Tree

Matsy Decision Tree

Page 30: Revising the Topliss Decision Tree

Matsy Decision Tree (Take II)

Page 31: Revising the Topliss Decision Tree

Target specific subsets

4-Cl > H

Everything Kinases Class A GPCRs

Page 32: Revising the Topliss Decision Tree

Account for Lipophilic Efficiency

• ΔLiPE = ΔpIC50 – ΔLogP

• The “%>” value is based on the number of times a particular R group has greater pIC50

– i.e. ΔpIC50 > 0

• Redefine it to only include cases where the increase in pIC50 was larger than any increase in LogP

– i.e. ΔpIC50 > 0 and ΔLiPE > 0

Page 33: Revising the Topliss Decision Tree

4-Cl > H

ΔLiPE > 0

Page 34: Revising the Topliss Decision Tree

ΔLiPE > 0

3,4-diCl > 4-Cl > H

Page 35: Revising the Topliss Decision Tree

Data-driven approach

• Not limited to the two trees in the Topliss paper

• All predictions backed by experimental data

– Can drill-down into the data, look at targets, scaffolds

– Can restrict experimental data used to particular targets, use in-house data rather than ChEMBL

• Does not explain why, only that it happens

Page 36: Revising the Topliss Decision Tree

Conclusions

• In the main, the Topliss Tree is supported by published data

– Largest difference is recommendation of 4-OMe rather than 4-OH

– Suggestion of 4-CF3 is also problematic

• We have generated the corresponding ‘Matsy Tree’ derived from experimental data

Page 37: Revising the Topliss Decision Tree

drag-and-drop interface to Matsy

Page 38: Revising the Topliss Decision Tree

Revising the Topliss decision tree …based on 30 years of medicinal

chemistry literature

Using Matched Molecular Series as a Predictive Tool To Optimize Biological

Activity J. Med. Chem. 2014, 57, 2704.

Want to hear more? Poster COMP 394 Tuesday 6:00-8:00pm Marriott Marquis Interested in an evaluation copy of Matsy? Come by our booth

[email protected]