in silico virtual screening for drug...
TRANSCRIPT
In silico virtual screening for drug discovery
Jean-Philippe [email protected]
Mines ParisTech - INSERM - Institut Curie
"Mathematics and Industry" workshop, IHES, Bures-sur-Yvette,France, April 23-25, 2009
Jean-Philippe Vert (ParisTech) In silico virtual screening IHES 1 / 23
Drug discovery
A long, expensive and risky processOn average 15 years and $800 millionsHigh attrition rate: for 10,000 molecules tested, 10 make it toclinicals, 1 to the market.>70% of the costs are wasted on failures
Jean-Philippe Vert (ParisTech) In silico virtual screening IHES 2 / 23
Computational approaches
The use of computers and computational methods permeates allaspects of drug discovery today, in particular for:
Target identificationStructure prediction, virtual screening (docking)Prediction of drug-likeliness of compounds
Jean-Philippe Vert (ParisTech) In silico virtual screening IHES 3 / 23
Collaboration with the industry
PracticallyDirect conctracting (leveraged through Institut Carnot M.I.N.E.S.)European projectsPôles de compétitivité
FeaturesLarge pharmaceutical or small biotech companiesShared PI or subcontractingCultural shock math/info vs bio/chemistry
Jean-Philippe Vert (ParisTech) In silico virtual screening IHES 4 / 23
Virtual screening and predictive models
ObjectiveBuild models to predict biochemical properties Y of small moleculesfrom their structures X , using a training set of (X , Y ) pairs.
Structures X
C15H14ClN3O3
Properties Ybinding to a therapeutic target,pharmacokinetics (ADME),toxicity...
Jean-Philippe Vert (ParisTech) In silico virtual screening IHES 5 / 23
Example : ligand-Based Virtual Screening
inactive
active
active
active
inactive
inactive
NCI AIDS screen results (from http://cactus.nci.nih.gov).Jean-Philippe Vert (ParisTech) In silico virtual screening IHES 6 / 23
Formalization
The problemGiven a set of training instances (x1, y1), . . . , (xn, yn), where xi ’sare graphs and yi ’s are continuous or discrete variables of interest,Estimate a function
y = f (x)
where x is any graph to be labeled.This is a classical regression or pattern recognition problem overthe set of graphs.
Jean-Philippe Vert (ParisTech) In silico virtual screening IHES 7 / 23
Classical approaches
Two steps1 Map each molecule to a vector of fixed dimension using molecular
descriptorsGlobal properties of the molecules (mass, logP...)2D and 3D descriptors (substructures, fragments, ....)
2 Apply an algorithm for regression or pattern recognition.PLS, ANN, ...
Example: 2D structural keys
O
N
O
O
OO
N N N
O O
O
Jean-Philippe Vert (ParisTech) In silico virtual screening IHES 9 / 23
Which descriptors?
O
N
O
O
OO
N N N
O O
O
DifficultiesMany descriptors are needed to characterize various features (inparticular for 2D and 3D descriptors)But too many descriptors are harmful for memory storage,computation speed, statistical estimation
Jean-Philippe Vert (ParisTech) In silico virtual screening IHES 10 / 23
Kernels
DefinitionLet Φ(x) = (Φ1(x), . . . ,Φp(x)) be a vector representation of themolecule xThe kernel between two molecules is defined by:
K (x , x ′) = Φ(x)>Φ(x ′) =
p∑i=1
Φi(x)Φi(x ′) .
φX H
Jean-Philippe Vert (ParisTech) In silico virtual screening IHES 11 / 23
The kernel trick
φX H
The trick1 Any positive definite function is a valid kernel (i.e., inner product
after mapping the molecules to some Hilbert space), e.g.,
K (x , x ′) = exp(−γ‖ x − x ′ ‖2
).
2 Many linear algorithms for regression or pattern recognition canbe expressed only in terms of kernels.
Jean-Philippe Vert (ParisTech) In silico virtual screening IHES 12 / 23
Example: Support Vector Machine
φ
minimizen∑
i=1
n∑j=1
αiαjK (xi , xj)−n∑
i=1
αi
subject to 0 ≤ αi ≤ C, i = 1, . . . , n ,n∑
i=1
αiyi = 0 .
Jean-Philippe Vert (ParisTech) In silico virtual screening IHES 13 / 23
Making kernels for molecules
Strategy 1: use well-known molecular descriptors to representmolecules m as vectors Φ(m), and then use kernels for vectors,e.g.:
K (m1, m2) = Φ(m1)>Φ(m2).
Strategy 2: invent new kernels to do things you can not do withstrategy 1, such as using an infinite number of descriptors. We willnow see two examples of this strategy, extending 2D and 3Dmolecular descriptors.
φX H
Jean-Philippe Vert (ParisTech) In silico virtual screening IHES 14 / 23
Example: 2D fragment kernel
. . . . . .C NCCON
C C NO C
CO C
CC CCN C
NO CC C C
CC CC C C
CN CC C C
N
O
O
O N
O
C. . . . . . . . .
φd(x) is the vector of counts of all fragments of length d :
φ1(x) = ( #(C),#(O),#(N), ...)>
φ2(x) = ( #(C-C),#(C=O),#(C-N), ...)> etc...
The 2D fragment kernel is defined, for λ < 1, by
Kfragment(x , x ′) =∞∑
d=1
r(λ)φd(x)>φd(x ′) .
Jean-Philippe Vert (ParisTech) In silico virtual screening IHES 15 / 23
2D kernel computation trick
Kfragment can be computed efficiently for various weights r(λ)although the feature space has infinite dimension.Rephrase the kernel computation as that as counting the numberof walks on a graph (the product graph)
G1 x G2
c
d e43
2
1 1b 2a 1d
1a 2b
3c
4c
2d
3e
4e
G1 G2
a b
The infinite counting can be factorized
λA + λ2A2 + λ3A3 + . . . = (I − λA)−1 − I .
Jean-Philippe Vert (ParisTech) In silico virtual screening IHES 16 / 23
Experiments
MUTAG datasetaromatic/hetero-aromatic compoundshigh mutagenic activity /no mutagenic activity, assayed inSalmonella typhimurium.188 compouunds: 125 + / 63 -
Results10-fold cross-validation accuracy
Method AccuracyProgol1 81.4%2D kernel 91.2%
Jean-Philippe Vert (ParisTech) In silico virtual screening IHES 17 / 23
Example: 2D subtree kernel
.
.
.
.
.
.
.
.
.N
N
C
CO
C
.
.
. C
O
C
N
C
N O
C
N CN C C
N
N
Jean-Philippe Vert (ParisTech) In silico virtual screening IHES 18 / 23
2D Subtree vs fragment kernels (Mahé and V, 2007)
7072
7476
7880
AU
CWalksSubtrees
CC
RF
−C
EM
HL−
60(T
B)
K−
562
MO
LT−
4R
PM
I−82
26 SR
A54
9/A
TC
CE
KV
XH
OP
−62
HO
P−
92N
CI−
H22
6N
CI−
H23
NC
I−H
322M
NC
I−H
460
NC
I−H
522
CO
LO_2
05H
CC
−29
98H
CT
−11
6H
CT
−15
HT
29K
M12
SW
−62
0S
F−
268
SF
−29
5S
F−
539
SN
B−
19S
NB
−75
U25
1LO
X_I
MV
IM
ALM
E−
3M M14
SK
−M
EL−
2S
K−
ME
L−28
SK
−M
EL−
5U
AC
C−
257
UA
CC
−62
IGR
−O
V1
OV
CA
R−
3O
VC
AR
−4
OV
CA
R−
5O
VC
AR
−8
SK
−O
V−
378
6−0
A49
8A
CH
NC
AK
I−1
RX
F_3
93S
N12
CT
K−
10U
O−
31P
C−
3D
U−
145
MC
F7
NC
I/AD
R−
RE
SM
DA
−M
B−
231/
AT
CC
HS
_578
TM
DA
−M
B−
435
MD
A−
NB
T−
549
T−
47D
Screening of inhibitors for 60 cancer cell lines (from Mahé and V.,2008)
Jean-Philippe Vert (ParisTech) In silico virtual screening IHES 19 / 23
Example: 3D pharmacophore kernel (Mahé et al.,2005)
O
O
2
d1
d3
d
O
O
2
d1
d3
d
K (x , y) =∑
px∈P(x)
∑py∈P(y)
exp (−γd (px , py )) .
Results (accuracy)Kernel BZR COX DHFR ER2D (Tanimoto) 71.2 63.0 76.9 77.13D fingerprint 75.4 67.0 76.9 78.63D not discretized 76.4 69.8 81.9 79.8
Jean-Philippe Vert (ParisTech) In silico virtual screening IHES 20 / 23
Challenges in ligand-based virtual screening
C15H14ClN3O3
How to embed molecules in a Hilbert space through a positivedefinite function K (x , x ′) in a computationally efficient way (e.g.,complete graph kernels are at least as hard as the graphisomorphism problem)?How to embed flexible 3D structures?How to extend machine learning algorithms to non-positivedefinite functions (e.g., edit distance between graphs)?
Jean-Philippe Vert (ParisTech) In silico virtual screening IHES 21 / 23
Towards chemogenomics
The problemSimilar targets bind similar ligandsInstead of focusing on each target individually, can we screen thebiological space (target families) vs the chemical space (ligands)?Mathematically, learn f (target , ligand) ∈ {bind , notbind} : this isan instance of operator inference.
Jean-Philippe Vert (ParisTech) In silico virtual screening IHES 22 / 23
Conclusion
Computational approaches have the potential to increase theglobal R&D productivity in pharmaceutical industry, in particular tospeed up the identification of targets, hits, and decrease theattrition rate.Collaboration with the industry is needed to access largedatabases of molecules and impact drug discovery.Statistics and machine learning are natural approaches forligand-based virtual screening approachesMolecules are complex objects, which can be represented asformula, structures, shapes. Applying state-of-the-art machinelearning methods to these representations requires mathematicaland computational developments.
Jean-Philippe Vert (ParisTech) In silico virtual screening IHES 23 / 23