125:583 molecular modeling i prof. william welsh november 2, 2006
DESCRIPTION
125:583 Molecular Modeling I Prof. William Welsh November 2, 2006. Norman H. Edelman Professor in Bioinformatics Department of Pharmacology Robert Wood Johnson Medical School University of Medicine & Dentistry of New Jersey (UMDNJ) & Director, The UMDNJ Informatics Institute 675 Hoes Lane - PowerPoint PPT PresentationTRANSCRIPT
125:583125:583Molecular Modeling IMolecular Modeling IProf. William WelshProf. William WelshNovember 2, 2006November 2, 2006
Norman H. Edelman Professor in BioinformaticsDepartment of Pharmacology
Robert Wood Johnson Medical SchoolUniversity of Medicine & Dentistry of New Jersey (UMDNJ)
& Director, The UMDNJ Informatics Institute675 Hoes Lane
Piscataway, NJ 08854
Applying the Drug Discovery Applying the Drug Discovery
Paradigm to BiomaterialsParadigm to Biomaterials
Bill WelshRobert Wood Johnson Medical School & Informatics Institute
University of Medicine & Dentistry of New Jersey (USA)
Some Advanced Medical Applications of Implant Materials
WN292 PP063
Tissue Engineering- requires degradable (bioactive) materials as temporary
scaffolds for tissue remodeling- requires materials that elicit controllable and predictable
cellular responses
Implantable Drug Delivery Systems and Degradable Temporary Support Devices- require fine-tuning of multiple sets of properties
We don’t have the right materials
The material base of the medical device industry is outmoded- The industry relies currently on industrial plastics from the 1940’s and
1950’s - Very few degradable biomaterials are available
The lack of degradable biomaterials that elicit predictable and controllable cell and tissue responses is a “bottleneck” in bringing tissue-engineering based therapies into the clinic
PP062
A New Approach: Combinatorial Chemistry in
Materials Design
Model: Drug Discovery- Very large libraries- Very specific bioassays looking for one particular
bioactivity- Searching for a needle in a haystack
Outcome- Dramatic acceleration of the pace in which lead
compounds can be identified
Elements of a Biomaterials“Combi” Approach
Parallel synthesis of a larger number of polymers Rapid screening assays for the characterization of
bio-relevant material properties- e.g., protein surface adsorption, cell growth, gene
expression in cells
Data mining, computational design and modeling
Reduced cost and risk, leading to greater willingness of industry to consider the commercialization of new biomaterials for specific applications
Screening for Fibrinogen Adsorption
Major surface protein to initiate coagulation and inflammation
Blood cells bind to fibrinogen
Level of fibrinogen adsorption is commonly used as a blood compatibility indicator
The Modern Drug Discovery Paradigm: The Modern Drug Discovery Paradigm:
RationalRational Drug Design Drug Designgenes
proteins
small molecules
drug candidates
Combinational Chemistry (CombiChem)Combinational Chemistry (CombiChem)
small-moleculelibraries
HO
OH
A B
C D
OH
O
OH O
O
O
HO
OH
OH
O
A
C D
HO
OH
ON
HO
OH
O
HO
OH
OH
O
A
C D
Parallel Chemical Synthesis
polymer libraries
Parallel Chemical Synthesis
O
O
OO
N
O
O
O
O
O
O
O
O
O
CH3
O
O
OO O
ON
O
O
O
CH3
O
O
O
O
O
ON
O
O
O
CH3
O
O
Focal AreasSurrogate Molecular Modeling to Accelerate Polymer Design and
Optimization
• Virtual Combinatorial Chemistry: Compressing Large Polymer Libraries into Representative Subsets
• Quantitative Structure-Performance Relationship (QSPR) Models: Predicting Cell-Material Interactions from the Polymer’s Chemical Structure
Atomistic Molecular Modeling to Explore Polymer Properties and Polymer-Protein Interactions
• Molecular Simulations of Water Transport Through Polymers
• Scoring Functions to Study Polymer-Protein Interactions
Quantitative Structure-Performance Relationship (QSPR) Models
• Find correlations between chemical structure and performance
• Predict complex polymer performance characteristics from simple structure and material properties
Quantitative Structure-Performance Relationship (QSPR) Models
Set of Polymers
In vitro/In vivo Data (Y) Molecular Descriptors (Xi)
QSPRY = f(Xi)
InterpretationPrediction
Types of Molecular Descriptors
*
O
CH2 CH2
O
NH CH CH2
O
O
O
O
CH2 O
CH2
OH
CH2 *n Topological
2-D structural formula (Kier & Hall indices)
Electrostatic
Charge distribution (partial charges, H-bond
donors/acceptors)
Geometric
3-D structure of molecule (I, SA, Molecular Volume)
Quantum-chemical
Molecular orbital structure (HOMO-LUMO energies, dipole moment)
Extract and Tabulate DescriptorsExtract and Tabulate Descriptors
POLYMER NMA "Y"
Descriptors (X i) Mol. Vol. ( Å 3) LogP Hydrophilicity
1 32 420 3. 31 0. 14 2 52 332 3.92 0.11 3 2 4 98 4.57 0.07 4 75 467 2.93 0.16 5 16 359 3.68 0.12
etc. etc. etc. etc. etc.
Quantitative Structure-Performance Quantitative Structure-Performance Relationship (QSPR) ModelsRelationship (QSPR) Models
Polymer Data SetPolymer Data Set
Building QSAR ModelsBuilding QSAR Models
Multiple Linear Regression (MLR)
pKi = ao + a1 (Mol Voli) + a2 (logP) + a3 (i) + ...
Hansch, 1969
Partial Least-Squares (PLS) Regression
pKi = ao + a1 (PC1) + a2 (PC2) + a3 (PC3) + ...
Wold, et al. 1984
(obs. property or activity) (molecular descriptors)
Y = f(Xi)
Simple (Univariate) Linear Regression Hammett, 1939
pKi = ao + a1 (Mol Voli)
Predicting Activities of Untested CompoundsPredicting Activities of Untested Compounds
Untested polymer:
extract
descriptors
Predicted activity of untested polymer
Validated QSPR model: Yi = 0.52 (Vi) + 0.27 (logPi) - 0.38 (i)
V logP HO
OH
Artificial Neural Network (ANN)
Input
Input
Input
Input
Input
Input
Input
Input
Input
Hidden Layer Output
Any measured parameter or observation
A set of weighed linear regressions or other functions
Prediction of the model
The ANN needs a training set of data to determine the optimum value of the weighing functions in the hidden layer that lead to the closest match between an experimentally determined outcome and the prediction of the model.Thereafter the ANN can make empirical predictions of the outcome when presented with similar data sets.
Combinatorial Polymer Libraries
OH OH
OH OH
OH
OO
OH
OH
OH
OH
OH
OH
HO CH2
C
C
O
NH CH CH2
O
OR
OH
Isopropanol
Benzyl Alcohol
Butanol
Hexanol
iso-Butanol
2-(2-Ethoxyethoxy)ethanol
sec-ButanolEthanolMethanol
Dodecanol
Octanol
n=1,2
n
diacid component
diphenol component
R
O
C
C NH OO CH2CH2C
O
CH2
O
CHC
O
Y
O
HO2CCO2H
HO2C CO2H
HO2CCO2H
HO2CCO2H
HO2CCO2H
HO2C O CO2H
HO2CCO2H
HO2C OO CO2H
3-Methyl-Adipic Acid
Diglycolic AcidGlutaric Acid
Sebacic Acid
Adipic Acid
Suberic Acid Dioxaoctanedioic Acid
Succinic Acid
C
O
OHYC
O
HO
Combinatorial Polymer Libraries
OH OH
OH OH
OH
OO
OH
OH
OH
OH
OH
OH
HO CH2
C
C
O
NH CH CH2
O
OR
OH
Isopropanol
Benzyl Alcohol
Butanol
Hexanol
iso-Butanol
2-(2-Ethoxyethoxy)ethanol
sec-ButanolEthanolMethanol
Dodecanol
Octanol
n=1,2
n
diacid component
diphenol component
R
O
C
C NH OO CH2CH2C
O
CH2
O
CHCO
YO
25 100
400
2500
0
500
1000
1500
2000
2500
5 10 20 50 Y or R
Siz
e o
f lib
rary
Combinatorial Explosion!!!
Deploy Rational Drug Design Deploy Rational Drug Design Approaches Approaches
to Biomaterials Designto Biomaterials Design
Generate Virtual Combinatorial Libraries
• Compress large polymer libraries into representative subsets
Build Computational Models for these Subsets
• Predict bioresponse to the polymers based only the polymer’s molecular structure
• Make predictions for the entire polymer library and beyond
Cluster representatives
Pre
dic
ted v
alu
e Synthesis-> Biol. testing-> QSPR model
Dipole
Molecular volume
Rotatable bonds
Good diversity
Double bonds
Moment of inertia
Density
Poordiversity
HO2CCO2H
HO2C CO2H
HO2CCO2H
HO2CCO2H
HO2CCO2H
HO2C O CO2H
HO2CCO2H
HO2C OO CO2H
3-Methyl-Adipic Acid
Diglycolic AcidGlutaric Acid
Sebacic Acid
Adipic Acid
Suberic Acid Dioxaoctanedioic Acid
Succinic Acid
C
O
OHYC
O
HO
OH OH
OH OH
OH
OO
OH
OH
OH
OH
OH
OH
HO CH2
C
C
O
NH CH CH2
O
OR
OH
Isopropanol
Benzyl Alcohol
Butanol
Hexanol
iso-Butanol
2-(2-Ethoxyethoxy)ethanol
sec-ButanolEthanolMethanol
Dodecanol
Octanol
n=1,2
n
diacid component
diphenol component
R
OC
C NH OO CH2CH2CO
CH2
O
CHCO
YO
From QSPR models, select those descriptors and their values that are associated with optimal performance property
Synthesize known polymers within cluster
Design and synthesize new scaffolds within cluster
1
2
3
From Models to Rational Design and Synthesis
• Calculate molecular descriptors for each polymer
• Generate QSPR models
• Compare predicted vs expt’l Normalized Metabolic Activity (NMA)
• Identify key descriptors associated with (NMA)
• Predict NMA values for untested polymers
Computational ProcedureComputational Procedure
List of Molecular DescriptorsList of Molecular Descriptors
FUNCTIONAL GROUPS EMPIRICAL DESCRIPTORS
MOLECULAR PROPERTIES
Number of primary C (sp3)Number of secondary C (sp3)Number of tertiary C (sp3)Number of unsubstituted aromatic C (sp2)Number of substituted aromatic C (sp2)Number and position of branches in pendant chainNumber of ethers (aliphatic)Number of H-bond acceptor atoms (N, O, F)
Unsaturation indexHydrophilic factorAromatic ratio
Molar refractivityPolar surface areaOctanol-water partition coefficient (logP)
Set of 62 polyarylatesSet of 62 polyarylates
& their calculated descriptors& their calculated descriptors
0 25 50 75 100 1250
50
100
150R2 = 0.75
Rcv2 = 0.55
Experimental
Pre
dic
ted
Normalized Metabolic Activity
PLS
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
MLOGP
PSA
MR
ARR
Hy
Ui
nHAcc
nROR
nOHs
nOHp
nCaR
nCaH
nCt
nCs
nCp
Loadings: Decompose PCs into Constituent Molecular Descriptors
PC1 PC4 PC5PC2 PC3
nBRs
nBRp
Key Descriptors Associated With Key Descriptors Associated With (NMA)(NMA)
Hydrophilic factor: # hydrophilic groups
Octanol-water partition coefficient logP
Number of secondary C (sp3)
Molar refractivity
PC1
0%
20%
40%
60%
80%
100%
PC1 PC2 PC3 PC4 PC5
MLOGP
PSA
MR
nOHs
nCt
SIMPLIFY the model
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
MLOGP
PSA
MR
ARR
Hy
Ui
nHAcc
nROR
nOHs
nOHp
nCaR
nCaH
nCt
nCs
nCp
nBRs
nBRp
nBRs
Predicted NMA for Untested PolyarylatesPredicted NMA for Untested Polyarylates
Polymer code: DTiB_AA
Predicted NMA: 40.9
Polymer code: HTH_AA
Predicted NMA: 69.7
Polymer code: HTH_GLA
Predicted NMA: 59.5
Polymer code: HTH_MAA
Predicted NMA: 33.7
OO
NH
OO
O
OCH
2
O
OO
NH
OO
O
OCH2
2
O
O OO
NH
OO
O
OCH
22O
Polymer code: DTiB_DGA
Predicted NMA: 55.0
OO
NH
OO
O
OCH
2O
OO
NH
OO
O
OCH2
O
Polymer code: THE_DGA
Predicted NMA: 82.6
O OO
NH
OO
O
OCH2
O
Kholodovych V, Smith JR, Knight D, Abramson S, Kohn J, Welsh WJ Polymer, 2004, 45, 7367-7379
(62.6±11.9)
(41.4±7.9)
(63.7±12.3)
(53.2±10.1)
(67.1±12.7)
(101.5±19.3)
FIBRINOGEN ADSORPTION FRLF NMA
YY YY RRRR
Summary & Conclusions Summary & Conclusions Computational molecular modeling represents a Computational molecular modeling represents a
powerful tool for accelerating optimal powerful tool for accelerating optimal biomaterial designbiomaterial design
QSPR models are useful for predicting, and QSPR models are useful for predicting, and interpreting, biomaterials' performance interpreting, biomaterials' performance propertiesproperties
QSPR-based approaches are complementary to QSPR-based approaches are complementary to atomistic simulation models (Knight, Latour, atomistic simulation models (Knight, Latour, Welsh)Welsh)
Smith JR, Knight D, Kohn J, Rasheed K, Weber N, Kholodovych V, Welsh WJ Using Surrogate Modeling in the Prediction of Fibrinogen Adsorption onto Polymer Surfaces Journal of Chemical Information & Computer Science 44(3): 1088-1097(2004)
Kholodovych V, Smith JR, Knight D, Abramson S, Kohn J, Welsh WJ Accurate Predictions Of Cellular Response Using QSPR: A Feasibility Test Of Rational Design Of Polymeric Biomaterials Polymer 45(22):7367-7379 (2004)
Smith JR, Kholodovych V, Knight D, Kohn J, Welsh WJ Predicting Fibrinogen Adsorption to Polymeric Surfaces In Silico: A Combined Method Approach Polymer 46: 4296 (2005) (Paper assigned for reading)
Smith JR, Knight D, Kohn J, Kholodovych V, Welsh W J Using Surrogate Modeling In The Analysis of Bioresponse Data from Combinatorial Libraries of Polymers QSAR & Combinatorial Science (submitted)
Relevant PapersRelevant Papers