java solutions for cheminformatics june 2006 conformer generation
TRANSCRIPT
Java Solutions for Cheminformatics
June 2006
Conformer generation
May 2006
The “modeling” team at ELTE (Eötvös Loránd University)
Ödön Farkas– General leadership– Geometry optimization– Fragment fuse– Search involving geometry constraints, etc.
Imre Jákli– Molecular dynamics (MD)– Database connection
Adrián Kalászi– Molecular mechanics– Drug design tools (3D pharmacophore model)– Conformer search via MD
Gábor Imre– 3D builder scheduling– Fragment-atom fuse (v2)– Minkowski-based build– Debug tools
Students: Krisztina Szölgyén, László Antall
May 2006
• Conformers are locally stable structures of a molecule.– Conformers are often called “rotamers”, however rings may
also have different conformers which are not rotamers.• Intermediate structures, corresponding to molecular
motion, are conformations and should not be considered as conformers.
• The lowest energy conformer can only be found certainly if all conformers are known.
• The distribution of conformers can be approximated using the calculated conformational energy.
Conformer generation / basic concepts
May 2006
Goal of conformer generation
• Generating valid 3D molecular structures
• Finding multiple structures for flexible molecules
May 2006
• First approach based on a generalized Minkowski metricG. Imre, G. Veress, A. Volford, Ö. Farkas “Molecules from the Minkowski space: an approach to building 3D molecular structures” J. Mol. Struct. (Theochem) 666-667, 51 (2003)
• Due to problems with chirality and slow computational time we introduced an atom-by-atom fuse methodG. Imre, Ö. Farkas “3D Structure Prediction and Conformational Analysis” 7th ICCS, June 5 - 9, 2005 Noordwijkerhout, The Netherlands
• Scheduling is important• Faster and reliable process• Frequent use of geometry optimization may slow down the
process• Current version is based on fusing fragments
History of conformer generation in Marvin
May 2006
Key algorithms used or developed for conformer generation
Quaternion fit (JQuatFit)
• Based on the work of Hamilton• http://en.wikipedia.org/wiki/Quaternion• Can fit two molecular structures via non-iterative, linear
scaling, extremely fast method.• Used for fitting common atoms for fusing fragments
Substructure3DSearch
• Based on the substructure search implemented by ChemAxon• Simplified for fast exact match (using graph invariant)• Extended with
• geometry matching (using quatfit) to separate conformers• high/low priority matching for selecting suitable fuse positions• geometry constrained topological matching for fragment re-use
• Can quickly distinguish conformers with optional diversity limit
May 2006
Conformer tools in the GUI MSketch/MView
Draw a molecule
May 2006
Conformer tools in the GUI MSketch/MView
Draw a molecule
Adjust Clean/3D mode• Fast build: old algorithm,
no Hydrogens• Fine build: new algorithm,
automatically adds Hydrogens• Build or optimize: build only
for non 3D structures• Optimize: just optimize
Press Ctrl-3 to process
May 2006
Conformer tools in the GUI MSketch/MView
Pressing F7 changes for3D rotation mode tochange the viewpoint
Previously Ctrl-F generated conformers,now it only displays if they are available
The new Conformerplugin is advised for conformer generation
May 2006
Conformer tools in the GUI MSketch/MView calculator plugins
The conformer plugin allows easy access tothe most importantoptions:
• Output as molecule array or storage in single molecule• Variable optimization criteria• Multiple or single conformer• Maximum conformer count• Time limit for the process• “Hyperfine” mode for thorough checking of conformers• H-bond visualization• Access to old algorithm
May 2006
Conformer tools in the GUI MSketch/MView calculator plugins
The conformer plugin allows easy access tothe most importantoptions:
• Output as molecule array or storage in single molecule• Variable optimization criteria• Multiple or single conformer• Maximum conformer count• Time limit for the process• “Hyperfine” mode for thorough checking of conformers• H-bond visualization• Access to old algorithm
May 2006
Conformer tools in the GUI MSketch/MView calculator plugins
The conformers canalso be stored as aproperty of the molecule(available in mrv, sdf)
• Single molecule appearsas a result and “Ctrl-F”displays the stored the individual conformers
• The desired conformer to display can be selected
• The selected conformershould be confirmed.
May 2006
Conformer tools in the GUI MSketch/MView calculator plugins
The stored conformersthen will appear when “Ctrl-F” is pressed.
May 2006
Molecular dynamics in the GUI MSketch/MView calculator plugins
The stored conformersthen will appear when “Ctrl-F” is pressed.
The flexibility of the molecule can be studiedvia molecular dynamics.
May 2006
Molecular dynamics in the GUI MSketch/MView calculator plugins
May 2006
Command line conformer tools (cxcalc)conformers & leconformers
Usage: cxcalc [general options] [input files/strings] conformers[conformers options] [input files/strings]
conformers options: -h, --help this help message -f, --format <output format> should be a 3D format (default: sdf) -m, --maxconformers <maximum number of conformers to be generated> (default: 100) -s, --saveconfdesc [true|false] if true a single conformer is saved with a property containing conformer information (default: false) -e, --hyperfine [true|false] if true hyperfine option is set (default: false) -o, --oldalg [true|false] if true old (before Marvin 4.1) algorithm is used for calculation (default: false) -y, --prehydrogenize [true|false] if true prehydrogenize is done before calculation, if false calculation is done without hydrogens (available only with old algorithm, default: false) -l, --timelimit <timelimit for calculation in sec> (default: 900) -O, --optimization [0|1|2|3] conformer generation optimiztaion limit (default: 1)
# cxcalc conformers -m 250 -s true test.sdf
May 2006
Command line molecular dynamics tools (cxcalc)moldyn
Usage: cxcalc [general options] [input files/strings] moldyn[moldyn options] [input files/strings]
moldyn options: -h, --help this help message -x, --forcefield [dreiding] forcefield used for calculation (default: dreiding) -i, --integrator [positionverlet|velocityverlet|leapfrog] integrator type used for calculation (default: velocityverlet) -n, --stepno <number of simulation steps> (default: 1000) -m, --steptime <time between steps in femtoseconds>
(default: 0.1) -T, --temperature <temperature in Kelvin> (default: 300 K) -j, --trajectorytype [mol|sdf] type of output mol: series of mol frames sdf: series of sdf frames (default: sdf)
Example: cxcalc moldyn test.mol
May 2006
Conformer tools API
// read input molecule MolImporter mi = new MolImporter("test.mol"); Molecule mol = mi.read(); mi.close();
// create plugin ConformerPlugin plugin = new ConformerPlugin(); // set target molecule plugin.setInputMolecule(mol); // set parameters for calculation plugin.setMaxNumberOfConformers(400); plugin.setTimelimit(900); // run the calculation plugin.run(); // get results Molecule[] conformers = plugin.getConformers(); int conformerCount = plugin.getConformerCount(); Molecule m; for (int i = 0; i < conformerCount; ++i) { m = conformers[i]; // same as m = plugin.getConformer(i); // do something with the conformer ... }
// do something with the results ...
May 2006
3D structure generation capabilitiesComparison
Corina Marvin
15.2 sMuch faster…
May 2006
3D structure generation capabilitiesComparison
Corina Marvin
5.9 sMuch faster…
May 2006
3D structure generation capabilitiesComparison
Corina Marvin
5.1 sMuch faster…
May 2006
Result statistics NCI 250K database (August, 2000)
•1st round• Current method with 120 sec. time limit• Conversion rate: 99.92% (failed 193 of 250251)• Avarage time is 0.65 sec/molecule
•2nd round• Old method on the 193 previously failing structures• Overall conversion rate: 99.994% (failed 13)
May 2006
Under development what to expect in the near future
•100% conversion rate for valid, medium size structures
•Optional conformer diversity limit•Server version
• Carrying built up fragments for consequent processes
• Store and use fragment database•Further speedup•MMFF94 force field
May 2006
Acknowledgements