biosolveit: flexs reference manual · version 2.1 reference manual christian lemmen, matthias...

240
Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

Upload: others

Post on 01-Jun-2020

13 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

Version 2.1Reference Manual

Christian Lemmen, Matthias Rarey,Bernd Kramer, Thomas Lengauer,

Markus Lilienthal, Frank Sonnenburg,Marc Zimmermann

Page 2: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann
Page 3: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

Contents

Contents 3

I Introduction

1 About FlexS 131.1 Additional copyright notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131.2 What’s new in this version? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141.3 How to read this guide . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

2 Installation 172.1 Preconditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172.2 Quick Installation for Interactive and Batch Mode Usage . . . . . . . . . . . . 172.3 Parts of FlexS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182.4 (Linux/Unix only) Defining the root directory . . . . . . . . . . . . . . . . . . 182.5 Starting FlexS: a first simple test . . . . . . . . . . . . . . . . . . . . . . . . . . . 192.6 Basic visualization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192.7 External programs and data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

2.7.1 Graphics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192.7.2 Torsion angles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202.7.3 Flexible ring systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202.7.4 Parallel script execution . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

3 Licensing 213.1 Obtaining License Keys . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213.2 Quick Start with Evaluation Licenses . . . . . . . . . . . . . . . . . . . . . . . . 213.3 License Scheme for Regular Licenses . . . . . . . . . . . . . . . . . . . . . . . . 22

3.3.1 License Server Installation . . . . . . . . . . . . . . . . . . . . . . . . . . 223.3.2 Configuring the License Information in FlexS . . . . . . . . . . . . . . . 233.3.3 The License Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233.3.4 Entering the License File Name Directly in the FlexS Configuration File 233.3.5 Run the FlexLM license server . . . . . . . . . . . . . . . . . . . . . . . 243.3.6 BioSolveIT License Scheme for HP-UXia64/SunOS/SGI-Irix Platforms 24

II User Guide

4 Getting started — a tutorial introduction 274.1 Single ligand superposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

3

Page 4: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

4 CONTENTS

4.1.1 A general overview and getting help . . . . . . . . . . . . . . . . . . . . 294.1.2 The reference ligand . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 304.1.3 The test ligand . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 334.1.4 Superpositioning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 344.1.5 Virtual screening . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

4.2 Superpositioning using combinatorial libraries . . . . . . . . . . . . . . . . . . 404.2.1 The reference ligand . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 414.2.2 The combinatorial library . . . . . . . . . . . . . . . . . . . . . . . . . . 414.2.3 Combinatorial alignment . . . . . . . . . . . . . . . . . . . . . . . . . . 43

5 Working with FlexS 475.1 Configuring FlexS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

5.1.1 @ROOTDIR: Defining the root directory . . . . . . . . . . . . . . . . . . 475.1.2 @DIRECTORIES: Defining directory paths . . . . . . . . . . . . . . . . 475.1.3 @STATIC_DATA: Defining paths to static data files . . . . . . . . . . . . 485.1.4 @PROGRAMS: Defining paths to external programs . . . . . . . . . . . . 495.1.5 @FLAGS: Defining control flags . . . . . . . . . . . . . . . . . . . . . . . 505.1.6 @ID_STRINGS: Defining control strings . . . . . . . . . . . . . . . . . . 585.1.7 @PARALLEL: Defining a Parallel Virtual Machine . . . . . . . . . . . . 585.1.8 @ALIASES: Defining aliases for commands . . . . . . . . . . . . . . . . 585.1.9 General remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

5.2 Starting FlexS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 595.2.1 Interactive mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 595.2.2 Arguments for batch processing (-a) . . . . . . . . . . . . . . . . . . . . 595.2.3 Batch mode (-b) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 595.2.4 Specifying an alternative configuration file (-c) . . . . . . . . . . . . . . 605.2.5 Specifying the execution directory (-d) . . . . . . . . . . . . . . . . . . . 605.2.6 Help for command line options (-h, ?) . . . . . . . . . . . . . . . . . . . 605.2.7 Output the processor id or system ID (-i) . . . . . . . . . . . . . . . . . 605.2.8 Logging the FlexS session (-l) . . . . . . . . . . . . . . . . . . . . . . . . 605.2.9 Nice value (-n) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 605.2.10 Redirecting output (-o, -om) . . . . . . . . . . . . . . . . . . . . . . . . . 605.2.11 RIF alignment option (-ri) . . . . . . . . . . . . . . . . . . . . . . . . . . 605.2.12 RIF alignment output option (-ro) . . . . . . . . . . . . . . . . . . . . . 615.2.13 Interface options (-s) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 615.2.14 Version information (-v) . . . . . . . . . . . . . . . . . . . . . . . . . . . 61

5.3 The FlexS shell . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 615.4 Errors and warnings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 625.5 Preparing the input data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

5.5.1 The test ligand . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 625.5.2 The reference ligand . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

6 Menus and commands 676.1 Menus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 676.2 Global commands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

6.2.1 Quitting FlexS (QUIT) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 676.2.2 Returning to the main menu (MAIN) . . . . . . . . . . . . . . . . . . . 68

Page 5: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

CONTENTS 5

6.2.3 Returning to the parent menu (END) . . . . . . . . . . . . . . . . . . . 686.2.4 Online help (HELP) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 686.2.5 Viewing the User Guide (MANUAL) . . . . . . . . . . . . . . . . . . . 686.2.6 Short online help (?) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 686.2.7 Editing the configuration file (EDITCFG) . . . . . . . . . . . . . . . . . 686.2.8 Listing environment variable settings (LIST) . . . . . . . . . . . . . . . 696.2.9 Changing values of environment variables (SET) . . . . . . . . . . . . . 696.2.10 Selecting the output destination for superposition results (SELOUTP) 696.2.11 Sending a command to FlexV (TOFLEXV) . . . . . . . . . . . . . . . . 706.2.12 Sending FlexS graphic objects to the visualizer (DISPLAY) . . . . . . . 726.2.13 Erasing a graphics object (ERASE) . . . . . . . . . . . . . . . . . . . . . 736.2.14 Executing shell commands (! and EXEC) . . . . . . . . . . . . . . . . . 736.2.15 Executing internal unit tests (UNITTESTS) . . . . . . . . . . . . . . . . 73

6.3 Commands in the root menu . . . . . . . . . . . . . . . . . . . . . . . . . . . . 736.3.1 Deleting everything (DELALL) . . . . . . . . . . . . . . . . . . . . . . . 736.3.2 Executing a batch file (SCRIPT) . . . . . . . . . . . . . . . . . . . . . . . 74

6.4 Working with test ligands (TEST_LIG submenu) . . . . . . . . . . . . . . . . . 746.4.1 Reading (READ) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 746.4.2 Setting up the initialization procedure (SELINIT) . . . . . . . . . . . . 756.4.3 Reading reference coordinates (READREF) . . . . . . . . . . . . . . . . 766.4.4 Assigning reference coordinates by subgraph matching (MAPREF) . . 766.4.5 Setting reference coordinates (SETREF) . . . . . . . . . . . . . . . . . . 776.4.6 Selecting the base fragments (SELBAS) . . . . . . . . . . . . . . . . . . 776.4.7 Selecting (SELECT) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 786.4.8 Outputting the most important information about a test ligand (INFO) 796.4.9 Editing (EDIT) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 796.4.10 Writing (WRITE) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 796.4.11 Deleting (DELETE) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 806.4.12 Volume computation (VOLUME) . . . . . . . . . . . . . . . . . . . . . . 806.4.13 Setting administration defaults for drawings (SELADM) . . . . . . . . 806.4.14 Setting default values for drawing the test ligand (SELGRA) . . . . . . 816.4.15 Selecting the coloring mode (SELCOL) . . . . . . . . . . . . . . . . . . 816.4.16 Labeling the test ligand (SELLAB) . . . . . . . . . . . . . . . . . . . . . 826.4.17 Drawing the test ligand (DRAW) . . . . . . . . . . . . . . . . . . . . . . 836.4.18 Drawing the test ligand at multiple positions sequentially (MDRAW) . 836.4.19 Drawing multiple test ligands (SDRAW) . . . . . . . . . . . . . . . . . 846.4.20 Listing the graphic items (GRAINF) . . . . . . . . . . . . . . . . . . . . 846.4.21 Setting charges (SETC) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 846.4.22 Reading charges (READC) . . . . . . . . . . . . . . . . . . . . . . . . . 846.4.23 Writing charges (WRITEFC) . . . . . . . . . . . . . . . . . . . . . . . . . 846.4.24 Writing Gaussians (WRITEG) . . . . . . . . . . . . . . . . . . . . . . . . 846.4.25 Checking SMARTSTM patterns and subgraph occurrence (SMARTS) . 856.4.26 Superpositioning of multiple test ligands by matching (MATCH) . . . 856.4.27 Switching coordinate types (SWITCHTYPE) . . . . . . . . . . . . . . . 856.4.28 Manually transforming test ligand coordinates (TRANSFORM) . . . . 866.4.29 Writing the atom coordinates of type ’opt’ (WRITEOPT) . . . . . . . . 86

Page 6: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

6 CONTENTS

6.4.30 *Working with test ligand conformations (TEST_LIG/CONFORMsubmenu) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86

6.5 Working with reference ligands (REF_LIG submenu) . . . . . . . . . . . . . . 886.5.1 Reading (READ) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 886.5.2 Selecting (SELECT) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 886.5.3 Deleting (DELETE) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 886.5.4 Setting up the initialization procedure (SELINIT) . . . . . . . . . . . . 896.5.5 *Building the reference ligand triangle hash table (TRIHASH) . . . . . 896.5.6 Setting administration defaults for drawings (SELADM) . . . . . . . . 896.5.7 Setting default values for drawing the reference ligand (SELGRA) . . 896.5.8 Selecting the coloring mode (SELCOL) . . . . . . . . . . . . . . . . . . 906.5.9 Labeling the reference ligand (SELLAB) . . . . . . . . . . . . . . . . . . 906.5.10 Drawing the reference ligand (DRAW) . . . . . . . . . . . . . . . . . . . 906.5.11 Listing the graphic items (GRAINF) . . . . . . . . . . . . . . . . . . . . 916.5.12 Setting charges (SETC) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 916.5.13 Reading charges (READC) . . . . . . . . . . . . . . . . . . . . . . . . . 916.5.14 Deleting Gaussians (DELETEG) . . . . . . . . . . . . . . . . . . . . . . 916.5.15 Merging Gaussians (MERGEG) . . . . . . . . . . . . . . . . . . . . . . . 916.5.16 Writing Gaussians (WRITEG) . . . . . . . . . . . . . . . . . . . . . . . . 916.5.17 Reading Gaussians (READG) . . . . . . . . . . . . . . . . . . . . . . . . 92

6.6 *Changing the static data (DATABASE submenu) . . . . . . . . . . . . . . . . 936.6.1 Editing the static data files (EDIT) . . . . . . . . . . . . . . . . . . . . . 936.6.2 Performing a cascading reload (RELOADDAT) . . . . . . . . . . . . . . 936.6.3 Saving the graphic settings (SAVEGC) . . . . . . . . . . . . . . . . . . . 936.6.4 Decrypting static data files (DECRYPT) . . . . . . . . . . . . . . . . . . 94

6.7 Superpositioning (SUPERPOS submenu) . . . . . . . . . . . . . . . . . . . . . 946.7.1 Placing base fragments (PLACEBAS) . . . . . . . . . . . . . . . . . . . 946.7.2 Fragment-based screening (SCREEN) . . . . . . . . . . . . . . . . . . . 956.7.3 Building up the complex (COMPLEX) . . . . . . . . . . . . . . . . . . . 956.7.4 Local flexible/rigid-body postoptimization of placements (POPT) . . . 966.7.5 Interactive selection of solutions (SELECT) . . . . . . . . . . . . . . . . 966.7.6 Clustering solutions (CLUSTER) . . . . . . . . . . . . . . . . . . . . . . 966.7.7 Writing placements in pdf format (WRITE) . . . . . . . . . . . . . . . . 976.7.8 Reading placements in pdf format (READ) . . . . . . . . . . . . . . . . 976.7.9 Deleting a placement (DELETE) . . . . . . . . . . . . . . . . . . . . . . 976.7.10 Sorting the list of placements (SORT) . . . . . . . . . . . . . . . . . . . 976.7.11 Outputting the most important quantities of a superposition result

(INFO) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 986.7.12 Listing solutions (LISTSOL) . . . . . . . . . . . . . . . . . . . . . . . . . 986.7.13 Listing solutions sorted by RMSD (LISTRMS) . . . . . . . . . . . . . . 996.7.14 Listing the matches of all solutions (LISTMAT) . . . . . . . . . . . . . . 996.7.15 Listing all solutions and matches (LISTALL) . . . . . . . . . . . . . . . 996.7.16 Listing one solution and the corresponding matches (LISTONE) . . . . 1006.7.17 Performing specific queries on solutions and matches (QUERY) . . . . 1006.7.18 Performing a specific query a second time (QHIST) . . . . . . . . . . . 1016.7.19 Writing solutions in a table (PRINTSOL) . . . . . . . . . . . . . . . . . 1016.7.20 Setting administration defaults for drawings (SELADM) . . . . . . . . 102

Page 7: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

CONTENTS 7

6.7.21 Setting default values for drawing placements (SELGRA) . . . . . . . 1026.7.22 Selecting the coloring mode (SELCOL) . . . . . . . . . . . . . . . . . . 1026.7.23 Labeling placements (SELLAB) . . . . . . . . . . . . . . . . . . . . . . . 1036.7.24 Drawing placements (DRAW) . . . . . . . . . . . . . . . . . . . . . . . 1036.7.25 Drawing multiple placements (MDRAW) . . . . . . . . . . . . . . . . . 1036.7.26 Listing the graphic items (GRAINF) . . . . . . . . . . . . . . . . . . . . 1046.7.27 *Special commands to evaluate placements (EVALUATE) . . . . . . . 104

6.8 *Using resulting information files (RIF) . . . . . . . . . . . . . . . . . . . . . . 1066.8.1 General overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1066.8.2 Superpositioning of RIF molecule pairs (ALIGN) . . . . . . . . . . . . 1096.8.3 Switch parameters for optimization in ALIGN (SWITCHP) . . . . . . . 110

6.9 *Gaussian overlap optimization (OPTPARAM) . . . . . . . . . . . . . . . . . . 1126.9.1 General overview of flexible superposition: . . . . . . . . . . . . . . . . 1126.9.2 Switching energy parameters for flexible optimization (SWITCHP) . . 1136.9.3 Stop criteria for flexible optimization (STOPCRT) . . . . . . . . . . . . 1136.9.4 Molecule transformation for flexible superposition (TRANSMOD) . . 1146.9.5 Superpositioning of flexible test ligands (FSUPER) . . . . . . . . . . . . 1146.9.6 Selecting a special optimization algorithm for flexible superposition-

ing without a reference ligand (SETALGO) . . . . . . . . . . . . . . . . 1156.9.7 Setting optimization parameters (SETPAR) . . . . . . . . . . . . . . . . 117

7 Additional modules for FlexS 1197.1 Parallel Virtual Machine (PVM submenu) . . . . . . . . . . . . . . . . . . . . . 119

7.1.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1197.1.2 Starting PVM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1207.1.3 Configuring PVM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1207.1.4 Executing parallel batch files . . . . . . . . . . . . . . . . . . . . . . . . 1217.1.5 Aborting and recovering . . . . . . . . . . . . . . . . . . . . . . . . . . . 1237.1.6 Killing a single work process . . . . . . . . . . . . . . . . . . . . . . . . 1247.1.7 Working with parallel FlexS . . . . . . . . . . . . . . . . . . . . . . . . . 1247.1.8 Working with PVM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124

7.2 Alignment of combinatorial libraries . . . . . . . . . . . . . . . . . . . . . . . . 1267.2.1 Handling combinatorial libraries (CLIB submenu) . . . . . . . . . . . . 1267.2.2 Setting up the initialization procedure (SELINIT) . . . . . . . . . . . . 1277.2.3 Alignment combinatorial libraries (CSUPER submenu) . . . . . . . . . 134

8 Troubleshooting 1418.1 Installation and Licensing Problems . . . . . . . . . . . . . . . . . . . . . . . . 141

8.1.1 Libraries missing? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1418.1.2 Ubuntu / Debian Distributions . . . . . . . . . . . . . . . . . . . . . . . 1418.1.3 Token not numeric error under Linux . . . . . . . . . . . . . . . . . . . 1418.1.4 Insufficient memory under Windows . . . . . . . . . . . . . . . . . . . 1418.1.5 Windows Vista . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142

8.2 Problems at Runtime of FlexS . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142

9 Getting Help 1439.1 Support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143

Page 8: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

8 CONTENTS

III Technical Reference

10 Files and file formats 14710.1 Molecular input file formats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14710.2 Overview of filename extensions . . . . . . . . . . . . . . . . . . . . . . . . . . 14810.3 *Batch files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148

10.3.1 Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14910.3.2 Script parameter lists . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15010.3.3 Loops: FOR_EACH/END_FOR, WHILE or FOREVER . . . . . . . . . 15110.3.4 Branches: IF/ELSE/ENDIF . . . . . . . . . . . . . . . . . . . . . . . . . 15210.3.5 One-of-n selection: SELINP . . . . . . . . . . . . . . . . . . . . . . . . . 15210.3.6 Special script command: SETVAR . . . . . . . . . . . . . . . . . . . . . 15310.3.7 Special batch file command: INPUT . . . . . . . . . . . . . . . . . . . . 15310.3.8 Special script command: INCR . . . . . . . . . . . . . . . . . . . . . . . 15310.3.9 Special batch file command: OUTPUT and OUTERR . . . . . . . . . . 15310.3.10 Special batch file command: TIMER . . . . . . . . . . . . . . . . . . . . 15310.3.11 Special batch file command: PROCSIZE . . . . . . . . . . . . . . . . . . 15310.3.12 Special batch file command: WAIT . . . . . . . . . . . . . . . . . . . . . 153

10.4 Definig program parameters (flexs_settings.dat) . . . . . . . . . . . . . . . . . . 15410.4.1 Overlap . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15410.4.2 Gaussian description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15410.4.3 Generation of conformations . . . . . . . . . . . . . . . . . . . . . . . . 15510.4.4 Selecting the base fragment (superposition algorithm, phase 1) . . . . 15610.4.5 Placing the base fragment (superposition algorithm, phase 2) . . . . . 15610.4.6 Building up the complex (superposition algorithm, phase 3) . . . . . . 15810.4.7 Applying filter functions . . . . . . . . . . . . . . . . . . . . . . . . . . . 15910.4.8 Combinatorial alignment parameters . . . . . . . . . . . . . . . . . . . 15910.4.9 Miscellaneous . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160

10.5 *Chemical parameters (chempar.dat) . . . . . . . . . . . . . . . . . . . . . . . . . 16110.5.1 Van der Waals radii . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16110.5.2 Bond length of heavy atoms . . . . . . . . . . . . . . . . . . . . . . . . . 16210.5.3 Atom types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16210.5.4 Valence states . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163

10.6 *Interaction types and compatibilities (contype_sp.dat) . . . . . . . . . . . . . . 16410.7 *Interaction geometries (geometry_sp.dat) . . . . . . . . . . . . . . . . . . . . . . 166

10.7.1 Associating interaction geometries with molecular groups . . . . . . . 16610.7.2 Defining interaction geometries . . . . . . . . . . . . . . . . . . . . . . . 16710.7.3 Computing the scoring contributions of matched interaction groups . 170

10.8 *Assigning data to the ligands: the subgraph data files . . . . . . . . . . . . . 17110.8.1 Defining groups of atoms . . . . . . . . . . . . . . . . . . . . . . . . . . 17110.8.2 Defining subgraphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172

10.9 *Ligand interaction groups (contact_sp.dat) . . . . . . . . . . . . . . . . . . . . . 17310.10*Ligand torsion database (torsion_standard.dat) . . . . . . . . . . . . . . . . . . 174

10.10.1 Constraining amides to planarity . . . . . . . . . . . . . . . . . . . . . . 17410.10.2 Fixing torsional angles at specified values: a sample case . . . . . . . . 176

10.11*Ligand formal charges (fcharges.dat) . . . . . . . . . . . . . . . . . . . . . . . . 17810.12*Automatic correction of localized systems (delocalized.dat) . . . . . . . . . . . 178

Page 9: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

CONTENTS 9

10.13SMARTSTM support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17910.13.1 Atomic primitives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18010.13.2 Ring perception . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18210.13.3 Aromaticity perception and hybridization states . . . . . . . . . . . . . 18210.13.4 Implicit hydrogens, valences and formal charges . . . . . . . . . . . . . 18310.13.5 Bond primitives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18410.13.6 Logical operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18410.13.7 Recursive SMARTSTM . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18510.13.8 Branches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185

10.14Defining subgraphs using SMARTSTM . . . . . . . . . . . . . . . . . . . . . . . 18510.15Using templates, vector bindings . . . . . . . . . . . . . . . . . . . . . . . . . . 18610.16Transforming molecules via SMARTSTM . . . . . . . . . . . . . . . . . . . . . . 187

10.16.1 Formal charges and hydrogens . . . . . . . . . . . . . . . . . . . . . . . 18710.16.2 Atom type assignment . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188

10.17Structure correction and atom type assignment . . . . . . . . . . . . . . . . . . 18910.18Transformation rules (transform.dat) . . . . . . . . . . . . . . . . . . . . . . . . 18910.19*Ligand Gaussian representation (gaussian.dat) . . . . . . . . . . . . . . . . . . 19110.20*Graphics (graphic_sp.dat) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192

10.20.1 Colors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19210.20.2 Switches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19310.20.3 Scalars . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19310.20.4 Lists . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19310.20.5 Color modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19310.20.6 Defining atom colors (@atom-colors) . . . . . . . . . . . . . . . . . . . . 19410.20.7 Defining colors for contact types (@contact-colors) . . . . . . . . . . . . 19410.20.8 Setting test ligand graphic defaults (@test-ligand-defaults) . . . . . . . 19510.20.9 Setting reference ligand graphic defaults (@reference-ligand-defaults) 19610.20.10Setting superposition graphic defaults (@superposition_defaults) . . . 19710.20.11Combilib Graphics Defaults . . . . . . . . . . . . . . . . . . . . . . . . . 198

10.21*Optimization/Rigid-body superposition/Scoring parameter file (optpar.dat) . 19910.21.1 Scoring parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19910.21.2 Flexible optimization parameter . . . . . . . . . . . . . . . . . . . . . . 20010.21.3 Rigid-body superposition parameter . . . . . . . . . . . . . . . . . . . . 20010.21.4 RIF optimization parameter sets . . . . . . . . . . . . . . . . . . . . . . 202

11 *Program interfaces 20511.1 Interface to PYTHON . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20511.2 Interface to WHATIF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20511.3 Interface to SCA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20511.4 Interface to CORINA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20511.5 Interface to CONFORT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20611.6 The FlexV graphical interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . 206

IV Appendix

A Sample configuration file 211

Page 10: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

10 CONTENTS

B Examples of script files 219B.1 Example I: Flexibly superpose a pair of ligands (1stTest.bat) . . . . . . . . . . 219B.2 Example II: Flexibly superpose a set of ligand pairs sequentially (flexs.bat) . . 220B.3 RigFit example I: Rigid-body superpose a set of ligands all against all (rigfit.bat)222B.4 RigFit example II: Rigid-body screen a fragment against a set of DB ligands

(screen.bat) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223B.5 Examples for the alignment of combinatorial libraries . . . . . . . . . . . . . . 224B.6 Examples and results of the postoptimization . . . . . . . . . . . . . . . . . . . 226B.7 Examples and results of the special postoptimization . . . . . . . . . . . . . . 228B.8 Examples and results of the flexible superposition . . . . . . . . . . . . . . . . 229

Index 232

Bibliography 239

Page 11: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

I

INTRODUCTION

11

Page 12: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann
Page 13: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

1About FlexS

FlexS is a computer program for predicting ligand superpositions. For a given pair of lig-ands, FlexS predicts the conformation and orientation of one of the ligands relative to theother one. In this first version of FlexS, the reference ligand is assumed to be rigid. Thus, itshould be given in a conformation, which is similar to the bound state. The superpositionalgorithm in FlexS works with very little manual intervention. Nevertheless, in some casesadditional information about the ligands or even the superposition is known. You can inte-grate this knowledge in the computations with FlexS by doing single steps manually. FlexShas therefore been designed for interactive work on ligand pairs as well as for larger sets ofligands.Before you start working with FlexS, we would remind you that FlexS is software understeady and current development. We do test the program with a continuously growing setof reference ligands and test ligands, but we are sure that FlexS is not ’error-free’.To understand and interpret the results produced with FlexS, it is necessary to know some-thing about the underlying models and algorithms. This topic is not covered in this UserGuide, we refer to the following literature [10, 9, 12, 11, 13, 7, 2, 3].FlexS originates from research as part of the RELIWE1 and RELIMO2 projects at the Ger-man National Research Center for Information Technology (GMD), Institute for Algorithmsand Scientific Computing (SCAI). At this point, we would like to thank Gerhard Klebe (Uni-versity of Marburg, Germany) and Thomas Mietzner (BASF AG, Ludwigshafen, Germany)for their support in the development of the underlying ideas of FlexS. We would also like toexpress our thanks to the first users of FlexS. They have reported numerous errors, have toldus their experiences with the user interface, and — above all — have motivated us to makeFlexS better. We especially want to mention Sven Grüneberg and Manfred Hendlich (Uni-versity of Marburg, Germany), Gerhard Barnickel (Merck KGaA, Darmstadt, Germany),Jens Sadowski (BASF AG, Ludwigshafen, Germany), and Hans Briem (Boehringer Ingel-heim, Ingelheim, Germany). Further development of the FlexS system is being undertakenat BioSolveIT GmbH.

1.1 Additional copyright notes

The following software/data is used in/with FlexS:

1 RELIWE is a German acronym for ’Prediction of receptor-ligand interactions’. The project is funded by theGerman Federal Ministry for Education, Science, Research and Technology (BMBF) under grant no. 01 IB 302 A.

2RELIMO IS THE GERMAN ACRONYM FOR ‘RECEPTOR MODELING AND THE DESIGN OF COMBINATORIALLIBRARIES’, THIS PROJECT WAS PARTIALLY FUNDED BY THE GERMAN FEDERAL MINISTRY FOR EDUCATION,SCIENCE, RESEARCH AND TECHNOLOGY (BMBF) UNDER GRANT NO. 0311 620.

13

Page 14: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

14 CHAPTER 1. ABOUT FLEXS

• Software-basis: Copyright c©2001 by Fraunhofer Gesellschaft (FhI-SCAI)

• getline library: Copyright c©1993 by Chris Thewalt

• PVM library version 3.4: Parallel Virtual Machine System University of Tennessee,Knoxville TN. Oak Ridge National Laboratory, Oak Ridge TN. Emory University, At-lanta GA. Authors: J. J. Dongarra, G. E. Fagg, G. A. Geist, J. A. Kohl, R. J. Manchek,P. Mucci, P. M. Papadopoulos, S. L. Scott, and V. S. Sunderam c©1997 All Rights Re-served3

• SMARTSTM may be a registered trademark of Daylight Chemical Information Sys-tems.

• The torsion angle data (torsion_standard.dat/torsion_fine.dat) is derived from theCambridge Structural Database. The copyright c© of these file is shared by GMD– Forschungszentrum Informationstechnik GmbH, the Cambridge CrystallographicData Center (CCDC), and BASF AG, Ludwigshafen.

1.2 What’s new in this version?

The major changes since FlexS version 1.6 are summarized in a file called changes.flexswhich is part of the package. All modifications are categorized in one of seven classes: bugfixes, computing, menus and user interface, file formats, graphics, misc., and documenta-tion. At the end of the file, a todo-list follows. The list summarizes requests which could notbe considered so far.

1.3 How to read this guide

Some section or subsection titles are marked with an asterisk. These sections are either ofless importance or they are very technical and hard to understand. We advise you to readthese sections after you have some experience with FlexS.We have used the following styles or fonts to highlight specific parts of the text. The mostimportant style is the environment of examples, as follows:

Example

This is an example

The descriptions of commands and global parameters of FlexS have a special list structure,which is self-explanatory. In the text, we use the following fonts: this is a command, this is

3PVM copyright notice: Permission to use, copy, modify, and distribute this software and its documentationfor any purpose and without fee is hereby granted provided that the above copyright notice appear in all copiesand that both the copyright notice and this permission notice appear in supporting documentation.

Neither the Institutions (Emory University, Oak Ridge National Laboratory, and University of Tennessee) northe Authors make any representations about the suitability of this software for any purpose. This software isprovided ‘as is” without express or implied warranty.

PVM version 3 was funded in part by the U.S. Department of Energy, the National Science Foundation andthe State of Tennessee.

Page 15: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

1.3. HOW TO READ THIS GUIDE 15

a <parameter>, and this is a filename, a path, or a program. Exceptions to this are theprograms CORINA, CONFORT and FlexS itself.A syntax description looks like this:

command <parameter> ...

Parameters which occur only in special cases or which are optional are set in parentheses:[<optional parameter>]. If the line ends with a \-character, the command line is continuedin the next line. Note that in FlexS itself it is not possible to escape a carriage return characterby using a \-character.

Page 16: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

16 CHAPTER 1. ABOUT FLEXS

Page 17: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

2Installation

2.1 Preconditions

At this time, the FlexS is available under Linux and Windows. We have tested several flavorsof Linux distributions. Please contact us in case you need other operating systems to besupported.

• Required:

– the executable

– a license; please see below (p. 21) for more details or visit http://www.biosolveit.de/license

(Linux only:)

– glibc in version 2.2 or higher.

– OpenGL libraries installed; see the troubleshooting section (p. 141) in case youencounter problems here.

• Highly desirable:

– a ring conformer generation tool such as corina from Molecular Networks toenable FlexS to also align ligands containing flexible ring systems.

– any minimization tool to superpose only low-energy conformations of ligands.(This can be achieved using corina as well.)

Please contact [email protected] in case you do lack a tool. More infor-mation about corina in combination with FlexS can be obtained from our FAQ &Knowledge Base at http://www.biosolveit.de/faq/questions/77/.

2.2 Quick Installation for Interactive and Batch Mode Usage

Installation is easy:

1. • Windows: Start the installer package. It should be fairly self-explanatory.

17

Page 18: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

18 CHAPTER 2. INSTALLATION

• Linux: Unpack the tar ball in one place (tar xvzf <file>). flexs is an exe-cutable. If it does not have the ’x’ flag, then please set it with chmod +x flexs.

2. Start the FlexSby calling flexs (Windows) or ./flexs (Unix/Linux) at the console(preferred) 1, or double-clicking the executable’s icon in your operating system’s desk-top. The FlexS will appear. Go to the troubleshooting section if you encounter prob-lems here (p. 141).

2.3 Parts of FlexS

After the standard installation procedure of FlexS, you should have the following files anddirectories:

Filename Descriptionconfig_sp.dat FlexS configuration file

doc/ Directory containing help information, e.g. the User Guideexample/ Various example input files

flexs link to the executableflexs-2.1.3-<OSNAME> FlexS executable

flexv FlexV executableinstall.readme Readme file containing information about the installation process

predict/ Default output directory for resultsstatic_data/ Encrypted static data files required by the software at run time

tmp/ A directory used for writing all sorts of temporary files which aredeleted on quitting the program

Linux/Unix: flexs-2.1.3-<OSNAME> is an executable. If it does not have the ’x’ flag, set itwith chmod +x flexs-2.1.3-<OSNAME>.

2.4 (Linux/Unix only) Defining the root directory

If you are running FlexS on a Linux or Unix system, the configuration file config_sp.datmust be present in each directory where FlexS will be run (or see section 5.1 below abouthow to define an alternative location for a configuration file).The second essential entry required in the config_sp.dat after the license information isthe root directory - a line starting @ROOTDIR. All paths specified later in the file are relativeto this path except those starting with / or ./. In the @DIRECTORIES section, you candefine default paths to various data locations. The @STATIC_DATA section contains pathsand filenames of the static data files of FlexS and the @PROGRAMS section contains paths andfilenames of executables.

1For the time being, we recommend you start the tool from an operating system console because operatingsystem warnings/errors and some FlexS warnings/errors may only be visible at a persistent console:

• Linux: Change to the directory in which FlexS resides, and type ./flexs

• Windows: Use Start -> Run, type cmd into the dialog, and press Return; use cd <directory> tochange directories and/or cd .. to move one directory higher to navigate to FlexS’s installation direc-tory. Once you are in there, type flexs to start FlexS.

Page 19: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

2.5. STARTING FLEXS: A FIRST SIMPLE TEST 19

For a simple test (see 2.5) to get FlexS running in the installation directory, you can enterthe path of the installation directory after the keyword @ROOTDIR and just leave all therest as it is. Later, you can customize the configuration of FlexS for different projects orindividually for each user – detailed configuration is covered later in 5.1. We recommendthat you already include the paths for RCGENERATOR and 3DGENERATOR at this time.

2.5 Starting FlexS: a first simple test

Windows After making the adjustments for licensing (as described above in 3.3), doubleclick on the desktop icon for FlexS or start the flexs.bat which is located in theinstallation directory.

Linux/Unix Change to the directory where FlexS was unpacked. After making the adjust-ments for licensing and definining the root directory (as described above in 3.3 and2.4), enter the command ./flexs at the prompt.

After printing a startup message, FlexS will read the configuration file and the static datafiles. You should then see the FlexS prompt: FLEXS>. Type quit y to terminate FlexS.

2.6 Basic visualization

BioSolveIT offer a basic visualization tool FlexV for use with FlexS, which is already builtinto the FlexS package.Type ./flexv to start the visualizer. The FlexV main window should then appear. Nowyou can quit FlexV again by pressing the ’Quit’ button in the lower-right corner.No license key is needed for FlexV .When started the first time, FlexV generates a .flexv file in your home directory. This filewill be used to store visualization preferences and should therefore be separate for each user(see FlexV manual for details).

2.7 External programs and data

Some features of FlexS are based on external data and software. Although FlexS can be usedas a stand-alone program, we advise you to make the following facilities available to FlexS.

2.7.1 Graphics

FlexS has no internal graphics. For visualization, FlexS must be coupled with an externalprogram. Currently, interfaces to the following software are provided:FlexV is an in-house visualization tool based on OpenGL. FlexV supports all graphic fea-tures of FlexS.

Page 20: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

20 CHAPTER 2. INSTALLATION

2.7.2 Torsion angles

The static data file torsion_standard.dat contains energetically favorable torsion an-gles for specific molecular fragments. The static data file torsion_fine.dat contains 10degree energy grids for torsion angles of specific molecular fragments (see Section 10.10).The torsion data in this software package have been derived by Gerhard Klebe [8] fromthe Cambridge Structural Database (CSD) licensed by the Cambridge Crystallographic DataCentre (CCDC). The torsion data are under copyright of GMD, BASF AG, and CCDC. Anend-user license for the torsion data files is included in the FlexS software license.

2.7.3 Flexible ring systems

The conformations of flexible ring systems can be computed by the 3D structure generatorCORINA [4, 16]. Your CORINA version is suitable for use with FlexS if the driver option’flexx’ is available (set the CORINA executable to your $path variable, then type ’corina -h d’ to check). CORINA or CORINA-F can be obtained from Molecular Networks GmbH(see http://www.mol-net.de for detailed information). The current version of FlexScontains a built-in CORINA library. Alternatively, the program CONFORT can be used.(Please contact Tripos Inc. for more information on this.) If no ring conformation generatoris available, the flag RING_MODE must be set to 0 (see section 5.1, Configuring FlexS).

2.7.4 Parallel script execution

FlexS contains a scheduling algorithm for parallel execution of scripts on workstation clus-ters. The underlying communication library is PVM2 (Parallel Virtual Machine) [17]. Inorder to run FlexS scripts in parallel, you need a PVM installation on your platform and aFlexS-PVM executable. The FlexS-PVM executable has a PVM marker and a PVM copyrightnote in its header. More information about FlexS-PVM can be found in section 7.1.

2PVM can be obtained from http://www.epm.ornl.gov/pvm

Page 21: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

3Licensing

Our software is license key protected. Please be aware that you cannot run FlexS under anycircumstances without a valid license.

3.1 Obtaining License Keys

If you do not already have a license, please use the form at

http://www.biosolveit.de/license

to request a license for our software or to request further information about the licensingprocedure. Alternatively you may mail us at

mailto:[email protected].

To obtain a valid license you must determine the “BioSolveIT Host ID” of your computer.To do this you can pick one of these possibilities:

a) Complete the installation of FlexS and start the commandline version without a li-cense. FlexS will in this case fail to start and print the ID information on the screen.

b) Use the small ID generation tool ’FlexIDgen’. You can download this from

http://www.biosolveit.de/download.

(Linux: if this executable does not have the ’x’ flag, set it with chmod +xflexidgen). Start this executable to find the ID information for your host or system.

c) Finally, from the console you can start the tool with the flag flexs -i to see the ID.

Send the output to us. After you receive your license keys from us, you are ready to proceedwith the next stage of the installation.

3.2 Quick Start with Evaluation Licenses

For a single computer on which the license file is installed, you do not need any additionalsoftware (flexlm) as depicted below for more complex scenarios.After you receive your license keys from us, you will need to edit config_sp.dat. Afterthe keyword @license_files enter the path and name of the license files containing thelicense keys. For details see sub section 3.3.4.

21

Page 22: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

22 CHAPTER 3. LICENSING

Alternatively, you can use the environment variable BIOSOLVE_LICENSE_FILE. This isalso possible with Windows operating systems:

Start -> Control Panel -> System -> Advanced -> Environment Variables

Set an environment variable called BIOSOLVE_LICENSE_FILE to a directory of your choiceand restart FlexS.

3.3 License Scheme for Regular Licenses

When using FlexS (after the evaluation period), you have to set up a license server.

• Select a machine as a license server (windows or linux). This may be a computingmachine where you are running FlexS, but it doesn’t need to.

• Determine your server’s so-called BioSolveIT Host ID executing the small ID genera-tion tool FlexIDgen or flexs -i.

• Get a license file from [email protected] which suits your BioSolveIT HostID.

• Install & start a little piece of software called flexlm which we deliver to you.

• Let flexlm find your license file, see below.

3.3.1 License Server Installation

Download the "BioSolveIT FlexLM" tool from the BioSolveIT homepage. There are versionsfor several platforms available. The package includes all files necessary to install and run aFlexLM license server.

3.3.1.1 MS-Windows

1. Unzip the zip package

2. Start LMTOOLS.EXE

3. Select the "Config Services" tab and enter the service name, e.g. "FlexS License Server".

4. Use the Browse buttons to select

lmgrd.exe

the licence file that you obtained from BioSolveIT

and a log file name

5. Check the "Use Services" checkbox.

6. Check the "Start Server at Power Up" checkbox to restart the server automatically aftera reboot

7. Click "Save Service"

Page 23: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

3.3. LICENSE SCHEME FOR REGULAR LICENSES 23

8. Select the Start/Stop/Reread tab

9. Select "Start Server"

10. Select the "Service/License File" tab and make sure that "Configuration using Services"is selected.

3.3.1.2 Linux

1. Unzip the downloaded package and run the command lmgrd -c/path/to/your/licensefiles

2. Start the license server in the directory where you unpacked zips. If this doesn’t work,make sure the lmgrd and BIOSOLVE files have executable rights set. If not, run thecommands: chmod 775 lmgrd and chmod 775 BIOSOLVE in the directory whereyou unpacked lmgrd and BIOSOLVE from the zip.

(In case you already have a running Acresso Flexnet Publisher (= FlexLM) infrastructure,you only have to add the file BIOSOLVE to the directory where your system administratorstores other FlexLM vendor daemons. Also, consult your system administrator to automat-ically start the license server everytime your server machine is booted.)

3.3.2 Configuring the License Information in FlexS

Finally, you need FlexS let to know where your license server is located.You can specify the license managing server directly in the section @license_filesof file config_sp.dat by adding the line :@myserver (;@myserver on Windows) ifyour server is called myserver. Alternatively, you can set the environment variableBIOSOLVE_LICENSE_FILE to @myserver.

3.3.3 The License Format

For experts: Each line of a license file includes the name of the licensed tool or module, and– among other information – the version number and the expiration date of the license.

Example

INCREMENT FlexS_base BIOSOLVE 2.0 28-jun-2011 uncounted \HOSTID=COMPOSITE=803C749ABB01 SIGN="0078 AD79 1445 A900 \906E 00A4 AB51 F600 AFC8 4394 9356 3C87 EBD2 4A7B E88D"

INCREMENT FlexS_Special BIOSOLVE 2.0 28-jun-2011 uncounted \HOSTID=COMPOSITE=803C749ABB01 SIGN="0089 BE66 7456 A340 \816F 1145 A671 F666 EF48 5396 3456 4587 E452 4456 2887"

This license information should be saved on your system in a text file with the suffix .lic.

3.3.4 Entering the License File Name Directly in the FlexS Configuration File

Open the config_sp.dat file in a text editor. On the next line after the keyword@LICENSE_FILES enter the complete path and filename of the license file(s) containing

Page 24: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

24 CHAPTER 3. LICENSING

the license keys (if more than one file, then enter each path and filename on a new line).Beware if you are working on Windows that this file is a text file. Do not save this file inany other format (for example rtf or doc) and make sure your text editor does not add thesuffix .txt to the end of the name config_sp.dat. Also beware that if a path / file namecontains whitespace, you must enclose the entire path and name in quotes, for example:

Example

@LICENSE_FILES"C:\Documents and Settings\user\My Licenses\biosolveit.lic"

3.3.5 Run the FlexLM license server

Please install the FlexLM license server as described in Section 3.3.1.

When using a license server, you must inform FlexS of the server’s name. To do this, you canspecify the license server directly in the section @LICENSE_FILES of file config_sp.datby adding the line

Windows ;@myserver

Linux :@myserver

for a server called myserver. For Windows this might look like

Example

@LICENSE_FILES;@myserver

Alternatively, you can set the environment variable BIOSOLVE_LICENSE_FILE to@myserver.

3.3.6 BioSolveIT License Scheme for HP-UXia64/SunOS/SGI-Irix Platforms

FlexS is no longer available for these operating systems. Please consult us [email protected] if this causes trouble for you.

Page 25: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

II

USER GUIDE

25

Page 26: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann
Page 27: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

4Getting started —a tutorial introduction

4.1 Single ligand superposition

In this section we would like to offer the user new to the program an easy entry to thevery first standard procedures. Such an access can only show a very rough overview of theprogram. Nevertheless we hope it is helpful for a quick start of working with FlexS. Weexpect that you know about basic operating system commands.It is advantageous to create a working directory for this tutorial so that major data is safe.From that reason, please copy all files from the directory example/Tutorial to the newone. Now you should have:

config_sp.dat1dhf_min_h.mol24dfr_kpl_h.mol24dfr_min_h.mol2

We tried to make an easy entry for you and from there we set some variables in theconfig_sp.dat used for this tutorial to the current directory ./. There are different set-tings in the config_sp.dat you will use later during daily usage.

LIGAND ./COMBILIB ./PREDICT ./HELP doc/TEMP ./SCRIPT ./

Nevertheless, as a first mandatory step, please edit config_sp.dat as follows: openingconfig_sp.dat with your favorite text editor. The @ROOTDIR should point to your FlexSinstallation directory, for example:

...# ROOT DIRECTORY# --------------------------------------------------------------------------# DESCRIPTION:# FlexS needs the definition of some paths and files, for example where# it should search for test ligands, reference ligands, etc. To make the# definition of these paths easier, you can define a root directory.# Each subsequent path definition starting with a directory name or

27

Page 28: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

28 CHAPTER 4. GETTING STARTED --- A TUTORIAL INTRODUCTION

# ../ is then relative to this directory. Note that ./ is relative# to the current directory.# REMARKS:# The root directory should be an absolute path.#---------------------------------------------------------------------------@ROOTDIR /home/user/BioSolveIt/Flexs

# --------------------------------------------------------------------------@DIRECTORIES# --------------------------------------------------------------------------...

This is necessary because otherwise FlexS cannot load different files being required for pro-gram running. If you are interested in detailed information about the potentiality of con-figuring FlexS please go to chapter Configuring FlexS (section 5.1) on page 47. Now theconfiguration file is prepared well and we assume that when FlexS starts, it is ready forwork, and the command prompt will be there waiting for your input.

...>> OPTPAR = ’/home/user/BioSolveIt/Flexs/static_data/optpar.dat’ loaded.

Process time used: 0.17 s. Current process size: 40904 kB.FLEXS>

For this tutorial, we will load two ligands, one to be the reference and the other to be themolecule that shall be tested for similarity. It is also possible to check multiple ligands,situated in a ligand library but this procedure is decribed later on.So, our general procedure will look like this:

1. A general overview

2. The reference ligand

Load a (minimized) ligand as a mol2 file

3. The test ligand

Load the ligand that will be tested for similarity

Fragment the ligand and create a base fragment

Load the reference coordinates

4. Do the superposition:

Base placement

Extend ligand fragments

5. Virtual screening

Load and fragment a new test ligand

Load reference ligands as multi-mol2 file

Screening fragments

Now we are prepared for beginning the superposition.

Page 29: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

4.1. SINGLE LIGAND SUPERPOSITION 29

4.1.1 A general overview and getting help

In the introduction we help you to put you in the position to start FlexS. In this section wewould like to give you a jump start for working with our program. To get an overview of thepossible commands and menus you have the possibility to press Return and FlexS showsyou a list of all commands which are available.

FLEXS>

>> Global: MAIN END QUIT !EDITCFG LIST SET SELOUTP HELPMANUAL ? EXEC TOFLEXV DISPLAYERASE

>> SubMenus: DATABASE REF_LIG TEST_LIG SUPERPOSPVM OPTPARAM CLIB CSUPER

>> Commands: SCRIPT DELALL

FLEXS> databaseFLEXS/DATABASE> endFLEXS>

The list of the commands is divided in different sections. First you see a list of the globalcommands. These commands are executable at every time of the running program. Assecond all available submenus are listed which are selectable from the level you are in. Youchange into the menus to treat the ligands or doing calculations. Whichever menu youare the list of the available commands at the end changed. Some of the menus have alsosubmenus itself which are listed in step submenus if some existing. If there are other menusat the same level, they are listed in a section called Parmenus. At the end of the Returnoutput all commands are listed which are available for the menu you are in. If you want tochange into a menu just type the name of the menu and press Return. For going a level uptype END.For executing a command type the name of the command and the demanded parametersseperated by spaces. It is also possible just to type the name of the command and then pressReturn, FlexS will ask you for all parameter values which are necessary.During working with FlexS the question would arise what a command stands for or how touse this command. For example, we would like to get information about a special commandcalled INFO under the superposition menu. To get the information simply type HELP INFOat the superposition menu to find out what the command is doing and which parametersare demanded by this command.

FLEXS> superposFLEXS/SUPERPOS> help info

Outputting the most important quantities of a superposition result (INFO)\ index I/O!table output \ index placement!lists

[Syntax:] INFO <TABLE FORMAT> <OUTPUT TABLE>[Description:] Displays the main characteristics of the superposition

result, such as number of solutions, highest ranking score, etc. \ ,on the screen. If <TABLE FORMAT> equals ’t’, the result is output in atable, otherwise (’l’) the output is printed on one line. The single-line option (’l’) is very useful to summarize a superposition run overlarge data sets. All single lines start with reference ligand name andtest ligand name.

Page 30: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

30 CHAPTER 4. GETTING STARTED --- A TUTORIAL INTRODUCTION

FLEXS/SUPERPOS>

There is another way to get quick help. Instead of the HELP command you can write aquestion mark ?. If so you get the description about the command in only one lines.

FLEXS/SUPERPOS> ? info>> List the most important quantities of a superposition resultFLEXS/SUPERPOS>

Now you are well prepared for doing a first calculation.

4.1.2 The reference ligand

For doing a simple superposition it is necessary to load a molecule as a reference. The ref-erence ligands are managed in their own submenu. So, please change to the FlexS referenceligand submenu:

FLEXS> ref_ligFLEXS/REF_LIG>

Next, we read in the reference ligand structure as a mol2 file. The mol2 format is an ASCII-based free-format created by the company Tripos. It saves the 3D-structure and it is alsorich in chemical information. A molecule entry is composed of different records. The record-identifiers look like @<tripos>atom, @<tripos>bond ...

# Name: 1dhf_kpl_h# Creating user name: lemmen# Creation time: Fri Apr 19 15:04:13 1996

# Modifying user name: lemmen# Modification time: Fri Apr 19 15:05:14 1996

@<TRIPOS>MOLECULE1dhf_kpl_h

51 53 1 0 0SMALLGASTEIGER

@<TRIPOS>ATOM1 N1 4.0018 4.7879 -16.2937 N.2 1 <1> -0.20112 C2 4.7449 4.0485 -17.1299 C.2 1 <1> 0.18593 NA2 6.0218 4.1075 -17.2696 N.pl3 1 <1> -0.33074 N3 4.1223 3.0217 -17.8519 N.pl3 1 <1> -0.25295 C4 2.7913 2.7619 -17.8084 C.2 1 <1> 0.27196 OH4 2.3328 1.8207 -18.4759 O.2 1 <1> -0.26807 C4A 2.0033 3.6117 -16.8978 C.2 1 <1> 0.16548 N5 0.6673 3.3811 -16.7953 N.2 1 <1> -0.2448

...@<TRIPOS>BOND

1 1 2 22 1 12 13 2 3 1

Page 31: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

4.1. SINGLE LIGAND SUPERPOSITION 31

4 2 4 1...

The lines with the hashes in the beginning are just comment lines which will never loadedby a program. They are not needed in the file format. The most commonly used records are@<TRIPOS>Molecule, @<TRIPOS>Atom, and @<TRIPOS>Bond.The @<TRIPOS>Molecule record consists of up to six data lines. The first line is the nameof the molecule. It must not be the same name as the name of the mol2 file. The secondline comprises the number of atoms, bonds, substructures, features and sets associated withthe molecule. The third line tells the type of the molecule, for example whether it is small,a saccharide or a protein. The fourth line contains the type of charges associated with themolecule. FlexS trusts in that the charges are set in the atom record. The fifth and the lastlines contain the interal SYBYL status bits and comments to the molecule. Both are notmandatory needed.The @<TRIPOS>Atom record is composed of single data lines with all information to rebuildone atom within the molecule. The first entry in the line is the atom_id followed by the atomname. The next three entries are the xyz cathesian coordinates. The next one is the SYBYLatom type for the atom. In the example above, the one and the one in angle brackets are theID number and the name of the substructure containing the atom. The last entry in one dataline is the charge associated with the atom.In analogy to the @<TRIPOS>Atom record the @<TRIPOS>Bond record consists of singledata lines with all informations to rebuild one bond within the molecule. The first entry ineach line is the bond ID. The second and the third are the atom IDs of the atoms which areconnected by this bond. The last entry is the bond type of this bond. 1=single, 2=double,3=triple and ar=aromatic.There exists more possible records for this file format. For more information visithttp://www.tripos.com/custResources/mol2Files/.When you load the reference ligand, you will get some warnings or errors subject to yoursettings. Most of the warnings are hints that any data for calculation the conformation is notavailable, e.g. an empty list of torsions angle at a bond. If you get an error, for example aspecial tool is not executable, it will have a significant influence on the result.If you want to read in a molecule, it is required to give the path of the ligand. You are able tospecify the path relative to the one – defined in the variable LIGAND – in config_sp.dat oras an absolute path to the file. In this tutorial we work in only one directory, so the variableis set to the current one and we do not give a path to the files we use.

FLEXS/REF_LIG> read 1dhf_min_h.mol2>> Ligand ’1dhf_min_h’ loaded from file ’./1dhf_min_h.mol2’.>> Initialization of molecule>> Number of ring systems identified: 2>> Number of ligand components: 11

Total number of interacting atoms : 16Total number of interaction points : 88Current process size: 42412 kB

FLEXS/REF_LIG> drawFLEXS/REF_LIG> display

The DRAW command generates a drawing of the reference ligand in a gdf-format. But thedrawings are not displayed automatically. For seeing the ligand on a graphical interfaceuse the DISPLAY command which starts the extern viewer. In the viewer window you can

Page 32: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

32 CHAPTER 4. GETTING STARTED --- A TUTORIAL INTRODUCTION

Figure 4.1: The FlexV main canvas and the main features of the GUI.

see the structure of your loaded reference coordinates. For a short introduction to the FlexVviewer see section 4.1.2.1 below. With the commands SELGRA (section 6.5.7 on page 89) andSELCOL (section 6.5.8 on page 90) you can change the graphic presettings. If you want tosee the changings please type DRAW and DISPLAY again.

4.1.2.1 A short introduction to FlexV

The tool FlexV is the preset viewer of our program family. It is integrated in the programpackages, so you do not need a special license key for it. When you execute the DISPLAYcommand FlexV starts and presents the main window on the screen. On the top of thewindow is the menu bar. Frequently used functions are placed as buttons in a row betweenthe menu bar and the scene window. Below the scene window on the right side is the Quitbutton to terminate FlexV . You can see the main window of FlexV in picture 4.1.The Filesmenu from the menu bar contains all functionalities to handle with files like loador write a GDF file. It also includes basic program commands.In the Panels menu you can control the three sub-panels of FlexV . The Z-Control sub-panel for controlling the zoom, Z-rotation, and Z-clipping. The Object control forswitching graphic objects on/off. The Render Options for changing rendering options.The Pick Function menu contains functions to handle and determine the molecule in thescene window.The View Control menu comprise all functions changing the view to the scene.Pressing the buttons with the left mouse key have the function as follows:

Diskette Loading GDF file

Camera Writing GDF file

Hand Opening/closing the object-control panel

Magnifier Opening/closing the Z-control panel

Page 33: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

4.1. SINGLE LIGAND SUPERPOSITION 33

Icosahedron Opening/closing the render-options panel

Eye Look at/center all visible objects

Atom Opening the molecule-display-mode form

Pen Reset all object labels

Compasses Switch to previous pick function

Using the right mouse key to rotate your molecule inside the scene window and the leftmouse key to put labels on selective atoms of the ligand.To reduce the scale of this tutorial we just offer you a short overview to FlexV . We referthe intrigued reader to the FlexV User Guide to get more information about the manifoldfunctions of FlexV .The next step is loading the test ligand, the molecule that will be tested.

4.1.3 The test ligand

To load the molecule which will be tested it is necessary to change to the test ligand sub-menu. The test ligand also reads in in the workspace as a mol2 file. In this tutorial we loadreference coordinates, too, to which we refer the RMSD calculation after the superposition.The reason for the reference coordinates is to compare the coordinates of the aligned testligand calculated by FlexS with previous calculated data. So you can decide how far FlexSis able to reproduce a calculation. It is not mandatory to load reference coordinates. If wewould not load reference coordinates the RMSD refers to the test ligand which superposi-tioned best. Please type:

FLEXS/REF_LIG> test_ligFLEXS/TEST_LIG> read 4dfr_kpl_h.mol2>> Ligand ’4dfr_kpl’ loaded from file ’/usr/flex/example/lig/dhfr/4dfr_kpl.mol2’.>> Initialization of molecule>> Number of ring systems identified: 2>> WARNING: Difference of 6.501deg in bond angle comparison (tolerance 5.042deg):

O1|27 -- CT|26 -- CA| 25 O2|28 -- CT|26 -- CA|25>> WARNING: Difference of 6.510deg in bond angle comparison (tolerance 5.042deg):

OE1|32 -- CD|31 -- CG| 30 OE2|33 -- CD|31 -- CG|30>> Potential number of conformations: 1.881170e+08>> Number of ligand components: 11>> Preparing ligand 1,5-interaction lists

N5(2) C6(2) C9(3) N10(2) CM(3) type 4C6(2) C9(3) N10(2) C14(2) C13(2) type 6C6(2) C9(3) N10(2) C14(2) C15(2) type 6C7(2) C6(2) C9(3) N10(2) CM(3) type 4C11(2) C(2) N(2) CA(3) CT(2) type 6C11(2) C(2) N(2) CA(3) CB(3) type 6C(2) N(2) CA(3) CB(3) CG(3) type 4Found 7 chains of 1,5-repulsion atoms.Total number of interacting atoms : 14Total number of interaction points : 83Current process size: 43024 kB

FLEXS/TEST_LIG> draw fixFLEXS/TEST_LIG> display

See the visualised molecule in FlexV . The option fix in the DRAW command means that thecoordinates are taken from the test ligand input file. You can change the graphic presettings

Page 34: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

34 CHAPTER 4. GETTING STARTED --- A TUTORIAL INTRODUCTION

with the same commands as in the reference ligand submenu, SELGRA (see section 6.4.14 onpage 81) and SELCOL (see section 6.4.15 on page 81).To carry out structural superpositions it is indispensable to separate the test molecule intofragments. The command SELBAS separates the molecule and defines the base fragment.

FLEXS/TEST_LIG> selbas a f>> Base fragment selection>> Automatic base selection

---------------------------------------------------->> Fragmentation 0

No.|Connect| Connected by bond | Nof. | # IA level |Components| to | | Conf.| 3 2 1 |

-----------------------------------------------------------------------0| - | -- | 72| 8 2 0 | 0, 1, 2,1| 0 | C14|19 -<06>- N10|14 | 1| 0 1 0 | 3,2| 1 | C|22 -<12>- C11|16 | 1| 1 1 0 | 4,3| 2 | N|24 -<03>- C|22 | 1| 1 0 0 | 5,4| 3 | CA|25 -<06>- N|24 | 1| 0 0 0 | 6,5| 4 | CT|26 -<06>- CA|25 | 1| 2 0 0 | 7,6| 4 | CB|29 -<04>- CA|25 | 1| 0 0 0 | 8,7| 6 | CG|30 -<07>- CB|29 | 1| 0 0 0 | 9,8| 7 | CD|31 -<12>- CG|30 | 1| 2 0 0 | 10,

>> Fragmentation 1...FLEXS/TEST_LIG>

Use the parameters a and f to determine automatic selection for flexible fitting. For furtherinformation see section 6.4.6 on page 77. The reference coordinates are read in as mol2 file,too. In this context, we load the test ligand in a minimized conformation for comparing thesuperpositioned ligand with it.

FLEXS/TEST_LIG> readref 4dfr_min.mol2>> Set reference coordinates by separate mol2 file>> Ligand reference coordinates loaded from file ’/usr/flex/dhfr/4dfr_min.mol2’.

Current process size: 43188 kBFLEXS/TEST_LIG>

Now we have all components to do the superpositioning.

4.1.4 Superpositioning

For doing the superposition it is requisite to change to the superposition submenu. The firststep to superpose the test ligand on the reference ligand is to calculate the placement of thebase fragment on the reference ligand.

FLEXS/TEST_LIG> superposFLEXS/SUPERPOS> placebas 3>> No triangle hash table, generate one ...>> Building edge hash table

---------------------------------------------------------------------------distance intervall: 0.800 - 9.000number of interaction groups : 10Type tree generated.remaining dots: 0 (pairs: 3025) Number of dot pairs stored: 3112

>> Base placement| |I| I| | | | #after | #after |#after|#after|#after|#after

Page 35: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

4.1. SINGLE LIGAND SUPERPOSITION 35

| |A| A| | # | #Tri | vector |preclash| clus-| | |restrictionF| |-| -| | C | angles | test | test |tering| final| |of solutionr|A |L| A|a | o | / | / | / | / | clash|filter|numbersa|l |e| t|c | n | #solu | delete | delete | | test | | =g|g |v| m|t | f | tions | dup sol| bad sol| | | | FINAL-----------------------------------------------------------------------------0 3 0 6 6 68 132532 125664 9484 2001 1836 1836 10001 3 0 5 4 105 41104 34703 7485 2001 1378 1378 10002 3 0 3 3 144 36752 30652 9275 2003 979 979 9793 3 0 3 1 12 0 0 0 0 0 0 03 os 0 0 0 12 522 258 246 246 240 240 240

>> Applying filtersSolutions discarded by filters : 0

>> Total number of base placements: 3219

Process time used: 34.57 s. Current process size: 45592 kBFLEXS/SUPERPOS>

The parameter 3 stands for a standard algorithm based on triangle hashing techniques. Forfurther information see section 6.7.1 on page 94. In the output table you can see the enu-meration of the triangles needed per fragment. The number of triangles will be reduced byvector tests, clash tests and clustering. Each number of triangles per fragment is restrictedto 1000. Now the base fragment is placed on the reference ligand. The next reasonable stepis to build up the test ligand fragment by fragment. In the used command you can choosewhether all missing fragments of the test ligand will be used or just selective fragments withspecial chemical properties.

FLEXS/SUPERPOS> complex all>> Complex Build-Up (k-greedy)

---------------------------------------------------------------------------Energy threshold : 0.000 kJ/molRelative energy threshold : 20000000000000.000 kJ/molNumber of solutions (k) : 500---------------------------------------------------------------------------

>> Fragment 1---------------------------------------------------------------------------expanding solution 3219# of complete placements : 0# of expanded conformations : 13178# of below overlapthreshold conformations: 7184# of filtered placements : 0# placements before clustering : 750# resulting dock entries : 416energy cutoff (k best solution) : -510.357current confidence level : 1.0000superpos entry with | no. |est. S. |total S.| rms |#M | overlap (perc.)----------------------------------------------------------------------min. estimated score| 1| -551.86| -122.16| 4.39| 6| 73.70min. score | 1| -551.86| -122.16| 4.39| 6| 73.70min. rms | 371| -510.72| -81.02| 2.35| 5| 72.00no solution with rms < 1.5A

>> Fragment 2...FLEXS/SUPERPOS>

Page 36: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

36 CHAPTER 4. GETTING STARTED --- A TUTORIAL INTRODUCTION

The parameter all completes the test ligand using all fragments. Now the superpositioningis finished.To see the results in a table use the LISTSOL command. It make sense to limit the outputbecause of the clarity. Type listsol 30 to get the best thirty solutions.

FLEXS/SUPERPOS> listsol 30SELECTED SOLUTIONS: 1dhf_min_h -- 4dfr_kpl_h+---+-------+-------+-------+-------+-----+-----+------+-----+------+-------+----+|No.|Total |Match- |Ovl.- |Est. |Norm.|RMSD-|Simil.|Match|Norm. |Ovl. |Frag|| |Score |Score |Score |Score |Score|Value|Index | |Volume|Volume |No. |+---+-------+-------+-------+-------+-----+-----+------+-----+------+-------+----+| 1|-795.77|-55.471|-740.30|-928.02|0.772|3.780| 2.491| 12| 0.81|1269.40| 2|| 2|-795.73|-55.470|-740.25|-927.98|0.772|3.780| 2.491| 12| 0.81|1269.32| 2|| 3|-742.27|-52.706|-689.56|-874.52|0.728|3.825| 2.575| 10| 0.74|1169.33| 2|| 4|-742.22|-52.700|-689.51|-874.47|0.728|3.825| 2.574| 10| 0.74|1169.27| 2|| 5|-725.81|-51.539|-674.27|-858.07|0.714|3.755| 2.507| 11| 0.73|1148.02| 2|...FLEXS/SUPERPOS>

The table output is sorted by total score. All values in the RMSD column are related to thereference coordinates which we read in in the test ligand menu above. Without the referencecoordinates the coordinates are calculated as well but the coordinates with the best totalscore are then the one to which the following calculations are related. So, in this case thefirst entry in the RMSD column would be always zero. To find out the meaning of everycolumn see the LISTSOL command in the section 6.7.12 on page 98 in this user guide.Instead of show the solutions of the alignment on the screen FlexS gives you the possibilityto write the output into a file. To do that please use the command SELOUTP. It is a globalcommand that means it is executable at any time. If you use the command FlexS will askyou for a file name, whether append to the file or overwrite it if it is already existing, andfor automatically merging after parallel script execution. The last step is applied to thePVM module (section 7.1) and has no influence to this tutorial. The file will be located inthe directory defined in the variable PREDICT in config_sp.dat. The file has the .logformat.

FLEXS/SUPERPOS> seloutpQuery output target <’screen’ or filename> [screen] : listsol_outputAppend to or overwrite file ’listsol_output’ <a,o>? [o] : oAuto-merge in parallel script execution [y] : y

FLEXS/SUPERPOS>

It is also able to visualise the superpositions using FlexV . The MDRAW command generatesdrawings of a selected set of placements:

FLEXS/SUPERPOS> mdraw 1-30>> superpos entry 1 drawn to /home/user/Biosolve_it/bin_flexs/tmp/flexv_tmp_6.gdf>> superpos entry 2 drawn to /home/user/Biosolve_it/bin_flexs/tmp/flexv_tmp_6.gdf...FLEXS/SUPERPOS>display...FLEXS/SUPERPOS>

Use the eye button to center the first entry of the table above. This is the best solution FlexSfound with the used parameters. To see the other solution click on the object-control button

Page 37: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

4.1. SINGLE LIGAND SUPERPOSITION 37

Figure 4.2: The FlexV object control panel.

(see figure 4.1). By using the shuffle regulator in the opening window (see figure 4.2) youcan choose an entry.At least we want to show you how to store the calculated coordinates of the placementsinto a seperate file. The computed coordinates just corresponds to the tested molecule andtherefore the needed command WRITE is placed in the test ligand menu.

FLEXS/SUPERPOS> test_ligFLEXS/TEST_LIG> write>> Default is MOL2 format. Use explicit suffix .pdb / .mol if

the ligand should be written in PDB or MOL format.Filename (default directory: ./) [ligand] : testcoordappend to any existing file (otherwise overwrite) <y,n> [y] : ncreate a multi-ligand file <y,n> [n] : yPlacement selection (o)pt , (f)ix , (r)ef , or placement (c)oords [c] :Placement selection <1-552> : all

>> alignment entry: 1 written to testcoord>> alignment entry: 2 written to testcoord>> alignment entry: 3 written to testcoord...

FlexS asks you for the required parameters. First it is needed to define a file name. FlexSwill store the file in the directory specified in the variable LIGAND in config_sp.dat. Tostore the file in another directory please give the path to it relative to the default one. If youcreate a multi-ligand file all placements will saved in one file otherwise FlexS will generate aset of files with the file name associated with the placement number. To save the placementcoordinates the parameter Placement selection must set to c.

4.1.5 Virtual screening

Beside aligning two ligands you have also the possibility doing a fragment based screening.In this case you can select the base fragment(s) from a loaded test ligand and align thesefragments against a ligand database. This means that you are able to test a lot of moleculesin just one step. The database will be read in as the reference ligand.For a optimal run of FlexS it might be helpful to prepare FlexS for performing its next task.To do that please use the command SET associated with the name of the flag which will beedit. In order to avoid unnecessary computational overhead it is mandatory to set the flag<superposition_mode> to 2.

FLEXS> set superposition_mode

Page 38: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

38 CHAPTER 4. GETTING STARTED --- A TUTORIAL INTRODUCTION

New flag value [ 0] : 2>> Data may be inconsistent (RELOAD?).FLEXS>

For additional information about the control flags in FlexS see section 5.1.5 on page 50 inthis user guide.The next step for doing the screening is to read in the ligands. In this example we use 1cbxas the test ligand. As described in the section above it is required to seperate the moleculein to fragments. But in this case we choose the parameter s for screening purposes insteadof f for flexible fitting.

FLEXS/TEST_LIG> read 1cbx_kpl.mol2

>> Ligand ’1cbx_kpl’ loaded from file ’/usr/local/flexs/tutorial/1cbx_kpl.mol2’.>> Initialization of molecule>> Number of ring systems identified: 1>> WARNING: Difference of 6.600deg in bond angle comparison (tolerance 5.042deg):

O1|12 -- C1|1 -- C2| 2 O2|13 -- C1|1 -- C2|2>> Potential number of conformations: 1.555200e+04>> Number of ligand components: 6>> Preparing ligand 1,5-interaction lists

C2(3) CA(3) CB(3) CG(2) CD1(2) type 4C2(3) CA(3) CB(3) CG(2) CD2(2) type 4Found 2 chains of 1,5-repulsion atoms.Total number of interacting atoms : 5Total number of interaction points : 0Current process size: 42228 kB

FLEXS/TEST_LIG> selbas a s>> Base fragment selection>> Automatic frag selection

---------------------------------------------------->> Fragmentation 0

No.|Connect| Connected by bond | Nof. | # IA level |Components| to | | Conf.| 3 2 1 |

-----------------------------------------------------------------------0| - | -- | 36| 2 0 0 | 0, 1, 2,1| 0 | C|4 -<12>- CA|3 | 1| 2 0 0 | 3,2| 0 | CB|5 -<03>- CA|3 | 1| 0 0 0 | 4,3| 2 | CG|6 -<12>- CB|5 | 1| 0 1 0 | 5,

>> Fragmentation 1No.|Connect| Connected by bond | Nof. | # IA level |Components

| to | | Conf.| 3 2 1 |-----------------------------------------------------------------------0| - | -- | 36| 2 0 0 | 2, 3, 4,1| 0 | C2|2 -<03>- CA|3 | 1| 0 0 0 | 1,2| 1 | C1|1 -<12>- C2|2 | 1| 2 0 0 | 0,3| 0 | CG|6 -<12>- CB|5 | 1| 0 1 0 | 5,

Current process size: 42248 kBFLEXS/TEST_LIG>

As reference ligand we use a multi-mol2 file as input. In practice you will use a multi-mol2file from a database.

FLEXS/REF_LIG> read DBligands.mol2>> List of molecules in DBligands.mol2

1: 1cbx_min_h

Page 39: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

4.1. SINGLE LIGAND SUPERPOSITION 39

2: 2ctc_kpl_h3: 3cpa_kpl_h4: 6cpa_kpl_h5: 7cpa_kpl_h

Molecule(s) to read: <all,1,..,5> [all] : all5 (of 5) molecules loaded from file /user/Flexs/tutorial/DBligands.mol2.Process time used: 0.25 s. Current process size: 45500 kB

FLEXS/REF_LIG>

Now we are ready for doing the superposition. So, please change to the superposition menuand execute the command SCREEN. You will be requested to select fragments for screeningand to appoint the resolution.

FLEXS/REF_LIG> superposFLEXS/SUPERPOS> screen

The base fragment of which fragmentation(s) do you want to use <0-1> :Set resolution <2.000000,10.000000> [2.000000] :

>> new value set.>> current number of laue vectors: 16

>> Ligand ’1cbx_kpl’ loaded from file ’/user/Flexs/tmp/rigfit_DB_in_1.mol2’.>> Initialization of molecule>> Number of ring systems identified: 0>> Number of ligand components: 3>> screening fragment no. 0 of fragmentation 0 against 5 reference ligands currently loaded

structures: | remaining:5 | 0%

>> Ligand ’1cbx_kpl’ loaded from file ’/user/Flexs/tmp/rigfit_DB_in_1.mol2’.>> Initialization of molecule>> Number of ring systems identified: 0>> Number of ligand components: 4>> screening fragment no. 0 of fragmentation 1 against 5 reference ligands currently loaded

structures: | remaining:5 | 0%

Score | ref-lig-name Mol-Id sybyl name | test-lig-name Mol-Id sybyl name |frag 0 frag 1 | | |

-172.926 -155.925 | DBligands_0000 1cbx_kpl_h | 1cbx_kpl_0000 1cbx_kpl |

-161.354 -180.387 | DBligands_0001 2ctc_kpl_h | 1cbx_kpl_0000 1cbx_kpl |

-174.083 -182.252 | DBligands_0002 3cpa_kpl_h | 1cbx_kpl_0000 1cbx_kpl |

-167.932 -141.673 | DBligands_0003 6cpa_kpl_h | 1cbx_kpl_0000 1cbx_kpl |

-172.094 -166.476 | DBligands_0004 7cpa_kpl_h | 1cbx_kpl_0000 1cbx_kpl |

Process time used: 1.09 s.FLEXS/SUPERPOS>

The output table is composed of one line for each reference ligand and one column for eachscreening fragment. Each table entry provides the optimum FlexS score of the respectiveplacement of a screening fragment on top of a reference ligand.

Page 40: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

40 CHAPTER 4. GETTING STARTED --- A TUTORIAL INTRODUCTION

Now you are prepared for a efficient work.Thank you for doing the tutorial.

4.2 Superpositioning using combinatorial libraries

In the first section of this tutorial we superpose only one molecule on the reference ligand.But for important applications it is inefficient to align single molecule manually one afterthe other. For that reason FlexS gives the opportunity to do superpositions of combinatoriallibraries. The application is called FlexSc and it is an extension of FlexS. So, check yourFlexS program package whether FlexSc is already existing. Start FlexS in the directory youalready created for the first section of this tutorial. If you have left out the first part ofthe tutorial, please create a working directory for this tutorial and copy all files from thedirectory example/Tutorial to it. Now you have to edit the config_sp.dat file asdescribed in the first part.In the header of the started program you can see a [CSUPER] if FlexSc is available.

user@linux:~/BioSolveIt/Flexs/working_directory> flexs______________________________________________________________________________

F l e x SCopyright Prediction of Ligand Superposition

BioSolveIT GmbH Version: 2.0.0 (06.05.10)An der Ziegelei 79 Modules: [CORINA_F] [DECRYPT] [CSUPER]53757 St. AugustinGermany Authors: C.Lemmen, M.Rarey, T.Lengauer,

C.Hiller, B.Kramer, M.Lilienthal,F.Sonnenburg, M.Zimmermann

www.biosolveit.de Contact: [email protected]______________________________________________________________________________

Additional copyright notes:Software-basis: (C) 2001 by Fraunhofer Gesellschaft (FhI-SCAI)Getline library: (C) 1993 by Chris ThewaltPVM library: (C) 1997 by University of Tennessee, Knoxville TNTorsion angle data: (C) by GMD SCAI, CCDC, BASF AG

>> Running on phi (Linux 2.6.25.20-0.7-pae) with 4 processors.>> FlexS configuration file ’/home/Flexs/config_sp.dat’ loaded.>> FlexS_base license check (BioSolveIT keys): succeeded.>> Licensed modules: FlexS [CORINA_F] [DECRYPT] [CSUPER]>> SETTINGS = ’/home/Flexs/static_data/flexs_settings.dat’ loaded.>> CHEMPAR = ’/home/Flexs/static_data/chempar.dat’ loaded....FLEXS>

If FlexSc is missing, download a new version of the FlexS program package or ask youradministrator for FlexSc. In order to use FlexSc an additional license key is necessary. Ofcourse you are also welcome to send us an email for help ([email protected]).Doing alignments of combinatorial libraries are similar as the alignment of only onemolecule. We also have to read in a molecule for reference and molecule fragments whichwill be tested. Because of that the to do overview for this part of the tutorial looks like theone in the first part:

Page 41: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

4.2. SUPERPOSITIONING USING COMBINATORIAL LIBRARIES 41

1. The reference ligand

Load a ligand as a mol2 file

2. The combinatorial library

Load a core that will be tested for similarity

Load the addition R-groups of the core

Create a subset of the set of R-groups

3. Combinatorial alignment

Selection of molecules for base placement

Align the core instances on top of the reference ligand

Adding all active instances of a R-group to the core placements

Now, we are ready to start.

4.2.1 The reference ligand

The procedure to load the reference ligand is equal to the one for a single molecule. So, wechange to the reference ligand submenu and load the ligand.

FLEXS> ref_ligFLEXS/REF_LIG> read 1dhf_min_h.mol2>> Ligand ’1dhf_min_h’ loaded from file ’./1dhf_min_h.mol2’.>> Initialization of molecule>> Number of ring systems identified: 2>> Number of ligand components: 11

Total number of interacting atoms : 16Total number of interaction points : 88Current process size: 42412 kB

FLEXS/REF_LIG>

Instead of use the ref_lig command to change in to the submenu, you can also takethe short form rl of the command. Look at the end of config_sp.dat in the section@ALIASES to see a list of available abbrevations.If you want to visualise the molecule use the commands draw and display as describedin the reference ligand section of the single ligand superposition.

4.2.2 The combinatorial library

A combinatorial library like used in FlexSc is made of a core and up to nine additional R-groups. The R-group number 0 is intended for the core, the other R-groups are numberedfrom 1 to 9. Every R-group is placed in a single multi-mol2 file which consists of the R-groupinstances. For additional information about the R-group and which possibilities you haveto store and to use them in files see section 7.2.1 of the user guide.So first we change into the combinatorial library submenu and then read in the core andthree different R-groups.

FLEXS> clibFLEXS/CLIB> read clib/CORE_dhfr.mol2 R

Page 42: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

42 CHAPTER 4. GETTING STARTED --- A TUTORIAL INTRODUCTION

1 (of 1) molecules loaded from file ./clib/CORE_dhfr.mol2.R-groups found: R3 R4 R5Molecule 0 : 24->R3 25->R4 26->R5

FLEXS/CLIB> read clib/R_3.mol2 R 3 X

23 (of 23) molecules loaded from file ./clib/R_3.mol2.23 molecules loaded for R-group 3.No additional R-groups found.

>> The type of link atom in the Parent-R-group 0 is set to: Dubecause the corresponding atom-type in R-group 3 is not unique.

>> The type of link atom in R-group 3 is set to : C.arbecause the corresponding atom-type in Parent-R-group 0 is always’C.ar’.Current process size: 42676 kB

FLEXS/CLIB> read clib/R_4.mol2 R 4 X...FLEXS/CLIB> read clib/R_5.mol2 R 5 X...FLEXS/CLIB>

When reading in the core and the R-groups we write the path clib/ to the file afterthe READ command. This is a relative path from the directory set in @DIRECTORIES inconfig_sp.dat. So, if you want to load the combinatorial library files for this tutorialwithout specifying the path change the COMBILIB entry from the current directory ./ tothe subdirectory ./clib/. You can also enter the complete path to this directory.The parameter R means the name of the connecting atom. Each instance must have the sameconnecting atom. In addition to the core the loaded R-groups have two further parameters.The parameter 0 means the R-group number and X means the name of the X-atom. Theloaded R-groups are identified by either of them.After the first time you execute the read command in the clib submenu, the combinatoriallibrary has the open status. In this condition the library is not yet ready for alignment calcu-lations. You have to finish the library load procedure first using the CLOSE command. So,remember this command and type:

FLEXS/CLIB> close>> Closing combilib:

Combilib is closed.>> Initiating combilib data assignment>> Combilib data assignment

Step 1: Assignment for R-group 0 (1 mols):...

>> Combilib data assignment finished, combilib ready for superposition.

Process time used: 0.27 s. Current process size: 43248 kBFLEXS/CLIB>

You are able to controll the status of your loaded library by executing the INFO command.Now the library has the READY status.

FLEXS/CLIB> info>> Combinatorial library :

Library load status : ready

Page 43: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

4.2. SUPERPOSITIONING USING COMBINATORIAL LIBRARIES 43

# R-groups : 3Core (R-group 0) : 1

file: /home/BioSolveIt/FlexS/working_directory/clib/CORE_dhfr.mol2R-group 3 located at core: 23

file: /home/BioSolveIt/FlexS/working_directory/clib/R_3.mol2R-group 4 located at core: 24

file: /home/BioSolveIt/FlexS/working_directory/clib/R_4.mol2R-group 5 located at core: 7

file: /home/BioSolveIt/FlexS/working_directory/clib/R_5.mol2Total number of instances : 55 molecules.Total library size : 3864 molecules.

FLEXS/CLIB>

Now we have all molecules which are necessary doing the superposition.But before we change to the next submenu we want to show you an other command withwhich you can reduce the set of R-group instances to a subset without delete the libraryand reload it with an other set of R-groups. The selected set will be used in all followingcalulations until a new subset is defined.In this tutorial we use the SELECT command without reduction of the set of R-group in-stances.

FLEXS/CLIB> select 3Active molecules <0-22> : 0-22

FLEXS/CLIB>

Now, let us start the superposition.

4.2.3 Combinatorial alignment

The tools for doing a combinatorial superposition are placed in the csuper submenu. Be-fore starting the alignment it make sense to think about the expected output. During thetutorial we most see the output on screen. The other possibility is to write the output intoa file. The output of the combinatorial alignment can write into a file with two different fileformats: pdf and mol2. We showed you the mol2format at the beginning of the tutorial.Do not confuse our pdf format with the portable document format, it is quite different. TheFlexS pdf format has similar record entries as the mol2 format and contains the whole set ofplacements. In this example we would like to present you the writing into a pdf file. To dothis it is required to change the <STORE_PLACEMENT_MODE> flag to -1.

FLEXS> set store_placement_modeNew parameter value (integer) [0] : -1

>> Data may be inconsistent (RELOAD?).FLEXS>

For doing a combinatorial alignment it is mandatory to select the instances which are usedfor base placement. You can select the total number and the maximum number for the coreand each R-group.

FLEXS/CLIB> csuperFLEXS/CSUPER> selins % 1 0 0 0>> Selection of molecules for base placement

Id | Name | Score---------------------------------------------------

Page 44: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

44 CHAPTER 4. GETTING STARTED --- A TUTORIAL INTRODUCTION

Core: 1 (all)R-3 : 0R-4 : 0R-5 : 0

FLEXS/CSUPER>

The percent sign as the first parameter means that the maximum possible number of totalinstances are selected. The next step is quite similar to the superposition described above.The command PLACEC includes the selection of base fragments, placing base fragments, andincremental construction. The same modes can be used as for single ligand superposition,see section 6.7.1 on page 94. In this case we take the triangle algorithm again.

FLEXS/CSUPER> placec 3>> No triangle hash table, generate one ...>> Building edge hash table

----------------------------------------------------------------------distance intervall: 0.800 - 9.000number of interaction groups : 10Type tree generated.remaining dots: 0 (pairs: 3025) Number of dot pairs stored: 3112

>> Core molecule 0 : r68Base selection : 1 fragment(s)Base placement : 1000 placementsComplex build up : 546 placements

>> Total number of core placements: 546Process time used: 24.57 s. Current process size: 45328 kB

FLEXS/CSUPER>

Use the command LISTP to see the calculation solutions. The first parameter of the com-mand is for adjusting the right mode which depends on the calculation. The c is needed forcore placements. The other parameters are the number of placements, whether only activemolecules are listed or not, and the sort criterium. 2 means a sorting by energy, the otherpossibility is a 1 for a sorting by the instance number.

FLEXS/CSUPER> listp c % y 2>> Core | # Pl | Placement scores

----------------------------------------------------------------------0 | 546 | -370.748 -369.369 -368.542 -366.991 -364.572

FLEXS/CSUPER>

In the table you see the core instance number, the total number of core placements, and thefive best placement scores.In daily usage it is often necessary to store the core palcements. Use the WRITEC commandfor it. The command requires the core id of the core which shall be save, and a file name. Thefile is record in a FlexS pdf format. The default directory is specified in the entry predictin config_sp.dat.

FLEXS/CSUPER> writecCore molecule id <0,0> [0] :Placement base pdf-filename [csuper] : first_core_placement

FLEXS/CSUPER>

Now let us, analog to the single molecule superposition, build up possible ligands by addingthe active instances of a R-group to the core placements. The EXTENDR command is just ableto handle one R-group per execution. So the first parameter of this command is to select the

Page 45: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

4.2. SUPERPOSITIONING USING COMBINATORIAL LIBRARIES 45

R-group which will be used by its number. The second parameter gives you the possibilityto store the solution into a file. A file name has to be declared if necessary.

FLEXS/CSUPER> extendr 3 y core_x>> Mol-No || Core | R-3 | #Pl | Score | Time | Molecule name

1 || 0 | 0 | 274 | -400.833 | 0.94s | ./core_x_C-000_R3-0002 || 0 | 1 | 348 | -404.777 | 1.45s | ./core_x_C-000_R3-0013 || 0 | 2 | 251 | -500.516 | 5.03s | ./core_x_C-000_R3-002

...22 || 0 | 21 | 217 | -576.475 | 4.98s | ./core_x_C-000_R3-02123 || 0 | 22 | 410 | -370.748 | 0.56s | ./core_x_C-000_R3-022

Process time used: 56.83 s. Current process size: 45896 kBFLEXS/CSUPER>

To see the solutions on screen we use the LISTP command again, but now with a s as firstparameter. The table output is sorted by the energy.

FLEXS/CSUPER> listp>> Calculated placements:

c= cores (generated by PLACEC)s= core plus single R-group (generated by PLACEC + EXTENDR)Placement type selection (c,s,m,a) [c] : sCore molecule list <0-0> [0] :R-Group list <1-9> [3] : 3List active molecules only [y] : yNumber of placements <1,5> [5] : 5Sort criterium (1= Core, 2= R-Group, 3= Molecule, 4= Energy) <1,4> [4]

>> Core | R-Grp | Molec | # Pl | Placement scores----------------------------------------------------------------------

0 | 3 | 5 | 215 | -584.406 -581.111 -576.398 -573.7190 | 3 | 21 | 217 | -576.475 -569.612 -569.081 -565.688

...0 | 3 | 22 | 410 | -370.748 -369.369 -368.542 -364.572

FLEXS/CSUPER>

The combinatorial library module has a lot of more potentialities as described so far. Never-theless you get an useful idea of the work with combinatorial libraries. You find a compre-hensive description of all available commands in section 7.2.3 on page 134.Thank you for doing this tutorial.

Page 46: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

46 CHAPTER 4. GETTING STARTED --- A TUTORIAL INTRODUCTION

Page 47: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

5Working with FlexS

5.1 Configuring FlexS

When FlexS is started, it first tries to read the config_sp.dat file from the current direc-tory. If the file is not present, FlexS checks whether an environment variable FLEXS_HOMEis defined. If so, FlexS searches for the config_sp.dat file in the directory specified byFLEXS_HOME.The config_sp.dat file allows you to define an individual working environment for FlexSby assigning values to FlexS-specific environment variables. There are environment vari-ables for directories, the names of the static data files, external programs and FlexS programflags. The ~-character can be used as an abbreviation for home directories in filenames pro-vided that $HOME is defined. The config_sp.dat file is divided into several records, eachof which assigns values to a group of environment variables. These are described below inseparate sections. A complete sample configuration file is given in Appendix A.

5.1.1 @ROOTDIR: Defining the root directory

In this record you specify a directory, relative to which the directories and static data files(specified in the following two records) are located. All paths specified later in the file arerelative to this path except those starting with / or ./.

5.1.2 @DIRECTORIES: Defining directory paths

Here is a complete list of environment variables directories used by FlexS:

LIGAND contains the path for the directory where FlexS looks for ligand files (MOL orMOL2 format) and writes any modified ligand files (MOL or MOL2 format). Thisdirectory is for read and write access. For format descriptions, see [19, 1].

COMBILIB contains the path for the directory where FlexS looks for combinatorial libraryinput files. This path is only used if the CSUPER module (FlexSc) is active.

HELP contains the path for the directory where FlexS looks for the ASCII files for the onlinehelp system. This directory is for read access only.

SCRIPT contains the path for the directory where FlexS looks for batch files.

PREDICT contains the path for the directory where FlexS writes the test ligand output files.

TEMP contains the path for the directory where FlexS reads and writes temporary files.

47

Page 48: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

48 CHAPTER 5. WORKING WITH FLEXS

PVM_TEMP contains the path for the directory where FlexS-PVM reads and writes tempo-rary files, like the batch file for a PVM run. If PVM_TEMP not set, then PVM_TEMPwill be set to TEMP.

5.1.3 @STATIC_DATA: Defining paths to static data files

A set of so-called static data files are loaded during start-up from FlexS. The names of thesefiles are defined in the following environment variables. Each environment variable con-tains the path and filename of one static data file. Details of files and their formats can befound in section 10.Please note:Some of your static data files may be crypted for legal or other issues. Under certain cir-cumstances you are able to decrypt the respective sections. Please refer to section 6.6.4 onpage 94 how to accomplish this.In the following we use ’static data file xyz.dat’ to refer to the actual value of the envi-ronment variable XYZ. At the same time these are the names of the corresponding samplefiles. For example, with ’static data file contype_sp.dat’ we actually mean the value ofthe environment variable CONTYPE. contype_sp.dat is the value of the environmentvariable CONTYPE in the sample configuration file (config_sp.dat) distributed with thepackage, and you will find a file named contype_sp.dat in the static_data directory.This naming convention will be used throughout this guide. In principle, you are free toname the respective files as you like.

SETTINGS Program parameters

All program parameters controlling FlexS are defined.

CHEMPAR General chemical parameters

In chempar, global chemical parameters like van der Waals radii are specified.

GEOMETRY Interaction geometries

Here a library of different interaction geometries is defined. Interaction geometries areparts of spherical surfaces on which the interaction center of the countergroup mustbe placed. Interaction geometries will be combined with interaction types defined inthe contype_sp file.

FCHARGES Formal charges for ligand fragments

The static data file fcharges defines (formal) charges for molecular fragments. Thesecharges are mapped onto the ligand if the flag <ASSIGN_FORMAL_CHARGES> isset to 1.

The rules in transform.dat are able to set fcharges as well. Whetherthe rules in fcharges.dat or transform.dat are used depends on the flag<USE_TL_TRANSFORMS> (for test ligands) and <USE_RL_TRANSFORMS> (forreference ligands).

DELOC Changing localized into delocalized systems

The static data file delocd defines small fragments which are automaticallyconverted from localized to delocalized systems and vice versa if the flagASSIGN_DELOCALIZED is set to 1.

Page 49: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

5.1. CONFIGURING FLEXS 49

Please be aware that the above mechanism triggers the traditional (de-)localizationprocedure. From version 2.0 there is an alternative route to delocalized systems,namely by applying the rules from transform.dat. It depends on the setting ofthe flag <USE_TL_TRANSFORMS> whether DELOC or TRANSFORM is used.

CONTACT Interacting groups in the test ligand

Small molecular fragments which can form one of the interactions (from contype_sp)with a specific geometry (from geometry_sp) are defined in the static data filecontact.

TORSION Torsion angle patterns for rotatable single bonds

This file contains mainly the torsion angle database of the conformation generationprogram MIMUMBA in our own subgraph file format. torsion_standard.dat containsthe torsion angle database of the old MIMUMBA model and torsion_fine.dat containsthe torsion angle database of the new MIMUMBA model without preselection of an-gles (see controlling flag TORSION_MODE in subsection 5.1.5).

CONTYPE Interaction types and compatibilities

The static data file contype_sp contains the definition of the interaction types han-dled by FlexS. The interaction types are divided into groups and countergroups tospecify which interactions may occur.

GAUSSIAN Gaussian representation

In gaussian, the Gaussian representations of physico-chemical properties based onmolecular fragments are defined. It is possible to exclude atoms from being repre-sented and in this way to get a sparse representation of the molecule by the Gaussians.

OPTPAR Default optimization and scoring parameters

This file contains the different parameters for the different scoring contribu-tions and the optimization parameters and stop criteria for flexible superposi-tion/postoptimization and the parameters of RIG FIT.

GRAPHIC Default setting for the graphic commands

graphic contains the default setting for the graphic commands in FlexS.

TRANSFORM Transformation of loaded molecules

The static data file transform.dat contains a set of (mostly SMARTSTM-based) rules.These are applied during the loading of ligands. In a highly configurable way, theycorrect misassignments, determine/correct atom types, check aromaticity and muchmore. Please refer to sections 10.1 and 10.13 for more details — and to the file itself.

5.1.4 @PROGRAMS: Defining paths to external programs

The following environment variables define the location of external programs called byFlexS. The variables must contain the path and the program name itself (see Appendix Afor an example):

EDITOR contains the call for the editor you want to work with (some FlexS commands arefor editing FlexS files ’online’).

Page 50: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

50 CHAPTER 5. WORKING WITH FLEXS

PDF_VIEWER contains your PDF viewer application (PDF: Adobe Portable Document For-mat).

FLEXS contains the path and filename of the FlexS executable. This is only needed forinstalling FlexS and for the PVM module.

FLEXV contains the path and name of the FlexV executable.

RCGENERATOR contains the path and name of the ring conformation generator as wellas the list of parameters. If CORINA is used, the list of parameters must contain thedriver option -d flexx. For additional parameters, see the CORINA program de-scription.

3DGENERATOR contains the path and name of a program to clean up molecule structuresor to read SMILES files. If CORINA (cp. page 205) is used, the list of parameters mustcontain the option -i t=sdf -o t=sdf.

CONFGENERATOR contains the path and name of an external conformation generator forthe command ALIGN (see 6.8.2 and section 6.8.1.5 for more details).

PCGENERATOR contains the path and name of an external partial charge generator. Theinput and output molecule file format have to be the MOL2 format. The generatorshould not change the order of molecule atoms.

5.1.5 @FLAGS: Defining control flags

The output of FlexS and the choice of some computational methods can be controlled by thefollowing environment variables, which are used as flags. These variables can be assignednumeric values, where ’0’ means ’false’ or ’off’ and ’1’ means ’true’ or ’on’. The values of theflags can be changed on the fly during a FlexS run (see LIST and SET commands), exceptthe so-called runtime-constant flags. These are indicated with a † .

VERBOSITY This is an integer value (greater than or equal to 0) rather than a flag. Itdetermines how ’verbose’ FlexS is:

Level Meaning0 Silent, only warnings and error messages and direct results1 User messages, displaying the program flow2 Important statistical output3 Runtime information (if the output is directed into a file, the verbosity

level should be less than 3)5 Tables for interactions, conformations, etc.9 All except control output of read and write commands

10 All

Error messages and warnings are output independently of the verbosity level on stan-dard error (stderr).

USER_MODE† Depending on the mode, menus and commands are enabled or disabled.This makes FlexS easier to use for beginners. Available values are:

Page 51: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

5.1. CONFIGURING FLEXS 51

Mode Meaning0 Off, complete superposition menu not present1 Basic, only essential menus and commands are enabled2 Standard, basic plus additional commands that are more complicated

to use3 Advanced, standard plus additional menus and commands mainly

used for analysing ligands and results4 Special, all menus and commands are enabled

PRINT_TIMES The processor time used to execute one command is printed after each ex-ecution of one command (if equals 1).

PRINT_SIZE The current process size is printed after each execution of one command (ifequals 1).

SIZE_LIMIT After each (non-global) command, FlexS checks its own memory requirement.If this is higher than <SIZE_LIMIT> and <SIZE_LIMIT> is greater than zero, FlexSautomatically terminates. <SIZE_LIMIT> is measured in kilobytes. Memory check-ing is currently only available under the Solaris OS.

RIGID_TORSIONS 0: Flexible bonds are treated as flexible1: Flexible bonds will “snap” into the closest torsion angle on a 30 degree grid.

RING_MODE If equals 0, rings will be considered to be rigid; consequently, the loadedconformation will be taken into account. If equals 1, ring conformations are computedby CORINA, and if equals 2, conformations are computed by CONFORT. Finally, ifequals 3, the conformations are computed by the built-in CORINA, and if equals 4,conformations are computed by MOE. The default value is 3.

TORSION_MODE† This flag controls how FlexS handles torsional degrees of freedom.There are a few models, the choice of which depends on this setting and an associ-ated data file. For the meaning of the respective settings, please have a closer look atFig. 5.1. Please bear in mind that there is an interdependency of TORSION_MODE andthe file to which the value of the TORSION variable points: if the variable TORSIONpoints to static_data/torsion_standard.dat, the flag TORSION_MODE must be 0. Oth-erwise, if it points to static_data/torsion_fine.dat, the flag TORSION_MODE can inprinciple be either 0, 1 or 2. It will usually make most sense though if it is set to 1 (seebelow). If TORSION_MODE equals 2, the parameter TORSION_MAX_ENERGY (seesection 10.4.3) has no effect on all ligands loaded.

Since this is a bit complicated, let us illustrate the situation by means of a figure(Fig. 5.1)

• At setting TORSION_MODE=0, the file torsion_standard.dat or torsion_fine.datdetermines all torsional energies. If multiple subgraphs match, the respec-tive energies are “merged”, i.e., the best energy below the threshold TOR-SION_MAX_ENERGY results in a test point (a point which is set during ligandbuild-up).

• If TORSION_MODE equals 1, the file torsion_fine.dat is applied, multiple match-ing energies are added up and averaged — basically forming the basis for a his-togram to which the MIMUMBA torsional model is subsequently applied.

Page 52: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

52 CHAPTER 5. WORKING WITH FLEXS

torsion_fine.dat

0

1

2

TORSION_MODE

SECONDARY_TORSION_MODE

if multiple subgraphs match, energies are added up and averaged.Then, an interpolation is performed, and minima & plateaus are test points according to the MIMUMBA model.

summing up and averaging as above, then a 10˚ grid is generated;points on the grid are test points

is taken ’as is’; grid points are test pointsenergetically favoured points are chosenbelow threshold TORSION_MAX_ENERGY

if no test points are found: enter

torsion_standard.dat

torsion_fine.dat

torsion angle

energy

x x x x x x xresult(x = test point coordinates)

TORSION_MAX_ENERGY

(must be = 1)

the value of TORSION has to point to file

value meaning

OR torsion_fine.dat

Figure 5.1: The effects and meanings of the parameters TORSION_MODE and TOR-SION_MAX_ENERGY.

Page 53: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

5.1. CONFIGURING FLEXS 53

• If TORSION_MODE equals 2, instead of the MIMUMBA model, a 10 degree gridis applied forming the basis for the test points.

If no subgraphs match at all, then the ’fallback solution’ is applied corresponding tothe execution of algorithms requiring SECONDARY_TORSION_MODE to be set (seebelow).

SECONDARY_TORSION_MODE This parameter controls the program behavior in casesfor which it was not possible to create any test points with the standard procedures(controlled by TORSION_MODE, see above). If equals 0, a default grid is applied fortorsional energies (see above). If equals 1 and no torsion angles for a rotatable bondare defined in the torsion database, the torsional potential is calculated from a forcefield (see Figure 5.2).

Page 54: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

54 CHAPTER 5. WORKING WITH FLEXS

TORSION_MINIMA_CUTOFF

x xx xxxxxxxx

energy

torsion angle

torsion angle

energy

all points on a 5˚ grid which are below the threshold are test points (x)

a default grid at 30˚ width is applied

SECONDARY_TORSION_MODE

a threshold (TORSION_MINIMA_CUTOFF) is applied

torsion angle

energy

1

0

she shape of the torsional energy is sampled with selected terms from the Sybyl force field

Figure 5.2: The effects and meanings of the parameters SECONDARY_TORSION_MODEand TORSION_MINIMA_CUTOFF.

Page 55: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

5.1. CONFIGURING FLEXS 55

SUPERPOSITION_MODEMode Meaning

0 Flexible ligand superpositioning according to the FlexS method1 Rigid-body ligand superpositioning according to the RigFit method2 Database screening with RigFit

KEEP_RCGEN_FILES If equals 1, the normally temporary files for communication withthe ring conformation generator are not deleted.

MOL_NAMEDefines the molecule name which is written to output mol files:

Mode Meaning1 Mol2 molecule name (from input file)2 Mol2 molecule name + infile number (multi-mol file) or

mol2 molecule name + solution number (placement solution)3 Output filename + infile number (multi-mol file) or

output filename + solution number (placement solution)

OPTIMIZE Mode of the local flexible postoptimization of placements (see section 6.7.4)

Mode Meaning0 Use the similarity(overlap) index (see 6.9.1) as goal function.1 Use the superpos-energy as goal function

KEEP_3D_GEN_FILES If equals 1, saves all 3DGENERATOR program files and outputfilenames on the screen. Otherwise deletes in/out/trace files.

SDF_MOL_ID_TYPE determines the field from which the molecule ID in an SDF file istaken:

Mode Meaning0 First line of header block1 Property line with the name given by SDF_MOL_ID_STRING2 Property line starting/ending with <ID.. / <id.. / ..ID> / ..id>3 Take the SDF_MOL_ID_NUMth field in the data section

SDF_MOL_ID_NUM Determines the field from which the molecule ID in an SDF file istaken, if SDF_MOL_ID_TYPE is set to 3. If equals x, molecule ID is read from the xthproperty line of type ’> <..>’

USE_PVM_FEATURE This enables or disables the use of the Parallel Virtual Machine. Ifequals 1, the execution of a parallel batch file is available. Otherwise it is not, and batchscripts will be executed sequentially. For more details, please refer to section 7.1.4.

USE_CONF_GEN This enables or disables the use of conformation generator for the com-mand ALIGN (see 6.8.2).

Page 56: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

56 CHAPTER 5. WORKING WITH FLEXS

Level Meaning0 The conformation generator is switched off1 Using the internal conformation generator ([6]) to produce various con-

formations (excluding the given conformation of the template)2 Using the internal conformation generator to produce various confor-

mations (including the given conformation of the template)3 Using an external conformation generator to load various conforma-

tions (excluding the given conformation of the template)4 Using an external conformation generator to load various conforma-

tions (including the given conformation of the template)

For more details see section 6.8.1.5. In order to use an external conformation generator,the generator must be defined in the configuration file config_sp.dat. For more detailssee CONFGENERATOR in section 5.1.4.

EXT_CONF_GEN_FORMAT The flag defines the input format for the external conforma-tion generator. If the flag is set to ’0’ the input for the generator is the name of thetemplate molecule. This case is used if a database with various conformations formolecules with unique names is used as external generator. Otherwise if the flag is setto ’1’ the the input is a MOL2 file with current conformation of the template molecule.In both cases FlexS expects a (multi-) MOL2 file with various conformations of thetemplate molecule, which will be loaded.

ASSIGN_EXT_PARTIAL_CHARGE This enables or disables the use of an external partialcharge generator for the molecule initialization. In order to use an external partialcharge generator, the generator must be defined in the configuration file config_sp.dat.For more details see PCGENERATOR in section 5.1.4.

USE_TL_TRANSFORMS New initialization routine in FlexS. If equals 1, the rules de-fined in transform.dat are applied upon loading a test ligand in the order de-fined in the @TRANSFORM section of config_sp.dat. The transform mecha-nism replaces and extends functionalities controlled by the flags TEST_ATOM_TYPES,ASSIGN_FORMAL_CHARGES, ASSIGN_DELOCALIZED and ADD_HYDROGENS. Theseare kept only for backwards compatibility and will be removed in one of the laterversions of FlexS. For comparision of the new and old initialisation routines, pleasealso refer to Table 5.1.

USE_RL_TRANSFORMS If equals 1, the rules defined in transform.dat are appliedupon loading a reference ligand in the order defined in the @TRANSFORM sectionof config_sp.dat. Otherwise only the assignment of formal charges according tofcharges.dat is used.

5.1.5.1 Hidden constant flags

These greyed out flags are outdated since FlexS 2.0. You should no longer use these flagsthey will be removed completely in one of the next versions!!!Nevertheless, these parameters are still avaiable in FlexS as HIDDEN CONSTANTflags, i.e., they are still known by FlexS, but they are no longer part of the defaultconfig_sp.dat. Instead for all flags the default value is used. However, you can still

Page 57: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

5.1. CONFIGURING FLEXS 57

USE_TL_TRANSFORMSFlag Name Value0 1

old initializiation scheme new initialization schemerules:f. chgs.: fcharges.datdelocalization:delocalized.datprotonation and atom typeassignments: hard coded

rules:all functions: transform.dat

Effect(s) on init processpre-process (level 2)aromaticity (level 3)

ADD_HYDROGENS no effects from level 4(protonation)

ASSIGN_FORMAL_CHARGES no effects no effects from level 5(formal charges)

ASSIGN_DELOCALIZED0

no effects from level 6(delocalization)

TEST_ATOM_TYPES no effects from level 10(atom types)

pre-process (level 2)aromaticity (level 3)

ADD_HYDROGENS protons added according to oldprotonation scheme

4. protonation according totransform.dat

ASSIGN_FORMAL_CHARGES charges added according tofcharges.dat

5. formal charges according totransform.dat

ASSIGN_DELOCALIZED1

delocalization according todelocalized.dat

6. delocalization according totransform.dat

TEST_ATOM_TYPES atom types only checked, not cor-rected

10. atom types checked accord-ing to transform.dat

TEST_LIG/SELINIToverrides setting in config.dat

Table 5.1: Overview of new ligand initialization and manipulation functions: In Re-lease 2 of FlexS you can decide between two ligand initialization routines: You can usethe old one as you know it by activating ADD_HYDROGENS, ASSIGN_FORMAL_CHARGES,TEST_ATOM_TYPES, and TEST_ATOM_TYPES in the config_sp.dat (not recom-mended). Should you be sure to want the old routine you have to make sure thatUSE_TL_TRANSFORMS in config_sp.dat is set to 0. If USE_TL_TRANSFORMS is set to1 (default), the new initialization routine as specified in transform.dat will be used.New compared to pre-Release 2 versions, this incorporates a pre-processor that finds com-mon mistakes in molecule files and localizes the structures for further processing (2.), andan aromaticity assignment routine (3.). You can also use the command SELINIT in theTEST_LIG menu to set the initialisation flags, which will override all flags specified in theconfig_sp.dat.

Page 58: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

58 CHAPTER 5. WORKING WITH FLEXS

change the setting by adding the respective flag to config_sp.dat. These parameters canno longer be changed using the SET command.This is the old documenation for these parameters:

ASSIGN_FORMAL_CHARGES If equals 1, the ligand formal charges are defined auto-matically by applying the fcharges static data file. Otherwise, the charge entries inthe input file are used.

ASSIGN_DELOCALIZED If equals 1, the ligand formal charges are automatically delo-calized in order to increase intermolecular symmetry. For example, the favorized rep-resentation of a carboxylate group consists of two oxygens attached via single bondsto an sp2 carbon, both oxygens having a formal charge of -0.5. FlexS automaticallydetects localized systems as defined in the static data file delocalized.dat andconverts them into a delocalized form.

TEST_ATOM_TYPES If equals 1, FlexS checks the Sybyl atom types contained in a ligandmol2 file and compares them to its internal standards. Warnings are output if differ-ences occur.

ADD_HYDROGENS If equals 1, missing hydrogens are automatically added to the ligand(protonates acids, keeps amines deprotonated).

5.1.6 @ID_STRINGS: Defining control strings

The following optional environment string is for parsing of SD files (SDF) only. The valuescannot be changed at runtime of FlexS!

SDF_MOL_ID_STRING The field names that determine the molecule ID can be definedhere. If this is not defined, "ID" is assumed.

5.1.7 @PARALLEL: Defining a Parallel Virtual Machine

FlexS is able to execute scripts in parallel. The parallelization is based on the PVM library[17].During startup, FlexS searches for the PVM (Parallel Virtual Machine) daemon and outputsa message indicating whether the daemon was found or not. If the daemon is not found orif errors occur during initialization of PVM, FlexS executes scripts sequentially. Otherwise,the first outermost loop of the current script is executed in parallel (see section 7.1 for furtherinformation on parallel script execution).With the @PARALLEL section, a list of hosts for parallel execution can be defined. Each linecontains the definition of one host. The line consists of the host name, the maximum numberof FlexS processes for this host, and optionally a nice value for processes on this host. If thenice value is low, the parallel script execution on this host has a high priority. And if the nicevalue is high, the parallel script execution on this host has a low priority.

5.1.8 @ALIASES: Defining aliases for commands

Aliases are short names for commands and submenu names (see menus and commandssection). Such aliases can be defined in this record. You can choose several aliases for each

Page 59: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

5.2. STARTING FLEXS 59

command.

Commands and their arguments can be joined to form one alias. For example"SET|verbosity VERB" defines a new alias "VERB" for the command SET. It will be exe-cuted as "SET verbosity", employing "verbosity "as the first argument to SET.

5.1.9 General remarks

A complete example of a configuration file is given in Appendix A. Lines beginningwith a #-character are comment lines or directives. Comment lines and empty lines areskipped. The root directory specified in the @ROOTDIR record should be an absolutepath. All paths specified in the @DIRECTORIES, @STATIC_DATA or @PROGRAMS recordsare relative to the specified root directory unless they begin with a /-character or ./-characters. In the example (Appendix A), the GAUSSIAN static data file will be read fromthe file static_data/gaussian.dat, the ligand input files will be read from directoryexample/lig, and the temporary files will be written to the tmp directory.If FlexS is available for different hardware platforms, you can define hardware-specific partsin the config file in a #system xx #end_system block. Currently, we distinguish betweenthe ’Linux’, ’Linuxx86_64’, ’Windows’ and the ’Darwin’ (MacOSX) operating systems. Ver-sion information can be added, for example ’Linux_6.2’ (see example in Appendix A).

5.2 Starting FlexS

5.2.1 Interactive mode

To start FlexS in interactive mode you must enter flexs from the operating system shell.You are then transferred to the FlexS shell, i.e. you will see the FlexS prompt on the screen:

FLEXS>

Historically, command line options and filenames are linked with a colon, for example-l:<logfile>. Because the filename extension mechanism does not work with this syn-tax, FlexS also allows blanks as a separator, i.e. -l <logfile>.

5.2.2 Arguments for batch processing (-a)

If FlexS is started in batch mode (see -b, below), you can define an argument string for thebatch program. The format of the argument string is explained in section 10.3.

5.2.3 Batch mode (-b)

For users experienced with batch files (see file formats section 10.3) it may sometimesbe desirable to start FlexS in batch mode. One advantage of this mode is that you canredirect the screen output of FlexS into a file. To start FlexS in batch mode, type flexs-b:<filename>. FlexS will then execute the batch file filename.bat. If FlexS is startedwith the -b option, it never waits for a keypress and terminates whenever an error occurs.

Page 60: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

60 CHAPTER 5. WORKING WITH FLEXS

5.2.4 Specifying an alternative configuration file (-c)

When started, FlexS normally tries to read config_sp.dat from the current (startup) di-rectory or from the directory specified by the environment variable FLEXS_HOME. It is pos-sible to tell FlexS to use another configuration file. To do this, start FlexS by typing flexs-c:<filename>, and FlexS will then use filename.dat as its configuration file.

5.2.5 Specifying the execution directory (-d)

In order to execute FlexS in an alternative directory, FlexS can be called with option-d:<execute dir>.

5.2.6 Help for command line options (-h, ?)

Type flexs -h to get a short help text about the command line options.

5.2.7 Output the processor id or system ID (-i)

Type flexs -i to output the processor or system ID of the machine it is running on.

5.2.8 Logging the FlexS session (-l)

If FlexS is started with the -l:<logfile> option, all commands executed are written withtheir parameters into a log file named logfile stored in the current directory.

5.2.9 Nice value (-n)

The FlexS session can be started with a specific nice value given after the -n option. The nicevalue affected whether the process is scheduled favorably. The common range is at leastfrom -20 (resulting in the most favorable scheduling) through 19 (the least favorable). Thenice value is only on Linux useful.

5.2.10 Redirecting output (-o, -om)

By default, FlexS sends all text output to stdout and error messages to stderr. StartingFlexS with flexs -o:<outputfile> causes text output to be redirected to outputfileand the error messages to be redirected to outputfile.err. The output of stdout andstderr can be merged using the parameter -om instead of -o.

5.2.11 RIF alignment option (-ri)

If FlexS is started with the option -ri <rif file> only one command is executed:ALIGN (see section 6.8.2). The command line option -ri is equivalent to the followingbatch script:

RIFALIGN <rif file> 0.0 y n 0

END

Page 61: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

5.3. THE FLEXS SHELL 61

The optimization parameter set which is used during the command executation, is specifiedin the static date file optpar.dat (see sections 10.21.4).If the command line -ro is additionally used the results are written to the specified file.

5.2.12 RIF alignment output option (-ro)

The line option -ro <output file> can only be used together with the option -ri<rif file>. If this option is given the results are written to <output file>.

5.2.13 Interface options (-s)

The options -s is an interface options to control FlexS behavior in combination with callingprograms and should therefore not be used as command line options.

5.2.14 Version information (-v)

Type flexs -v to get detailed information about the FlexS version you are using.

5.3 The FlexS shell

When you see the FlexS prompt on the screen, you can work with the FlexS shell. The FlexSshell is menu-driven, and the menus are hierarchically organized in a tree structure. In eachmenu you have specific valid commands (called menu commands). You can execute thesecommands by typing their names. Entering a submenu’s name brings you to the submenu,entering END to the parent menu. You can also directly go to a menu available in a parentmenu by typing its name. The FlexS prompt will always reflect the name and location of thecurrent menu. There are some commands which are valid for all menus. These are calledglobal commands.You can get a list of all global commands, menu commands and submenu names which arevalid in a given menu by pressing the RETURN key after the prompt.Since FlexS internally converts command and menu names to uppercase letters, you cantype the commands and menu names in lowercase letters or uppercase letters (or both, ifyou want). This is not true for the command arguments, which are explained now.Many commands take a list of arguments. You may type this list directly after the com-mand name, with whitespace separating single arguments. If there is an argument whichcontains whitespace itself, it must be enclosed in double quotes. Otherwise this single argu-ment would be interpreted as multiple arguments. If the number of arguments you specifyin the argument list directly after the command name is less than the number of expectedarguments, then you will be prompted to supply the missing arguments. For some com-mands (e.g. QUERY), some arguments may be an empty string. This type of argument isrepresented by "" (two double quotes) in the argument list.Typing a ∧-character when an argument is required cancels execution of the current com-mand.You can obtain a short online help text about commands and submenus by typing ? fol-lowed by a topic. Topics are all currently valid command and submenu names.

Page 62: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

62 CHAPTER 5. WORKING WITH FLEXS

A full description of all commands and menus can be found in the following section. Typinghelp followed by a topic displays the text segment about the topic from this User Guide onthe screen.

5.4 Errors and warnings

FlexS divides atypical situations into three categories: warnings, errors, and fatal errors. Awarning is issued if the situation can be handled by FlexS but will probably have a signifi-cant influence on the result. In the case of an error, something went wrong and data is notavailable. The current command will be aborted in most cases. A fatal error is produced incases where FlexS has to terminate immediately.

5.5 Preparing the input data

In this section we explain how the input data for FlexS should be prepared.

5.5.1 The test ligand

The test ligand must be given in the SYBYL MOL2 format. Since Release 2.0 MDL SDF andMDL MOL are accepted as well. For a description of these file formats we refer to the SYBYLmanuals [19] and MDL manual [18].For an overview of the default settings and the intercorrelation of the important configura-tion flags of the configuration file in this context, please refer to Table 10.1.

5.5.1.1 Atom and bond types

Correct atom and bond types are important for FlexS because they are used to map physico-chemical information like torsion angles and interaction groups onto the test ligand. If yourtest ligand file has been converted automatically from another file format, check the typescarefully. The following hints may help you to assign the correct types:

• FlexS makes no distinction between *.ar and *.2 atom types. In principle, *.ar typesshould only be used in aromatic ring systems. In contrast, there is a big differencebetween bond types ar and 2. The bond type ar should only be used in aromatic ringsystems.

• Setting the right type for nitrogen atoms is the most difficult part of type assignment.The atom type N.am and the bond type am should only be used in amide groups.FlexS distinguishes between N.3 and N.pl3 atoms. In contrast to N.pl3, N.3 atomshave a lone pair and can accept hydrogen bonds. In heterocycles, the types N.ar/N.2and N.pl3 define where the hydrogens will be attached later and thus where hydrogenbond acceptors and donors lie

5.5.1.2 Atomic charges

FlexS expects partial charges at the molecules. These charges are used to compute overlapvolumes of the respective Gaussians and to determine the hydrophobicity of an atom. In

Page 63: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

5.5. PREPARING THE INPUT DATA 63

addition FlexS assigns formal charges to the atoms which are used to discriminate betweencharged and non-charged hydrogen bonds and to detect interaction partners for salt bridges.Here are some examples, often occurring in organic compounds:

—CO2 -0.5 (each oxygen)—NH3 +1.0 (nitrogen)—C(NH2)2 +0.5 (each nitrogen)—SO3 -0.33 (each oxygen)—PO2— -0.5 (each oxygen)—PO3 -0.66 (each oxygen)

FlexS has an extended mechanism for automatic preparation of molecules, based onSMARTSTM and SMILES notation is implemented as well. The transformation level 5 is ableto convert localized definitions into delocalized ones. This setting is highly recommended ifautomatic structure generation is used.If the automatic assignment of formal charges is not used, the charges from the input fileare used as the formal charges. In addition two commands SETC and READC allow youto set charges to the formal charges entry and to read charges from a separate mol2 filerespectively.

5.5.1.3 The remaining steps

The test ligand molecule should contain hydrogens. If the atom types are assigned correctly,this step can be performed automatically by SYBYL.Because bond lengths and angles in the test ligand molecule are taken from the input struc-ture, the molecule should be energy minimized. Non-minimized structures can cause ge-ometry errors in FlexS.

5.5.1.4 Fixing parts of the test ligand structure

In some applications it makes sense to fix a specific torsion angle or ring conformation. Thiscan be done with the @<TRIPOS>SET RTI (Record Type Indicator) in the mol2 input file.For more general applications, such as constraining all amides to planarity, please refer tothe sections on subgraph specification and torsional data specification (sections 10.10 and10.10.1).

Syntax: @<TRIPOS>SET<set_name> <set_type> <obj_type> <sub_type> <status> <comment><num_members> <member> <member> . . .

The first data line following the @<TRIPOS>SET line consists of several parameters includ-ing the set name, set type, object type, the set subtype, status and a user comment. Thesecond data line contains the number of set members followed by a list of set members. Formore details on these refer to the SYBYL documentation [19]. To fix a torsion angle or ringat the conformation in the input file, these parameters must have the following values.

set_name The set name must read FIXTORSION or FIXRING. The name can contain morecharacters after these keywords in order to distinguish different sets.

Page 64: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

64 CHAPTER 5. WORKING WITH FLEXS

set_type The set type must be STATIC.

obj_type The object type must be BONDS for FIXTORSION or ATOMS for FIXRING.

sub_type The subtype of the set should be <user>.

status The status must be ∗ ∗ ∗∗.

comment The comment can be any comment from the user.

num_members The number of set members.

member For a FIXTORSION each member is the bond ID of the bond which corresponds tothe torsion angle. For a FIXRING each member is the atom ID of an atom contained inthe ring system.

The following example fixes the torsion angles at bond 5 and 7 and the ring conformation ofthe ring system containing atom 12:

Example

@<TRIPOS>SETFIXTORSION STATIC BONDS <user> **** fix torsions at bonds 5 and 72 5 7FIXRING STATIC ATOMS <user> **** fix ring containing atom 121 12

Note: It is still possible to fix torsion angles and ring conformations using the old@<TRIPOS>COMMENT RTI method.

5.5.1.5 Preparing a reference coordinate file

You can compare the relative orientations predicted by FlexS on the fly with a referenceposition of the test ligand on top of the reference ligand. This can be useful for testingthe predictive power of FlexS on specific reference ligand molecules or for comparing thepredicted binding modes with those generated manually.The reference structure file must be in SYBYL MOL2 format and can be read with theREADREF command. The numbering scheme of the atoms as well as the atom types must beidentical to those in the previously read test ligand input file.Alternatively, a reference structure can be assigned with the MAPREF command. In thiscase the assignment is done via subgraph matching such that the atom numbering is ofless importance. The SYBYL atom types must be identical between the reference and theinput molecule in order to find a hit. Bond type comparison is optional. In the case wheremore than one hit of the reference structure is found, the arbitrary first mapping is used.In the case where the reference structure is used for base selection or placement, multiplemappings are evaluated.

5.5.2 The reference ligand

The reference ligand must be given in the SYBYL MOL2 format. Since Release 2.0 MDL SDFand MDL MOL are accepted as well. For a description of these file formats we refer to theSYBYL manuals [19] and MDL manual [18].

Page 65: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

5.5. PREPARING THE INPUT DATA 65

For the reference ligand a rigid structure similar to the receptor-bound conformation is as-sumed. Regarding atom types, bond types and partial charges the reference ligand is han-dled equivalently to the test ligand. Most features like READ, WRITE,DELETE, etc. are equiv-alent to the test ligand menu.

Page 66: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

66 CHAPTER 5. WORKING WITH FLEXS

Page 67: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

6Menus and commands

6.1 Menus

This is the hierarchical menu structure of FlexS:

FLEXS---+--DATABASE|+--REF_LIG|+--TEST_LIG----+--CONFORM|+--SUPERPOS----+--EVALUATE|+--RIF|+--OPTPARAM|+--PVM (module PVM)|+--[CLIB] (module CombiLib)|+--[CSUPER] (module CombiLib)

Typing the submenu name brings you to the submenu, typing END returns you to the parentmenu. You can type commands and menu names in uppercase or lowercase letters.

6.2 Global commands

In contrast to menu commands, global commands are available in every menu. As withmenu names, no distinction is made between uppercase and lowercase letters.

6.2.1 Quitting FlexS (QUIT)

Syntax: QUITDescription: Quits FlexS after clearing memory.

67

Page 68: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

68 CHAPTER 6. MENUS AND COMMANDS

6.2.2 Returning to the main menu (MAIN)

Syntax: MAINDescription: Returns to the main menu.

6.2.3 Returning to the parent menu (END)

Syntax: ENDDescription: Returns to the parent menu. Has the same effect as QUIT when en-tered in the main menu.

6.2.4 Online help (HELP)

Syntax: HELP <topic>Description: Displays the description of the specified <topic> on the screen. Pos-sible topics are all valid commands or menus available in the current menu. Thetext is taken directly from this LaTeX source by a simple parser, therefore the textmight not be formatted perfectly in every case.

6.2.5 Viewing the User Guide (MANUAL)

Syntax: MANUALDescription: Displays the User Guide for FlexS. Starts your local PDF viewer withflexs_ug.pdf (PDF: Adobe Portable Document Format). The viewer applica-tion is adjustable in config_sp.dat, entry PDF_VIEWER in section @PROGRAMS. Thedefault viewer is acroread, which is available for free at http://www.adobe.com/products/acrobat/readstep2.html.

6.2.6 Short online help (?)

Syntax: ? <topic>Description: Displays a one line (very short) help text about the specified <topic>on the screen. Possible topics are all valid commands or menus available in thecurrent menu.

6.2.7 Editing the configuration file (EDITCFG)

Syntax: EDITCFGDescription: Calls the defined (in file config_sp.dat) editor with theconfig_sp.dat file which may be edited. If any paths to static data files werechanged, a (cascading) reload of the databases is performed and the user is re-minded that the reference ligand and/or test ligand data may be inconsistent. Itis up to the user to check whether the reference ligand/test ligand must be reloadedtoo. If a runtime-constant flag changes its value, FlexS is in an inconsistent state.Restart FlexS as soon as possible in this case.

Page 69: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

6.2. GLOBAL COMMANDS 69

6.2.8 Listing environment variable settings (LIST)

Syntax: LIST <topic>Description: Displays a list of environment variables and their current values onthe screen. <topic> can be

flg Program flags, see section 5.1.5. Flags which must be constant during an appli-cation are marked.

dir Program directories (paths), see section 5.1.2exe Executables called by FlexS, see section 5.1.4db Static data files, see section 5.1.3par Program parameters, see section 10.5mol Filenames of currently loaded filesall All of the above.

6.2.9 Changing values of environment variables (SET)

Syntax: SET <variable name> <value>Description: Sets the variable <variable name> to value <value>. Valid stringsfor <variable name> are the names of environment variables for non-constant flags,directories and program parameters, see sections 5.1.5, 5.1.2, 10.5.

Example

SET verbosity 3SET ligand /home/goofy/new_drugs/

6.2.10 Selecting the output destination for superposition results (SELOUTP)

Syntax: SELOUTP <destination> [<append>] [<pvm merge>]Description: Directs the output generated by the LIST, LISTALL, LISTSOL,LISTMAT, QUERY, OVERLAP, SCREEN and INFO commands (see sections 6.4, 6.7)to <destination>. If <destination> equals the string ’screen’, then output is di-rected to the screen, otherwise it is directed into a file. The name of the file is<destination>.log, if <destination> does not contain any suffix, otherwisethe name is <destination>.This file will be located in the directory stored in the PREDICT environment variable(see config_sp.dat). If <destination> is a file, you can decide with <append>whether the output is to be appended to the existing file (<append> = ’a’) or thefile is to be overwritten (<append> = ’o’).In FlexS-PVM, output files created by SELOUTP and commands listed above areautomatically merged after parallel script execution. This feature can be switchedoff by setting <pvm merge> to ’no’.If <pvm merge> is set ’no’, then the build in batch variable $(PVM_ID) 1 is auto-matically appended to the filename.

1$(PVM_ID) : please refer to the PVM section on page 125

Page 70: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

70 CHAPTER 6. MENUS AND COMMANDS

Important note For filename usage and file merging within scripts, please refer tothe PVM section on page 125.Note: The SELOUTP command must be used before a command whose output isto be redirected (e.g. LIST, LISTALL, LISTSOL, etc). Afterwards the output streamcan be set back to the screen again. Here is an example:

Example

SUPERPOSSELOUTP table_listsol.log a yLISTSOL 30SELOUTP screen

END

6.2.11 Sending a command to FlexV (TOFLEXV)

Syntax: TOFLEXV <command>Description: You can send a command string to FlexV with TOFLEXV.A command consists of a single character (the first character from the sent stringfollowed by an argument. In case of only one argument, the argument can be writ-ten directly after the command without a separating blank. If there are more thanone argument, the whole command has to be put in quotation marks, like this:"<command> [<first_arg> [<second_arg> ...]]".Currently, the following commands are implemented:

Page 71: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

6.2. GLOBAL COMMANDS 71

Com. Argument Meaning. do nothing, can be used for checking the com-

munication line

b BREAK, terminates the communication link tothe application program without terminating.

c [x y z] CENTER, sets the center of rotation to the cen-teroid of all currently visible objects. If x, y, zgiven, center of rotation will be set to the givencoordinates.

d o_id DELETE, deletes the contents of the graphic ob-ject o_id, which can be 0 to 255 or the string"all".

i file1 [o_id [mode]][matchlist_file file2[show|skip]]

IMPORT, imports the mol2 file file1.o_id is optional. If just one o_id is given, FlexVimports the mol2 file into this slot. If the slotis not free, it will be overwritten without anywarnings. Using mode "a" after it, you canappend the new mol2 to an occupied slot as anew slider object.It’s also possible to give a range of object ids,like "2,3,4,7", "2-4,7". In this case, FlexV trys toimport the mol2 file file1 into the next freeslot. If there is no free slot, FlexV breaks witha warning. Using mode "a" for append, FlexVtrys to append the new mol2 file to the lastused slot in the given o_id range. If the lastused graphical object is not within the givenrange, FlexV breaks with a warning.Giving a matchlist and a second mol2 filefile2 (optional), FlexV draws lines betweenthe atoms in the first and the second givenmol2 files for each matching entry in thematching file. Matchlist file consists of lineswith following format:<atom_id_1> <atom_id_2> <label><energy>,<atom_id_1>:integer, atom ID in file1,<atom_id_2>:integer, atom ID in file2,<label>:string, label of the line to draw,<energy>:double, base for the color of the line.Giving argument "skip", skipps drawing ofthe second mol2 file file2, "show" (default)draws also the second one.

Page 72: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

72 CHAPTER 6. MENUS AND COMMANDS

Com. Argument Meaningg file[ o_id] GET, load the gdf file.

o_id is optional. If not given, the objects arewritten to the slots that are saved in the gdf file.If just one value is given, FlexV loads the gdffile file into the given o_id. An existinggraphical object will be overwritten withoutany warnings.It is also possible to give a range of object ids,like "2,3,4,7", "2-4,7". In this case, FlexV trys toload the gdf file file into the next free slot. Ifthere is no free slot, FlexV breaks with a warn-ing.

l LOOKAT, initiates the look-at function.

m mode[ o_id] changes the current molecule display mode.Possible values are "lines", "ballssticks","sticks", "spheres" and "smartsticks".o_id is optional. If it is left out or set to "all"or "*" all slots are changed. Otherwise only thegiven slots are affected.It is also possible to give a range of object ids,like "2,3,4,7", "2-4,7".

p switches to pharm mode and opens the pharmcontrol panel.

r action ROCK, switches the rock mode on (action is"on") or off (action is "off")

s o_id action SWITCH, switches graphic objects on or off.o_id is the number of the graphic object, actionis the string "on" or "off", "-" (decrement visibleinstance id), "+" (increment visible instance id),or the instance id directly. The whole call has tobe put into quots.

x EXIT, quit FlexV

6.2.12 Sending FlexS graphic objects to the visualizer (DISPLAY)

Syntax: DISPLAYDescription: The command causes a switch to FlexV and displays the objects pre-viously drawn with DRAW.Requirements: DRAW must be performed first.

Page 73: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

6.3. COMMANDS IN THE ROOT MENU 73

6.2.13 Erasing a graphics object (ERASE)

Syntax: ERASE <from graphics object number> <to graphics object number>Description: Deletes the objects which are drawn to the graphic display in graph-ics objects <from graphics object number> to <to graphics object number> byFlexS during the next execution of DISPLAY.

6.2.14 Executing shell commands (! and EXEC)

Syntax: ! <unix command>EXEC <line no> <unix command>Description: If a user input starts with the !-character, the complete string (withoutthis first !-character) is passed as a command to the operating system which will tryto execute it.In contrast to !, EXEC reads the output printed by the unix command to stdout andstores the output line <line no> to a built-in variable named $(UNIX_OUTP) whichcan be accessed later on in a FlexS script.Note that the <unix command> goes through the parameter processing unit ofFlexS. Therefore, if called in a script, all variables starting with $ are exchanged bytheir values.Important notes: Note that no shell-specific expansions can be performed (e.g. the~-character can not be expanded to the home directory name).Shell commands are disabled in the WWW interface.

Example

!ls -a!cp dummy.c /home/usr/snoopy/test/

6.2.15 Executing internal unit tests (UNITTESTS)

Syntax: UNITTESTSDescription: This command runs a couple of internal self-tests for FlexS. It ismainly used for internal quality checks. It prints a dot for each test executed andsummarizes the results. The final result of this command call should always be’OK’. Please report to [email protected] if this is not the case.

6.3 Commands in the root menu

6.3.1 Deleting everything (DELALL)

Syntax: DELALLDescription: DELALL deletes everything in FlexS’s main memory except staticdata. Thus it summarizes the delete commands in the submenus TEST_LIG,REF_LIG and SUPERPOS.

Page 74: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

74 CHAPTER 6. MENUS AND COMMANDS

6.3.2 Executing a batch file (SCRIPT)

Syntax: SCRIPT <filename> [<parameter list> <keep variables>]Description: Executes the batch <filename>. <parameter list> is a list of batchvariables with predefined values, which must be separated by ’;’. If <keepvariables> is answered yes, the list of batch variables is not reset, i.e. variables andtheir values from previously executed batch files are present during the executionof the batch file. See section 10.3 for an explanation of the batch language.

6.4 Working with test ligands (TEST_LIG submenu)

6.4.1 Reading (READ)

Syntax: READ <filename> [<molecule ID>]Description: Reads a test ligand file <filename> into FlexS’s workspace. TheFlexS native file format for ligands is the SYBYL MOL2 format [19]. Other allowedfile formats are MDL SDF [18] and MDL MOL. The file must have the correspondingextension .mol2, .sdf or .mol.The default directory for this command is the path specified in the LIGAND(config_sp.dat) entry.If the file is a multi-mol2 file, <molecule ID> can be either a single number, a list ofnumbers separated by blanks or ’,’, a list of intervals of the form ’a-b’, or ’all’. Eachselected molecule from the multi-file is loaded into FlexS’s workspace. Note thateven if multiple molecules are loaded, only one compound is ever active at a time(see SELECT).The following operations are initiated in this order:

1.File read-in

2.Identification of ring systems

3.Ring conformer generation (e.g. by CORINA)

4.Molecule initialization (see below)

5.Stereo descriptor and atom equivalence class computation

6.Torsion angles analysis for all acyclic single bonds

7.Interaction type and interaction geometry assignment

8.Assignment of the Gaussian representation to the molecule

The molecule initialization comprises a preprocessing step, formal charge assign-ment, an aromaticity analysis and much more.The initialization configuration can be fully configured with the rules definedin the transform.dat static data file. To use this mechanism, the flagUSE_TL_TRANSFORMS must be set to 1. Please refer to section 10.18 for the details.Note: If you use a verbosity level of 5 or higher, FlexS lists an overview of compo-nents in its output. The respective table contains statistical information about thenumber of interaction geometries of a certain interaction level (#IA level) withineach component. Please note that for every atom only one contact type (the onewith the highest interaction level) is counted. For example, if you have a carbon

Page 75: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

6.4. WORKING WITH TEST LIGANDS (TEST_LIG SUBMENU) 75

atom with the contact types “phenyl_ring” (level 2), “phenyl_center” (level 2), and“aro” (level 1), only one interaction of type 2 is counted. Furthermore FlexS lists anoverview of Gaussian atom representation.Important notes: FlexS requires partial charges specified in the molecule file.These charges are used to compute overlap volumes of the respective Gaussiansand to determine the hydrophobicity of an atom (see subsection 5.5.1.2).

6.4.2 Setting up the initialization procedure (SELINIT)

Syntax: SELINIT [<list of levels>]Description: Selects the steps that are applied to the test ligand in the initializationprocess. If USE_TL_TRANSFORMS (see 5.1.5) is set to 1 the transformation rules oftransform.dat are used. Otherwise, the corresponding old initialisation func-tions are used.Initially, the transformation levels applied during initialization are set intransform.dat file by setting a switch ON or OFF. To adjust this for spe-cial purposes or to perform only selected initialization steps, like protonation orassignment of formal charges, the levels can be selected by this command. Thereare two ways of calling this command:

Interactively When no levels are specified as parameters to the command, FlexSasks for each level individually whether it should be enabled (ON) or disabled(OFF). See the example below.By parameters The numbers of the levels that should be enabled or disabled canbe specified as parameters to the command. An exclamation mark (“!”) beforethe level number means it should be disabled, otherwise just the presence of thenumber in the list means that it will be enabled. An asterisk means all levels. Inaddition to using the number of a level, there are several labels for the initializationprocedures themselves, regardless of the number they have. Currently there areseven procedures available:

Label Action Default level numberP PDB import 1L Localization of bonds 2A Aromatic systems 3H Protonation 4F Formal charges 5D Delocalization 6T Atom type check/assignment 10

Page 76: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

76 CHAPTER 6. MENUS AND COMMANDS

Example

FLEXS/LIGAND> selinitLevel 1: Correct valences and bonds in PDB structures [OFF] : OFFLevel 2: Preprocess molecule [ON] : OFFLevel 3: Aromaticity check [ON] : OFFLevel 4: Assign default protonation [ON] : ON...Level 10: Assign atom types [ON] : ON

Example

FLEXS/LIGAND> selinit !* 10 # enables atom type detection, onlyFLEXS/LIGAND> selinit !T H # atom type check off, prot. check onFLEXS/LIGAND> selinit !5 10 # disable level 5 but enable level 10

6.4.3 Reading reference coordinates (READREF)

Syntax: READREF <filename> <ignore hydrogen>Description: To compare test ligand placements proposed by FlexS with otherplacements or reference coordinates (computation of RMSDs), a reference coordi-nate set can be loaded with the command READREF from the file <filename> Thefile must be in MOL2 file format [19] and the numbering of the atoms must be thesame as in the test ligand file loaded with the READ command. The READREFcommand can only be performed after the READ command. Finally, if <ignorehydrogen> is answered yes, hydrogen atoms are ignored during loading.Note that READREF can be executed before as well as after a superposition com-putation. If it is executed after a superposition computation, the RMSD will beautomatically recomputed.Requirements: READ must be performed first.

6.4.4 Assigning reference coordinates by subgraph matching (MAPREF)

Syntax: MAPREF <filename> <bond check> <atom check> <ignore hydrogen>Description: The mol2 file <filename> is loaded and reference coordinates are as-signed on the basis of a subgraph matching. The matching process can be controlledwith two flags: if <bond check> is answered yes, the matching algorithm enforcesexact matching of bond types. Otherwise, bond types are ignored. If <atom check>is answered yes, exact matching of sybyl atom types is required, otherwise only theelement must match. Finally, if <ignore hydrogen> is answered yes, hydrogenatoms are neglected during loading.Executing MAPREF has two effects. During execution, the reference molecule ismapped to the previously loaded molecule and reference coordinates are assigned.If multiple matchings are found, the arbitrary first matching is used. The subgraphtogether with the coordinates are then stored internally and used further duringbase selection and placement (see SELBAS and PLACEBAS commands).<filename> can be a multiple mol2 file. In this case, each molecule instance is used

Page 77: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

6.4. WORKING WITH TEST LIGANDS (TEST_LIG SUBMENU) 77

to form a coordinate set. It is required, however, that the molecules themselves(atom types, bond types, atom ordering, etc.) are identical. For the assignmentof reference coordinates, only the first molecule contained in the file is used. InPLACEBAS however, the manual placement is performed for each coordinate setloaded.Requirements: READ must be performed first.Important notes: In the mapping process, only mappings with compatible stereochemistry at 4-bonded atoms are created.

6.4.5 Setting reference coordinates (SETREF)

Syntax: SETREF <ignore hydrogen> [<placement ID>]Description: If the superposition predictions are to be compared with the coordi-nates in the test ligand file loaded with the READ command, the reference coordi-nates can be set with the SETREF command. You can decide whether the hydrogenatoms should be taken into account in the comparison or not with the parameter<ignore hydrogen>.If placements are already computed, SETREF can be used to compare the place-ments with one specific one by setting <placement ID> to the number of that place-ment.Requirements: READ must be performed first.

6.4.6 Selecting the base fragments (SELBAS)

Syntax: SELBAS <mode1> [<mode2>|<base atom list>|<smarts>]Description: Defines the base fragment of the test ligand. The following modes areavailable:

automatic (a) In automatic mode, a set of base fragments is automatically selectedbased on an internal scoring scheme. If <mode2> = f, the selection is used forflexible fitting (SUPERPOS/PLACEBAS). Otherwise if <mode2> = s, the selec-tion is used screening purposes (SUPERPOS/SCREEN). All previous selectionsare overwritten.

manual (m) In manual mode, a base fragment is manually defined, previous defi-nitions of base fragments are overwritten.You receive a list of all test ligand atom names, proceeded by the number ofthe atom in the test ligand input file. You are asked for a set of atom numbersdefining the base fragment in the following way:All atoms lying on a path between these atoms extended by all atoms whichare attached rigidly to them form the base fragment. The selection is a list ofintegers or integer ranges (format a-b) separated by ’,’ or blanks or the keyword’all’.

manual append (p) In this mode, a base fragment is manually selected and addedto the list of already selected base fragments.

reference (r) In reference mode, base fragments are selected via a reference struc-ture loaded previously with the MAPREF command. If multiple mappings to

Page 78: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

78 CHAPTER 6. MENUS AND COMMANDS

the reference structure are possible, up to the maximum number of base frag-ments mappings are used. Be aware that due to the limitation of the numberof allowed conformations, too large reference structures may be rejected.

smarts (s) In this mode, a base fragment is selected by the SMARTSTM expression<smarts>. A list of allowed SMARTSTM expressions is given in section 10.13.All previous selections are overwritten.

After the base fragments are defined, the complete fragmentation is calculated andthe order in which fragments are added is determined. The order depends on sev-eral features of the fragments like the kind of interactions which can be performedand the number of fragments which still have to be added to complete this part ofthe ligand.Requirements: A test ligand must be loaded with the TEST_LIG/READ com-mand. For reference mode, a reference structure must have been loaded with theTEST_LIG/MAPREF command.Important notes: The selection of the base fragment is an important phase and theresults vary substantially for different base fragments. A manual selection shouldalways be preferred if you have specific knowledge about the reference and testligand at hand.Base fragments should have the following features:

• only a reasonable set of discrete conformations (up to five hundred)• enough interacting groups which are able to be connected to the respective

reference ligand groups• large and specific enough to be placed reasonably on top of the reference lig-

and.

If the number of discrete conformations is too large, FlexS prohibits the base selec-tion and aborts the command.

Example

selbas a f # automatically select for flexible fittingselbas a s # automatically select for fragment-based screeningselbas p "3 6" # manually select atoms 3 and 6selbas m all # manually select all atoms

The first example performs an automatic selection; in the second example, all atoms lyingbetween atom numbers 3 and 6 in the molecular graph are selected as the base fragment andappended to the list of previously selected base fragments; in the third example, the wholetest ligand is selected as the base fragment (previous selections are overwritten).

6.4.7 Selecting (SELECT)

Syntax: SELECT [<?>] <molecule ID>Description: Selects a test ligand from FlexS’s workspace and prepares the internaldatastructs for processing. The parameter <molecule ID> gives the number of themolecule to be activated. The optional parameter <?> causes the display of a list ofstructures contained in FlexS’s workspace.

Page 79: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

6.4. WORKING WITH TEST LIGANDS (TEST_LIG SUBMENU) 79

Important notes: An empty workspace of FlexS causes an error. The parameter<molecule ID> is required to be greater than or equal to 1 and smaller than orequal to the number of molecules currently loaded in FlexS’s workspace.

6.4.8 Outputting the most important information about a test ligand (INFO)

Syntax: INFODescription: Displays the main characteristics of a test ligand, such as the poten-tial number of conformations and the number of interactions, on the screen. Threeerror flags indicate whether problems have occurred during initialization of the testligand data structure. (<rc_gen> error) indicates that the ring conformation pro-gram has had problems in generating the ring conformations of the molecule. A(geom error) occurs if the geometry of the test ligand looks strange (try an energyminimization in this case). A (conf error) occurs if no conformation can be con-structed (again, try an energy minimization before loading the test ligand). TheINFO command is extremely useful for summarizing errors in a test ligand data set.If a base fragment is selected (TEST_LIG/SELBAS) and reference coordinates areloaded (TEST_LIG/READREF), RMSDs between the reference coordinates and themost similar conformation of the test ligand in FlexS’s discrete conformational spaceare computed (see also TEST_LIG/CONFORM/MINCONF).

6.4.9 Editing (EDIT)

Syntax: EDITDescription: Calls the editor together with the test ligand file currently in FlexS’smain memory.Important notes: There is no automatic reload after editing.

6.4.10 Writing (WRITE)

Syntax: WRITE <filename> <append file> <multi file> <coordinate system>[<placement selection>]Description: Writes a set of test ligand placements in the file <filename>. Thedefault format is MOL2 [19]. Alternative formats can be selected by the appropriatefilename extension. The rules explained at the beginning of section 10 apply to thefilename. Possible formats are MDL SD [18] (extension .sdf), MOL [19] (extension.mol) or PDB [1] (extension .pdb). If <append file> is set to ’y’, all placements areappended to an already existing file named in <filename>. If <multi file> is setto ’y’, all placements are written in one file. Otherwise a set of files with filenames<filename>_<placement number> is generated.The ligand can be written in different FlexS-specific coordinate systems: ’o’: fromthe superimposing of multiple test ligands, ’f’: fixed coordinates from the loadedmol2 file, ’r’: reference coordinates (see 6.4.4) and ’c’: coordinates of several place-ments. ’o’, ’f’, ’r’ write one coordinate set in contrast to the option ’c’, where theuser has to define a set. <placement selection> specifies the placements to write.It can be either a single number, a list of numbers separated by blanks or ’,’, a listof intervals of the form ’a-b’, or ’all’, or ’q’. If set to ’q’, the result list from the last

Page 80: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

80 CHAPTER 6. MENUS AND COMMANDS

query command (submenu SUPERPOS, commands QUERY, LISTSOL, LISTRMS,etc.) is used. If <filename> is a mol2 file, several score values for each placementare printed as a comment line (FLEXS_SCORE).Note that if you are writing in one file, the order of the placements is ascendingand does not correspond to the order of the entries in the entered selection. Theplacements are written in the same order as in the query list only if the last queryresult list is used.The default directory for this command is the path specified in the PREDICT entryof (config_sp.dat).

6.4.11 Deleting (DELETE)

Syntax: DELETE <molecule ID selection>Description: Removes one or more test ligands from FlexS’s workspace.<molecule ID selection> specifies the structures to be deleted. It can be either asingle number, a list of numbers separated by blanks or ’,’, a list of intervals of theform ’a-b’, or ’all’. All data associated with the test ligands such as conformationalsets and placements is removed automatically too.

6.4.12 Volume computation (VOLUME)

Syntax: VOLUME <table format>Description: Computes the van der Waals volume of the test ligand and the Gaus-sian volumes of the different qualities. The different qualities are described in sec-tion 10.4.2. If <table format> is set to ’y’, the result will be output in a table, other-wise it is printed on one line.

6.4.13 Setting administration defaults for drawings (SELADM)

Syntax: SELADM<ref. coords> <graphics object number> <temp file> <appendgraphic files>Description: With SELADM you can specify the graphics object numbers used fordrawing test ligands and you can determine whether the graphics files are tempo-rary with self-generated names or specified in each graphic command.

<ref. coords> If set to ’y’ or ’1’, the following modifications of the graphic settingsconcern the reference coordinates display of the test ligand.

<graphics object number> If set to (0–255), the drawings concerning ligands willbe displayed in graphics object <graphics object number> (see FlexV manual).If you select ’0’, you will be asked for a range of graphics object numbers.Subsequent DRAW commands use these mol object numbers in a first-drawn-first-overwrite manner.

<temp file> If set to ’y’, the drawings are written in temporary files and removedafter quitting FlexS. Otherwise you will be asked for a filename at the end ofeach DRAW command (see below).

<append graphic files> If set to ’y’, the drawings will be appended to an alreadyexisting file. Otherwise the file will be overwritten.

Page 81: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

6.4. WORKING WITH TEST LIGANDS (TEST_LIG SUBMENU) 81

6.4.14 Setting default values for drawing the test ligand (SELGRA)

Syntax: SELGRA <ref. coords> <mol display mode> <hydro> <interactgeoms> <interact points> <all co types> [<co type selection>] <allcomponents> [<component selection>] <surf> <gauss>Description: With SELGRA you can set specific default values for drawing test lig-ands.

<ref. coords> If set to ’y’ or ’1’, the following modifications of the graphic settingsconcern the reference coordinates display of the test ligand.

<mol display mode> Specifies the default appearance of molecules. The displaymodes are lines ’1’, sticks ’2’, balls & sticks ’3’, spheres ’4’.

<hydro> If set to ’y’ or ’1’, hydrogens are shown. If set to ’2’ then only hydrogensbonded to hetero atoms (i.e., only hydrogens that are bonded to non-carbonatoms) are shown.

<interact geoms> If set to ’y’ or ’1’, the interaction geometries including main di-rections are shown.

<interact points> If set to ’y’, the interaction geometries are shown as discrete in-teraction points.

<all co types> If set to ’y’, interaction geometries of all contact types are drawn.Otherwise, a set of contact types must be entered in <co type selection>. Theselection is a list of integers or integer ranges (format a-b) separated by ’,’ orblanks.

<all components> If set to ’y’, atoms of all components are drawn. Otherwise, aset of component numbers must be entered in <component selection>.

<surf> Specifies the kind of surface to draw. Basically the surfaces are molecularsurfaces. In lines mode, only lines connecting bounding atoms of reentrant(saddle and concave) patches of the surface are shown. In triangles mode, theconcave patches are displayed as triangles. In Connolly mode, the molecularsurface is drawn.

<gauss> If set to ’y’ or ’1’, an iso-contour surface of the Gaussians of the differentqualities is (selectively) drawn. The different qualities are described in section10.4.2.

Requirements: Depending on the graphical interface, the surface modes are lim-ited. In VRML mode, both lines mode and triangles mode are available. With FlexV ,all surface modes can be shown.Important notes: The Connolly surface is rendered by its analytical calculatedpatches. This enables selection of the level of curvature approximation but makesthe rendering much more complicated. Therefore a few percent of the patches arerendered incorrectly (we will try to reduce this rate). In addition, there is currentlyno cusp trimming.

6.4.15 Selecting the coloring mode (SELCOL)

Syntax: SELCOL <ref. coords> <test ligand color mode selection> <interact ge-oms color mode selection> <main dirs color mode selection> <surface color modeselection> <Gaussian color mode selection>

Page 82: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

82 CHAPTER 6. MENUS AND COMMANDS

Description: With SELCOL you can set the color modes for the molecule, maindirections and interaction geometries. Valid modes are listed below. Depending onthe color mode chosen, you will be asked for specific color values. All color modesand the respective color definitions can be found in 10.20. If <ref. coords> is set to’y’ or ’1’, the modifications of the graphic settings concern the reference coordinatesdisplay of the test ligand.

invisible : (test ligand, main dirs, interact geoms, surface) Nothing is drawn.atom : (test ligand, surface) The bonds are drawn half-half in the colors of the

atoms. The atoms are colored according to their element.unique : (test ligand, main dirs, interact geoms, surface) Everything is drawn in

one user-defined color.fragment : (bond) The test ligand is bi-colored to visualize the fragmentation, the

base fragment has an extra color.energy : (molecule) The molecule is colored depending on the score that it achieves

during superposition.contact type : (main dirs, interact geoms) Main directions or interaction geometries

are drawn in the colors of the contact types.surfpatch : (surface) The surface is colored according to the surface patch type (con-

cave, saddle, convex).surf-atom : (surface) Convex patches are colored by atom type, reentrant (saddle

and concave) patches are colored in one user-defined color.

There are three possible ways to specify a color. Inside FlexS, RGB values are used.You can specify the RGB value directly by typing three or four values (lucent col-oring) from the interval [0.0,1.0] separated by blanks or /, or indirectly either bythe name of a color or by an integer from the interval [1,360]. The integer repre-sents an angle in a color circle. All color names are defined in the static data filegraphic_sp.dat. You can add color definitions if you like (see also 10.20). Notethat if you are typing a single line command or writing a batch file and the colordefinition contains blank characters, you must enclose them in double quotes.

The following example shows different ways of selecting a green color. The dots representprevious and subsequent parameters.

Example

selcol .... "dark green" ....selcol .... green ....selcol .... trans green ....selcol .... "0.0 0.8 0.1" ....selcol .... 0.0/0.8/0.1 ....selcol .... 220 ...

6.4.16 Labeling the test ligand (SELLAB)

Syntax: SELLAB <ref. coords> <atom name> <infile number> <sybyl type><fragment number>Description: Test ligand molecules are labeled with text information.

Page 83: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

6.4. WORKING WITH TEST LIGANDS (TEST_LIG SUBMENU) 83

<ref. coords> If set to ’y’ or ’1’, the following modifications of the graphic settingsconcern the reference coordinates display of the test ligand.

atom name: If set to ’y’ or ’1’, the short element name of the atom is shown.infile number: If set to ’y’ or ’1’, the number of the atom in the input file is shown.sybyl type: If set to ’y’ or ’1’, the sybyl type string is shown.fragment number: If set to ’y’ or ’1’, the number of the atom fragment is shown.

Important notes: VRML 1.0 is (to our knowledge) not able to handle 2D text in3D scenes which makes labeling a little bit strange under visualization with VRMLbrowsers.

6.4.17 Drawing the test ligand (DRAW)

Syntax: DRAW [fix | opt | ref | rms rms-limit rank-limit | <placement number>][<graphics object number>][<quality><level>] [<filename>]Description: DRAW generates a drawing of the test ligand. If <placement number>is set to fix, coordinates are taken from the test ligand input file. If <placementnumber> is set to opt, coordinates of type ’opt’ are taken. If <placement number> isset to ref the reference coordinates (with their specific graphic settings) are taken. If<placement number> is set to rms the highest ranking solution where RMSD<rms-limit among the first rank-limit solutions is taken. If rms-limit is set to 0.0 the bestpossible RMSD is taken. If rank-limit is set to 0 all solutions are considered. If<placement number> is a number greater than zero, coordinates are taken fromthe placement <placement number>. If the drawing is not stored in temporaryfiles (see TEST_LIG/SELADM), it will be stored in the file <filename>. If the draw-ing of Gaussian iso-contour surfaces has been selected you will be asked for thekind of Gaussian quality to be drawn and for the level of the iso-contour.Important notes: Drawings are not displayed automatically. Use DISPLAY to out-put the drawing on the graphics device.

6.4.18 Drawing the test ligand at multiple positions sequentially (MDRAW)

Syntax: MDRAW <placement selection> [<filename>]Description: Generates drawings of the test ligand at positions taken from a selec-tion of placements. The selection is a list of integers or integer ranges (format a-b)representing placement IDs separated by ’,’ or blanks.For each drawing, MDRAW works like the DRAW command.MDRAW produces different results depending on the capabilities of the visualizationtool.

VRML The drawings are always displayed simultaneously.FlexV The drawings are attached to the object 0 slider in FlexV , overwriting any

previous contents in each case. You can choose between the different drawingsby moving this slider (see FlexV manual).

Important notes: Drawings are not displayed automatically. Depending on thegraphical interface, use DISPLAY to output the drawings to FlexV or the VRMLbrowser.

Page 84: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

84 CHAPTER 6. MENUS AND COMMANDS

6.4.19 Drawing multiple test ligands (SDRAW)

Syntax: SDRAW <test_lig selection> <coord type>Description: Generates drawings of test ligands selected in <test_lig selection>.The selection is either a single number, a list of numbers separated by ’,’, a list ofintervals of the form ’a-b’, or ’all’.If <coord type> is set to fix, coordinates will be taken from the test ligand input file.If <coord type> is set to opt, the coordinates of type ’opt’ will be taken.Important notes: Drawings are not displayed automatically. Depending on thegraphical interface, use DISPLAY to output the drawings to FlexV or the VRMLbrowser.

6.4.20 Listing the graphic items (GRAINF)

Syntax: GRAINF <ref. coords>Description: Outputs a list of all current graphic settings for the test ligand (eitherfor reference or read-in/placements).

6.4.21 Setting charges (SETC)

Syntax: SETCDescription: Sets charges with formal charges.

6.4.22 Reading charges (READC)

Syntax: READC <filename>Description: Reads charges from separate mol2 file.

6.4.23 Writing charges (WRITEFC)

Syntax: WRITEFC <filename> <append file> <multi file> <placementselection>Description: Same as WRITE except that formal charges are used to fill the chargesentry.

6.4.24 Writing Gaussians (WRITEG)

Syntax: WRITEG <filename> <filetype> <placement selection> [<gausstype>]Description: Writes Gaussians to an X-plor, mol, or mol2 file named <filename>.<filetype> may be either 0 (X-plor), 1 (MOL), or 2 (MOL2). Default extensionsare .xpl (X-plor), .gauss (MOL), and .gauss2 (MOL2). The type of Gaussians(see 10.4.2) is given by <gausstype>. The grid spacing is 0.25Å for X-plor files. Inthe case of the MOL2 file format <gausstype> can be either a single number, a listof numbers separated by blanks or ’,’, a list of intervals of the form ’a-b’, or ’all’.Note that you are writing then in a single multi-mol2 file. <placement selection>specifies the placement to write.

Page 85: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

6.4. WORKING WITH TEST LIGANDS (TEST_LIG SUBMENU) 85

6.4.25 Checking SMARTSTM patterns and subgraph occurrence (SMARTS)

Syntax: SMARTS <smarts pattern>Description: Checks whether a substructure defined by a given SMARTSTM pat-tern can be found in a ligand or not. The matched atoms are output to the screen.Additionally the batch variable $(SMARTS_MATCH) contains the number of occur-rences of the substructure or <no_match> if the substructure was not found. Set theVERBOSITY to 10 to get more information about the substructure generated fromthe SMARTSTM pattern.

6.4.26 Superpositioning of multiple test ligands by matching (MATCH)

Syntax: MATCH [<transform>] <tuple> [<first> <second> <matching>][<list>]Description: Calling MATCH the first time causes two test ligands that are loaded inFlexS’ workspace to be superposed, according to a user-defined matching of atoms.With any subsequent call of MATCH, the transformation resulting from the last com-puted matching can be applied to a list of additional test ligands. However, the usercan always enforce the superposition of test ligands using a new matching.If MATCH is called for the first time, <transform> is not an option and <tuple>must equal ’y’ meaning that two test ligands are to be superimposed. <first> is theID of a ligand in FlexS’ workspace which is to be superposed onto the <second>ligand. <matching> defines a correspondence list of atoms in these ligands that isused for an RMS-fit. Transformed coordinates are stored as ’opt’ coordinates of thefirst ligand and the transformation matrix is stored for subsequent calls of match.If match has been called before, <transform> may equal ’y’ in which case a <list>of test ligands must be provided. <list> can be either a single number, a list ofnumbers separated by blanks or ’,’, a list of intervals of the form ’a-b’, or ’all’. Thetransform resulting from the previous use of match is applied to all the specifiedtest ligands and the transformed coordinates are stored in the ’opt’ coordinates field.If <transform> equals ’n’, <tuple> must equal ’y’ and <first>, <second>, and<match> must be provided (cf. above).

6.4.27 Switching coordinate types (SWITCHTYPE)

Syntax: SWITCHTYPE <switchtype>Description: <switchtype> may be either 1 (opt→fix), 2 (placement→opt), or 3(fix→opt)

1.Saves coordinates of type ’opt’ to coordinates of type ’fix’.

2.Saves the coordinates of one placement to coordinates of type ’opt’.

3.Saves coordinates of type ’fix’ to coordinates of type ’opt’.

Important notes: The target set of coordinates is overwritten. ’opt’ coordinates arethe basis of operation of flexible postoptimization.

Page 86: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

86 CHAPTER 6. MENUS AND COMMANDS

6.4.28 Manually transforming test ligand coordinates (TRANSFORM)

Syntax: TRANSFORM <coord type> [<placement>] <translation> [<x-comp><y-comp> <z-comp>] <global rotation> [<axis> <angle>] <local rotation>[<selection> <angle>]Description: Transforms test ligand coordinates. <coord type> may be either 1(fix), 2 (opt), or 3 (placement). If <coord type> equals 3, coordinates are taken froma specific <placement>.If <translation> equals ’y’, the components of the translation-vector (<x-comp><y-comp> <z-comp>) are requested. If <global rotation> equals ’y’, a rotationaxis und a rotation angle must be provided. <axis> may be either 0 (x-axis), 1(y-axis), or 2 (z-axis). If <local rotation> equals ’y’, a selection of rotatable bondsmust be specified. <selection> can be either a single number, a list of numbersseparated by ’,’, a list of intervals of the form ’a-b’, or ’all’. For each bond a rotationangle must be specified. The transformed test ligand coordinates are stored as type’opt’ coordinates.

Transformation: xnew = Rg(Rl(x)− sm) + sm − z ∀x ∈ IR3

where Rg ∈ SO3 is the matrix of global rotation, z ∈ IR3 is the translation vector, Rlis the operator of local rotations, and sm is the center of gravity of the atoms of thetest ligand.

6.4.29 Writing the atom coordinates of type ’opt’ (WRITEOPT)

Syntax: WRITEOPT <test_lig> <filename> <f_charges>Description: Writes the coordinates of type ’opt’ (see 6.4.28) for a chosen test lig-and in the file <filename>. The default format is MOL2 [19]. An alternativeformat can be selected by means of the appropriate filename extension. Possibleformat is MOL [19] (extension .mol). <test_lig> is the number of the test lig-and. If <f_charges> is set to ’y’, formal charges are used to fill the charges entry.The default directory for this command is the path specified in the TEMP entry of(config_sp.dat).

6.4.30 *Working with test ligand conformations (TEST_LIG/CONFORM submenu)

There are some commands for analyzing the conformational set of a test ligand. They maybe useful for understanding differences between FlexS’ conformational model and X-raycrystallographic data.

6.4.30.1 Finding a conformation with minimal RMSD or minimal energy to the refer-ence conformation (MINCONF)

Syntax: MINCONF <fragmentation> <by rms>Description: With the MINCONF command, you can search for the most similarconformation in the discrete conformational set of the test ligand compared withthe reference coordinates or with a conformation with minimal energy. The confor-mational set is defined by the ring conformations produced by the ring conforma-

Page 87: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

6.4. WORKING WITH TEST LIGANDS (TEST_LIG SUBMENU) 87

tion generator and by discrete torsion angles assigned in FlexS from the torsiondatabase (torsion_standard.dat/torsion_fine.dat). With <fragmentation> you selectthe fragmentation used for the minimization.Requirements: SELBAS and SETREF or READREF must be performed first.Important notes: The algorithm used here is heuristic, so there is no guaranteethat the proposed conformation is really the one with the lowest RMSD from thereference coordinates or with the lowest energy.

6.4.30.2 Writing one specific conformation (WRITONE)

Syntax: WRITONE <fragmentation> <superpose> <conf. string> <filename>Description: Writes one specific conformation into a file. The conformation isspecified by the conformation string which defines the conformation of each frag-ment sequentially. Because this description is based on the internal representationof conformations, you must take the conformation string from FlexS’s output, forexample from the command MINCONF or from the placement tables (see section6.7.12). If <superpose> equals ’y’, the defined conformation is superposed on thereference coordinates. You can also write out partial test ligand conformations byterminating the conformation string with -1. The default directory for the output isPREDICT.Requirements: SELBAS must be performed first. If <superpose> equals ’y’, ref-erence coordinates must be loaded (READREF).

6.4.30.3 Writing a set of random conformations (WRITRAND)

Syntax: WRITRAND <filename> <number of conf.>Description: Writes a set of approximately <number of conf.> randomly selectedconformations in a set of mol2 files <filename>. The default directory for the out-put is PREDICT.Requirements: SELBAS must be performed first.Important notes: The actual number of written conformations can differ from the<number of conf.> if internal clashes occur.

6.4.30.4 Calculating the RMSD from read-in to reference coordinates (FIXRMSD)

Syntax: FIXRMSDDescription: Calculates the fixed order and variable order RMSD from the read-in(fix) to the reference (ref) coordinates.Requirements: A test ligand (TEST_LIG/READ) and reference coordinates(TEST_LIG/READREF) must be loaded.

6.4.30.5 Enumerating all sterically possible conformations

Syntax: ENUMALL <fragmentation>Description: Enumerates all test ligand conformations without internal clashes.Requirements: SELBAS must be performed first.Important notes: This command can be very time consuming.

Page 88: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

88 CHAPTER 6. MENUS AND COMMANDS

6.5 Working with reference ligands (REF_LIG submenu)

6.5.1 Reading (READ)

Syntax: READ <filename>Description: Reads a reference ligand file <filename> into FlexS’s workspace.The FlexS native file format for ligands is the SYBYL MOL2 format [19]. Otherallowed file formats are MDL SDF [18] and MDL MOL. The file must have the cor-responding extension .mol2, .sdf or .mol.The default directory for this command is the path specified in the LIGAND(config_sp.dat) entry.If the file is a multi-mol2 file, <molecule ID> can be either a single number, a list ofnumbers separated by blanks or ’,’, a list of intervals of the form ’a-b’, or ’all’. Eachselected molecule from the multi-file is loaded into FlexS’s workspace. Note thateven if multiple molecules are loaded, only one compound is ever active at a time(see SELECT).The following operations are initiated in this order:

1.File read-in

2.Identification of ring systems

3.Molecule initialization (see below)

4.Stereo descriptor and atom equivalence class computation

5.Interaction type and interaction geometry assignment

6.Assignment of the Gaussian representation to the molecule

The molecule initialization comprises a preprocessing step, formal charge assign-ment, an aromaticity analysis and much more.Important notes: FlexS requires partial charges specified in the molecule file.These charges are used to compute overlap volumes of the respective Gaussiansand to determine the hydrophobicity of an atom (see subsection 5.5.1.2).

6.5.2 Selecting (SELECT)

Syntax: SELECT [<?>] <molecule ID>Description: Selects a reference ligand from FlexS’s workspace and prepares theinternal datastructs for processing. The parameter <molecule ID> gives the num-ber of the molecule to be activated. The optional parameter <?> causes the displayof a list of structures contained in FlexS’s workspace.Important notes: An empty FlexS workspace causes an error. The parameter<molecule ID> must be greater than or equal to 1 and smaller than or equal tothe number of molecules currently loaded in FlexS’s workspace.

6.5.3 Deleting (DELETE)

Syntax: DELETE <molecule ID selection>Description: Removes one or more reference ligands from FlexS’s workspace.<molecule ID selection> specifies the structures to be removed. It can be eithera single number, a list of numbers separated by blanks or ’,’, a list of intervals of

Page 89: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

6.5. WORKING WITH REFERENCE LIGANDS (REF_LIG SUBMENU) 89

the form ’a-b’, or ’all’. All data associated with the reference ligands, such as thetriangle hash table and superposition predictions, is removed automatically too.

6.5.4 Setting up the initialization procedure (SELINIT)

Syntax: SELINIT [<list of levels>]Description: Selects the steps that are applied to the reference ligand in the ini-tialization process. If USE_RL_TRANSFORMS (see 5.1.5) is set to 1 the transforma-tion rules of transform.dat are used. Otherwise, only the assignment of formalcharges is used during the initialization.Initially, only the transformation level for the assignment of formal charges is active.To adjust this for special purposes or to perform only selected initialization steps,like protonation or assignment of formal charges, the levels can be selected by thiscommand. For more details see TEST_LIG/SELINIT (6.4.2).

6.5.5 *Building the reference ligand triangle hash table (TRIHASH)

Syntax: TRIHASHDescription: For the second phase of the superposition algorithm, a triangle hashtable must be generated. If the hash table is not available, it will be automaticallygenerated by the superposition algorithm. Manual usage of this command is there-fore not necessary under standard conditions.

6.5.6 Setting administration defaults for drawings (SELADM)

Syntax: SELADM <graphics object number> <append graphic files> <temp file>Description: With SELADM you can specify the graphics object numbers used fordrawing the reference ligand and you can determine whether the graphics files aretemporary with self-generated names or specified in each graphic command.

<graphics object number> If set to (0–255), the drawings for the reference lig-and will be displayed in graphics object <graphics object number> (see FlexVmanual). If you select 0, you will be asked for a range of graphics object num-bers. Subsequent DRAW commands use these graphics object numbers in afirst-in-first-overwrite manner.

<temp file> If set to ’y’, the drawings are written in temporary files and removedafter quitting FlexS. Otherwise you will be asked for a filename at the end ofeach DRAW command (see below).

<append graphic files> If set to ’y’, the drawings will be appended to an alreadyexisting file. Otherwise the file will be overwritten.

6.5.7 Setting default values for drawing the reference ligand (SELGRA)

Syntax: SELGRA <mol display mode> <hydro> <interact geoms> <interactpoints> <all co types> [<co type selection>] <surf> <gauss>Description: With SELGRA you can set specific default values for drawing refer-ence ligands.

Page 90: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

90 CHAPTER 6. MENUS AND COMMANDS

<mol display mode> Specifies the default appearance of molecules. The displaymodes are lines ’1’, sticks ’2’, balls & sticks ’3’, spheres ’4’.

<hydro> If set to ’y’ or ’1’, hydrogens are shown. If set to ’2’ then only hydrogensbonded to hetero atoms (i.e., only hydrogens that are bonded to non-carbonatoms) are shown.

<interact geoms> If set to ’y’ or ’1’, the interaction geometries including main di-rections are shown. Main directions can be excluded from a drawing by settingtheir color mode to INVISIBLE.

<ia points> If set to ’y’ or ’1’, the discrete interaction points are drawn.<all co types> If set to ’y’ or ’1’, interaction geometries of all contact types

are drawn. Otherwise a set of contact types must be entered in <co typeselection>. The selection is a list of integers or integer ranges (format a-b)representing contact type numbers separated by ’,’ or blanks.

<surf> Specifies the kind of surface to draw. Basically the surfaces are molecularsurfaces. In lines mode, only lines connecting bounding atoms of reentrant(saddle and concave) patches of the surface are shown. In triangles mode, theconcave patches are displayed as triangles. In Connolly mode, the molecularsurface is drawn.

<gauss> If set to ’y’ or ’1’, an iso-contour surface of the Gaussians of the differentqualities is (selectively) drawn. The different qualities are described in section10.4.2.

Requirements: Depending on the graphical interface, the surface modes are lim-ited. In VRML mode, both lines mode and triangles mode are available. With FlexV ,all surface modes can be shown.Important notes: The Connolly surface is rendered by its analytical calculatedpatches. This enables selection of the level of curvature approximation but makesthe rendering much more complicated. Therefore a few percent of the patches arerendered incorrectly (we will try to reduce this rate). In addition, there is currentlyonly pairwise cusp trimming.

6.5.8 Selecting the coloring mode (SELCOL)

Identical to TEST_LIG/SELCOL. See page 81.

6.5.9 Labeling the reference ligand (SELLAB)

Syntax: SELLAB <atom name> <infile number>Description: Reference ligand molecules are labeled with text information.

atom name : If set to ’y’ or ’1’, the short element name of the atom is shown.infile number : If set to ’y’ or ’1’, the number of the atom in the input file is shown.

6.5.10 Drawing the reference ligand (DRAW)

Syntax: DRAW [<filename>]

Page 91: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

6.5. WORKING WITH REFERENCE LIGANDS (REF_LIG SUBMENU) 91

Description: DRAW generates a drawing of the reference ligand. If the drawing isnot stored in temporary files (see REF_LIG/SELADM), it will be stored in the file<filename>.Important notes: Drawings are not displayed automatically. Use DISPLAY to out-put the drawing on the graphics device. If the drawing of Gaussian iso-contoursurfaces has been selected you will be asked for the kind of Gaussian quality to bedrawn and for the level of the iso-contour.

6.5.11 Listing the graphic items (GRAINF)

Syntax: GRAINFDescription: Outputs a list of all current graphic settings for the reference ligand.

6.5.12 Setting charges (SETC)

Syntax: SETCDescription: Sets charges with formal charges.

6.5.13 Reading charges (READC)

Syntax: READC <filename>Description: Reads charges from separate mol2 file.

6.5.14 Deleting Gaussians (DELETEG)

Syntax: DELETEG <gausstype> <distance>Description: Deletes Gaussians of type <gausstype> that are further apart fromevery other Gaussian of that type than <distance>. <gausstype> can be either asingle number, a list of numbers separated by blanks or ’,’, a list of intervals of theform ’a-b’, or ’all’. For convenience a table is provided that allows you to estimatehow many Gaussians will be deleted at a certain distance threshold. Note that with<distance>=0.0 all Gaussians of the selected type are deleted.

6.5.15 Merging Gaussians (MERGEG)

Syntax: MERGEG <gausstype> <distance>Description: Merges Gaussians of type <gausstype> that fulfill

∀n ∈ neighbor(k) : |neighbor(n)| = 1 ∨ (∀s ∈ neighbor(n) : s = k ∨ s ∈ neighbor(k))

with neighboring Gaussians being no further apart than <distance>. <gausstype>can be either a single number, a list of numbers separated by blanks or ’,’, a list ofintervals of the form ’a-b’, or ’all’.

6.5.16 Writing Gaussians (WRITEG)

Syntax: WRITEG <filename> <filetype> [<gausstype>]

Page 92: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

92 CHAPTER 6. MENUS AND COMMANDS

Description: Writes Gaussians to an X-plor, mol, or mol2 file named <filename>.<filetype> may be either 0 (X-plor), 1 (MOL), or 2 (MOL2). Default extensions are.fofc (X-plor), .gauss (MOL), .gauss2 (MOL2). The type of the Gaussians (see10.4.2) is given by <gausstype>. The grid spacing is 0.25Å for X-plor files. In thecase of the MOL2 file format <gausstype> can be either a single number, a list ofnumbers separated by blanks or ’,’, a list of intervals of the form ’a-b’, or ’all’. Notethat you are writing then in a single multi-mol2 file. The default directory for theoutput is specified in the PREDICT entry of (config_sp.dat).

6.5.17 Reading Gaussians (READG)

Syntax: READG <filename> <filetype> <overwrite> [<gausstype>][<use_neg>] [<mean_shift>] [<threshold>]Description: Reads Gaussians from separate X-plor, mol, or mol2 file named<filename>. <filetype> may be either 0 (X-plor), 1 (MOL), or 2 (MOL2).<overwrite> determines whether the Gaussians should overwrite already loadedones. The type of the Gaussians (see 10.4.2) is given by <gausstype>.

For the syntax of X-plor files, see the manual of the X-plor program(http://xplor.csb.yale.edu/xplor-info/). The default extension of X-plor files is.fofc. X-plor files define a grid of density values that is to be transformed intoa set of Gaussians internally. The remaining parameters specify the considera-tion/neglection of negative density values (<use_neg>), a shift of the mean of thedensity values (<mean_shift>), and a threshold (to be given inσ-units under whichdensities are ignored (<threshold>)).Mol and mol2 files have the usual SYBYL syntax [19] and by default the exten-sion .gauss and .gauss2, respectively. A coordinate entry in these files definesthe location of the center of a Gaussian and the corresponding charge entry de-fines its volume. Note that a Gaussian ae−bx2

has the volume a(π/b)3/2 (see DE-FAULT_GAUSS_WIDTH for the definition of b). The name of the dummy atomdefining a Gaussian is required to be Stce.Multi-mol2 files are allowed and the type of the Gaussians may be specified in com-ment lines starting with the record type indicator @<TRIPOS>COMMENT. The syn-tax is Flexs Gaussians of property <gausstype>. By default the typeE_DENSITY is assumed. The following example defines three Gaussians of the typeCHARGE (partial charge).

Page 93: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

6.6. *CHANGING THE STATIC DATA (DATABASE SUBMENU) 93

Example

# File example.gauss created by FLEXS# Creation time: 08-Dec-98 16:21:23# GAUSSIANS OF PROPERTY CHARGE@<TRIPOS>MOLECULEexample.gauss

24 0@<TRIPOS>ATOM

1 G 0.3843 2.5122 -7.5688 Stce 1 lig 0.0622 G 0.2793 -2.9642 0.0718 Stce 1 lig -0.5503 G 2.3723 2.9332 0.1578 Stce 1 lig -0.550

@<TRIPOS>BOND@<TRIPOS>COMMENTFlexs Gaussians of property CHARGE

6.6 *Changing the static data (DATABASE submenu)

6.6.1 Editing the static data files (EDIT)

Syntax: EDIT <static data filename>Description: Calls the editor defined in config_sp.datwith the file correspond-ing to <static data filename>. Valid strings for <static data filename> are the en-vironment variables for static data files as listed in section 5.1.3.

6.6.2 Performing a cascading reload (RELOADDAT)

Syntax: RELOADDAT <static data filename>Description: Reloads the static data file <static data filename>. Performs, if nec-essary, a cascading reload. The user will receive a warning that the test ligand orreference ligand data may have become inconsistent. Valid strings for <static datafilename> are the environment variables for static data files as listed in section 5.1.3and the string all.

Example

RELOADDAT allRELOADDAT geometry

The first example reloads all static data files. The second example first reloads thestatic data file stored in the GEOMETRY environment variable (in our convention:geometry_sp.dat). Since the static data file contact_sp.dat uses definitions fromgeometry_sp.dat, it is reloaded too (cascading reload).

6.6.3 Saving the graphic settings (SAVEGC)

Syntax: SAVEGC <file>Description: SAVEGC writes the graphic settings of test ligand, reference ligandand superposition to <file> (default filename is graphic_sp.dat).

Page 94: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

94 CHAPTER 6. MENUS AND COMMANDS

6.6.4 Decrypting static data files (DECRYPT)

Syntax: DECRYPT <directory>Description: DECRYPT decrypts the static data files so that the user can modifythem. It always creates a copy in <directory>.Important notes: This command is only available with a full license, it is not avail-able with an evaluation license. In order to use the modified static data file youmust please modify config_sp.dat.

6.7 Superpositioning (SUPERPOS submenu)

The superposition algorithm of FlexS consists of three phases: Selection of a set of base frag-ments, placing the base fragments on top of the reference ligand, and building up the com-plex beginning at the base fragments in an incremental way. Two of the three phases areassociated with the following two commands.

6.7.1 Placing base fragments (PLACEBAS)

Syntax: PLACEBAS <mode> [<matching_list> <continue>] [<continue>]Description: Places the base fragment of each fragmentation on top of the refer-ence ligand. <mode> selects the algorithm used:

m Manual base fragment placing is performed. Here, the test ligand is super-posed on top of the reference coordinates, which must be loaded or set by theREADREF, MAPREF or SETREF command. If MAPREF is used to assign referencecoordinates all valid mappings onto the subgraph read by MAPREF are used inturn for manual base placement.

p Same as manual base placement, except that a set of small variations around theprecise placement position are used additionally.

2 Manual base fragment placing is performed. Here, the test ligand is superposedon matching atoms of the reference ligand. You must select at least three atomsof the base fragment and corresponding atoms of the reference ligand (thematching). If <continue> equals ’y’ or 1, the algorithm minimizes the rms ofthe matching for each conformation of the base fragment. Then the matchingalgorithm creates a placement for every conformation with smallest rms. But ifyou select only a small number of matching atoms, the algorithm can create aplacement for every conformation, because the coordinates of the chosen basefragment atoms are equal in all conformations.

3 The standard algorithm based on triangle hashing techniques is used. A place-ment is found only if at least three interactions between the reference ligandand the base fragment occur. If only a few placements are found or the basefragment has only two active interactions, FlexS automatically tries to find baseplacements with the line algorithm (2), too.

o An algorithm based on a rigorous optimization technique starting from a fewsample points (RigFit) places the base fragment. A warning is reported if thenumber of conformers of the base fragment exceeds a certain limit. In such

Page 95: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

6.7. SUPERPOSITIONING (SUPERPOS SUBMENU) 95

cases the runtime is expected to be comparatively high and the user is asked ifthe placement should proceed (default is no).

If with either of the selected placement modes (3 and o) only a few base placementsare generated, the system automatically starts the respective alternative procedure.Parameters controlling the base placement are listed in the flexs_settings.datstatic data file. Parameters controlling the RigFit placement are listed in the staticdata file optpar.dat.Requirements: Reference ligand and test ligand must be loaded (READ) and a basefragment must be defined (SELBAS). For running in manual mode you must havespecified reference coordinates with the SETREF or READREF command.Important notes: The triangle algorithm can be quite time-consuming andmemory-intensive under specific conditions such as very symmetric interaction pat-terns. Therefore, the number of examined triangles (<MAX_NOF_Q_TRI>) and thenumber of clusters generated (<MAX_NOF_CLUSTER>) is limited.

6.7.2 Fragment-based screening (SCREEN)

Syntax: SCREEN <fragmentation> <resolution>Description: Rigid-body superposes the screening fragment no. 0 (i.e., the basefragment) of the selected fragmentations onto all reference ligands loaded intoFlexS’s workspace using the RigFit method. Tabular output is generated, provid-ing one line for each reference ligand and one column for each screening fragment.Each table entry provides the optimum FlexS score of the respective placement of ascreening fragment on top of a reference ligand.For <resolution> see the contolling parameter of the RigFit method <laue_radius>in section 10.21.3.Requirements: Reference ligand and test ligand must be loaded(TEST_LIG/READ) and a screening fragment must be defined(TEST_LIG/SELBAS).Important notes: In order to avoid unnecessary computational overhead<SUPERPOSITION_MODE> should be set to 2 for this operation (see 5.1.5).

6.7.3 Building up the complex (COMPLEX)

Syntax: COMPLEX [<build up>]Description: Builds up the test ligand on top of the reference ligand. This isdone incrementally, beginning with the placed base fragments. You can stop thebuild-up process after any fragment, setting <build up> to the corresponding frag-ment number. If <build up> equals ’all’ or is not defined, the whole test lig-and will be placed. Parameters controlling the base placement are listed in theflexs_settings.dat static data file.Requirements: Reference and test ligand must be loaded (TEST_LIG/READ), thebase fragments must be defined and placed (PLACEBAS).Important notes: The base placement is destroyed during the complex buildingphase. It is not possible to restore the base placements after the complex buildingphase is performed.

Page 96: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

96 CHAPTER 6. MENUS AND COMMANDS

6.7.4 Local flexible/rigid-body postoptimization of placements (POPT)

Syntax: POPT <flexible> <placements>Description: Postoptimization of placements. This can be the base placement,a partial placement in the complex construction procedure, or the completealignment prediction. The reference ligand is included as a rigid template for thepostoptimization. An ’all-on-one’ goal function is optimized (see eq. (6.5) in section6.9.1).

If <flexible> equals ’y’, the goal function according to the flag ’OPTIMIZE’ (see p.55) is used for optimization.If <flexible> equals ’n’, the Gaussian overlap volume of rigid placements will beoptimized. The rigid-body optimization is performed using the RigFit method.Parameters controlling the flexible postoptimization for the RigFit placement arelisted in the static data file optpar.dat (see section 10.21).<placements> is the selection of placements to be optimized. It can either be asingle number, a list of numbers separated by ’,’, a list of intervals of the form ’a-b’,or ’all’.Properties of the new/optimized placement (such as RMS value, overlap volume,etc.) are updated. For an application example, see Appendix B.6.

Requirements: Placements must have been computed.Important notes: Flexible postoptimization: If the flag TORSION_MODE is not setto 2 (see section 5.1.5 and Appendix A), the torsion energy flag (see section 6.9.2) isturned off automatically.Placements are kept in the original sequence. After optimization the placements areno longer guaranteed to be sorted by score. The user may sort the list again usingthe SORT command (see section 6.7.10).

6.7.5 Interactive selection of solutions (SELECT)

Syntax: SELECT <placement selection>Description: Enables the interactive selection of a restricted set of partial place-ments. The selection is either a single number, a list of numbers separated by ’,’, alist of intervals of the form ’a-b’, or ’all’.Requirements: A superposition must have been computed.Important notes: Once the SELECT command is finished, deleted placements canonly be recovered by re-computation. After the operation, the placements are in thesame order, the numbering of the placements is reset to values from 1 to the numberof selected placements.

6.7.6 Clustering solutions (CLUSTER)

Syntax: CLUSTER <max rms> <max angle dev.> <max length dev.>Description: Clusters the placements proposed with a complete linkage cluster al-gorithm. The distance between two placements is defined to be the RMSD betweenthem. All placements in a remaining cluster have an RMSD below <max rms>. If

Page 97: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

6.7. SUPERPOSITIONING (SUPERPOS SUBMENU) 97

there are vectors to fragments not yet placed, two placements can only be clusteredif the following conditions are met for all pairs of vectors:

1. The distance of the endpoints of the vectors must be less than or equal to <maxlength dev.>.

2. The enclosing angle between the vectors must be less than or equal to <maxangle dev.>

After the clustering, only the energetically highest placement of each cluster re-mains in the set of solutions.Requirements: A superposition must have been computed.

6.7.7 Writing placements in pdf format (WRITE)

Syntax: WRITE <filename> <code transformations>Description: Writes a placement in a FlexS-specific file format (.pdf format) ondisk. The default directory for this command is the path specified in the entry PRE-DICT (config_sp.dat). The pdf format is based on ASCII and can therefore beread and edited with standard tools. Because small changes in transformations canresult in different solutions, transformation information should be coded by setting<code transformations> to ’y’.Important notes: Coding works only on machines with specific floating-point rep-resentations. Thus, it may be the case that coding cannot be used on your hardwareplatform. Be careful when reading and writing coded pdf files on different ma-chines.

6.7.8 Reading placements in pdf format (READ)

Syntax: READ <filename>Description: Reads a placement in a FlexS-specific file format (.pdf format) fromthe file <filename>. The default directory for this command is the path specified inthe entry PREDICT (config_sp.dat).Important notes: The placement information is based on the reference and test lig-and molecules. Therefore the reference/test ligand files in FlexS’s main memorybefore executing the READ command must be equal to the files which were in mainmemory during the generation of the placements. Otherwise, FlexS ends up in aninconsistent state which is not detected in every case.

6.7.9 Deleting a placement (DELETE)

Syntax: DELETEDescription: Deletes all placement predictions from FlexS’s workspace.Important notes: The base placement is destroyed during the complex buildingphase. It is not possible to restore the base placements after the complex buildingphase is performed.

6.7.10 Sorting the list of placements (SORT)

Syntax: SORT <property>

Page 98: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

98 CHAPTER 6. MENUS AND COMMANDS

Description: Sorts the list of placements by the specified <property>. This is nor-mally necessary after optimizing the placements (SUPERPOS/POPT). Possible prop-erties are: <E_TOTAL> and <RMSD>.

6.7.11 Outputting the most important quantities of a superposition result (INFO)

Syntax: INFO <output format>Description: Displays the main characteristics of the superposition result, such asnumber of solutions, highest ranking score, etc. on the screen. If <output format>equals ’t’, the result is output in a table, otherwise (’l’) the output is printed on oneline. The single-line option (’l’) is very useful to summarize a superposition runover large data sets. All single lines start with reference ligand name and test ligandname. If <output format> equals ’s’, the output of a RigFit run will printed to oneline. Only available if the VERBOSITY level is 4 or greater.

6.7.12 Listing solutions (LISTSOL)

Syntax: LISTSOL <table length>Description: Displays a table of <table length> solutions of the superposition onthe screen. The table has the following columns. The column identifiers are shownin parentheses:

No. (SOL_NO) The number of the solution.Total Score (E_TOTAL) Total score of the superposition solution.Match Score (E_MATCH) Contribution of the matched interacting groups.Overlap Score (E_OVL) Contribution of the Gaussian overlap volume.Estimated Score (E_EST) Estimation of the sum of match and overlap score for the

whole test ligand.Normalized Score (E_NORM) The total score divided by the score achieved by su-

perpositioning the test ligand onto itself.RMSD Value (RMSD) RMSD of coordinates from reference coordinates. If there

are no reference coordinates, this column contains the RMSD from the highestranking solution.

Similarity Index (SIM_IDX) Measure of similarity between solution coordinatesand reference coordinates. If there are no reference coordinates, all entries ofthis column are −1.0. The similarity index score is similar to the RMSD valuegiven in the previous column. The RMSD value however, is strictly restrictedto calculate the RMSD between the 2 coordinates assigned to one atom: thesolution coordinates and the reference coordinates. The similarity index rathercorresponds to a "fuzzy" similarity measure: It is based on an RMSD betweenthe generated coordinates of an atom and the reference coordinates of the near-est atom of the same SYBYL atom type. For example, in docking, a symmet-ric molecule docked back to front will have a bad (high) RMSD but can stillachieve a good (low) similarity index.

#Match (NOF_MATCH) Number of matches.Norm. Volume (NORM_VOL) V.d.W. overlap volume normed by the v.d.W. vol-

ume of the test ligand

Page 99: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

6.7. SUPERPOSITIONING (SUPERPOS SUBMENU) 99

Ovl. Volume (OVL_VOL) V.d.W. overlap volume of test ligand and reference lig-and

Fragmentation No. (FRAG_NO) Number of the fragmentation used for this pre-diction.

Conf. String (CONF_STR) String displaying the conformation of the test ligand(internal notation) (in DEBUG_MODE only).

Sol. String (SOL_NR_STR) String displaying the rank of the solution after eachbuild-up step (in DEBUG_MODE only).

6.7.13 Listing solutions sorted by RMSD (LISTRMS)

Syntax: LISTRMS <table length>Description: Displays a table of <table length> solutions of the placement pre-diction on the screen, sorted by RMSD. The table has the same columns as listedabove.

6.7.14 Listing the matches of all solutions (LISTMAT)

Syntax: LISTMAT <table length>Description: Displays a table of all matches of the placement prediction on thescreen. The table has the following columns:

No. (SOL_NO) The number of the corresponding solution.TLig Atom (TLIG_ATOM) Test ligand atom name of the match.TLig ANo. (TLIG_ATOM_NO) Test ligand atom number of the match.Test-Lig IA-Type (TLIG_IA-Type) Test ligand interaction type.Rlig Atom (RLIG_ATOM) Reference ligand atom name of the match.RLig ANo. (RLIG_ATOM_NO) Reference ligand atom number of the match.Ref-Lig IA-Type (RLIG_IA-Type) Reference ligand interaction type.Opt. Energy (E_OPT) Optimal score (without geometry penalties) of the match.Chg. (CHG) Product of formal charges of the interacting atoms.Chg. fact. (CHG_FAC) Charge factor for the interaction.LDev. (LDEV) Length deviation.LDev. fact. (LDEV_FAC) Length deviation factor.ADevL (ADEVL) Angle deviation on test ligand site.ADevL fact. (ADEVL_FAC) Angle deviation factor on test ligand site.ADevR (ADEVR) Angle deviation on reference ligand site.ADevR fact. (ADEVR_FAC) Angle deviation factor on reference ligand site.Res. Engy. (E_RES) Resulting match score (optimal score multiplied by the charge

factor and rescaled by the deviation factors).

6.7.15 Listing all solutions and matches (LISTALL)

Syntax: LISTALL <table length>Description: Displays a table of <table length> solutions and all matches of thesuperposition on the screen. For a description of the table columns, see the twosections above.

Page 100: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

100 CHAPTER 6. MENUS AND COMMANDS

6.7.16 Listing one solution and the corresponding matches (LISTONE)

Syntax: LISTONE <solution number>Description: Lists solution <solution number> and the corresponding matcheson the screen. For a description of the table columns, see the two sections above.

6.7.17 Performing specific queries on solutions and matches (QUERY)

In many real cases, the number of solutions and matches is very large. It is possible to selectspecific information from the solution and matches tables. There are three ways of selectingor rearranging the table information:

SELECT specific columns FROM the table(s).

Select specific rows of the table(s) WHERE a certain condition applies.

Output the information selected in this way SORTed BY some criteria.

An SQL-like language is provided for the user to tell FlexS what information to display.

Syntax: QUERY <field list> <table list> [<condition>] [<order list>] <tablelength>Description: <field list> is a list of field names, separated by colons. A field nameis an identifier for a column of one of the tables. A list of valid strings for fieldnames is output when you enter QUERY without arguments (see sections 6.7.12and 6.7.14 for a list of valid strings). In the resulting output, only the table columnsrepresented by <field list> are listed. An asterisk * for <field list> is valid andrepresents the complete list of field names. <table list> is a list of the names of thetables you want to see, separated by colons. Valid table names are solutions andmatches.An asterisk * for <table list> is also valid and stands for solutions, matches(or, equivalently, matches, solutions). <table list> must contain all tables, thecolumns of which have been selected by <field list>.<condition> is optional and selects rows of the tables (whereas <field list> selectscolumns). An atomic condition is a field name followed by an arithmetic operator (=,>=, <=, >, < or ! =) or the contains-string operator ([]), followed by an appropri-ate constant. String constants must be enclosed in single quotes. Atomic conditionscan be connected by the binary Boolean operators and and or. Conditions can alsobe nested with brackets ’(’, ’)’.The underlying semantics for conditions containing an and operator for differ-ent combinations of tables are different. If you have selected the solutions or thematches table separately. <order list> is also optional and describes the order inwhich the selected rows are to be displayed on the screen. An <order list>is a list of order specifications, separated by colons. An order specification con-sists of one of the strings ascending or descending, followed by a field name.The field name must be an element of <field list>. The string ascending ordescending is optional. If it is missing, ascending is assumed.

Page 101: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

6.7. SUPERPOSITIONING (SUPERPOS SUBMENU) 101

Example

QUERY "sol_no, e_total, e_match, e_lipo, e_rot" solutionsQUERY * solutions "sol_no > 20 and (e_total<-10.0 or nof_match > 3)" ""QUERY * * "" "descending e_total, ascending nof_match, descending e_res"

The first query shows five columns (solution number and four energy values) of the com-plete solutions table.The second query shows all columns of the solutions table, but only those solutions (rows)whose number is greater than 20 and whose total energy is either less than -10.0 or whosenumber of matches is greater than 3.The third query shows the complete solutions and matches tables, but reordered: the solu-tions are sorted by decreasing total energies, those with equal total energies are sorted byascending number of matches. The matches of one solution will be sorted by decreasingresulting energies.The pair of subsequent double quotes "" in example 2 (3) represents a missing optionalparameter <order list> (<condition>).

6.7.18 Performing a specific query a second time (QHIST)

Syntax: QHIST <query no.>Description: With QHIST you can perform a previous query again. After typingthe command, you will receive a list of the last ten query commands. You can chooseone of them by its number <query no.>.Requirements: A query must have been previously performed using the com-mand QUERY.

6.7.19 Writing solutions in a table (PRINTSOL)

Syntax: PRINTSOL <filename> <table length> <separator> <fillchar><append> [<pvm merge>]Description: Writes a table of <table length> solutions of the superposition in afile. The name of the file is <filename>.log, if <filename> does not contain anysuffix, otherwise the name is <filename>.The table has the following columns. The first and the second columns containthe ligand name and the index of the placement solution. The next columns arethe columns of the LISTSOL table. The table contains a column for all interactiongeometries. These columns contain the Res. Engy. value of the LISTMATCH tableif they are matched. Each column is separated by <separator>. If a solution has noresult for a match column, this column then contains <fillchar>. If <append> is’n’, <filename>.log will be created and the first row is a header row. Otherwise thetable will be appended to <filename>.log without a header row. If the referenceligand does not change, the table has the same format for all test ligands becauseeach row has the same columns. The default directory for this command is the pathspecified in the PREDICT entry.In FlexS-PVM, <filename>.log is automatically merged after parallel script execu-tion. This feature can be switched off by setting <pvm merge> to ’no’.

Page 102: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

102 CHAPTER 6. MENUS AND COMMANDS

If <pvm merge> is set ’no’, then the build in batch variable $(PVM_ID) 2 is auto-matically appended to the filename.Notes: <filename> can be used by a spreadsheet program like EXCEL.Important note: For filename usage and file merging within scripts, please refer tothe PVM section on page 125.

6.7.20 Setting administration defaults for drawings (SELADM)

Syntax: SELADM <graphics object number> <temp file> <append graphic files>Description: With SELADM you can specify the graphics object numbers used fordrawing placements and you can determine whether the graphics files are tempo-rary with self-generated names or specified in each graphic command.

<graphics object number> If set to (0–255), the drawings for the placements willbe displayed in graphics object <graphics object number> (see FlexV man-ual). If you select 0, you will be asked for a range of graphics object numbers.Subsequent DRAW commands use these graphics object numbers in a first-in-first-overwrite manner.

<temp file> If set to ’y’ or ’1’, the drawings are written in temporary files andremoved after quitting FlexS. Otherwise you will be asked for a filename at theend of each DRAW command (see below).

<append graphic files> If set to ’y’, the drawings will be appended to an alreadyexisting file. Otherwise the file will be overwritten.

6.7.21 Setting default values for drawing placements (SELGRA)

Syntax: SELGRA <include test_lig> <include ref_lig> <all co types> [<co typeselection>]Description: With SELGRA, you can set specific default values for drawings ofplacements (superposition results). A placement is a set of dashed lines connect-ing the interacting groups. The test ligand in the predicted conformation and on thepredicted place can be drawn separately or can be included in the drawing of theplacement:

<include lig> If set to ’y’ or ’1’, the test ligand is included in the drawing.<include rec> If set to ’y’ or ’1’, the reference ligand is included in the drawing.<all co types> If set to ’y’ or ’1’, interaction geometries of all contact types are

drawn. Otherwise a set of contact types must be specified in <ia typeselection>. The selection is a list of integers or integer ranges (format a-b)separated by ’,’ or blanks.

6.7.22 Selecting the coloring mode (SELCOL)

Syntax: SELCOL <color mode selection>Description: With SELCOL you can set the color mode. Valid modes for place-ments are listed below. Depending on the color mode chosen, you will be asked forspecific color values.

2$(PVM_ID) : please refer to the PVM section on page 125

Page 103: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

6.7. SUPERPOSITIONING (SUPERPOS SUBMENU) 103

invisible : Interactions are not drawn.unique : All interactions are drawn in one user-defined color.contact type : The interaction lines are colored according to the interaction (or con-

tact) type (on the reference ligand side) of the interaction.opt-energy : The interactions are colored according to their contribution to the total

free energy, assuming that the interactions have optimal geometries. You areasked for the number of colors, the energy range and the color range. Theenergy (opt-energy) values are linearly mapped to colors.

energy : Like opt-energy, but includes the geometric scaling factors.

6.7.23 Labeling placements (SELLAB)

Syntax: SELLAB <ia type> <energy> <opt energy>Description: Reference ligand/test ligand interactions are labeled with text infor-mation.

ia type : If set to ’y’ or ’1’, the interaction type is shown.energy : If set to ’y’ or ’1’, the energy value of the interaction is shown.opt energy : If set to ’y’ or ’1’, the optimal energy value of the interaction type is

shown.

6.7.24 Drawing placements (DRAW)

Syntax: DRAW <placement number> [<filename>]Description: DRAW generates a drawing of the placement <placement number>.If the drawing is not stored in temporary files (see TEST_LIG/SELADM), it will bestored in the file <filename>.Important notes: Drawings are not displayed automatically. Use DISPLAY to out-put the drawing on the graphics device.

6.7.25 Drawing multiple placements (MDRAW)

Syntax: MDRAW <placement selection> [<directory> <filename>]Description: Generates drawings of a selected set of placements. The selection is alist of integers or integer ranges (format a-b) representing placement IDs separatedby ’,’ or blanks. For each drawing, MDRAW works like the DRAW command.MDRAW produces different results depending on the capabilities of the visualizationtool.

VRML The drawings are always displayed simultaneously.FlexV The drawings are attached to the object 0 slider in FlexV , overwriting any

previous contents in each case. You can choose between the different drawingsby moving this slider (see FlexV manual).

Important notes: Drawings are not displayed automatically. Depending on thegraphical interface, use DISPLAY to output the drawings to FlexV or the VRMLbrowser.

Page 104: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

104 CHAPTER 6. MENUS AND COMMANDS

6.7.26 Listing the graphic items (GRAINF)

Syntax: GRAINFDescription: Outputs a list of all current graphic settings for drawing superposi-tion results.

6.7.27 *Special commands to evaluate placements (EVALUATE)

This menu contains a collection of commands for analyzing the results of your superpositionrun or an orientation loaded (X-ray data, for example).

6.7.27.1 Computing and scoring the contacts found (SCORE)

Syntax: SCORE <table format> <draw>Description: All interactions between reference and test ligand are computed onthe basis of the fix-coordinates loaded with the test ligand (TEST_LIG/READ).

<table format> Yes/no answer:yes Return a table, which is described in section 6.7.15 (SUPERPOS/LISTALL).no Return the scoring and matching information in a single line format. The

first and the second columns of the line contain the ligand name and anindex. The next columns are the columns of the LISTSOL table. The tablecontains a column for all kind of matches. These columns contain the Res.Engy. value of the LISTMATCH table. Each column is separated by ’;’. Ifthe ligand has no result for a match column, this column then contains ’ ’.

<draw> Yes/no answer:yes Automatically generate a drawing of the test ligand, the reference lig-

andreceptor and the interaction information etc. in the form of theSUPERPOS/DRAW command. The graphics must be visualized using theDISPLAY command.

no Do not generate a drawing.

The placement together with the interactions cannot be stored.Requirements: Reference and test ligand must be loaded.Important notes: Note that the fix-coordinates NOT the ref-coordinates are takeninto account here.

6.7.27.2 Computing the contacts found (MATCH_LS)

Syntax: MATCH_LS <draw>Description: All interactions between reference and test ligand are computed onthe basis of the fix-coordinates loaded with the test ligand (TEST_LIG/READ). Theoutput table is described in section 6.7.15 (SUPERPOS/LISTALL). The placementtogether with the interactions cannot be stored. However, they can be visualized bysetting <draw> to yes.Requirements: Reference and test ligand must be loaded. A base fragment mustbe selected.Important notes: Note that the fix-coordinates NOT the ref-coordinates are takeninto account here.

Page 105: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

6.7. SUPERPOSITIONING (SUPERPOS SUBMENU) 105

6.7.27.3 Evaluating the overlap of a placement

Syntax: OVERLAPDescription: Evaluates the fraction of the test ligand volume that overlaps withthe reference ligand volume considering van der Waals spheres. Additionally thenumber of test ligand atoms with centers located outside the reference ligand vol-ume is provided. A single line of output is generated.Requirements: Reference and test ligand must be loaded.

6.7.27.4 Computing an RMSD histogram (RMSHIST)

Syntax: RMSHISTDescription: Computes all pairwise RMSDs between the computed placements.The RMSDs are counted in bins with 1.0 Å width.Requirements: A superposition must have been computed.

6.7.27.5 Evaluating the score of an alignment (ALIGN)

Syntax: ALIGN <filename> <table format> <draw>Description: Searches for interactions and computes an energy estimation for theligand placed on a given set of coordinates. The coordinates are read from a mol2(mol or sdf) file named <filename>.The placement together with the formed interactions cannot be stored. It can bevisualized by setting <draw> to yes. If set <table format> to ’y’ the result isoutput in a table, otherwise the output is printed in one line.

Requirements: Reference ligand must be loaded.

Page 106: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

106 CHAPTER 6. MENUS AND COMMANDS

6.8 *Using resulting information files (RIF)

Result information files (RIF) are externally generated files (e.g. by Ftrees) that define amapping between substructures of molecules that shall be used to guide the alignment ofthese molecules. RIF contain lists of pairs of molecules with a given similarity and matchinginformations (e.g. FTrees node mappings). The following section gives details on how togenerate flexible alignments with minimal distance between the given match points.

6.8.1 General overview

6.8.1.1 Resulting information file (RIF)

The input for the command ALIGN (6.8.2) is a result information file (RIF), which look, forexamplen like this:

Example

@LIF_FILEdhfr_kpl_h.lif

@LIF_FILEdhfr_min_h.lif

@SIMILARITIES_AND_MATCHES# <name1> <name2> <similarity> <matching nodes>"1dhf_kpl_h" "4dfr_min_h" 0.9335 "9|11 12|10 11|9 10|8 8|7 7|6 13|13 0,3|0 1|1 5,4|3 2|2,4 6|5""4dfr_kpl_h" "1dhf_min_h" 0.9335 "10|12 9|11 8|10 11|9 7|8 6|7 13|13 0|0,3 1|1 3|5,4 2,4|2 5|6"

The entries @LIF_FILE contain the name of a library information file (LIF). For more detailsabout a LIF or a RIF file see FTrees Userguide [15].Each line of @SIMILARITIES_AND_MATCHES block contains the names of the comparedmolecules the similarity between them, and a list of the pairwise matching Feature Treenodes. These pairwise matching nodes are used as constraint for flexible superposition. Inthe further text the pairwise matching nodes are called match points.

Important Note: The first entry in each line of the @SIMILARITIES_AND_MATCHESblock is taken as reference ligand and the second entry is taken as test ligand in FlexS.

6.8.1.2 Using Manually Defined Mappings

It is also possible to use manually defined mapping to guide the FlexS alignment. You canof course also define node of substructures in like the it is done for the Feature Tree nodes.But there is even a more simple way to specify the mappings: If no nodes of substructuresare defined FlexS assumes that the mapping defined in a RIF refer to atom numbers insteadof node numbers. Thus it is very easy to specify a atom based mapping. In the following weshow this by a little example.An example for an atom based mapping. The standard entry @MOLECULE_INFOcontains information about the used molecule. The first column contains the name of the

Page 107: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

6.8. *USING RESULTING INFORMATION FILES (RIF) 107

corresponding molecule file. The next column specifies if the file is a multi conformationfile. The next columns contain the molecule name, the seek position in molecule file, a checksum for the read molecule and the node information (see FTrees Userguide [15]).In order to define an atom based mapping, we simply skip the node information and thecheck sum, here, and write only the file name, whether we have multiple coordinates in thefile, the molecule name and the seek position, which is obviously 0 for a single entry file. Inthis case we use two such entries on for the reference molecule and one for the test ligand.However, multi entry files could be use here as well, but then the seek position have to beset.Example

@MOLECULE_INFO# <file><MConf><name><seek><molecule check sum><node atoms>1UHO.mol2 NO "1UHO.A" 0

@MOLECULE_INFO# <file><MConf><name><seek><molecule check sum><node atoms>1UHO_hit.mol2 NO "1UHO_hit" 0

@SIMILARITIES_AND_MATCHES# <name1> <name2> <similarity> <matching nodes>"1UHO.A" "1UHO_hit" 0.8079 "9,10,12,14,15,17|22,24,26,28,30,31 18|33 25|34 44|16 49|23 "

The line in @SIMILARITIES_AND_MATCHES block now contains a list of the pairwisematching atoms. FlexS used the center of mass of each set of atoms as match point for thealignment. So in this case the center of mass of the atoms 9,10,12,14,15,17 of the referenceligand is mapped onto the center of mass of the atoms 22,24,26,28,30,31 of the test ligand.And the atom 18 of the reference ligand is mapped onto the atom 33 of the test ligand andso on.This manually defined RIF can now be used with the ALIGN command (see 6.8.2).

6.8.1.3 Flexible alignment with minimal distance between the match points

In order to generate a good start position a coarse alignment is performed by rigidly su-perimposing the given match points. The resulting conformation is flexibly relaxed on thereference ligand by optimizing the following energy function:

Et,r = iO St,r + iLJP Pt + iD Dist(MPt, MPr) (6.1)

Et,r is the overall energy function for the flexible alignment of the test ligands conformationt to the reference ligands conformation r.St,r is the gaussian overlap index (see eq. (6.2) in section 6.9.1).Pt is the Lennard-Jones potential of the test ligands conformation t. And Dist is a distancefunction for the given match points of the test ligands conformation and of the referenceligands conformation. iO, iLJP, iD, ≤ 0 are parameters of the energy function.A 4-step optimization method is used for the flexible relaxation of the conformtions:

step 1 : The parameters iO and iLJP of equation (6.1) are set to 0. All conformation areflexibly relaxed.

Page 108: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

108 CHAPTER 6. MENUS AND COMMANDS

step 2 : Same parameter as in step 1, but only the selected conformation with the minimalmatch point distance are optimized further.

step 3 : Now the given values for iO and iLJP are used. But only the gaussians of the electrondensity are used in St,r.

step 4 : Now, the full potential of the energy function in 6.1 is used.

The goal is to approximate a good start position with cheap optimizations (step 1 to 3) forthe last step, which is expensive and precise.

6.8.1.4 Filter for aligned solutions

To avoid optimized conformations after the 4-step optimization (6.8.1.3) which are self over-lapping or strained a filter is implemented. The filter uses the Lennard-Jones potential ofthe energy function in 6.1. Each conformation will be skipped where the Lennard-Jones po-tential score is greater than the value of the parameter RIF_LJPOT_FILTER. The potentialscore of a solution is added to the output of ALIGN command (see 6.8.2).

6.8.1.5 Conformation generator

In addition, it is possibe to generate a set of different starting conformations for the testligand either by an internal or an external conformation generator.If the flag USE_CONF_GEN is set to

0 : the conformation generator is switched off;

1 : use the internal generator, exclude the given conformation of the template molecule;

2 : use the internal generator, include the given conformation of the template molecule;

3 : use the external generator, exclude the given conformation of the template molecule;

4 : use the external generator, include the given conformation of the template molecule;

If USE_CONF_GEN is set to ’1’ or ’2’ the internal conformation generator ([6]) produces var-ious conformations (excluding or including the given conformation of the template) for thetest ligand.If USE_CONF_GEN is set to ’3’ or ’4’ the interface to an external conformation generator isused to create various conformations (excluding or including the given conformation of thetemplate) for the test ligand.In order to use an external conformation generator, the generator must be defined in theconfiguration file config_sp.dat. An example of a definition, where the excutable Corina isused as CONFGENERATOR:

Example

@PROGRAMS#system Linux

CONFGENERATOR ./bin/corina -d stergen,rc,mc=5 -i t=mol2,dummies -o t=mol2#end_system

Page 109: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

6.8. *USING RESULTING INFORMATION FILES (RIF) 109

The flag EXT_CONF_GEN_FORMAT defines the input format for the external conformationgenerator. If the flag is set to ’0’ the input for the generator is the name of the templatemolecule. This case is used if a database with various conformations for molecules withunique names is used as external generator. Otherwise if the flag is set to ’1’ the the inputis a MOL2 file with current conformation of the template molecule. In both cases FlexSexpects a (multi-) MOL2 file with various conformations of the template molecule, whichwill be loaded.

6.8.2 Superpositioning of RIF molecule pairs (ALIGN)

Syntax: ALIGN <rif file> <similarity threshold> <fscore> <write_mol>[<mol_type> <incl_ref_lig>] <write_gdf> <add comment> [<comment>]Description: Reads pairs of molecule from the given result information file (RIF)(see 6.8.1.1) <rif file> and performs a (flexible) alignment with the constraint ofmimimal distance between the given match points from <rif file>.

Only pairs of molecules with a similarity value which is greater or equal <similaritythreshold> are loaded and processed in FlexS.

If flag USE_CONF_GEN in config_sp.dat not set to ’0’ a conformation generator willbe used to produce various conformations of the test ligand (see section 6.8.1.5 formore details). Otherwise, only the given conformation of the test ligand is used.

Each conformation of the test ligand will be combined with each conformation ofthe reference ligand (if multiple conformations are given in RIF) for each moleculepair from the RIF. A coarse alignment is performed by RMSD minimization of thegiven match points from the RIF for each conformation pair.A 4-step optimization method is used for a flexible alignment of conformationpairs. The energy function in eq. (6.1) (see section 6.8.1.3) is optimized in order toflexibly align the conformation pairs with the constraint of distance minimizationfor given match points.In the first step each conformation pair is flexibly relaxed. In the next three stepsonly the best solutions are further optimized.

After the optimization, the results for the optimized conformation with the bestgaussian overlap index are printed to screen:

Example

SOL 4dfr_kpl_h 4dfr_min_h_010 1.000 0.496 0.0076 469.0292 21.0927 55.2935 0.0004 0.000 0.000 -894.561 0.783 -17.246SOL 4dfr_kpl_h 1dhf_min_h_017 0.933 0.611 0.0077 652.5390 13.5481 6.0112 0.0007 0.000 0.000 -811.948 0.710 -24.699SOL 1dhf_kpl_h 4dfr_min_h_015 0.933 0.583 0.0065 629.5437 11.2017 1.1233 0.0007 0.000 0.000 -718.497 0.667 -21.956SOL 1dhf_kpl_h 1dhf_min_h_006 1.000 0.717 0.0083 711.0212 21.0801 57.1823 0.0008 0.000 0.000 -904.542 0.839 -20.970

Each line starts with the key word <SOL>. The next two columns contain thenames of the molecules from the RIF. In the fourth column the similarity index formolecule pairs from the RIF is given. The fifth column contains the gaussian overlapindex of the best pair of conformations.

Page 110: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

110 CHAPTER 6. MENUS AND COMMANDS

The next seven columns contain the Hodgkin indices for gaussians of differentproperties like electron density, partial charge, hydrophobicity, hydrogen bondingdonor, hydrogen bonding acceptor, hydrogen bonding donor geometry and hydro-gen bonding acceptor geometry (see section 10.21.1).

If <fscore> is set to ’y’ the standard FlexS score is computed for the best confor-mation pair and the thirteenth and fourteenth column contains FlexS score and thenormalized FlexS score. The normalized FlexS score is the FlexS score divided byFlexS score of the reference ligand aligned to itself.

The next column contains the Lennard-Jones potential of the energy function in 6.1for the aligned test ligand conformation.If <add comment> is set to ’y’ an additional column with <comment> is printed.

If <write_mol> equals ’y’ the best solution according to the gaussian overlapindex is written to a molecule file. <mol_type> may either be 1 (MOL2) or 0 (SDF)and specifies molecule file format. If <incl_ref_lig> equals ’y’ the reference ligandconformation is written to the molecule file too. Otherwise only the aligned pose ofthe test ligand is written.

<write_gdf> may be

0 : no solutions will be written;

1 : only the best solutions will be written (according to the gaussian overlap index);

and specifies which solutions should be written to a graphic description file (gdf),that can be reviews with FlexV .The Gaussian overlap parameters of similarity index and the parameters controllingthe flexible optimization are listed in the data file optpar.dat (see sections 10.21.1and 10.21.2).Important notes: The optimization algorithm can be quite time-consuming undercertain conditions. Therefore, the maximum number of iterations should be limited(see command SWITCHP) and the minimum gradient criterion should not be toosmall (see section 6.9.3).

6.8.3 Switch parameters for optimization in ALIGN (SWITCHP)

Syntax: SWITCHP <mode> [<nof sol> <max_it_step1> <max_it_step2><max_it_step3>] <max_it_step4>]Description: Switch parameters for the 4-step optimization method in RIF.Three default parameter sets are defined for the 4-step optimization method: forthe screening mode, the interactive mode and the precise mode. The parameter<mode> switches the default parameter sets.

Page 111: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

6.8. *USING RESULTING INFORMATION FILES (RIF) 111

Example

>> Screening mode:Number of best solutions, which are used for the further optimizations: 1Number of iteration steps for the 1.step optimization: 0Number of iteration steps for the 2.step optimization: 50Number of iteration steps for the 3.step optimization: 20Number of iteration steps for the 4.step optimization: 10

>> Interactive mode:Number of best solutions, which are used for the further optimizations: 1Number of iteration steps for the 1.step optimization: 3 * sqrt(Number of fragments)Number of iteration steps for the 2.step optimization: 50Number of iteration steps for the 3.step optimization: 20Number of iteration steps for the 4.step optimization: 30

>> Precise mode:Number of best solutions, which are used for the further optimizations: 3Number of iteration steps for the 1.step optimization: 3 * sqrt(Number of fragments)Number of iteration steps for the 2.step optimization: 50Number of iteration steps for the 3.step optimization: 50Number of iteration steps for the 4.step optimization: 50

>> Current parameter values:Number of best solutions, which are used for the further optimizations: 1Number of iteration steps for the 1.step optimization: 3 * sqrt(Number of fragments)Number of iteration steps for the 2.step optimization: 50Number of iteration steps for the 3.step optimization: 20Number of iteration steps for the 4.step optimization: 30Select mode (screen (0), interactive (1), precise (2) or user defined (3)) <0,3> [1] :

If <mode> is set to ’3’ other parameter values can be defined.After the first step in the 4-step optimization method (see section 6.8.1.3) only the<nof sol> conformations with the minimal match point distance are optimized fur-ther in the next steps.The other parameter values define the maximum number of iterations for the opti-mzation. If a parameter is set to 0 the corresponding optimization step is switchedoff.

<max_it_step1> : maximum number of iterations in the step 1<max_it_step2> : maximum number of iterations in the step 2<max_it_step3> : maximum number of iterations in the step 3<max_it_step4> : maximum number of iterations in the step 4

The value for <max_it_step1> can be negative. Then the maximum number of iter-ations will be set to the absolut value of <max_it_step1> multiplied by the squareroot of the fragment number of the test ligand.Notes: The default values for the three parameter sets are specified in the staticdate file optpar.dat (see sections 10.21.4).

Page 112: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

112 CHAPTER 6. MENUS AND COMMANDS

6.9 *Gaussian overlap optimization (OPTPARAM)

The following sections give details of the parameters for flexible superposition and postopti-mization of placements. Sections 6.9.7 provide parameters for the rigid-body superpositiontool RigFit .

6.9.1 General overview of flexible superposition:

The general Hodgkin index for f , g ∈ L2(IR3):

H( f , g) :=2

∫IR3 f (x)g(x) dx∫

IR3 f (x)2 dx +∫

IR3 g(x)2 dx

Similarity (Gaussian overlap) index (superposition to a reference ligand):

∀ t : St,r = ∑q

λq Ht,rq , λq ≥ 0, ∑

qλq = 1 (6.2)

St,r is the similarity index between the test ligand t and the reference ligand r. Ht,rq is the

Hodgkin index for Gaussians of property q (see section 10.21.1).

Overlap index (mutual superposition without a reference ligand):

S =2

n(n− 1)

n−1

∑i=1

n

∑j=i+1

Sti ,t j (6.3)

S is the scaling overlap index between all test ligands. Sti ,t j is the similarity index betweenthe test ligands ti and the test ligand t j. n is the number of test ligands.

Sti =1

n− 1

n

∑j=1j 6=i

Sti ,t j (6.4)

Sti is the overlap index between the test ligand ti and the other test ligands.

Energy function of superposition to a reference ligand:

∀ t : Et,r = λO St,r + λLJP Pt + λT Tt (6.5)

Et,r is the energy function of the flexible superposition of the test ligand t to the referenceligand r. St,r is the similarity index (see eq. (6.2)). Pt is the Lennard-Jones potential ofthe test ligand t. Tt is the scaling rating function of the torsion energy of the test ligand t.λO, λLJP, λT ≤ 0 are parameters of the energy function.

Energy function of mutual superposition without a reference ligand:

E = λO S + λLJP1n

n

∑j=1

Pt j + λT1n

n

∑j=1

Tt j (6.6)

E is the energy function of the flexible and mutual superposition of a set of test ligands. S isthe scaling overlap index (see eq. (6.3)). Pt j is the Lennard-Jones potential of the test ligandt j. Tt j is the scaling rating function of the torsion energy of the test ligand t j. λO, λLJP, λT ≤0 are parameters of the energy function.

Page 113: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

6.9. *GAUSSIAN OVERLAP OPTIMIZATION (OPTPARAM) 113

6.9.2 Switching energy parameters for flexible optimization (SWITCHP)

Syntax: SWITCHP <energy-parameter> [<overlap> <lj_potential><torsions_eng>] <overlap-parameter> [<e_density> <charge> <hydrophobic><h_bond_don> <h_bond_acc> <h_bond_don_geom> <h_bond_acc_geom>]Description: The parameters of the energy function for flexible optimization (seeeq. (6.5) and (6.6) in section 6.9.1) and the overlap parameters for the similarity in-dex (see eq. (6.2) in section 6.9.1) can be switched on or off.If <energy-parameter> equals ’y’, the user may switch on or off the similarity index<overlap> (default:on), the Lennard-Jones potential <lj_potential> (default:on),and the torsion energy function <torsions_eng> (default:off).If <overlap-parameter> equals ’y’, the user may switch on or off individual contri-butions to the similarity index: <e_density> (default:on), <charge> (default:on),<hydrophobic> (default:on), <h_bond_don> (default:on), <h_bond_acc> (de-fault:on), <h_bond_don_geom> (default:off), and <h_bond_acc_geom> (de-fault:off). Use 1 for on and 0 for off.Important notes: The torsion energy is relevant only if TORSION_MODE equals 2(see section 5.1.5 and Appendix A).Notes: See section 10.21.1 for the energy and overlap parameters.

6.9.3 Stop criteria for flexible optimization (STOPCRT)

Syntax: STOPCRT <gradient> <step> [<steptol>] <energy> [<energytol>]<max_it_step>Description: Different stop criteria for flexible optimization using the Gaussianoverlap index (see section 6.9.1 for a general overview and sections 6.7.4 and 6.9.5for related commands) as goal function. Default values of the stop criteria arespecified in the static data file optpar.dat (see section 10.21.2).

The first criterion is a minimum gradient εg :

‖∇E(xi)‖∞ < εg, εg ∈ [10−15; 10−3]

A value for εg (=<gradient>) must be entered as a floating-point number.

The second criterion is a minimum step size machepsεs :

maxk=1,...,n

|xi+1k − xi

k|max{|xi

k|, 1.0}< machepsεs , εs ∈ [0.0; 1.0]

macheps is machine accuracy of the computer. If <step> equals 1, εs (=<steptol>)must be specified as a floating-point number.

The third criterion is a minimum energy size machepsεe :

|E(xi+1) − E(xi)| < machepsεe , εe ∈ [0.0; 1.0]

If <energy> equals 1, εe (=<energytol>) must be specified as a floating-pointnumber.

Page 114: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

114 CHAPTER 6. MENUS AND COMMANDS

The last stop criterion is an upper bound for the number of iterations of the opti-mization. If <max_it_step> equals 0, the number of iteration steps is unrestricted.Otherwise at most <max_it_step> iterations will be performed.Notes: If <step>, <energy> and <max_it_step> are set to 0, the gradient crite-rion is the only stop criterion for the flexible optimization. Therefore the gradientcriterion should be not too small, in the worst case it can happen that the optimiza-tion algorithm never stops. It is therefore highly advisable to limit the number ofoptimization steps.

6.9.4 Molecule transformation for flexible superposition (TRANSMOD)

Syntax: TRANSMOD <transformation>Description: <transformation> specifies the molecule transformation. The flexi-ble postoptimization perturbs the conformation of a molecule in order to maximizethe Gaussian overlap index by finding a molecule transformation which is definedfor a test ligand as:

anewi = Rg(x)(Rl(x)(aold

i )− cg(x)) + cg(x) + t(x) ∀aoldi

for a placement as:

anewi = Rl(x)(Rg(x)(aold

i − cg(x)) + cg(x) + t(x)) ∀aoldi

Rg(x) ∈ SO3 is the matrix of global rotation, t(x) ∈ IR3 is the translation vector,Rl(x)(.) is the operator of local rotations, cg(x) is the center of the global rotation.x is the optimization variable. Local rotations are torsion angle transformations.The aold

i are the atom coordinates before, and anewi are the coordinates after the

transformation. Molecule transformation defaults to 7, allowing global and localrotations as well as translation (see Example).Example

Molecule transformation by courtesy of:(No transformation (0):)Global rotation (1):Translation (2):Global rotation and translation (3):Local rotation (4):Local rotation and translation (5):Global and local rotation (6):Global and local rotation and translation (7):Global rotation center and global andlocal rotation and translation (8):

Transformation <1,8> [7] :

Notes: <transformation> may be equal to 0 only in the special cases described insection 6.9.5.

6.9.5 Superpositioning of flexible test ligands (FSUPER)

Syntax: FSUPER <reference> [<ref_lig> [<coord_type>]] <test_ligands>[<coord_type> <transformation> [<trans>] [<freeze_bonds> [<bonds>]]]+

Page 115: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

6.9. *GAUSSIAN OVERLAP OPTIMIZATION (OPTPARAM) 115

Description: Superposition of multiple flexible test ligands. If <reference> equals’y’, a rigid reference is taken for the superposition. In this case an ’all-on-one’goal function is optimized (see eq. (6.5) in section 6.9.1). <ref_lig> specifies thereference. If <ref_lig> equals ’0’, the reference ligand is taken. Otherwise the testligand with number <ref_lig> is taken as rigid reference. <coord_type> mayeither be 1 (fix) or 2 (opt) and specifies the coordinates taken in the case of a testligand as reference.If <reference> equals ’n’, an ’all-on-all’ goal function is performed (see eq. (6.6) insection 6.9.1). All ligands are treated flexibly and are mutually overlayed.

<test_ligands> can either be a single number, a list of numbers separated by ’,’, alist of intervals of the form ’a-b’, or ’all’. For each test ligand the <coord_type>,<transformation> and optionally <freeze_bonds> may be specified.<coord_type> may either be 1 (fix) or 2 (opt) and determines which set of coordi-nates are taken. If <transformation> equals ’y’, a type of molecule transformation<trans> for the ligands must be provided. Otherwise the default type of moleculetransformation is taken (global und local transformation (type no. 7, see section6.9.4)). If the type of the selected molecule transformation is a number >3, specificrotatable single bonds may be frozen. If a selection of rotatable single bonds is to befrozen during the optimization, <freeze_bonds> must be ’y’. <bonds> can eitherbe a single number, a list of numbers separated by ’,’, a list of intervals of the form’a-b’, ’all’ or ’ˆ’ for no bonds, and specifies the selection of frozen bonds.The Gaussian overlap parameters of similarity index and the parameters control-ling the flexible optimization are listed in the data file optpar.dat (see sections10.21.1 and 10.21.2). The optimized coordinates of test ligands will be written to thecoord type ’opt’.See Appendix B.8 for an example and a legend of the results.

Important notes: The optimization algorithm can be quite time-consuming underspecific conditions, especially the ’all-on-all’ superposition without a specific refer-ence ligand. Therefore, the maximum number of iterations should be limited andthe minimum gradient criterion should not be too small (see section 6.9.3).The torsion energy in the energy function for the flexible optimization (see eq. (6.5)and (6.6) in section 6.9.1) is only effective if the parameter of torsion energy is turnedon (see section 6.9.2) and TORSION_MODE is set to 2 (see section 5.1.5 and Ap-pendix A).Notes: If the user has selected no reference ligand and only a single test ligand, theprocedure is with a warning.

6.9.6 Selecting a special optimization algorithm for flexible superpositioning withouta reference ligand (SETALGO)

Syntax: SETALGO <algorithms> <center> [<value>]Description: One of two algorithms for superpositioning without a reference lig-and must be selected. The algorithms are a Jacobi and a special Gauss-Seidelprocedure (see Figure 6.1). The Jacobi procedure (<algorithms>=1) simulta-neously moves all ligands per optimization step. The Gauss-Seidel procedure

Page 116: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

116 CHAPTER 6. MENUS AND COMMANDS

(<algorithms>=2) separately moves selected ligands per optimization step. TheJacobi algorithm is generally faster than the Gauss-Seidel procedure. However, thequality of the results of the Gauss-Seidel algorithm are usually much better. A spe-

One optimization step: jacobi-algorithm One optimization step: special gauss-seidel-algorithm

Mol A

Mol B

Mol C

Mol A

Mol B

Mol C

Coevally Movement Apartly Movement

optimal moleculein this step

second move

first move

third move

Figure 6.1: One optimization step of the Jacobi and the Gauss-Seidel algorithm

cial stop criterion for superpositioning without a reference ligand must be selected(<center>). The stop criterion for the centers of gravity of the molecules is:

| cnew − cold| < εc, εc ∈ [10−10; 10−1]

cold(new) =n−1

∑i=1

n

∑j=i+1

‖cold(new)i − cold(new)

j ‖2, cold(new)i =

1ni

ni

∑l=1

aold(new)i,l

Where aold(new)i,l are the atom coordinates prior (after) the optimization of ligand

i. Here n is the number of ligands and ni is the number of atoms in ligand i. εc(=<value>) must be specified as a floating-point number.

The following command (SETPAR) adjusts important parameters of the rigid-body super-positioning tool RigFit which is based on Fourier methods. This method is also enabled viaSUPERPOS/PLACEBAS. THe paramers are described in section 10.21.3.A short description of the method is given here in order to expose the significance of the fol-lowing command and parameters. The scoring function of the method considers Gaussianoverlap volumes between the test ligand and the reference ligand. The overlap volumes areapproximated by Fourier series calculations. In order to optimize rotational and translatio-nal orientation of the test ligand separately, the Patterson function is used in Fourier space.For each solution resulting from the Patterson function optimization, translation optimiza-tions from different start translations are performed. Thus a resulting solution is identifiedby the number of the rotation solution and the number of the translation solution.

Page 117: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

6.9. *GAUSSIAN OVERLAP OPTIMIZATION (OPTPARAM) 117

6.9.7 Setting optimization parameters (SETPAR)

Syntax: SETPAR <parameter name> <value>Description: Sets a parameter of the RigFit method. The parameters are describedin section 10.21.3:

• <rotation_filter_tolerance>• <baseplace_filter_tolerance>• <glob_opt_filter_tolerance>• <laue_radius>• <start_rotation_sample_step>

• <translation_sample_method>

Requirements: None

Page 118: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

118 CHAPTER 6. MENUS AND COMMANDS

Page 119: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

7Additional modules forFlexS

7.1 Parallel Virtual Machine (PVM submenu)

The PVM module of FlexS allows you to execute a script on a Parallel Virtual Machine(PVM). The PVM submenu is only present if the PVM interface is available and the PVMmodule is activated in FlexS. The PVM module is a standard module of FlexS, no additionallicense is necessary.Firstly, it will be much easier to set up your parallel calculation if you understand what isgoing on behind the scenes. The basic setup is as follows:A ’master’ FlexS process runs on your workstation. This instance of FlexS reads a ready-prepared script which it can split into jobs (it does this by extracting the iterations of a loop).The master process then starts slave FlexS processes on remote machines via PVM and aremote login. These slave processes then receive jobs from the master process. When theslave finishes one job it may receive another. When all jobs have been sent, the master waitsfor them to finish and then tidies up and ends the calculation.

7.1.1 Preliminaries

There are several important points to get right in order to get parallel computing workingwith FlexS. Here is a list of things to prepare:

• PVM must be installed on all machines that will run master or slave processes.

• You must be able to execute a remote login to all machines that will run processeswithout having to enter a password. Make sure that your .rhosts or equivalent fileis correctly defined.

• The environment variables PVM_ROOT, PVM_ARCH and PVM_RSH must be set. Thesemust point to the PVM installation, your platform architecture name and your remotelogin type. Often the script $PVM_ROOT/lib/pvmgetarch can be used to find yourplatform architecture. For example, the variables may look like this:

PVM_ROOT=/software/pvm/pvmPVM_ARCH=LINUXPVM_RSH=/usr/bin/ssh

• You should have the following set in your path:

119

Page 120: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

120 CHAPTER 7. ADDITIONAL MODULES FOR FLEXS

$PVM_ROOT/lib$PVM_ROOT/lib/$PVM_ARCH$PVM_ROOT/bin/$PVM_ARCH

• All the machines must be able to access the same file system (NFS) for the data files.

• All the machines must be able to access the same FlexS executable. Thisexecutable – or a link to a central executable – must be found in either$PVM_ROOT/bin/$PVM_ARCH or /home directory/pvm3/bin/$PVM_ARCH. Forexample:

/home/user/pvm3/bin/LINUX/flexs -> /install/software/BioSolveIT/FLEXS/bin/flexs

• The executable name must match the setting of the program flag FLEXS (see Sec-tion 5.1).

See the PVM manual for further information on parallel computing with PVM [5].

7.1.2 Starting PVM

If no PVM daemon is running, typing ’pvm’ on the console starts the daemon. Or usethe FlexS command TOPVM (7.1.3.4) to start the PVM daemon. There is a message duringFlexS start-up saying whether a PVM daemon is detected or not. If the PVM environmentvariables PVM_VMID and/or PVM_TMP are used, FlexS uses these environment variables,too. For this purpose it is necessary that the environment variables PVM_ROOT and PVM_RSHmust be set on all machines that will run master or slave processes.

7.1.3 Configuring PVM

The parallel machine is configured by FlexS itself. The FlexS configuration fileconfig_sp.dat contains a section headed @PARALLEL. This section consists of a list ofhost names followed by a number specifying the maximum number of FlexS processes al-lowed on this host and an optional nice value for all processes on this host (see section 5.1.7).FlexS only uses the specified hosts regardless of the configuration of PVM. The configurationof PVM can be modified or viewed with the following commands.

7.1.3.1 Outputting the PVM configuration (INFO)

Syntax: INFODescription: Generates a list of all hosts with the corresponding maximum num-ber of FlexS processes and outputs it on the screen. If the PVM daemon is not run-ning, a status message about the PVM daemon is output.

7.1.3.2 Modifying the PVM configuration (ADD)

Syntax: ADD <host name> <max processes> <nice val>

Page 121: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

7.1. PARALLEL VIRTUAL MACHINE (PVM SUBMENU) 121

Description: Adds a new host to the PVM configuration. If a host named <hostname> is already contained in the host list, the maximum number of processes ischanged to <max processes>. If the maximum number of processes is set to 0, thehost remains unused during a parallel computation. Finally, a nice value can bedefined for FlexS processes on this host.Important notes: ADD modifies only the internal list of hosts and does not actuallyadd the host to the PVM. This is done during start-up of a parallel execution. There-fore error messages about the availability of a host appear during start-up and notafter adding a host.

7.1.3.3 Removing a host from the PVM configuration (REMOVE)

Syntax: REMOVE <host name>Description: Removes a host from the PVM configuration.

7.1.3.4 Calling the PVM console (TOPVM)

Syntax: TOPVMDescription: Calls the PVM console. If no PVM daemon is running, a daemonwill be started. See the PVM manual for a list of console commands. Typing ’halt’terminates the daemon, while typing ’quit’ does not. Both commands terminate theconsole and return to FlexS.Important notes: Typing ’reset’ kills all processes running under the control of thepvm daemon. This may result in temporary files which are not deleted by FlexS.

7.1.4 Executing parallel batch files

The execution of a parallel batch file is initiated by the SCRIPT command or by the commandline option -b. If all of the following conditions hold, the batch file is executed in parallel:

• The FlexS program is a FlexS-PVM executable.

• The PVM daemon is running on the machine on which the FlexS process is started.

• The control flag USE_PVM_FEATURE is set to 1 (see section 5.1.5).

• The current FlexS process is not a work process.

• The FlexS configuration (cp. 5.1) contains a list of hosts with a maximum number ofprocesses greater than zero or a list of hosts is defined interactively with the PVM/ADDcommand.

• The batch file contains a FOR_EACH loop.

• The batch file does not contain any of the following commands:

– batch file/script commands: INPUT, SELINP, WAIT

– flexs commands: EDITCFG, DISPLAY, TOFLEXV

– commands within the PVM submenu

Page 122: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

122 CHAPTER 7. ADDITIONAL MODULES FOR FLEXS

– commands within the DATABASE submenu

– the command for changing the environment variable PREDICT (namely SETPREDICT)

If one of the conditions does not hold, the batch file is executed sequentially. An easy wayto control whether a batch file is executed in parallel or not is to start the PVM daemon inadvance.With parallel execution, the current FlexS process becomes the master process, also calledthe scheduler. The master process initiates work processes, schedules tasks to the workprocesses, and collects and merges the resulting output files.In a first step, the batch file is subdivided into three sections; the break points are theFOR_EACH and END_FOR statements of the outermost loop which define the section to beparallelized. The part before the FOR_EACH statement is called the init batch file, the part be-tween the FOR_EACH and END_FOR statement is called the loop batch file, and the remainingpart is called the post-processing batch file.The parallel execution of nested loops within FlexS scripts is supported in cases where theFOR_EACH and END_FOR statements directly follow one another (!) in a script. In the caseof nested loops the loop batch file adjusts to the section between the innermost FOR_EACHand END_FOR statement.The outermost loop must lie on the top level of FlexS’s menu structure, that is, structuressuch as

LIGANDFOR_EACH ......END_FOR

END

are forbidden, while

LIGAND...

ENDFOR_EACH ......

END_FOR

would be allowed.The master then initiates FlexS work processes on all hosts according to the list of hostsdefined in the configuration file or interactively. Each work process executes the init batchfile. At this point, the master outputs the hosts on which a FlexS process is started on stdoutand starts the communication protocol.After initiating the communication protocol, the master process sends the loop batch filewith corresponding input data to the work processes whenever one of them is idle. If thework process is killed for any reason, the master automatically initiates a new work processon the corresponding host and starts the next task on it. When all loop iterations have beenexecuted by the work processes, the master sends the post-processing batch file to them andterminates the processes.

Page 123: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

7.1. PARALLEL VIRTUAL MACHINE (PVM SUBMENU) 123

The output of the work processes is collected in files named pvm_outp_<tid>_<no>. Out-put generated with LIST* or INFO commands preceded by SELOUTP with <pvm merge>switched on is collected in temporary files and merged afterwards by the master to a singlefile. These files do not differ from those generated by a sequential execution of the batchscript.In addition the master generates a log file named pvm_flexs-run_<no>. This file containsa summary of all script execution events like the work process creation (including the taskID of the process), output merging events, and commission execution events (including thehost where the corresponding loop instance was executed and the output of the last INFOcommand contained in the loop). The log file contains information in chronological orderand is used for recovering partial calculations.

7.1.5 Aborting and recovering

The recommended way to start parallel execution is with the SCRIPT command from insideFlexS (instead of a command line start with the -b option). The advantage is that a briefcommunication protocol is output in the current window so that you always see how farprocessing has already progressed. In this mode, the calculation can be aborted by pressingany key and then abort as soon as the prompt appears. The scheduler stops sending newcommissions and waits until all processes are idle (roll-out). It will then merge the outputcreated so far and terminate such that FlexS switches back to interactive mode. Note thatthis can take a while depending on the typical runtime of the individual computing tasks.During this roll-out phase, another keypress followed by abort causes an immediate termi-nation without waiting for the results of running work processes. To immediately stop theparallel execution, open a PVM console (by typing pvm in a different shell) and type reset.If the parallel execution was aborted, the output files are already merged. If the schedulerwas interrupted for an immediate termination, the output files are not merged yet. Mergingcan therefore be done offline with the OFMERGE command.

7.1.5.1 Offline merging of PVM output files (OFMERGE)

Syntax: OFMERGE <pvm log file>Description: OFMERGE merges the output files from a parallel script execution.Under standard conditions, output files are automatically merged. If the scheduleris immediately terminated, the output files created up to this point can be manuallymerged with OFMERGE afterwards.Requirements: A valid log file and all temporary files from a previously abortedcalculation must be available.

7.1.5.2 Recovering an aborted or terminated parallel script execution (RECOVER)

Syntax: RECOVER <pvm log file>Description: If a parallel script execution was aborted or immediately terminated,FlexS is able to recover the calculation from the log file. RECOVER will analyze thelog file and create a list of already calculated commissions. The script must thenbe restarted using the SCRIPT command. FlexS will skip the already calculatedcommissions and continue its calculation writing into the same log file.

Page 124: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

124 CHAPTER 7. ADDITIONAL MODULES FOR FLEXS

Requirements: A valid log file and all temporary files from a previously abortedcalculation must be available.

7.1.6 Killing a single work process

If a parallel execution starts with the SCRIPT command from inside FlexS, a single workprocess can be terminated by pressing any key and then kill as soon as the prompt ap-pears. Then FlexS asks for the work process to be terminated, and sends a terminationinstruction to this work process.

7.1.7 Working with parallel FlexS

It is advisable to start a parallel FlexS execution with an empty PREDICT directory. Allfiles except the local files containing the batch files (located in the PVM_TEMP directory)are written into this directory (unless there are absolute filenames in the batch file). ThePREDICT directory must be modified in the FlexS configuration. Changing the PREDICTdirectory interactively causes confusion between the master and the work processes.The directories used during the FlexS run must be available and unique for all hosts. In astandard installation, this is sometimes not the case for TEMP which can be a local directory.Make sure that all FlexS processes access the same directories.When a work process terminates during execution, a new work process is automaticallystarted by the master process. This process will have a new task ID and in order to avoidfailure of the final merging of the output files, temporary files of the old work process arerenamed to the task ID of the new process. The new work process now executes the initbatch file before doing any computation. If the init batch file overwrites an output file,data generated by the terminated process will be lost. This can be avoided by usingSELOUTP in append mode only.

7.1.8 Working with PVM

Working with PVM also has some pitfalls. The following list explains some of them:

• Problems occur if your start-up file .cshrc or .profile produces output on stdout.Make sure that this is not the case.

• Hosts which are part of the Parallel Virtual Machine must be available for the user. Inparticular, it must be possible to log in or execute an rsh command. Make sure thatyour .rhosts file is defined appropriately.

• PVM expects the user program to be located at $(PVM_ROOT)/bin/$(PVM_ARCH).It is best to generate a link from there to the FlexS executable named flexs.

• As mentioned previously (p. 120), it is crucial to employ the same FlexS executable forboth master and work nodes.

• If you kill a FlexS master process the PVM demons on the client machines are notterminate correctly and may still running. This may cause problems if you start a newjob in parallel. Thus, we recomment (a) always to interrupt a parallel computationwith abort (see Sec. 7.1.5) and (b) if you had to kill FlexS kill also all pvm demons and

Page 125: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

7.1. PARALLEL VIRTUAL MACHINE (PVM SUBMENU) 125

temporary PVM lock files on all client machines manually before restarting anotherjob, i.e., execute “killall pvmd3; rm -f /tmp/pvm*” on all client machines.

Finally, for PVM usage there is a constraint on filenames if you want to employ themwithin parallel scripts. One scenario might be that you’d like to do a parameteriza-tion study and want to associate the filenames with the parameters, e.g.: SELOUTPoutput_for_$(a_parameter)In that case a file merging process at the end of PVM-parallelized jobs will work correctlyonly if:

• the respective final filename(s) remain either constant OR

• only contain built-in, constant variables OR

• the variables are set during the startup of the script.

If a filename itself contains variables set during the scripts (even outside the parallelizedloop), file merging will most probably not work properly due to operating system behavior.

In order to allow that each slave process write his own files, FlexS-PVM has a build in batchvariable $(PVM_ID), which is built up of the environment variable $PVM_VMID (p. 120), if itis set, and the pvm task id of the slave process:

$(PVM_ID) - _TEST_VMID_2883585, where $PVM_VMID is set to TEST_VMID$(PVM_ID) - _2883585, where $PVM_VMID is not set

The batch variable is automatically added to files, which will be merged at the end of PVM-parallelized jobs.

Page 126: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

126 CHAPTER 7. ADDITIONAL MODULES FOR FLEXS

7.2 Alignment of combinatorial libraries

FlexSc is an extension of FlexS which allows a more efficient alignment of combinatoriallibraries. Since a combinatorial library is built from only a few building blocks with a well-defined way of connecting them, information from previously aligned molecules can bereused to speed up the alignment process. FlexSc is more like a toolbox of algorithms forcombinatorial alignment rather than a single method. The reason for this design is that ofteninformation about the library is available, making the use of one or the other alignmentscheme look more appropriate. For a detailed description of the data structures underlyingFlexSc we refer to the scientific literature.In order to use FlexSc an additional license key is necessary.FlexSc is a combinatorial alignment tool, it is not a combinatorial library design tool. Thedifference is that in FlexSc we expect a combinatorial library including all building blocks aswell as the rules for how to connect them as input while a combinatorial library design toolstarts with fragments and creates a combinatorial library as output. FlexSc can be useful toprioritize among several libraries for screening or for determining preferable R-groups.The user interface of FlexSc contains two new menus, CLIB and CSUPER. The CLIB menucontains all commands for handling combinatorial libraries and therefore replaces theTEST_LIG menu in the classical alignment. The CSUPER menu contains the commandsfor combinatorial alignment.The following batch variables (see section 10.3.1) are available:

$(CORE_ID) contains the index of the currently core group.

$(NOF_CORE_INST) contains the number of instances for the currently core group.

$(NOF_RGROUPS) contains the number of rgoups (without the core group).

7.2.1 Handling combinatorial libraries (CLIB submenu)

For FlexSc, a combinatorial library consists of a core and up to nine additional R-groups,usually numbered from 1 to 9. The core is a synonym for R-group number 0, there is nodifference in principle between the core and the R-groups. The alternative fragments foran R-group, called R-group instances, must be stored in a single file (see section 7.2.1.1).If the file has the (multi-)mol2 format then the connecting atoms of the R-group instancesare defined by their atom name. Each R-group instance (except core instances) must haveexactly one atom with a unique atom name (usually ’X’), called the X-atom. In addition,each core or R-group instance can have several so-called R-atoms which connect to other R-groups. R-atoms have to be marked with a unique atom name (usually ’R’) followed by thenumber (1 – 9) which nominates the connecting R-group. X-atoms and R-atoms are requiredto be terminal atoms, i.e. they have exactly one bond. The unique neighbor is called the X-neighbor or R-neighbor atom. In order to connect two R-group instances, the correspondingX-atom and R-atom are removed and a bond is formed between the X-neighbor and theR-neighbor atom, the bond length is determined by the X-atom – X-neighbor atom distance.

7.2.1.1 Loading R-groups (READ)

With the comamnd READ combinatorial libraries can be loaded as a couple of multi mol2 filefor each R-group:

Page 127: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

7.2. ALIGNMENT OF COMBINATORIAL LIBRARIES 127

Syntax (mol2): READ <rgroup file> <r id> [<rgroup no> <x id>]Description: • (multi-)mol2 files:

Reads an R-group/the core into FlexSc main memory. All instances of the R-group must be stored in a single multi-mol2 file <rgroup file>. All connectingatoms to additional R-groups (R-atoms) must have the unique atom name <rid> followed by a number 1 – 9. Each instance must have the same connectingatoms. The first R-group to load must always be R0 (or the core). Subsequentlyloaded R-groups are identified either by <rgroup no> or by the number in theX-atom name. Each instance must have exactly one X-atom name <x id>,optionally followed by a number. READ reads in the molecules and outputsthe list of found R-groups. Note that the R-atom type is compared with the X-atom neighbor type. If the X-atom neighbor type is unique, the R-atom type isset to it, otherwise it is set to Du (dummy type). The same atom type correctionis done for the X-atom by analyzing the r-atom neighbors.

Note: Instances, which contains only linker atoms, are skipped during the reading.Important note (mol2): After the first execution of READ, the combinatorial libraryhas the OPEN status. After reading all R-groups, CLOSE must be executed.

7.2.1.2 Finishing R-group loading (CLOSE)

Syntax: CLOSEDescription: Finishes the library load procedure. All open R-groups are substi-tuted by hydrogens and physico-chemical data is assigned to each instance. Afterexecuting CLOSE, the library is assigned the READY status and can be used foralignment calculations.Requirements: The core and all R-groups must have been loaded with READ be-fore.

7.2.2 Setting up the initialization procedure (SELINIT)

Syntax: SELINIT [<list of levels>]Description: Adjusts the state of steps in the initialization process. The commandcalls TEST_LIG/SELINIT (see 6.4.2 for more details of the command).

7.2.2.1 Loading reference coordinates (READREF)

Syntax: READREF <rgroup no> <ref file> <ignore hydrogen>Description: Reads reference coordinates for R-group <rgroup no> from a multi-mol2 file <ref file>. Reference coordinates are used for manual fragment place-ment and RMSD calculations. There must be the same number of molecules in <reffile> as there are instances in the R-group <rgroup no>. The i-th molecule in <reffile> is mapped onto the i-th instance of R-group <rgroup no>. For each molecule-instance pair, the assignment of atoms is based solely on the atom numbering. Fi-nally, if <ignore hydrogen> is answered yes, hydrogen atoms are neglected duringloading.Requirements: A library must be loaded and have READY status.

Page 128: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

128 CHAPTER 7. ADDITIONAL MODULES FOR FLEXS

7.2.2.2 Setting reference coordinates (SETREF)

Syntax: SETREF <rgroup no> <ignore hydrogen>Description: Copies the loaded coordinates to the reference coordinates for all in-stances of R-group <rgroup no>. Reference coordinates are used for manual frag-ment placement and RMSD calculations. You can decide whether the hydrogenatoms should be taken into account in the comparison or not with the parameter<ignore hydrogen>.

7.2.2.3 Assigning reference coordinates by subgraph matching (MAPREF)

Syntax: MAPREF <rgroup no> <ref file> <check bonds> <check atom types><ignore hydrogen>Description: Loads the molecule from <ref file> as a reference subgraph for R-group <rgroup no>. Reference coordinates are assigned to each instance of R-group <rgroup no> based on subgraph matching. If multiple matchings are found,the first arbitrary matching is used. The sybyl atom types between atoms of the in-put molecule and atoms in the subgraph must be identical in order to be matched.If <check bonds> is answered yes, bond types must also match. If <ignorehydrogen> is answered yes, hydrogens are excluded while loading the referencestructure. The subgraph together with the coordinates are stored internally and fur-ther used during base selection and placement (see SELBAS and placement routinesPLACEC, PLACER, PLACESEQ).Requirements: A library must be loaded and have READY status.Important notes: Although only the first mapping is used for the initial assign-ment of reference coordinates, multiple mappings are processed later in the place-ment routines.

7.2.2.4 Deleting a combinatorial library (DELETE)

Syntax: DELETEDescription: Deletes a combinatorial library from FlexSc main memory.

7.2.2.5 Outputting combinatorial library information (INFO)

Syntax: INFODescription: INFO outputs summary information about the currently loaded li-brary on the screen. Besides the load status, the number of instances and the inputfile is listed for each R-group. Finally, the total number of instances and the totallibrary size (number of library molecules) is output.

7.2.2.6 Subselecting R-group instances (SELECT)

Syntax: SELECT <rgroup no> <instance list> <select mode>Description: With SELECT, the set of R-group instances can be reduced to a sub-set. In all following calculations, only the subset is considered. <rgroup no> isthe number of the R-group whose instance set should be reduced. <instance list>defines the set of remaining instances. <instance list> is a list of integer ranges

Page 129: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

7.2. ALIGNMENT OF COMBINATORIAL LIBRARIES 129

(format a-b) separated by comma or blank (if separated by blanks, the list must beenclosed by quotation marks). Depending on <select mode>, previous selectionsare overwritten (“o”, the default) or extended (“e”).Requirements: A library must be loaded and have READY status.

7.2.2.7 Outputting R-group information (RGROUP)

Syntax: RGROUP <rgroup no>Description: RGROUP outputs the R-group file and the list of all instances with theinternal number, a ’+’ indicating the active (subselected) instances, and the moleculename of the instance.

7.2.2.8 Switching the core (SWITCH)

Syntax: SWITCH <rgroup no>Description: During the alignment calculations later on, the core plays a specialrole. When the library is loaded, the first R-group loaded is always R-group 0 whichwill then be the core. For the calculations later on however, each R-group can playthe role of the core. With SWITCH, the R-group <rgroup no> is defined to be thecore.Requirements: A library must be loaded and have READY status.Important notes: Switching the core is possible only before starting the alignmentcalculations and not during the calculations.

7.2.2.9 Extend a core instance (EXTENDCORE)

Syntax: EXTENDCORE <core id> [<rgroup no> [<inst no>] ...]Description: With the command EXTENDCORE a single core instance can be ex-panded with chosen R-group instances. The extended core is specified by the in-stance number for the core <core id> and a sequence of <rgroup no> and <instno>.Requirements: A library must be loaded and have READY status. If a core wasexpanded previously, it must be reset first using RESETCORE.Important notes: If an extended core defined, only the extended core can be cho-sen, that means the other commands take automatically the expanded core and donot ask for a core instance. Some commands are not available, if a core instance hasbe extended, like SWITCH, ENUM, READREF, SETREF, MAPREF, PLACER, PLACESEQand SELECTR. To reset the extended core, use the command RESETCORE.

7.2.2.10 Release an extended core (RELEXTCORE)

Syntax: RELEXTCOREDescription: Release an extended core. All R-groups that currently expand thecore to a extended core are removed.Requirements: A core must first have been extended with EXTENDCORE.

Page 130: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

130 CHAPTER 7. ADDITIONAL MODULES FOR FLEXS

7.2.2.11 Resetting the combinatorial library (RESETCORE)

Syntax: RESETCOREDescription: The combinatorial library is reset to its initial status.

7.2.2.12 Extracting a molecule from the library (EXTRACT)

Syntax: EXTRACT <core id> [<rgroup no> <inst no>]Description: With the commands EXTRACT, EXTEND, and RELEASE, moleculescan be extracted from the combinatorial library and subjected to individualmolecule calculations in the same way as if they were loaded as a single moleculein the TEST_LIG menu. All commands from TEST_LIG and SUPERPOS can beapplied to them.The first parameter <core id> defines the instance of the core to be used. AdditionalR-groups can be added. For each R-group, the R-group number <rgroup no> andthe instance <inst no> must be specified. Note that R-groups can only be addedin an order such that the resulting molecule is connected. In order to terminate theselection −1 should be provided as the terminal <rgroup no>.The molecule is built by linking the corresponding R-groups and adding somephysico-chemical information at the newly formed bond, the molecule is not copied.Therefore it is necessary to release the molecule to the library before another librarymolecule is extracted.Requirements: A library must be loaded and have READY status. If a librarymolecule was extracted previously, it must be released first using RELEASE.

7.2.2.13 Releasing R-groups from an extracted molecule (RELEASE)

Syntax: RELEASE <nof rgroups>Description: RELEASE releases the last <nof rgroups> added to the currently ex-tracted library molecule. The combinatorial library molecule extraction works likea stack. Therefore, only the last added fragments can be removed in opposite order.The default value for <nof rgroups> is the total number of R-groups (including thecore) such that the complete molecule can be released by accepting the default. Alibrary molecule has to be released completely before a new one can be extracted.Requirements: A molecule must have been extracted from the library.

7.2.2.14 Extending an extracted molecule (EXTEND)

Syntax: EXTEND [<rgroup no> <inst no>]Description: EXTEND adds additional R-groups to a partially extracted librarymolecule. This command works exactly like EXTRACT except that at least a coreinstance must have been extracted before.Requirements: EXTRACT must have been previously performed.

7.2.2.15 Enumerating a combinatorial library (ENUM)

Syntax: ENUM [<write to mol2 (y/n)> <filename>]

Page 131: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

7.2. ALIGNMENT OF COMBINATORIAL LIBRARIES 131

Description: ENUM enumerates the combinatorial library currently loaded start-ing from the currently extracted molecule. The molecules are constructed and re-leased only without any further computations. This function is useful for testing themolecule construction routine before using it in a more time-consuming alignmentcalculation. The generated molecules can optionally be written to a multi-mol2 file.Requirements: EXTRACT has to be performed first.

7.2.2.16 Outputting information about a library molecule (MINFO)

Syntax: MINFO <extracted> [<rgroup no> <inst no>]Description: MINFO outputs detailed information about a library molecule. If<extracted> is answered yes, information is given for the extracted molecule. Oth-erwise an instance is selected by the <rgroup no> and the instance id <inst no>.Requirements: A library must be loaded and have READY status.

7.2.2.17 Setting administration defaults for drawings (SELADM)

Syntax: SELADM <graphics object number> <temp file> <append>Description: With SELADM you can specify the graphics object numbers used fordrawing the library and you can determine whether the graphic files are temporarywith self-generated names or specified in each graphic command.

<graphics object number> If set to (1–255), the drawings for the library moleculeswill be displayed in graphics object <graphics object number> (see FlexVmanual). If you select "slot no. taken from R-group no. <0>", the <graphicsobject number> will match the chosen <rgroup no> (see 7.2.2.21). That is, forexample, a R-group with number 5 will be in slot 5.

<temp file> If set to ’y’, the drawings are written in temporary files and removedafter quitting FlexS. Otherwise you will be asked for a filename at the end ofeach DRAW command (see below).

<append> If set to ’y’, previous drawings are not overwritten. Instead, the currentdrawing is appended to the previous one in the graphic file.

7.2.2.18 Setting default values for drawing the library (SELGRA)

Syntax: SELGRA <mol display mode> <hydro> <interact geoms> <interactpoints> <all contact types> [<contact type selection>] <surf> <R connect> <Xconnect>Description: With SELGRA you can set specific default values for drawing combi-natorial libraries.

<mol display mode> Specifies the default appearance of molecules. The displaymodes are lines ’1’, sticks ’2’, balls & sticks ’3’, spheres ’4’.

<hydro> If set to ’y’ or ’1’, hydrogens are shown. If set to ’2’ then only hydrogensbonded to hetero atoms (i.e., only hydrogens that are bonded to non-carbonatoms) are shown.

<interact geoms> If set to ’y’ or ’1’, the interaction geometries including main di-rections are shown.

Page 132: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

132 CHAPTER 7. ADDITIONAL MODULES FOR FLEXS

<interact points> If set to ’y’, the interaction geometries are shown as discrete in-teraction points.

<all contact types> If set to ’y’, interaction geometries of all contact types aredrawn. Otherwise, a set of contact types must be entered in <contact typeselection>. The selection is a list of integers or integer ranges (format a-b) sep-arated by ’,’ or blanks.

<surf> Specifies the kind of surface to draw. Basically the surfaces are molecularsurfaces. In lines mode, only lines connecting bounding atoms of reentrant(saddle and concave) patches of the surface are shown. In triangles mode, theconcave patches are displayed as triangles. In connolly mode, the molecularsurface is drawn. In general hydrogen atoms are neglected, and single atominstances are not considered either. This will often lead to warnings.

<R connect> If set to ’y’ or ’1’, vectors are drawn at the positions where furtherR-groups can be added (r-atoms).

<X connect> If set to ’y’ or ’1’, a vector is drawn at the position where the currentR-group is connected to the parent R-group or core (x-atom).

Important notes: The Connolly surface is rendered by its analytical calculatedpatches. This enables selection of the level of curvature approximation but makesthe rendering much more complicated. Therefore a few percent of the patches arerendered incorrectly (we will try to reduce this rate). In addition, there is currentlyonly pairwise cusp trimming.

7.2.2.19 Selecting the coloring mode (SELCOL)

Syntax: SELCOL <library color mode selection> <interact geoms color modeselection> <main dirs color mode selection> <surface color mode selection>Description: With SELCOL you can set the color modes for the library molecule aswell as main directions, interaction geometries, and surfaces belonging to it. Validmodes are listed below. Depending on the color mode chosen, you will be askedfor specific color values. All color modes and the respective color definitions can befound in 10.20.

invisible : (ligand, main dirs, interact geoms, surface) Nothing is drawn.atom : (ligand, surface) The bonds are drawn half-half in the colors of the atoms.

The atoms are colored according to their element.unique : (ligand, main dirs, interact geoms, surface) Everything is drawn in one

user-defined color.fragment : (bond) The ligand is bi-colored to visualize the fragmentation, the base

fragment has an extra color.contact type : (main dirs, interact geoms) Main directions or interaction geometries

are drawn in the colors of the contact types.cen-distance : (surface) The surface is colored according to the distance to the cen-

ter of the molecule.surfpatch : (surface) The surface is colored according to the surface patch type (sad-

dle, concave, convex).

Page 133: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

7.2. ALIGNMENT OF COMBINATORIAL LIBRARIES 133

surf-atom : (surface) Convex patches are colored by atom type, reentrant (saddleand concave) patches are colored in one user-defined color.

energy : Used with superposition, the interactions are colored according to theirscore.

There are three possible ways to specify a color. Inside FlexS, RGB values are used.You can specify the RGB value directly by typing three or four values (translucentcoloring) from the interval [0.0,1.0] separated by blanks or /, or indirectly either bythe name of a color or by an integer from the interval [1,360]. The integer repre-sents an angle in a color circle. All color names are defined in the static data filegraphic_sp.dat. You can add color definitions if you like (see also 10.20). Notethat if you are typing a single line command or writing a script and the color defi-nition contains blank characters, you must enclose them in double quotes.

7.2.2.20 Labeling the library (SELLAB)

Syntax: SELLAB <X name> <X id> <atom name> <infile number> <sybyltype> <fragment number>Description: SELLAB defines the label string in combinatorial library drawings.

X name If set to ’y’ or ’1’, the instance name is written at the X-atom.

X id If set to ’y’ or ’1’, the instance ID is written at the X-atom.

atom name: If set to ’y’ or ’1’, the short element name of the atom is shown.

infile number: If set to ’y’ or ’1’, the number of the atom in the input file is shown.

sybyl type: If set to ’y’ or ’1’, the SYBYL type string is shown.

fragment number: If set to ’y’ or ’1’, the number of the atom fragment is shown.

7.2.2.21 Drawing the combinatorial library (DRAW)

Syntax: DRAW <rgroup no> <active only> <all active> <transform> [<parent rid>] [<selection>] [<mul draw directory> <filename>]Description: DRAW draws an R-group in the currently active graphic output formatbased on the CLIB graphic context. The first parameter defines the R-group to bedrawn. The set of instances can then be restricted to the active ones by answering<active only> with ’y’ or 1 and then further to a subselection by answering <allactive> with ’n’ or 0. If <all active> is set to ’n’ or 0, the <selection> specifies thelist of instances (lists are defined as a list of integer ranges (format a-b) separated by’,’ or blanks) to draw.The R-group molecules can be transformed relative to a specified instance of theparent R-group by answering <transform> with ’y’ or 1 followed by the instanceof the parent R-group <parent r id> to which the transformation should be per-formed.Finally, the <mul draw directory> and the <filename> of the graphic output filecan be specified if no temporary files are used (this is specified in SELADM).Requirements: A library must be loaded and have READY status.

Page 134: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

134 CHAPTER 7. ADDITIONAL MODULES FOR FLEXS

7.2.2.22 Listing the graphic settings (GRAINF)

Syntax: GRAINFDescription: Outputs a list of all current graphic settings (the graphic context) forthe combinatorial library.

7.2.3 Alignment combinatorial libraries (CSUPER submenu)

In a combinatorial alignment run a huge amount of placement information can be generatedin a short time. It is therefore necessary to store this information in secondary memory andcontrol the amount of information stored.Internally, FlexSc only keeps the scores achieved for the k highest ranking placements.The number of scores kept for each library molecule can be controlled by the parameter<SCORE_TABLE_SIZE>.During the calculation, placement files can be written. The number and content ofthe files is controlled by three parameters: <STORE_PLACEMENT_THRESHOLD>,<STORE_PLACEMENT_MODE> and <MAX_NOF_FILE_WRITE>:

<STORE_PLACEMENT_THRESHOLD> defines a threshold such that only moleculeshaving a score below this threshold are stored in placement files.

<STORE_PLACEMENT_MODE> defines the file format and the number of placementsstored. If set to ’-1’, a pdf file containing the whole set of placements is stored. If set toa value k > 0, a multi-mol2 file is written containing the k first placements. If set to ’0’,no files are created.

<MAX_NOF_FILE_WRITE> controls the number of files created. If set to’-1’, a file is created for each molecule (according to the settings of<STORE_PLACEMENT_THRESHOLD> and <STORE_PLACEMENT_MODE>).If set to ’0’, all placements are written to a single multi-mol2 file (provided that<STORE_PLACEMENT_MODE> > 0). If set to k > 0, FlexSc keeps only theplacement files of the k highest scoring library molecules.

<KEEP_ALL_SCORES_ACHIEVED> If <KEEP_ALL_SCORES_ACHIEVED> is set to’1’, FlexSc keeps the scores achieved for all placements (only for the placement so-lutions of EXTENDMR and PLACESEQ.

For further information about the combinatorial alignment parameters see also section 10.4.

If a file is created for a single library molecule (either a mol2 or a pdf file), the filenameis constructed in the following way. The first part of the name is the constant string <filename> which is a parameter of various placement commands (see below). The second partdescribes the library molecule: there is a string for each R-group contained in the librarymolecule having the format [C|R<rgroup no>]-<instance id>. These strings are concate-nated with the separator ’_’ in the order in which the R-groups are added by the constructionalgorithm. If <file name> is a multi-mol2 file, several score values for each placement areprinted as a comment line (FLEXS_SCORE).

7.2.3.1 Selecting base instances (SELINS)

Syntax: SELINS <nof base mol> <nof in core> [<nof in rgroup> ...]

Page 135: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

7.2. ALIGNMENT OF COMBINATORIAL LIBRARIES 135

Description: SELINS selects the instances which are used for base placement. Theuser can only control the number of selected instances, not the selection processitself. First, the total number of instances <nof base mol> can be selected, then themaximum number of instances for the core and each R-group can be specified.For the combinatorial alignment algorithms currently available, SELINS doesn’treally make sense. In order to get a result for all library molecules, you have toselect all instances of the core or the R-group (for R-group placement). SELINSis basically implemented for other combinatorial alignment algorithms which areunder development. Note that SELINS has to be performed before any placementroutine.Requirements: A library must be loaded and have READY status. Previously cal-culated placements must be deleted.

7.2.3.2 Placing the core instances (PLACEC)

Syntax: PLACEC <mode>Description: PLACEC aligns the core instances on top of the reference ligand us-ing FlexS’ incremental construction algorithm. All three phases (selection of basefragments, placing base fragments, incremental construction) are performed auto-matically. <mode> controls the placement algorithm for base placement. The samemodes are allowed as for single molecule superposition, manual (m), perturbate (p),matching (2), triangle (3), optimize (o) (see SUPERPOS/PLACBAS 6.7.1 for details).Requirements: A library must be loaded and have READY status. Previouslycalculated placements must be deleted and base instances must be selected usingSELINS.

7.2.3.3 Local flexible/rigid-body postoptimization of core placements (POPTC)

Syntax: POPTC <flexible> <core id> [<placements>]+Description: Postoptimization of core placements. The reference ligand is in-cluded as a rigid template for the postoptimization. An ’all-on-one’ goal function isoptimized.<core id> is a selection of core instances. It can either be a single number, a list ofnumbers separated by ’,’, a list of intervals of the form ’a-b’, or ’all’.If <flexible> equals ’y’, the goal function according to the flag ’OPTIMIZE’ (see p.55) is used for optimization.If <flexible> equals ’n’, the Gaussian overlap volume of rigid placements will beoptimized. The rigid-body optimization is performed using the RigFit method.Parameters controlling the flexible postoptimization for the RigFit placement arelisted in the static data file optpar.dat (see section 10.21).For each selected core instance with calculated placements, <placements> is theselection of placements to be optimized. It can either be a single number, a list ofnumbers separated by ’,’, a list of intervals of the form ’a-b’, or ’all’.Properties of the new/optimized placement (such as RMS value, overlap volume,etc.) are updated.Requirements: Core placements must have been computed.

Page 136: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

136 CHAPTER 7. ADDITIONAL MODULES FOR FLEXS

Important notes: Flexible postoptimization: If TORSION_MODE is not set to2 (see 5.1.5 and Appendix A), the torsion energy flag (see 6.9.2) is turned offautomatically.

7.2.3.4 Writing core placements in pdf format (WRITEC)

Syntax: WRITEC <core id> <base file name>Description: Writes a set of core placements in a FlexS-specific file format (.pdfformat) on disk. The default directory for this command is the path specified inthe entry PREDICT (config_sp.dat). The pdf format is based on ASCII and cantherefore be read and edited with standard tools.

7.2.3.5 Reading core placements in pdf format (READC)

Syntax: READC <core id> <filename>Description: Reads a set of placements for core instance <core id> in a FlexS-specific file format (.pdf format) from the file <filename>. The default directoryfor this command is the path specified in the entry PREDICT (config_sp.dat).Important notes: The placement information is based on the reference ligand andcore instance. Thus, the reference ligand and the core instance files in FlexS’s mainmemory before executing the READC command must be the same as the files thatwere in the main memory during generation of the core placements. OtherwiseFlexS ends up in an inconsistent state which is not detected in every case.

7.2.3.6 Placing R-group instances (PLACER)

Syntax: PLACER <rgroup no> <mode> <store> [<file name>]Description: PLACER aligns all instances of R-group <rgroup no> on topof the reference ligand using the base placement algorithm <mode> (seeSUPERPOS/PLACEBAS for a description of the modes). If <store> is answered’y’, placements are stored in the file <file name> with respect to the set-tings in SETTINGS (see section 10.4) (<STORE_PLACEMENT_THRESHOLD>,<STORE_PLACEMENT_MODE> and <MAX_NOF_FILE_WRITE>).

7.2.3.7 Sequential alignment of the library (PLACESEQ)

Syntax: PLACESEQ <continue> [<sequence>] <rgroup no> [<inst no>][<rgroup no> [<inst no>]...] <mode> <store> [<file name>]Description: PLACESEQ performs a sequential placement of all library molecules.The molecules are created sequentially and then aligned on top of the referenceligand. No information of previous placements is re-used. The alignment resultsare therefore identical to the results of a standard calculation of each individuallibrary molecule. The total amount of computing time is of course also the same.You can restart aborted computations or start from a given sequence. For both ofthese scenarios, the <continue> question has to be answered with ’y’. If you answer’n’ here, FlexS will directly ask you for the R-group(s) and compute all combinationsbetween them.

Page 137: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

7.2. ALIGNMENT OF COMBINATORIAL LIBRARIES 137

In the ’restart’ scenario (i.e., you answered ’y’ to <continue>), FlexS internallychecks whether an aborted sequence is still available. If so, it will propose the lat-est sequence you computed and ask (<sequence>, ’y’ or ’n’) whether you want tocontinue with the one following it. If not, it will ask for a start sequence and continuethe calculation from the sequence following this start sequence. The start sequence hasto be specified as a sequence of <rgroup no> and <inst no>.The <rgroup no> defines which R-groups should be added. The <instno> defines the instance of R-group <rgroup>. In order to restrict the R-group instances, use CLIB/SELECT in advance. The following parametersare the same as in PLACER. <mode> defines the placement mode for thebase placement (see SUPERPOS/PLACEBAS for a description of modes). If<store> is answered ’y’, placements are stored in the file <file name> withrespect to the settings in SETTINGS (<STORE_PLACEMENT_THRESHOLD>,<STORE_PLACEMENT_MODE> and <MAX_NOF_FILE_WRITE>).Requirements: A library must be loaded and have READY status. Previouslycalculated placements must be deleted and base instances must be selected usingSELINS.Notes: The calculation can be aborted by pressing any key and entering abort assoon as the prompt appears. Before the prompt appears, the complex build up ofthe current sequence will be finished.

7.2.3.8 Extending core placements by single R-groups (EXTENDR)

Syntax: EXTENDR <rgroup no> <store> [<file name>]Description: EXTENDR adds all active instances of R-group <rgroup no>to all core placements using the incremental construction algorithm. If<store> is answered ’y’, placements are stored in the file <file name> ac-cording to the settings in SETTINGS (<STORE_PLACEMENT_THRESHOLD>,<STORE_PLACEMENT_MODE> and <MAX_NOF_FILE_WRITE>).Requirements: A library must be loaded and have READY status. Previously cal-culated placements must be deleted and base instances must be selected and placedusing SELINS and PLACEC.

7.2.3.9 Selecting R-group instances by score (SELECTR)

Syntax: SELECTR <rgroup list> <nof active>Description: SELECTR divides the set of instances for R-groups in <rgroup list>into active and inactive ones. The selection criterion is the score of a previous align-ment run. <nof active> is the number of instances which should be defined asactive afterwards. If <nof active> is terminated with a percent sign, the value istaken to be the percentage that should be active.Requirements: A library must be loaded and have READY status. Previously cal-culated placements must be deleted and R-group instances must be placed usingeither SELINS and PLACER or SELINS, PLACEC and EXTENDR.

Page 138: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

138 CHAPTER 7. ADDITIONAL MODULES FOR FLEXS

7.2.3.10 Extending core placements by multiple R-groups (EXTENDMR)

Syntax: EXTENDMR <continue> [<sequence>] <rgroup no> [<inst no>][<rgroup no> [<inst no>] ...] <store> [<file name>] <compare> [<cmp file>]Description: EXTENDMR adds multiple R-groups to already created core place-ments in a recursive fashion (recursive combinatorial alignment, see [14] for a de-scription of the algorithm).You can restart aborted computations or start from a given sequence. For both ofthese scenarios, the <continue> question has to be answered with ’y’. If you answer’n’ here, FlexS will directly ask you for the R-group(s) and compute all combinationsbetween them.If <continue> equals ’y’, an aborted EXTENDMR calculation can be continued. If aEXTENDMR calculation was aborted in the current FlexS session and <sequence>equals ’y’, the aborted calculation will be continued. Otherwise a sequence of<rgroup no> and <inst no> must be specified. EXTENDMR starts the calculationwith this sequence.If <continue> equals ’n’, the <rgroup no> must be specified only.<rgroup no> defines the list and order of R-groups to be added. <inst no> definesthe instance of R-group <rgroup>. The list of R-groups can be terminated with-1. The R-groups must be selected in an order such that the core and all selectedR-groups form a connected molecule. Note that only active instances are placed inthe alignment procedure (after loading, all instances are active; the set of actives canthen be limited with CLIB/SELECT or CSUPER/SELECTR).If <store> is answered ’y’, placements are stored in the file(s) <file name> accord-ing to the settings in SETTINGS (<STORE_PLACEMENT_THRESHOLD>,<STORE_PLACEMENT_MODE> and <MAX_NOF_FILE_WRITE>). If<compare> is answered ’y’, the placements created for each library moleculeare compared with the placement in <cmp file>_<mol id>, where <mol id> iscreated according to the filename rules described above.Requirements: A library must be loaded and have READY status. Previous calcu-lated placements must be deleted and core instances must be placed using SELINSand PLACEC.Notes: The calculation can be aborted by pressing any key and then abort as soonas the prompt appears. Before the prompt appears, the complex build up of thecurrent sequence will be finished.

7.2.3.11 Deleting placement information (DELETE)

Syntax: DELETEDescription: DELETE deletes all placement information in main memory. Thecommand is unique for all kind of placement results. Note that files created dur-ing a combinatorial alignment run are not deleted.

7.2.3.12 List placement results (LISTP)

Syntax: LISTP <mode> [<active> <fill gaps> <nof placements> <sort>][<nof_sol>]

Page 139: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

7.2. ALIGNMENT OF COMBINATORIAL LIBRARIES 139

Description: LISTP creates a table of alignment solutions, one row for eachaligned library molecule. Depending on the performed calculations, one of fourmodes must be selected with <mode>: c (core placements), s (single R-group place-ments), m (multiple R-group placements) or a (all multiple R-group placements,only if <KEEP_ALL_SCORES_ACHIEVED> is set to ’1’).If <mode> is set to ’a’, the placements will be sorted by energy and the best<nof_sol> placement scores will be listed in the table, one row for each placement.Otherwise the table can be restricted to active molecules by answering <active>with ’y’. If placement information is missing for some library molecules, the tablecan be completed (filled up with 0’s) by answering <fill gaps> with ’y’. The numberof placement scores to be printed is controlled by <nof placements>, and finally<sort> defines the sort criterion which is either 0 (by R-group instance numbers)or 1 (by energy).

7.2.3.13 Extracting a library molecule and loading placement data (EXTRACT)

Syntax: EXTRACT <mode> <core id> [<r id>...]Description: EXTRACT creates a library molecule and loads the correspondingplacement data. The library molecule is specified by the instance numbers for thecore <core id> and the R-groups <r id>. <mode> defines which kind of place-ment data should be loaded. Allowed values are c (core placements), r (single R-group placements) and m (multiple R-group placements).Requirements: A combinatorial alignment must have been performed andplacement data must have been written to a pdf file. Make sure that<STORE_PLACEMENT_MODE> is set to -1 before the calculation and the<store> parameter of the corresponding alignment command is answered ’y’.

7.2.3.14 Releasing a library molecule and deleting placement data (RELEASE)

Syntax: RELEASEDescription: RELEASE destroys placements previously loaded with EXTRACT andreleases the extracted library molecule back to the library. RELEASE is automaticallyexecuted between EXTRACT commands.

7.2.3.15 Writing multiple R-group placements to a file (WRITESOL)

Syntax: WRITESOL <basename> <append> <nof_sol>Description: WRITESOL writes the best <nof_sol> multiple R-group placementsto a mol2 file calling <basename>. The placement scores will be sorted by energy.If <append> is set to ’y’, <basename> is a multi mol2 file. Otherwise the place-ment data will be written to single mo2 files. The names of these files consist of<basename> and of a well defined string, which describes the library molecule.WRITESOL reads the data of the best <nof_sol> placements from the pdf files1.Requirements: <KEEP_ALL_SCORES_ACHIEVED> must have been set to ’1’and a combinatorial alignment must have been performed with EXTENDMR resp.

1In rare cases, for numerical reasons, it may happen that you cannot read in a generated pdf file. We’d kindlyask you to report such problems to [email protected] if possible.

Page 140: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

140 CHAPTER 7. ADDITIONAL MODULES FOR FLEXS

PLACESEQ. The placement data must have been written to a pdf file. Make surethat <STORE_PLACEMENT_MODE> is set to -1 before the calculation and the<store> parameter of the corresponding alignment command is answered ’y’.

7.2.3.16 Extracting the library molecule with the best alignment score and loadingthe corresponding placement data (EXTRACTTOP)

Syntax: EXTRACTTOPDescription: EXTRACTTOP creates the library molecule with best alignment-energy score and loads the corresponding placement data2.Requirements: <KEEP_ALL_SCORES_ACHIEVED> must have been set to ’1’and a combinatorial alignment must have been performed with EXTENDMR resp.PLACESEQ. The placement data must have been written to a pdf file. Make surethat <STORE_PLACEMENT_MODE> is set to -1 before the calculation and the<store> parameter of the corresponding alignment command is answered ’y’.

2In rare cases, for numerical reasons, it may happen that you cannot read in a generated pdf file. We’d kindlyask you to report such problems to [email protected] if possible.

Page 141: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

8Troubleshooting

8.1 Installation and Licensing Problems

8.1.1 Libraries missing?

We only use standard shared libraries in FlexS. In FlexV , one of the following libraries maybe missing:

OpenGL: libGL is missing. In this case, OpenGL is not installed on this machine. Pleasecontact your system vendor or administrator.

8.1.2 Ubuntu / Debian Distributions

In some Ubuntu and Debian distributions, for example Ubuntu 7.0, a symbolic link fromlibGL.so.1 to libGL.so is missing.For remedy, please go to the /usr/lib directory and issue the following command:

ln -s libGL.so.1 ./libGL.so

The file libGL.so.1 can also have higher numbers, therefore a command such asln -s libGL.so.2 ./libGL.so may also be the appropriate one.If the problems do not vanish thereafter, please apply the same procedure to the fileslibGLX.so and libGLU.so; and restart your X-Server using STRG+ALT+BACKSPACE.

8.1.3 Token not numeric error under Linux

If you are running a Linux system and get the error message ’Token not numeric’ whilereading data, this is caused by the language support contained in the C library (floating-point numbers are expected to contain ’,’ instead of ’.’ in some languages). Unset the $LANGvariable before running FlexS to circumvent language support.

8.1.4 Insufficient memory under Windows

In rare cases it may occur that FlexS does not start under Windows but shells out to thesystem. This is usually due to the memory management of the operating system. In allcases we observed this could be resolved with the following procedure:

1. Create a c:\config.sys file if it does not exist. (You may need to change C: to an-other drive name pointing to your boot hard disk.)

141

Page 142: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

142 CHAPTER 8. TROUBLESHOOTING

2. Enter something like:shell=c:\windows\command.com c:\windows /e:2048 /p orshell=c:\windows\system32\command.com c:\windows /e:2048 /p,respectively.

3. Reboot your machine.

The problem should be resolved. Please email us if you still encounter problems.

8.1.5 Windows Vista

FlexS has been tested under Vista. You will please need to enable the so-called “XP com-patibility mode” using the context menu (right mouse click) of the executable’s icon in theinstallation folder and restart FlexS.

8.2 Problems at Runtime of FlexS

If you encounter problems such as

>> DATA ERROR: unexpected token

[occurred in line 21 of file C:\Documents and Settings\user\.flexs\flexs_gfx.cfg]>> ERROR: font\_app medium-r

>> ERROR: ^

>> expecting one of the following tokens:

it may be that there are files left over from the beta stage. In this case, please remove the filesin C:\Documents and Settings\user\.flexs\ (Windows) or /home/user/.flexs/(Linux). By user, we refer to your login name.After restart of FlexS, you may have to re-configure the license file path and the specificationto the ring conformer generator (corina, ...) in the Global Settings dialog.

Page 143: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

9Getting Help

9.1 Support

Commercial customers usually have full support from us. But we will certainly also tryto help academic customers who do not usually have support contracts with us. In eithercase, please check our FAQ & Knowledge Base at http://www.biosolveit.de/faq foranswers. In case this does not help, please visit http://www.biosolveit.de/support.Finally, we are certainly available through email at [email protected].

143

Page 144: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

144 CHAPTER 9. GETTING HELP

Page 145: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

III

TECHNICAL REFERENCE

145

Page 146: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann
Page 147: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

10Files and file formats

FlexS deals with two kinds of files. The one kind are static data files. FlexS knowswhere to read these files, because their paths have been defined in the configuration fileconfig_sp.dat (see there). The other kind are input/output files; in this case FlexS knowsonly the standard directories where to read/write the files (they have also been defined inconfig_sp.dat).How does FlexS locate files? For the static data files, FlexS looks only in the defined path,and nowhere else. For the second kind of files, FlexS performs the following procedure: if thegiven filename begins with a slash character (/), the filename is taken to be an absolute path,and FlexS first tries to open the file there. If instead the given filename begins with ./, FlexSconsiders it to be relative to the current directory, and FlexS first tries to open the file in thatpath relative to the current directory.In both cases, if FlexS fails to open a file in the absolute or relative path, or if neither of thetwo cases applies (that is, the given filename neither begins with / nor with ./), FlexS triesto open the file in the predefined directory (if a predefined directory exists for the kind offile in question).

10.1 Molecular input file formats

Due to the fact that different input file formats exist for molecular input, different treatmentof the respective information becomes necessary. As an example, SYBYL atom types arenot employed in the well-known SD file format, or three-dimensional coordinates are notnecessarily specified for ligand data specified as SMILES. Matters may become especiallyhard to understand when it comes to the subtle differences in chemistry which must be orneed not be applied.Some file formats do not usually come with 3D coordinates which is why the program3DGENERATOR for automatic generation of 3D coordinates has be specified, cp. page 205,for these formats. Similarly, because FlexS relies on SYBYL atom types, we must determinesuch types for cases where they are not available. This is activated by selecting transfor-mation level 10 (see SELINIT (6.4.2) and forced(!) for the respective file formats — again,depending on what type of initialization (traditional or new) you have chosen (cf. above).To assist you with the configuration for ligand initialization and associated files, we list thesettings and forced actions for the most important file types in Table 10.1 in case you wantto change any of the values.

147

Page 148: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

148 CHAPTER 10. FILES AND FILE FORMATS

Format Forced Flag Settings NoteSDF SELINIT !* 10 —

MDL MOL SELINIT !* 10 —

MOL2 — —

Table 10.1: Overview of the default ligand initialization settings and intercorrelation of flagsfor different ligand file formats. Please consider possible constraints of the 3D coordinatesgenerator (cp. page 205).

10.2 Overview of filename extensions

FlexS uses files with the following extensions:

.bat FlexS batch files. Default directory: Current.

.pdf Placement description files in a FlexS-specific format. Describes a whole set of place-ments for a reference/test ligand superposition.

.dat Static data files in a FlexS-specific format, or configuration file (config_sp.dat).Default directory: Current for config_sp.dat, none for static data files.

.log FlexS results in the form of tables. All result browsing commands such as QUERY canwrite their output into .log files (see SELOUTP command (section 6.2.10)).

.mol MDL’s molecular input format [18]. Default directory: Value of environment variableLIGAND (see config_sp.dat).

.mol2 Ligand files in the SYBYL MOL2 format [19]. Default directory: Value of environ-ment variable LIGAND (see config_sp.dat).

.sdf MDL’s molecular input format [18]. Default directory: Value of environment variableLIGAND (see config_sp.dat).

.gdf Graphic description files in a FlexS/FlexV -specific format (see FlexV manual). De-fault directory: Value of environment variable TEMP (see config_sp.dat).

10.3 *Batch files

Roughly speaking, a batch file consists of a sequence of FlexS commands together withtheir arguments (one command plus argument list per line). Nevertheless, there are twoimportant features which give batch files more expressive power than simple sequences ofcommands. The first feature is the use of variables, the second and more important is the useof loops. An extended example of a batch file is given in Appendix B.1 and B.2.Batch files are started by entering the command SCRIPT in the main menu (FLEXS) or bystarting FlexS in batch mode (see section 5.2).A simple way to generate a batch file is to start FlexS in log mode (option -l, see 5.2). Thenall the commands you type are stored in a log file, which in turn can be used as a batch file.Note that batch filenames are required to have the extension .bat.

Page 149: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

10.3. *BATCH FILES 149

The SCRIPT command itself is not allowed in batch files, so ’nested’ batch files are not pos-sible.

10.3.1 Variables

Variables in scripts are either $(alphanum-string) or $0, $1, . . ., $9. The latter format wasthe only one until FlexS 1.6 and is kept for compatibility reasons. Variables can replaceany argument of a command but not a command itself. They can either be initialized bya loop (see below), by interactive user commands INPUT or SELINP, or by the assignmentstatement SETVAR (see also below).Some variables are predefined within FlexS and can be used in scripts in a read-only manner.Currently, these are:

• The time the program was called $(START_TIME), the format is a single string, e.g.,20031017_1107 for 11:07h, October the 17th 2003.

• The best score of all placements $(BEST_SCORE)

• The best RMSD of all placements $(BEST_RMSD)

• All directories defined in the configuration file, for example $(LIGAND)

• All flags defined in the configuration file, for example $(RING_MODE)

• All database filenames defined in the configuration file, for example $(TORSION)

• All executables defined in the configuration file, for example $(RCGENERATOR)

• All integer and double parameters defined in the SETTINGS database file, for example$(CLASH_FACTOR)

• The name of the currently loaded test ligand $(TLIG_NAME) and the currently loadedreference ligand $(RLIG_NAME)

• The number of the currently loaded test ligands $(NOF_TLIG) and the number of thecurrently loaded reference ligands $(NOF_RLIG)

• The number of generated placements $(NOF_PLACEMENTS).

• The number of components for currently loaded ligand $(NOF_COMPONENTS).

• The index of the currently core group $(CORE_ID).

• The number of instances for the currently core group $(NOF_CORE_INST).

• The number of rgroups (without the core group) $(NOF_RGROUPS).

• The torsion status of the currently loaded test ligand $(TORSION_STATUS):value -1: the status is not defined.value 0: torsion angles around the bonds applied according to either tor-sion_standard.dat or torsion_fine.dat (see section 10.10).value 1: torsion angles around the bonds applied according to either tor-sion_standard.dat or torsion_fine.dat; if not found, the torsional profile is calculated

Page 150: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

150 CHAPTER 10. FILES AND FILE FORMATS

(see section 10.10).value 2: no torsion angles found in both torsion_standard.dat and torsion_fine.dat forat least one rotatable bond. The program applies a 30 degree grid with arbitrary refer-ence atoms.

• $(PVM_ID) is set to ’$PVM_VMID_<pvm task id>’ in FlexS-PVM (please refer to thePVM section on page 125).

If a variable is used in a script the first time for reading rather than for writing, it is set to anempty string.If the string assigned to a variable contains blanks, it must be written in double quotes toavoid breaking the string into separate tokens. A variable is replaced in a string even ifthe string is quoted. Variable replacement can be avoided by writing the variable in singlequotes. An empty string can be written as "".

Example

setvar $(number) 5setvar $(ligand) lig$(number)output "The value of ’$(ligand)’ is: $(ligand)."

prints: The value of $(ligand) is: lig5.

10.3.2 Script parameter lists

A script can be executed with a parameter list. The parameter list consists of a list of scriptvariables with assigned values. Before execution of the first script command, the variablesfrom the parameter list will be initialized with the given values. A parameter list can be en-tered with the command line option -a or as the second parameter of the SCRIPT command.Because the parameter list itself must be handled as an command argument, there are somesyntactic rules:

• Each entry in the parameter list is separated by a ;-character, the parameter list entryitself is not allowed to contain a ’;’.

• The variable name must be followed directly by an =-character, the string between =and either ; or the end of string is interpreted as the value for the preceding variable.

The following example shows a FlexS call in batch mode with two param-eters. The script 1stTest is given in Appendix A. The script loads a ref-erence ligand file named , a test ligand file named , computes the super-position, and writes the ten highest ranking solutions in a multi-mol2 file.Example

flexs -b 1stTest -a ’$(reflig)=dhfr/1dhf_kpl_h;$(testlig)=dhfr/4dfr_min_h’

Page 151: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

10.3. *BATCH FILES 151

10.3.3 Loops: FOR_EACH/END_FOR, WHILE or FOREVER

A loop is a sequence of commands, braced by a pair of FOR_EACH and END_FOR instruc-tions. Loops can be nested.

Syntax: FOR_EACH <n1> [<n2> [...]] IN "<filename>" [FROMTO <from> <to>]<sequence of commands>END_FORFOR_EACH <n1> FROMTO <from> <to><sequence of commands>END_FORWHILE <condition><sequence of commands>END_FORDescription: In the first variant, FlexS opens the specified file <filename>. If<filename> is an absolute path (first character is a slash /), it is taken as it is. Ifit is a relative path (first character anything other than a slash), it is taken relative tothe current directory.After opening the file, FlexS reads a line at a time (comment lines and empty lineswill be skipped). FlexS expects to find at least as many tokens (a token is any se-quence of non-whitespace characters or a string enclosed in "") in that line as thereare loop variables in the FOR_EACH line. FlexS assigns the tokens from left to rightto the respective loop variables, and then it executes the <sequence of commands>between FOR_EACH and END_FOR. Optionally, a subsection of the file can also bespecified with FROMTO. <from> and <to> define the lines to be used (omittingempty lines and comment lines). An example is given at the end of the commanddescription.In the second variant, the loop variable successively gets the values <from>,<from>+1, . . ., <to>, see also the example at the end of this description.Finally, WHILE allows the definition of conditional loops. See the next section onbranches for how to define a condition.When END_FOR is encountered, FlexS jumps to the corresponding FOR_EACH andreads the next line from the specified file. This process is repeated until no morelines are found. After that, execution continues following END_FOR. <n1>, <n2>,... are script variables following the rules summarized in section 10.3.1.If the first command of a batch file is FOREVER, the whole batch file repeatedly isexecuted. This is (until now) only used for software demonstrations.The commands BREAK and CONTINUE may be used to alter the sequence of com-mands executed in a loop. On BREAK the execution jumps to after the END_FORcommand of the innermost loop containing this BREAK command. On CONTINUEthe execution jumps to the FOR_EACH command of the innermost loop containingthis CONTINUE command.

Page 152: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

152 CHAPTER 10. FILES AND FILE FORMATS

Example

FOR_EACH $(var1) $(var2) IN "variable.list"...END_FOR

The variables $(var1) and $(var2) are taken from the first two columns in the file variable.list:

variable.list:# molecule file ; nof compounds ; ...testset.mol2 1000 ...testset2.sdf 3000 ......

Example

FOR_EACH $(count) FROMTO 2 10...END_FOR

10.3.4 Branches: IF/ELSE/ENDIF

Testing operators enable the sequence of commands executed in a batch file to be altered.

Syntax: IF <condition><1st sequence of commands>[ELSE<2nd sequence of commands>]ENDIFDescription: If <condition> evaluates as TRUE, the 1st sequence of commands isexecuted. If not, the 2nd sequence of commands is executed instead. <condition>allows variables or constants to be tested using a variety of operators. The operatordecides whether the operands are interpreted as values or strings. Numerical oper-ators are: <, <= (≤), == (=), => (≥), >, and != ( 6=) may be used for comparison.String operators eq and [] (contains operator) may be used.Valid conditions are e.g.:

<condition> 1st seq. executed if 2nd seq. executed if Comment$(val) < 5 $(val)<5 $(val)≥5 Numerical comparison$(val) != 5 $(val) 6=5 $(val)=5 Numerical comparison$(str) eq no $(str)=‘no’ $(str) 6=‘no’ String comparison$(str) [] no $(str) contains $(str) does not String comparison

the string ‘no’ contain ‘no’

10.3.5 One-of-n selection: SELINP

Syntax: SELINP <n1> [<n2> [...]] IN "<filename>" <no>Description: The SELINP command can be used instead of a loop. Instead of all,only row <no> is used to initialize the variables and no iteration takes place. Themeaning of <n1>, etc. is the same as for FOR_EACH.

Page 153: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

10.3. *BATCH FILES 153

If <no> is missing, the user will be prompted for it interactively (assuming that thebatch file is started with the SCRIPT command). SELINP can therefore be used tostart a batch file with alternative parameter settings.

10.3.6 Special script command: SETVAR

Syntax: SETVAR <x> <value>Description: Assigns the string <value> to the batch variable <x>.

10.3.7 Special batch file command: INPUT

Syntax: INPUT <n> <info text>Description: Displays the <info text> on the screen and reads a string from thekeyboard (standard input). The string is then assigned to batch variable <n>.

10.3.8 Special script command: INCR

Syntax: INCR <var> <inc val>Description: Increments the value of <var> by <inc val>. If <inc val> and theformer value of <var> are integers, the output will also be an integer, <var> willcontain the string of a floating-point number.

10.3.9 Special batch file command: OUTPUT and OUTERR

Syntax: OUTPUT <info text> [<info text> ...]Description: Displays all <info text> parameters on the screen. <info text> canbe a string enclosed in "" or a variable. OUTERR outputs <info text> to standarderror instead of standard output.

10.3.10 Special batch file command: TIMER

Syntax: TIMER <command>Description: Sends a <command> to the internal timer. <command> is eitherstart or stop. start resets the internal timer and starts counting, stop stopscounting and outputs the elapsed time.

10.3.11 Special batch file command: PROCSIZE

Syntax: PROCSIZEDescription: Displays the current program size on the screen. The used memoryof the machine is output in specified units.

10.3.12 Special batch file command: WAIT

Syntax: WAIT [<delay>]Description: The WAIT command is designed for demonstrating FlexS. It delaysexecution for <delay> seconds. Without the optional delay parameter, execution ishalted until the user presses the RETURN key.

Page 154: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

154 CHAPTER 10. FILES AND FILE FORMATS

During the time delay of the wait command, the complete batch file can be abortedby pressing ’∧’. The wait command itself can be aborted by pressing the spacebarand you can go into stop mode (wait until keypress) by pressing ’s’.

10.4 Definig program parameters (flexs_settings.dat)

Although there are default values for all parameters, each one should be defined. The pa-rameters and their meaning are described in the following section.

10.4.1 Overlap

Name (type): <UNITED_ATOM_CORRECTION> (floating-point)Description: Hydrogens are not explicitly considered in overlap tests. In-stead, the united atom radii model is used. For each hydrogen attachedto a heavy atom, the van der Waals radius of this atom is incremented by<UNITED_ATOM_CORRECTION>.Default value: 0.1 ÅReasonable range: 0.0 – 0.2

Name (type): <DOT_OVERLAP_VOL> (floating-point)Description: Maximum admissible overlap volume between an atom placed onan interaction point and a reference ligand atom. This overlap test is used to rejectinteraction points before the superposition computation.Default value: 2.5 Å3

Reasonable range: 0.0 Å3 – 100.0 Å3

Name (type): <CHELAT_MATCH_DIST> (floating-point)Description: Determines the minimum distance between matched interactionpoints that is required for two (or more) interactions with one interaction partner(bifurcated bonds).Default value: 1.6 ÅReasonable range: 0.0 – 10.0 Å

Name (type): <VOLUME_OVERLAP_TH> (floating-point)Description: Determines a threshold value for the minimum v.d.W. overlap vol-ume which is required for every placement. This value is normalized by the maxi-mum possible overlap volume (identity).Default value: 0.6Reasonable range: 0.0 – 0.9

10.4.2 Gaussian description

Name (type): <DEFAULT_GAUSS_WIDTH_QUAL> (floating-point)Description: Determines the default width that is taken for all the Gaussians rep-resenting the chemical property QUAL of the ligands. <QUAL> is one of theidentifiers of the following properties: <ED Electron density>, <CHG Partialatomic charge>, <HYD Hydrophobicity>, DON H-Bonding donator potential,

Page 155: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

10.4. DEFINIG PROGRAM PARAMETERS (FLEXS_SETTINGS.DAT) 155

<ACC H-Bonding acceptor potential>, DON_GEOM H-Bonding donator geom-etry. <ACC_GEOM H-Bonding acceptor geometry>, The specification -1 meansthat no default value is applied and the widths are taken from the input data.Default value: 1.2 ÅReasonable range: 0.3 – 2.0 Å

Name (type): <UNITED_GAUSSIANS> (integer)Description: 0 : one Gaussian per atom, 1 : two or more Gaussians with centerscloser than 1.2 Å from each other are merged if they have the same sign.Default value: 1Reasonable range: 0 or 1

Name (type): <MAX_NOF_GAUSS> (integer)Description: Maximum number of Gaussians per property allowed for a singlemolecule.Default value: 1000Reasonable range: 0 – 10000

10.4.3 Generation of conformations

Name (type): <CLASH_FACTOR> (floating-point)Description: A conformation where two test ligand atoms are closer together than<CLASH_FACTOR> times the sum of their van der Waals radii is removed fromthe conformational set. Several atom pairs are excluded from this test: atoms withtwo or fewer bonds between them, atoms relating to the same ring system, andhydrogen atoms.Default value: 0.6 ÅReasonable range: 0.0 – 1.0 Å

Name (type): <RCGEN_MAX_ENERGY> (integer)Description: Defines the maximum conformation energy for ring system confor-mations generated by the external ring conformation generator.Default value: 20 kJ/molReasonable range: See description of the ring conformation generator.

Name (type): <CORINA_MAX_ENERGY> (integer)Description: Defines the maximum conformation energy for ring system confor-mations generated by CORINA.Default value: 20 kJ/molReasonable range: See description of CORINA

Name (type): <TORSION_MAX_ENERGY> (integer)Description: Defines the maximum conformation energy allowed for torsion an-gles.Default value: 20 kJ/molReasonable range: See description of MIMUMBA programImportant note: Any changes of TORSION_MAX_ENERGY have no effect on al-ready loaded ligands. If TORSION_MODE equals 2, then this parameter has noeffect on all ligands which are loaded.

Page 156: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

156 CHAPTER 10. FILES AND FILE FORMATS

Name (type): <EQUIVCLASS_A_EPS> (floating-point)Description: Defines the angular tolerance for the detection of periodicitiessmaller than 2π at torsion angles.Default value: 0.088 radReasonable range: 0.0 – π rad

Name (type): <EQUIVCLASS_D_EPS> (floating-point)Description: Defines the distance tolerance (difference in bond lengths) for the de-tection of periodicities smaller than 2π at torsion angles.Default value: 0.15 ÅReasonable range: 0.0 – 1.0 Å

Name (type): <TORSION_MINIMA_CUTOFF>Description:Default value: 30 %Reasonable range:

10.4.4 Selecting the base fragment (superposition algorithm, phase 1)

Name (type): <NOF_FRAGMENTATIONS> (integer)Description: Determines the number of different base fragments and thus thenumber of different fragmentations selected in the automatic base selection step.Default value: 4Reasonable range: 1 – 6

Name (type): <MAX_NOF_COMP_TO_ALIGN> (integer)Description: Maximum number of components allowed per molecule. Superpos-ing a molecule with more components will be time-consuming, the chance of correctprediction is extremely low. Therefore FlexSaborts the superposing of moleculeswith more components.Default value: 40Reasonable range: 30 – 50

10.4.5 Placing the base fragment (superposition algorithm, phase 2)

Name (type): <QUERY_TRI_TOL> (floating-point)Description: Determines the tolerance which is allowed for the lengths of the sidesof a query triangle up to which the respective triangles are merged.Default value: 0.2 ÅReasonable range: 0.0 – 0.4 Å

Name (type): <MAX_NOF_Q_TRI> (integer)Description: Determines the maximum number of query triangles considered dur-ing base placement.Default value: 10000Reasonable range: ≥ 1000

Name (type): <MAX_NOF_TRI_PLACE> (integer)

Page 157: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

10.4. DEFINIG PROGRAM PARAMETERS (FLEXS_SETTINGS.DAT) 157

Description: Determines the maximum number triangle placements attemptedDefault value: 400000Reasonable range: ≥ 100000

Name (type): <MAX_NOF_CLUSTER> (integer)Description: Determines the maximum number of clusters generated during baseplacement. If this threshold is exceeded, base placement is aborted.Default value: 2000Reasonable range: ≥ 500

Name (type): <TRIANGLE_BUCKET_SIZE> (floating-point)Description: Determines the width of the buckets in the triangle hashing datastructure. The tolerance for each triangle edge in the matching procedure is half ofthis value. Be careful with changing this parameter! It is strongly correlated to thepoint densities defined in the delta entries in the static data file geometry_sp.dat.Default value: 0.9 ÅReasonable range: 0.2 – 2.0 Å

Name (type): <LINE_LEN_TOL_FACTR> (floating-point)Description: Determines the factor of additional tolerance for discretization of ge-ometries as opposed to ligand alignment.Default value: 1.4 ÅReasonable range: 1.4 – 3.0 Å

Name (type): <MIN_EDGE_LENGTH> (floating-point)Description: Defines the minimum length of a triangle edge which is stored in thetriangle hash table.Default value: 0.8 ÅReasonable range: 0.0 – <MAX_EDGE_LENGTH> Å

Name (type): <MAX_EDGE_LENGTH> (floating-point)Description: Defines the maximum length of a triangle edge which is stored in thetriangle hash table.Default value: 10.0 ÅReasonable range: ≥ <MIN_EDGE_LENGTH>

Name (type): <TRIANGLE_CLUSTER_RMS> (floating-point)Description: Defines the maximum RMS distance between placed test ligands(placed by matched triangles) such that they can be put into the same cluster.Default value: 1.0 ÅReasonable range: 0.8 – 3.0 Å

Name (type): <TRIMATCH_D_EPS> (floating-point)Description: Defines the maximum admissible tolerance in the distance betweenthe centers of matched interaction groups during triangle matching.Default value: 0.55 ÅReasonable range: ≥ <DELTA_BUCKET> /2

Name (type): <TRIMATCH_A_EPS> (floating-point)

Page 158: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

158 CHAPTER 10. FILES AND FILE FORMATS

Description: Defines the maximum admissible tolerance angle on the test ligand’sside for matched interaction groups during triangle matching.Default value: 0.1 radReasonable range: 0.0 – 2π rad

10.4.6 Building up the complex (superposition algorithm, phase 3)

Name (type): <ADDMATCH1_D_EPS> (floating-point)Description: Defines the maximum admissible tolerance in the distance betweenthe centers of matched interaction groups during the search for further matches ifthe re-placement of the test ligand failed.Default value: 0.8 ÅReasonable range: ≥ <TRIMATCH_D_EPS>

Name (type): <ADDMATCH2_D_EPS> (floating-point)Description: Defines the maximum admissible tolerance in the distance betweenthe centers of matched interaction groups during the search for further matches.Thus <ADDMATCH2_D_EPS> determines the radius within which FlexS searchesfor further matches.Default value: 2.0 ÅReasonable range: ≥ <ADDMATCH1_D_EPS>

Name: <ADDMATCH_A_EPS> (floating-point)Description: Defines the maximum admissible tolerance angle on the test ligand’sside for matched interaction groups during complex construction.Default value: 0.35 radReasonable range: ≤ π

Name (type): <MAX_ENERGY> (floating-point)Description: Defines the energy threshold for partial solutions during the complexbuilding. All partial placements with energy greater than <MAX_ENERGY> areomitted.Default value: 0.0 kJ/molReasonable range: −∞−−+ ∞ kJ/mol

Name (type): <REL_MAX_ENERGY> (floating-point)Description: Like <MAX_ENERGY>, but the energy threshold is computed rela-tive to the best partial placement found in each iteration.Default value: 2000.0 KJ/molReasonable range: ≥ 0.0 kJ/mol

Name (type): <SOLUTIONS_PER_IT> (integer)Description: The maximum number of partial placements which will be taken intothe next building step.Default value: 500Reasonable range: ≥ 0

Name (type): <SOLUTIONS_PER_FRAG> (integer)

Page 159: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

10.4. DEFINIG PROGRAM PARAMETERS (FLEXS_SETTINGS.DAT) 159

Description: In addition to the number of solutions defined in<SOLUTIONS_PER_IT>, a minimum number of solutions for each frag-mentation will be taken into the next iteration. This number is defined in<SOLUTIONS_PER_FRAG>.Default value: 50Reasonable range: 0 – <SOLUTIONS_PER_IT>

Name: <SOLUTION_CLUSTER_RMS> (floating-point)Description: After each build-up step, a clustering follows. Solutions will be clus-tered if three conditions are fulfilled. First, the RMS distance between the two place-ments must be less than or equal to <SOLUTION_CLUSTER_RMS>.Default value: 0.7 ÅReasonable range: ≥ 0 Å

Name: <SOLUTION_CLUSTER_ANGLE> (floating-point)Description: To avoid clustering of solutions which differ only in the vectors towhich the subsequent fragments are connected, two conditions must be fulfilled foreach such pair of vectors. Firstly the angle between the vectors must be less than orequal to <SOLUTION_CLUSTER_ANGLE>.Default value: 0.25 radReasonable range: 0 – 2π rad

Name: <SOLUTION_CLUSTER_DIST> (floating-point)Description: Secondly, the distance between the endpoints of the vectors must beless than or equal to <SOLUTION_CLUSTER_DIST> (see above for a full explana-tion of the parameter).Default value: 1.0 ÅReasonable range: ≥ 0 Å

10.4.7 Applying filter functions

Name: <FILTER_1_5_REPULSION> (integer)Description: The repulsions filter checks the superpostion solutions and may erasesome solutions. If <FILTER_1_5_REPULSION> equals 1, the filter will be applied.Default value: 0Reasonable range: 0 or 1

10.4.8 Combinatorial alignment parameters

The following parameters are used by FlexSc for combinatorial library alignment only. Seesection 7.2.3 for a description of FlexSc.

Name: <SCORE_TABLE_SIZE> (integer)Description: During a combinatorial alignment run, FlexSc keeps internallyonly the scores achieved for the highest ranking placements per librarymolecule. <SCORE_TABLE_SIZE> controls the number of scores stored.If the scores achieved for all placements should be kept, set the flag<KEEP_ALL_SCORES_ACHIEVED> to ’1’.

Page 160: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

160 CHAPTER 10. FILES AND FILE FORMATS

Default value: 5Reasonable range: 1 – 5

Name: <STORE_PLACEMENT_THRESHOLD> (floating-point)Description: During combinatorial superposition, placement files arecreated only if the score of the highest ranking placement is below<STORE_PLACEMENT_THRESHOLD>.Default value: 0.0Reasonable range: ≤ 0.0

Name: <STORE_PLACEMENT_MODE> (integer)Description: <STORE_PLACEMENT_MODE> defines the file format and thenumber of placements stored for each library molecule in a combinatorial align-ment run. If set to ’-1’, a pdf file containing the whole set of placements is stored.If set to a value k > 0, a multi-mol2 file is written containing the k highest rankingplacements. If set to ’0’, no files will be created.Default value: 0Reasonable range: -1 – 500

Name: <MAX_NOF_FILE_WRITE> (integer)Description: <MAX_NOF_FILE_WRITE> controls the number of files created. Ifset to ’-1’, a file is created for each molecule (in accordance with the settings of<STORE_PLACEMENT_THRESHOLD> and <STORE_PLACEMENT_MODE>).If set to ’0’, all placements are written to a single multi-mol2 file (provided that<STORE_PLACEMENT_MODE> > 0). If set to k > 0, FlexSc keeps only the place-ment files of the k highest scoring library molecules.Default value: 0Reasonable range: ≥ -1

Name: <KEEP_ALL_SCORES_ACHIEVED> (integer)Description: <KEEP_ALL_SCORES_ACHIEVED> controls the storage of place-ment solutions during combinatorial alignment. If set to ’1’, FlexSc keeps the scoresachieved for all placements (only for the placement solutions of EXTENDMR andPLACESEQ, see section 7.2.3). If <KEEP_ALL_SCORES_ACHIEVED> is set to ’0’,FlexSc keeps the number of highest ranking scores that are defined by the flag<SCORE_TABLE_SIZE>. If EXTRACTTOP or WRITESOL is used (see section 7.2.3),<KEEP_ALL_SCORES_ACHIEVED> has to be set to ’1’.Default value: 0Reasonable range: 0,1

10.4.9 Miscellaneous

Name (type): <MAX_NOF_RLIG> (integer)Description: Maximum number of reference ligands to be handled simultaneouslyin FlexS’s workspace.Default value: 8192Reasonable range: 1 – 32768

Page 161: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

10.5. *CHEMICAL PARAMETERS (CHEMPAR.DAT) 161

Name (type): <MAX_NOF_TLIG> (integer)Description: Maximum number of test ligands to be handled simultaneously inFlexS’s workspace.Default value: 8192Reasonable range: 1 – 32768

Name (type): <RESTRICTED_RMS_MODE> (integer)Description: RMSD calculation restricted to: 0: no restriction, 1: atom centers in-side reference ligand volume, or 2: atoms intersecting reference ligand volume.Default value: 0Reasonable range: 0 – 2

Name (type): <PROBE_RADIUS> (floating-point)Description: Determines the probe radius for surface computation. A referenceligand atom is a surface atom if a sphere of radius <PROBE_RADIUS> can be movedfrom infinity to the surface of the van der Waals sphere of this atom without collid-ing with any reference ligand atom (molecular surface).Default value: 1.4 ÅReasonable range: 0.8 – 3.0 Å

Name (type): <RIF_LJPOT_FILTER> (integer)Description: Filter value for the RIF alignment (see 6.8.1.4): If the Lennard-Jonespotential of an aligned solution is greater than <RIF_LJPOT_FILTER> the solutionwill be skipped.Default value: 1000Reasonable range: 100 – 50000

10.5 *Chemical parameters (chempar.dat)

The chempar static data file contains global chemical knowledge such as van der Waalsradii, bond lengths to hydrogens, etc.

10.5.1 Van der Waals radii

After the record identifier @atom_vdw_radius, the van der Waals radii are defined for allatom types known in FlexS.

@atom_vdw_radiushydrogen 1.2carbon 1.7::

FlexS uses the united atom radii model. This means hydrogens are not directly in-volved in overlap tests. The van der Waals radius of a carbon atom is increased by<UNITED_ATOM_CORRECTION> for each bonded hydrogen to compensate for this. Forintramolecular overlap tests, hydrogens are considered explicitly (since version 1.5).

Page 162: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

162 CHAPTER 10. FILES AND FILE FORMATS

10.5.2 Bond length of heavy atoms

If the SYBYL atom types are known, a table of standard bond lengths for connected atomswith a known SYBYL atom type is provided (@sybyl_bond_length). But SYBYL types are notavailable for all elements, in this case the bond length can be estimated by a Linus Paulingrelation from the covalent radii and the bond order. Or conversely, the bond order can beestimated from the bond length and the element symbol of the connected atoms. The tableof @covalent_radii is used for this.

10.5.2.1 Covalent radii

The syntax is identical to @atom_vdw_radius. For all elements that can be part of a covalentbond a covalent radius is given. Any atom not listed here cannot be part of a covalent bond.

@atom_covalent_radius

* 0.0HYDROGEN 0.300LITHIUM 1.225BERYLLIUM 0.889::

10.5.2.2 SYBYL bond length

The @sybyl_bond_length table specifies the standard length for two connected atoms witha defined SYBYL atom type and a known SYBYL bond type. The syntax of each line is

@sybyl_bond_lengths# syntax: <type atom 1> <bond type> <type atom 2> <length># Bond lengths taken from:# Allen, Kennard, ... J. Chem. Soc.# Perkin Trans. II (1987) S1 1 S19H 1 C.3 1.090H 1 C.2 1.080H 1 C.1 1.050H 1 N.3 1.009H 1 N.2 0.970

::

10.5.3 Atom types

In this section the atom types are defined; the definition is analogous to the definition of theSYBYL atom types. The user should not make any changes to this table. Each type has anumerical ID, a chemical element, a hybridization state (known hybridization states are S,UN, SP1, SP2, SP3, AR), a text name and a symbol that is used for writing to mol2 files. Theseredundant entries can be used to define special types for FlexS that are not compatible withthe Tripos definitions, but for writing mol2 files a compatible string can be given (dummy

Page 163: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

10.5. *CHEMICAL PARAMETERS (CHEMPAR.DAT) 163

atoms, Du for example). All atoms that do not match one of these types get an E.xx label,where xx is the two character element symbol. We chose this because it is more informativethan Du, which is currently the Tripos default.

Example

#ID element hybrid name mol21 CARBON SP3 C.3 C.32 CARBON SP2 C.2 C.23 CARBON AR C.ar C.ar: : : : :

10.5.4 Valence states

In this section the permitted valence states are defined for most elements that are relevantfor molecule initialization. The valence state definitions are important for the calculationof formal charges, number of attachable hydrogens, calculation of pi-electrons for aromatic-ity perception and check of illegal or unusual valence states in molecules (e.g. 5 bindingcarbons). As an example, the valence state definition for carbon looks like this:

CARBON 3 4 4state 4 0 0state 3 0 +1 [$([#6](N)(N)[N,c])] # allowed in some cases (C.cat)state 3 2 -1state 2 2 -2state 1 0 -3state 0 4 -4state 5 0 +1 [$(c(:a)(:a)=[#8,#7;r])] # in some ring systems OK

The first line of a valence state definition contains the element name in capital letters, themaximum bond order (carbon can form single, double and triple bonds so this is 3), thepossible number of valences (here 4) and the number of valence electrons (4).The number of possible valences and valence electrons may be identical in most cases, butsometimes not all valence electrons can be used to form covalent bonds. The number ofvalence electrons is further used to fill up hydrogens during protonation. If no special rulesare given for valence states, the difference between the number of valence electrons and thevalence found on a given atom is used as the number of missing hydrogens or as the formalcharge respectively.As an example let’s take sulfur, which can provide six valence electrons, but for protonationonly two are relevant, so the number of valence electrons is two. The valence states 4 and6 are however allowed as well, but they are constrained by SMARTSTM patterns that allowonly bonds to electronegative partners. If 6 valence electrons were used, a sulfur with upto 6 hydrogens may result when protonating a sulfide ion. But the addition of up to twohydrogens would be correct.The following lines describe the possible states in more detail. The first argument after thestate keyword is the valence state (a carbon with 4 valences is defined), the second argumentis the number of free electrons in this state and the 3rd line is the formal charge that anatom of that state will have. Optionally a fourth argument can be given which contains a

Page 164: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

164 CHAPTER 10. FILES AND FILE FORMATS

SMARTSTM pattern where this state is allowed. A negative formal charge (argument 3) canbe neutralized by hydrogens. For carbon there are two valence states defined for the valence3.The first one results in a formal charge of +1 and is only allowed in a guanidin-like envi-ronment, which represents a C.cat carbon. Usually a carbon with three valences has a freevalence, so the formal charge is (-1) which means that one hydrogen will be added uponneutralization.Note: Valence states are evaluated in order of occurrence, so constrained states must occurprior to unconstrained states. For the SMARTSTM patterns some rules (X,h,+,-) are not al-lowed here because they depend on these states themselves and would yield to a recursivetest of valence states. They will cause an error message if used here.A very special state is the valence state 5 which is illegal by default, but in mol2 files thereare very often aromatic rings with exocyclic carbonyls which would result in a valence stateof 5. It seems to be common practice to use such structures. To take them into account, thisline was added. If you want to be more strict and to receive a warning for such structures,just comment the line out.Note: This section does not contain all elements. If you are working heavily with ligandscontaining elements with no valence states defined here, feel free to add the necessary lines.

10.6 *Interaction types and compatibilities (contype_sp.dat)

The contype_sp static data file defines the different interaction types and possible matchesbetween them.The first record defines the interaction (or contact) types and groups them into sets:

@contact_types<type1.1> <type1.2> ... <type1.N> [ | <type2.1> ... <type2.M> ]

The types <type1.1>, . . . , <type1.N> and the types <type2.1>, . . . <type2.M> – separatedby a |-character – are collected in two groups. The second group is the countergroup of thefirst one, i.e. every type from the second group is a potential counter IA-type from the firstgroup and vice versa. Only interactions within the same group can be matched. If the op-tional part containing the second group is missing, the first group is its own countergroup,i.e. every type of the group can be matched with every type from the group itself.

Example

@contact_types# hydrogen bondingh_don | h_acc

# hydrophobic interactionsphenyl_centeramide

Each interaction type is classified to a specific placement level. The placement level deter-mines how the interaction is used in the placement algorithms. We distinguish between fourlevels:

Page 165: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

10.6. *INTERACTION TYPES AND COMPATIBILITIES (CONTYPE_SP.DAT) 165

Level Meaning0 Inactive. These interactions are not used.1 Geometrically less restrictive. These interactions are very unspecific

from the geometric point of view as well as from the chemical pointof view (occurs frequently). This level should contains unspecific hy-drophobic contacts.

2 Geometrically less restrictive. These interactions are still unspecificfrom the geometric point of view but occur less frequently. This levelshould contain interactions between aromatic rings, etc.

3 Geometrically restrictive. These interactions are geometrically restric-tive ones which are preferably used in the placement algorithms. Thislevel should contain hydrogen bonds, salt bridges, etc.

For each base fragment FlexS decides which interaction levels should be used for fragmentplacement in order to achieve reliability as well as time efficiency. In the complex construc-tion phase, only level 3 and level 2 interactions are used.

Example

@placement_level# --------------------------------------------------------------------------# hydrogen bonding / salt bridgesh_acc 3h_don 3

continued

Example (continued)

# hydrogen bondingh_acc 3h_don 3metal 0metal_acc 0# hydrophobic interactionsphenyl_center 2ch3_phe 0phenyl_ring 0amide 2ch 0ch2 0ch3 0sulfur 0aro 0ref_atom_match 3

The next record can be used to prohibit special matches between interactions.

@noncompatibility<type1> <type2>[<type3> <type4>... ]

Page 166: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

166 CHAPTER 10. FILES AND FILE FORMATS

Thus, types can be put into groups, even if there are specific combinations which are ener-getically unfavorable.The last record in this static data file defines a minimum van der Waals radius for eachinteraction type. This should be the minimum radius that an arbitrary group which can beassigned this interaction type can have. The radius is expected to be in Å.

@min_vdw_radius<type1> <radius>[<type2> <radius>...]

If no radius is assigned, 1.0 Å is assumed.

Example

@min_vdw_radius# hydrogen bondingh_don 0.9metal 1.39h_acc 1.42

# hydrophobic interactionsphenyl_center 2.5phenyl_ring 2.0ch3_phe 2.0amide 2.5

10.7 *Interaction geometries (geometry_sp.dat)

The static data file geometry_sp.dat contains the description of interaction geometries.Together with the interaction types defined in contype_sp.dat, they will be mapped tothe reference and the test ligand.

10.7.1 Associating interaction geometries with molecular groups

The method of how interaction geometries are associated with molecular groups is essentialfor understanding the records defining the interaction geometries. We will therefore explainthis first in this section.The assignment of an interaction geometry to a molecular group contains the list of upto four atoms, named a0, . . . , a3 in the following. These atoms have positions in space~c0, . . . ,~c3, which will be used to derive two different bases (local coordinate systems) (seealso Figure 10.1):

Base e ~e1 = ~c1 −~c0 Base b ~b1 =~e1

~e2 = ~c2 −~c0 ~b2 =~e1 ×~e2

~e3 = ~c3 −~c0 ~b3 =~b1 ×~b2~e3 =~e1 ×~e2 if~c3 is undefined

Page 167: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

10.7. *INTERACTION GEOMETRIES (GEOMETRY_SP.DAT) 167

e1

e2

e3 b2

b3

1b01

2

Figure 10.1: Definition of the e- and b-bases for a carboxylate group.

The origin of both bases is~c0 but can be redefined by the center entry during the definitionof the interaction geometry. Note that only base b is guaranteed to be orthogonal by con-struction. Furthermore, none of these 6 vectors are normalized in general. The three atomsmust not lie on a straight line. If only two atoms are specified, only e1 and b1 are defined. Ifonly one atom is specified, only the origin is defined.

10.7.2 Defining interaction geometries

An interaction geometry is always a part of a spherical surface. It can be constructed fromcones, cone sections, or spherical rectangles:

@geometry <geom_name>radius <rad>delta <polar incr> <azimuth incr>[center <center vector>]<geom specifier>[<geom specifier> ...]energy <opt energy>[charge_scaling <c factor> <c threshold>][distance_scaling <d threshold1> <d threshold2>][angle_scaling <a threshold1> <a threshold2>]

A <geom specifier> can be one of the following lines:

spherecone <basis> <vec1> <polar1> <polar2>s_area <basis> <vec1> <vec2> <azimuth1> <azimuth2> <polar1> <polar2>

The name of the geometry <geom name> can be an arbitrary string and is used for refer-encing the geometry in the contact_sp.dat file (see 10.9). The entries of the @geometryrecord are:

radius defines the radius of the sphere where the interaction surfaces are located.

Page 168: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

168 CHAPTER 10. FILES AND FILE FORMATS

delta defines the step size for approximation of the surfaces by discrete point sets. Thevalues are the maximum arc lengths between two consecutive points in the polar di-rection and azimuth direction respectively, measured in Å. The computation of the dis-crete point set for an interaction surface works as follows: first the polar angle intervalis divided by an odd number of circles having a distance of maximal <polar incr> Åon the spherical surface. Each circle (circle section in the case of spherical rectangles)is divided by an odd number of points having a distance of maximal <azimuth incr>on the spherical surface. The distance from the border line of the interaction surface isless than or equal to half of <polar incr> or <azimuth incr>, respectively.

center defines the center of the sphere. If not specified, the origin of the local coordinatesystem is assumed to be the center. For the center definition the coordinate system ewith origin c0 is used.

sphere defines the whole sphere to be the interaction surface. The <geom specifier>s de-fined in a @geometry record should describe disjoint surface parts of the sphere.Thus, if the whole sphere is chosen, no other <geom specifier> should occur in this@geometry record.

cone defines a cone (-section) as the interaction surface. The axis of the cone is given by thevector <vec1> in the local coordinate system <basis>. The cone section is delimitedby the angles <polar1> < <polar2>. Setting <polar1> to 0 yields a closed cone (seeFigure 10.2).

s_area defines a spherical rectangle as the interaction surface. <vec1>, <polar1>,<polar2> define a cone (-section) as above. <vec2> specifies the zero direction forthe azimuth and must not be collinear with <vec1>. The part of the cone section ly-ing between <azimuth1> < <azimuth2> in a right-handed rotation around <vec1>starting at <vec2> is the defined spherical rectangle (see Figure 10.2).

energy defines the energy contribution of this interacting group to the interaction geometryin the ideal case. Note that the interaction energy for a match is distributed among thetwo interaction partners.

charge_scaling defines the factor <c factor> by which the energy is multiplied if the prod-uct of the formal charges is less than the charge threshold <c threshold> (see section10.7.3).

distance_scaling defines the two threshold values for the scaling factor for distance devia-tions (see section 10.7.3).

angle_scaling defines the two threshold values for the scaling factor for angle deviationsand an optional interaction surface scaling factor (see section 10.7.3).

Note that lengths must be given in Å, angles in degrees, and energies in kJ/mol.

Page 169: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

10.7. *INTERACTION GEOMETRIES (GEOMETRY_SP.DAT) 169

directionzero

directionmain

center

main direction

1

φ

Figure 10.2: Definition of cones and spherical rectangles. The main direction correspondsto <vec1>, the zero direction to <vec2>, the φ angles to <polar>, and the ω angles to<azimuth>. In the right figure, the main direction is perpendicular to the drawing plane.

Page 170: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

170 CHAPTER 10. FILES AND FILE FORMATS

Example

@geometry coo- # Hydrogen acceptor geometry on a carboxylate oxygen# c1=o, c2=c, c3=o

radius 1.8delta 1.6 1.6s_area b -1 0 0 0 1 0 -50 50 40 80s_area b -1 0 0 0 1 0 130 230 40 80energy -2.35charge_scaling 1.766 -0.15distance_scaling 0.3 0.7angle_scaling 75 140

@geometry o/nh # Hydrogen donor geometry# c1=o/n, c2=h

radius 1.9delta 10.0 10.0center 1.0 0.0 0.0 # centered in Hcone b 1 0 0 0 50energy -2.35charge_scaling 1.766 -0.15distance_scaling 0.3 0.7angle_scaling 30 80

10.7.3 Computing the scoring contributions of matched interaction groups

The energy contribution of a match between two interaction geometries g1 and g2 is com-puted as follows:The ideal energy of the match is the sum of the <opt energy> values defined in the entry’energy’ of g1 and g2. This value is scaled with 4 factors: a charge factor, a length deviationfactor, and two angle deviation factors. Let ~c1,~c2 be the centers of the interaction groupsand ~m be the matchpoint at the hypothetical receptor side in the global coordinate system.There are different penalty factors and deviation computation schemes used for long-rangeinteractions (radius>2Å).

• If the product of the formal charges of the first matched atoms a1 in g1 and g2, re-spectively, is less than or equal to the charge threshold <c threshold>, the energy ismultiplied by the charge factor <c factor>.

• The distance deviation is computed as

ddev = ||~c1 − ~m| − (g1.radius)|+ ||~c2 − ~m| − (g2.radius)|/2

The scaling factor is

fd =

1 : ddev ≤ d1

1− ddev−d1d2−d1

: d1 < ddev < d2

0 : ddev ≥ d2

Page 171: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

10.8. *ASSIGNING DATA TO THE LIGANDS: THE SUBGRAPH DATA FILES 171

where d1 and d2 are the distance thresholds <d threshold1>, <d threshold2> definedin the entry ’distance_scaling’ of g1. The distance thresholds in g1 and g2 should bethe same.

• The angle deviation is computed at each of the two interacting groups and from thepoint of view of the hypothetical receptor side. Here, the formulas are given for geom-etry g1. ~d is the main direction of the geometry: in the case of a cone, this is the axis ofthe cone; in the case of a spherical rectangle, this is the vector to the geometric centerof the rectangle. φ1 corresponds to the parameter <polar1> in the cone definition.

adev =

0 : if interaction surface type = sphere|∠(~c2 −~c1, ~d)−φ1| : if interaction surface type = cone∠(~c2 −~c1, ~d) : if interaction surface type = s_area

For the hypothetical receptor side located at match-point ~m the angle deviation is com-puted as:

adev = |∠(~c1 − ~m,~c2 − ~m)|

Then the scaling factor is computed as above:

fa =

1 : adev ≤ a11− adev−a1

a2−a1: a1 < adev < a2

0 : adev ≥ a2

where a1, a2 are the angle thresholds defined at g1 in the entry ’angle_scaling’ of g1.

10.8 *Assigning data to the ligands: the subgraph data files

Ligands can be arbitrary organic molecules. Thus, assigning additional physico-chemicaldata to the ligand is complicated. In FlexS, a subgraph matcher is provided for this task. Thepatterns that can be matched to the ligand are defined in subgraph data files.There are different kinds of information which will be assigned by this mechanism: for-mal charges to atoms (section 10.11), interaction types and geometries to interacting groups(section 10.9), and torsion angle patterns to rotatable single bonds (section 10.10). This sec-tion contains a description of the definition of subgraphs which is independent of the dataassigned to them.

10.8.1 Defining groups of atoms

With the @defgroup records at the beginning of a subgraph file, you can combine atomtypes to form groups.

@defgroup <grp name> <atom type 1> [<atom type 2> ...]

An atom type can be an element of more than one group. The group names used must beunique. The atom types follow the SYBYL notation [19]. Three groups are predefined: thegroup ’R’ contains all atom types except hydrogen, the group ’RH’ represents an arbitraryatom type, and the group ’RX’ contains all atom types except hydrogen and carbon. Notethat in subgraph files no wildcards are allowed, but you can use the asterisk character in the

Page 172: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

172 CHAPTER 10. FILES AND FILE FORMATS

definition of groups. Recursive definition of groups is not possible. The group mechanismdoes not work for bonds.

Example

@defgroup N.* N.1 N.2 N.3 N.ar N.am N.pl3 N.4@defgroup N2ar N.2 N.ar N.pl3# hydrogen donor atom types@defgroup DON O.2 O.3 N.1 N.2 N.3 N.ar N.am N.pl3 N.4

10.8.2 Defining subgraphs

The second part of a subgraph file contains the subgraph definitions themselves. Each suchdefinition starts with the keyword @subgraph followed by three parameters. The recordconsists of a list of atoms, a list of bonds and the data part:

@subgraph <class> <priority> <subgraph name><atom specification>[<atom specification> ...]<bond specification>[<bond specification> ...]

data<data area>

end

The parameter <class> allows you to divide the defined subgraphs into a set of classes.Definition of the classes can significantly reduce the number of subgraph matches you needto perform. Usage of the classification must be supported by the program part, which isresponsible for the evaluation of the data part. The classification scheme is therefore ex-plained in the following sections, where the four instances of subgraph files in FlexS will beexplained.Subgraph definitions need not be disjoint, i.e. a subgraph can be a subgraph of another sub-graph. This is an important feature, because it enables you to define more global subgraphssuch as default rules and specializations of this. The subgraph matcher tries to match ev-ery defined subgraph, even if the first hit has occurred. This can be avoided by assigningpriority numbers in the <priority> parameter. If a subgraph with a high priority is found,subgraphs with a lower priority will not be matched any more. Priority numbers are non-negative values, the priority increases with the value, i.e. priority 0 is the lowest possiblepriority.The <subgraph name> is an arbitrary string (without newlines, blanks, tabs) which will beprinted in FlexS’s control output.An atom specification consists of the keyword atom, a consecutive atom number, startingwith 1, a type specifier and additional optional specifiers.

atom <atom no> <type specifier> [charge <op> <charge>] \[nof_bonds <op> <nof bonds] [excl_match]

The atom number is used to reference the atom in the following bond specifications and inthe data area. The type specifier can be a group, previously defined by a @defgroup record,or a SYBYL atom type [19].

Page 173: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

10.9. *LIGAND INTERACTION GROUPS (CONTACT_SP.DAT) 173

There are three types of optional additional specifiers. The charge specifier enables you torestrict the matched atoms to atoms with a specific formal charge. <op> can be one of ==,>=, <=, >, <, or ! = the given value. With the nof_bonds specifier you can restrict theset of matched atoms to atoms with the specified number of bonds. If excl_match (exclusivematch) is specified, the atom is blocked for subgraph matchings with the same subgraph.The bonds are specified as follows:

bond <source atom no> <dest atom no> <bond type> [<dest atom no> \<bond type>...]

The <source atom no> is the number of the first atom to which the bond is attached. A listof destination atom numbers alternated with bond types then follows. Thus, a set of bondsstarting at atom <source atom no> can be defined in one bond specification. The bondtypes follow the SYBYL notation [19], no groups for bonds are allowed. The bond type ’un’can be matched to arbitrary bonds. Each bond must only be defined once, from atom a toatom b or vice versa.

Example

@subgraph 0 1 Acceptor_Natom 1 N2ar CHARGE 0.0 NOF_BONDS 2atom 2 Ratom 3 Rbond 1 2 UN 3 UN

data...

end

Example

@subgraph 0 1 Phenyl_groupatom 1 C.aratom 2 C.ar EXCL_MATCHatom 3 C.ar EXCL_MATCHatom 4 C.ar EXCL_MATCHatom 5 C.ar EXCL_MATCHatom 6 C.ar EXCL_MATCHbond 1 2 ar 6 arbond 3 2 ar 4 arbond 5 4 ar 6 ar

data...

end

The contents of the <data area> depend on the instance of the subgraph file and will beexplained in the following two sections.

10.9 *Ligand interaction groups (contact_sp.dat)

The first application of the subgraph matcher is the detection of interaction groups in amolecule. The subgraphs for this task are defined in the subgraph file contact_.dat.The data area in the subgraph definition contains of a list of interactions provided as follows:

Page 174: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

174 CHAPTER 10. FILES AND FILE FORMATS

iact <atom no1> [<atom no2>] [<atom no3>] [<atom no4>] <contact type>\<ia geometry>

The atom numbers refer to the matched atoms from the subgraph in question. The<contact type> is one of the contact types defined in the static data file contype_sp.dat.The <ia geometry> is one of the interaction geometries defined in the static data filegeometry_sp.dat. If the second, third or fourth atom is not defined, a ’-’ must be in-serted instead.

10.10 *Ligand torsion database (torsion_standard.dat)

The second subgraph data file in FlexS is the torsion_standard.dat or thetorsion_fine.dat file. torsion_standard.dat contains subgraphs for the assign-ment of energetically favorable torsion angles to acyclic single bonds and is based on theold MIMUMBA model. torsion_fine.dat contains subgraphs for the assignment of 10degree energy grids to acyclic single bonds and is based on the new MIMUMBA modelwithout preselection of torsion angles.The data area in the subgraph definitions of this file looks as follows:

tangle/te <angle> <energy>[ tangle/te <angle> <energy> ...]period <period>symmetry <sym>

The first part is a list of torsion angles combined with energy values, each pair <angle>,<energy> is preceded by the keyword tangle/te. Then the periodicity of torsion angles(keyword period) and the symmetry are defined (keyword symmetry). The set of torsionangles is extended in the following way. First, all angles are mirrored at the symmetryvalue, then the angles are copied 360/<period> -1 times, in the i-th copy, and an offset ofi ∗<period> is added to each torsion angle. An example can be found in Figure 10.3. All ex-plicitly defined torsion angles (tangle/te) must be less than or equal to the symmetry value,the period value must be a multiple of the symmetry value. Both values must be defined.Note that the subgraph matcher cannot distinguish between different stereoisomers.

10.10.1 Constraining amides to planarity

For your convenience, with Release 2 we have predefined (but NOT activated) the en-forcement of planar amides, i.e. the torsional angle R-(NH)-(CO)-R′ will be constrainedto amount to 0◦ or 180◦. To activate the respective settings, you must comment out therespective lines in torsion_standard.dat or torsion_fine.dat depending on whattorsional database you chose to load in config_sp.dat. To find out what you have loaded,please edit config_sp.dat and look for the setting of TORSION.We restricted the definition to those amides which carry an H at the amide nitrogen atom.You will find the definitions at the end of the torsional database files. Note that if youwould like to extend or alter the definitions, you could as well use the SMARTSTM subgraphlanguage. Here, the first three bonds of the SMARTSTM pattern define the torsion angle.

Page 175: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

10.10. *LIGAND TORSION DATABASE (TORSION_STANDARD.DAT) 175

Figure 10.3: Example for a torsion.dat entry: This plot shows the potential for the followingentry. The original values are plotted in black. These values are mirrored at 45 degrees(outlined boxes) and the resulting potential is repeated every 90 degrees (grey boxes).

tangle/te 10 5tangle/te 15 3tangle/te 20 0period 90symmetry 45

Page 176: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

176 CHAPTER 10. FILES AND FILE FORMATS

Figure 10.4: A simple diketo compound in mixed tautomeric form.

10.10.2 Fixing torsional angles at specified values: a sample case

Let us clarify this by means of a simple example.We assume we want to constrain the superposition of a diketo compound flanked by anaromatic atom to planarity. 1

One way of doing this is to define a suitable subgraph including the assignment of afew, low-energy grid points on the grid of allowed torsion angles. In the following, wewill assume we want to work with the following settings: TORSION_MODE should equal“0’, and the value of TORSION (set in your configuration) should accordingly point to thetorsion_standard.dat static data file. The idea is to “tighten” certain rotatable bonds.It is a good idea to construct a sample compound. We have done so starting from a simpleSMILES string (O=C(C=C(O)c1ccccc1)c2ccncc2).In Fig. 10.4 we show what the compound looks like after reading in with FlexS and callinga drawing command with FlexV .Our task is to constrain the rotatable bonds C6–C4 and C2–C12. We can trace our progressby reading in the ligand with verbosity set to “10”. The output contains lines such as:

>> Identification of rotatable bonds...

bond C2|2 --- C12|12/ 179 \

ref. C3|3 C17|17-------------------------------------------------------------------------------> (period 180) : 0 30 60 120 150

energy : 0.0 1.0 9.0 9.0 1.0 0.0 1.0 9.0 9.0 1.0<-- (period 360) : 0 30 60 120 150 180 210 240 300 330

...

1Since our case is a special extention of fragments already covered by the existing subgraph definitions intorsion_standard.dat, we did not include it in the distribution.

Page 177: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

10.10. *LIGAND TORSION DATABASE (TORSION_STANDARD.DAT) 177

indicating that the bond between C2 and C12 read from the input file is currently associ-ated with a torsional angle of 179 degrees. Further, the currently valid torsion angular gridpoints including the listed energies are given. Bear in mind that if there were no subgraphmatchings onto our fragment of interest, the default procedure would have been applied:the torsion angle grid points were distributed on a 30 degree grid. In this case then, anoutput at verbosity “10” would look like this:

>> WARNING: Empty list of torsion angles at bond XX|X --> YY|Y .>> WARNING: No torsions, taking 30 degree grid with arbitrary reference atoms.

bond C4|4 --- O5|5/ 143 \

ref. C6|6 H19|19---------------------------------------------------------------------------------> (period 360) : 0 30 60 90 120 150 180 210 240 270 300 330

energy : 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0<-- (period 360) : 0 30 60 90 120 150 180 210 240 270 300 330

Now we must define a subgraph in such a way that the specified atoms match the desiredrotatable bond and insert tangle/te values to which the fragment build-up procedure willbe constrained. One possible solution is a subgraph weak enough to match both the ketoand the enol side of such a compound depicted in Fig. 10.4. The first four atoms definethe surrounding of the bond to be restricted to a planar environment. Some @defgroupstatements allow for compressing the definition.

@defgroup C2 C.2 C.ar@defgroup Aratm C.2 C.ar N.ar N.2 S.2 O.2 N.pl3@defgroup Oketoenol O.co2 O.2 O.3@defgroup Ca_ketoenol C.2 C.3 C.ar#@subgraph special 6 diketo_compound_flanked_by_aromaticsatom 1 Aratmatom 2 Aratmatom 3 C2 # a C bonded to keto-Oatom 4 Ca_ketoenol # the C in between two CO groups. might be CH2, or CHatom 5 Oketoenol # either an hydroxyl-O or carbonyl-Oatom 6 C2 # bonded to enol-Oatom 7 Oketoenol # either an hydroxyl-O or carbonyl-Obond 1 2 arbond 2 3 1 # the rotatable bondbond 3 4 un # 1 or arbond 3 5 un # 2 or arbond 4 6 un # 1 or arbond 6 7 un # 1 or 2 or ar, depending on keto-enol tautomer.datatangle/te 0 0tangle/te 5 1tangle/te 175 1tangle/te 180 0period 360symmetry 180end

The priority of the subgraph must be high enough to overrule existing subgraph definitions,which is why we put a value of 6 in the @subgraph line (see section 10.8.2). Checking withFlexS reveals that the subgraph has been assigned properly:

Page 178: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

178 CHAPTER 10. FILES AND FILE FORMATS

bond C2|2 --> C12|12Subgraph: diketo_compound_flanked_by_aromaticsDirection:-b Number of matches (different sets of torsion angles): 2 ( 2)

1 2 3 4Match 1: < C17|17 > < C12|12 > < C2|2 > < C3|3 >Match 2: < C13|13 > < C12|12 > < C2|2 > < C3|3 >

bond C2|2 --- C12|12/ 0 \

ref. C3|3 C13|13------------------------------------------------------------------------------> (period 180) : 0 5 175 180 185 355

energy : 0.0 1.0 1.0 0.0 1.0 1.0<-- (period 360) : 0 5 175 180 185 355

...bond C4|4 --> C6|6Subgraph: diketo_compound_flanked_by_aromaticsDirection:-b Number of matches (different sets of torsion angles): 2 ( 2)

1 2 3 4Match 1: < C11|11 > < C6|6 > < C4|4 > < C3|3 >Match 2: < C7|7 > < C6|6 > < C4|4 > < C3|3 >

bond C4|4 --- C6|6/ 0 \

ref. C3|3 C7|7------------------------------------------------------------------------------> (period 180) : 0 5 175 180 185 355

energy : 0.0 1.0 1.0 0.0 1.0 1.0<-- (period 360) : 0 5 175 180 185 355

The subgraph definition in fact matches from “both sides” and assigns energy values andtorsion angle grid points as desired. We have plotted the allowed torsional grid values inFig. 10.5. Also we left a little flexibility of ±5 degrees which in practice has produced betterresults and is probably also more realistic.

10.11 *Ligand formal charges (fcharges.dat)

Note: The static data file fcharges.dat comes with FlexS 2.1 only for backward compatibility,in further versions only the rules in transform.dat will be supported.If the flag ASSIGN_FORMAL_CHARGES equals 1, subgraphs from the fcharges file aremapped onto the molecules in order to assign formal charges to the atoms. The data sectionof subgraphs in fcharges consists of a charge value which is assigned to the ligand atommapped to atom 1 of the subgraph.

10.12 *Automatic correction of localized systems (delocal-ized.dat)

Note: The static data file delocalized.dat comes with FlexS 2.1 only for backward compati-bility, in further versions only the rules in transform.dat will be supported.If the flag ASSIGN_DELOCALIZED equals 1, subgraphs from the deloc file are mapped ontothe ligand in order to automatically change localized systems into delocalized ones. Thesubgraph file contains definitions for both directions (converting localized to delocalized

Page 179: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

10.13. SMARTSTM SUPPORT 179

Figure 10.5: Allowed torsional angles and associated energies for the subgraph example.The respective points are coloured in red; grid points above 180 degrees are generated dueto the symmetry entry in the subgraph definition.

and vice versa). Subgraphs for converting from localized to delocalized belong to class 1,otherwise to class 2.The data section contains one line for each atom to be modified:

<atom no> <new type> <new fcharge> [<bond to> <new bond type>]*

<atom no> defines the atom in the subgraph, <new type> and <new fcharge> define thenew type and formal charge values for this atom. The subsequent data pairs <bond to>,<new bond type> define a bond attached to <atom no> (and to <bond to>) and a newtype for it.

10.13 SMARTSTM support

Another mechanism to define subgraphs is available from Release 2 of FlexS and isbased on the SMARTSTM syntax introduced by Daylight Chemical Information Systems Inc.(www.daylight.com). Similar to SMILES, which can be used to define small molecules usinga line notation, SMARTSTM is the corresponding code to define subgraphs. As an exam-ple, let’s say [OH]c1ccccc1 defines phenole, the identical pattern used in SMARTSTM iden-tifies a phenole group in a molecule. In principle, SMARTSTM is an extension of SMILES.Whereas a SMILES expression defines one specific molecule, SMARTSTM defines patternsthat match groups or families of molecules. This is achieved by a set of descriptors thatmust be matched by a substructure to be recognized. In contrast to the few rules definedin the classical FlexS subgraph definition language which is additionally highly dependenton SYBYL’s atom type specification, SMARTSTM is more chemistry based and has manymore rules to define subgraphs of high complexity. At the moment, the implementationof the SMARTSTM matching mechanism is incomplete in terms of chirality and directional

Page 180: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

180 CHAPTER 10. FILES AND FILE FORMATS

bonds. On the other hand there are some extensions to the standard SMARTSTM rules. Inthe following sections, a complete overview of the actual SMARTSTM implementation inFlexS is given. It is described how classical subgraph definitions may be redefined usingSMARTSTM and how those patterns are used for atom type assignment, structure correctionand aromaticity perception.

10.13.1 Atomic primitives

The SMARTSTM definition used in FlexS is based on SMARTSTM version 4.83 and shouldcomply with most existing implementations in other software packages.The following table gives an overview of the available atomic properties.

Page 181: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

10.13. SMARTSTM SUPPORT 181

Symbol Symbol name Default Atomic property requirements* wildcard Any atom (including hydrogen)#<n> atomic number none Element with number <n>, e.g.

[#6] means any carbon{x} atom type [O.co2] Explicit SYBYL atom typea aromatic Aromatic (see 10.13.3)A aliphatic Aliphatic (see 10.13.3)D<n> degree 1 <n> explicit connections, includ-

ing explicit hydrogensH<n> total H count 1 <n> attached hydrogens (see

10.13.4)h<n> implicit H count 1 <n> implicit hydrogens (see

10.13.4)R<n> ring membership any In <n> SSSR rings (see 10.13.2)r<n> smallest SSSR ring has

size <n>any In smallest SSSR ring of size <n>

(see 10.13.2)v<n> valence 1 Total bond order (see 10.13.4)X<n> <n> total connections 1 Number of bonds including

bonds to implicit hydrogens (see10.13.4)

-<n> formal charge == <n> -1 See 10.13.4+<n> formal charge == <n> +1 See 10.13.4@ chirality Anticlockwise (NOT IMPLE-

MENTED)@@ chirality Clockwise (NOT IMPLE-

MENTED)@<c><n> chirality Chiral class <c> chirality <n>

(NOT IMPLEMENTED)@<c><n>? chirality Chirality <c><b> or unspecified

(NOT IMPLEMENTED)<n> atomic mass Explicit atomic mass (NOT IM-

PLEMENTED)^<x> hybridization state [^2] Explicit hybridization state out

of (s,1,2,3,ar) (SMARTSTM exten-sion)

All atomic primitives except ’*’ must be written in square brackets[]. Atomic elements areusually written without square brackets [], but if they are part of logical expressions thebrackets are necessary as well.There is no special rule for grouping several properties, but if a rule is composed of severalproperties like [Cv3X2AD3-2], the description of the basic element must be the first state-ment. This means that [C+] is OK, but [+C] is not allowed. This is to prevent things like[CD3X2N]. Here we have a carbon as principal element of the atom, the nitrogen in the endis not allowed in this context and will be omitted. If no symbol is defined like the pattern[D1] then rule [*,H;D1] is used. Elementary operators are element symbols (C,N,O,...), *,SYBYL types ({N.3},{O.co2},...) and element specifications ([#n]).

Page 182: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

182 CHAPTER 10. FILES AND FILE FORMATS

10.13.2 Ring perception

Ring systems are written as defined in SMILES where atoms get labels for ring closure,so C1CCCCC1 means a cyclohexane ring or N1=CCc2ccccc21 defines an indole system.SMARTSTM uses identical rules, but any atom can be replaced by any SMARTSTM atomdescription. The simple and identical rules [R] or [r] mean that the desired atom must bepart of a ring system. The rule [R2] will match any atom that is part of two rings and the rule[r5] means that the smallest possible ring that the atom belongs to has a size of five atoms.These primitive rules can be combined to [R2r5], which would match two atoms in indol forexample.

10.13.3 Aromaticity perception and hybridization states

Aromaticity is a more problematic atomic property, due to different existing definitions.FlexS traditionally follows the SYBYL definition of aromaticity, where only six-memberedrings that consist of nitrogen and carbon atoms only are accepted to be aromatic if theyare planar with six aromatic bonds or alternating single and double bonds. Atoms in suchsystems get the SYBYL atom type N.ar or C.ar.Additionally, many other ring systems must be treated as aromatic not only for SMARTSTM,but from the chemical point of view as well. A few examples are thiophene or furane andother five-membered and adjacent ring systems such as indole. FlexS treats all ring systemsthat fit Hueckel’s rule (4n+2 pi-electrons in a planar ring system) as aromatic as well. Inaddition, planar ring systems that miss Hueckel’s rule are checked for special subgraphswhich should be treated as aromatic as well.To achieve this behavior, the aromatic property is represented by two possible attributes.On the one hand, an atom can be simply aromatic irrespective of its assigned hybridizationstate or SYBYL atom type. On the other hand, SYBYL atom types assign a hybridizationstate to each atom, and if it is of type N.ar or C.ar the atom is aromatic as well.Hybridization states are highly related to the atom types because a SYBYL atom type con-sists in principle of an element and a hybridization state (refer to chempar.dat, where theatom types are defined). SMARTSTM basically does not know anything about hybridizationstates, but the information is sometimes useful so in FlexS it is possible to change or checkthe hybridization state of an atom using the [^<type>] expression. This is a feature that isnot originally defined by Daylight, but occurs in other SMARTSTM implementations andwas included in FlexS too.This property only makes sense in combination with specific atom types that handle infor-mation about hybridization states. Permitted hybridization states are currently s, 1, 2, 3, ar.The states 1, 2, 3 mean sp1, sp2 and sp3 hybridization respectively. But be careful, [^ar] hasa double meaning, because an atom can be aromatic but not necessarily have the hybridiza-tion state aromatic. This is due to different representations of aromatic systems, where theSMILES expression c1ccccc1 represents benzene as well as the kekule form C1-C=C-C=C-C=1, but in the first case all atoms get the hybridization state aromatic, in the second theywill have an SP2 hybridization, but all these atoms match the pattern [c]. So to get it right,use [a] for Hueckel aromaticity and [^ar] for SYBYL aromaticity.This behavior implies a special handling of bonds between aromatic atoms too. The sim-ple case is that a bond has the bond type aromatic, which is always matched as an aromaticbond. But to match something like a:a or aa in a thiophen, we must match a a because a thio-

Page 183: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

10.13. SMARTSTM SUPPORT 183

phene is usually represented by alternated single and double bonds. So a:a or aa matchesall bonds between two aromatic atoms irrespective of the actual bond type. Sometimes aro-matic bonds are used to represent delocalized systems as well. For those cases it may beuseful to match an aromatic bond between aliphatic atoms, e.g. [A]:[A], or to find a bond oftype aromatic in an aromatic system with a pattern like ’[a]:;!=;!-[a]’ that excludes single anddouble bonds in aromatic systems.

10.13.4 Implicit hydrogens, valences and formal charges

SMARTSTM includes several rules that depend on connected hydrogens. Usually FlexS ex-pects a completely correct protonated molecule as ligand.If the transformation level 4 (H) is enabled (see command SELINIT (6.4.2)), then missinghydrogens can be assigned by FlexS itself.To do this correctly and to match rules that expect implicit hydrogens ([X] or [h]) that are notexplicitly given in the molecule structure, FlexS calculates the number of missing hydrogensbased on the number of adjacent heavy atoms and the assigned bond order. Every bondcontributes valence fractions to the overall bond order of an atom. The total valence of anatom is calculated by

valence = ∑ bsingle + ∑ bamide + ∑ bdouble ∗ 2 + ∑ baromatic ∗ 1.5 + ∑ btriple ∗ 3.

Odd numbers of aromatic bonds yield fractional valences and FlexS reports a warning forthose cases. For internal reasons fractional valences are usually summed up by 0.5 valence tothe next full valence (a single aromatic bond equals a valence of two, three aromatic bonds avalence of 5 and so on). The valence can be directly checked by the [v<n>] atomic primitive.For each chemical element a number of allowed valence states are defined in the static datafile chempar.dat in the @valence_states section, where additional properties such as theresulting formal charge and the number of free electrons for each state are defined. Refer to10.5.3 for more information.It is assumed that negative formal charges will usually be found on acids or need to be com-pensated by missing hydrogens so that the properties [C-1] and [Ch1] would have more orless identical meaning. SMARTSTM provides a mechanism which allows you to work withimplicit hydrogens for a given number of bonds [X<n>]. So [CX4] matches any carbon thatis connected to four neighbors, including implicit hydrogens that were not necessarily givenin the molecule. So this pattern would match a terminal methyle group independent of thenumber of assigned hydrogens in the range from 0-4 including methane and all negativelycharged derivatives (C,[C-4],[C-3],[C-2],[C-1]) and of course any carbon connected to fourheavy atoms. The number of implicit hydrogens is therefore the total number of addablehydrogens to reach a neutral state of an atom.Note this assumption is made for all atoms including nitrogen and oxygen, which meansthat both C[O-] and C[O] will match the pattern [X2] (one explicit bond to carbon and inthe case of the alcoholate one potential/implicit hydrogen). Formal charges have a doublemeaning in FlexS. On the one hand the formal charge is defined by the valence state, which isusually correct. On the other hand FlexS uses so-called delocalized systems, where integralformal charges are distributed over several atoms. It is highly recommended that chargerules are used only for charges and not for missing hydrogens. As a FlexS-specific extension

Page 184: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

184 CHAPTER 10. FILES AND FILE FORMATS

to SMARTSTM, testing for fractional charges such as [O-0.5] is allowed. If no number isgiven, + and - characters are counted and summed to get the total charge. To give someexamples, [-] means any atom with charge -1 and [Ca++] means a calcium ion. But [*+-+-] isidentical to [+0] or [-0] and would match only uncharged atoms.

10.13.5 Bond primitives

The supported bond types are shown in the following table:

Symbol Atomic property requirements- Single bond= Double bond# Triple bond-^ Amide bond (FlexS-specific extension): Aromatic bond~ Any bond@ Any ring bond. Not connected

The default or implicit bond type in SMARTSTM is single OR aromatic (-,:). As a FlexS-specific extension the amide bond (-^) type has been introduced because FlexS uses thisbond type internally to represent the non-rotatable bond in amide groups. This SMARTSTM

expression should only be used for FlexS-specific patterns and you should not expect to findit in any other SMARTSTM implementation.The amide bond is usually compatible with a single bond, so to match it, it is sufficient towrite a single bond. In ring systems like C1CCCC1, the ring closing bond is usually taken asany ring bond, whereas the expression C1CCCC@1 is absolutely identical. You can see that abond type given before the closing ring identifier specifies the closing bond. The position ofa ring label can appear before or after a bond descriptor. It is important to note that a bondspecifier before the label specifies the type of the closing bond (e.g. C=1CCCCC1 equalsC1CCCCC=1) whereas a bond specifier after a label specifies the bond to the following atom(e.g. C1=CCCCC1). The not-connected bond (.), the so-called component-level grouping, isnot really supported by FlexS, because salts and complexes that are not covalently connectedare currently not supported and such structures are rejected upon loading.

10.13.6 Logical operators

SMARTSTM allows atomic primitives to be combined using a couple of logical operatorsgiven in the table below.

Symbol Expression Example, meaning! not [!#6] means any atom, but not carbon, or [#6,#7] means any carbon or any nitrogennone and (highest) [X2H1] combines two expression to one tightly& and (high) [a&#6] aromatic carbon; and (low) [a;#6,#7] aromatic carbon or any nitrogen

In contrast to most computer languages SMARTSTM does not use brackets to define theprecedence of an expression. It uses two kinds of logical AND expressions. Additionally,

Page 185: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

10.14. DEFINING SUBGRAPHS USING SMARTSTM 185

FlexS actually uses three ways to define the AND expression. The most recommended typeis the implicit AND expression that has the highest precedence. Let’s look at an example.[a#6] and [a&#6] have identical meaning, but different atomic primitives are stored in onesubgraph vertex. Atomic primitives separated by logical expressions are stored in a vari-ants queue that is interpreted sequentially based on the precedence of the logical expres-sion. However, the expression [X2H2] has a higher precedence than [X2&H2]. As anotherexample take [X2H2&a,C]. The highest precedence is assigned to [X2H2], after that [a] isrecognized and at least a logical OR test on carbon is performed.Logical expressions may be given for bond types as well, the pattern *@;-,=&!#* matches twoatoms connected by a cyclic bond that may be a single or double bond, but not a triple bond.The implicit AND combination is not allowed for bond types.

10.13.7 Recursive SMARTSTM

Any SMARTSTM expression may be used to define an atomic environment. The definitionof such recursive patterns is usually enclosed in $(). The specified atom must be the firstatom of the recursive expression and represents an atomic property like any other atomicprimitive. As an example let’s take a carbonyl group pattern ’C=O’. To specify the carbonylcarbon only, you can write [$(C=O)]. The resulting subgraph specifies only one atom, thedouble bond to the carbonyl oxygen is just used as a further constraint. As another examplean amide nitrogen can be represented by [$(N C(=O))].To use recursive SMARTSTM such as [$(C(O)O)] in a batch script, use the followingexpression: " ’[$(C(O)O)]’ ". Note the blanks between " and ’!

10.13.8 Branches

Branches are represented by brackets "()". A carboxylic acid can be represented by ’C(=O)O’,the branching bond ’(=O)’ must be included in the brackets; the expression ’C=(O)O’ wouldbe accepted as well, but the double bond will be assigned to the next atom. Thus the expres-sion shown is identical to ’C(O)=O’.

10.14 Defining subgraphs using SMARTSTM

The most important application of SMARTSTM in FlexS is to enhance the subgraph definitionusing a more standardized mechanism. As an example let’s take the Acceptor_N definitionfrom the previous section.

@subgraph 0 1 Acceptor_Natom 1 N2ar CHARGE 0.0 NOF_BONDS 2atom 2 Ratom 3 Rbond 1 2 UN 3 UN

data...

end

This definition can be written by SMARTSTM:

Page 186: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

186 CHAPTER 10. FILES AND FILE FORMATS

@subgraph 0 1 Acceptor_Nsmarts [nD2+0](~*)~*data...

end

The atomic order in SMILES matches the occurrence in the pattern.

10.15 Using templates, vector bindings

An extension of the recursive pattern recognition mechanism in SMARTSTM is the usage ofso-called vector bindings or substructure templates that can be predefined by the keyword@vector_binding in any static data file. Once defined, its name can be used as a template forrecursive expressions.

@vector_binding phenol [$(Oc1ccccc1)]@vector_binding ester [$([OD2]-C(=O)*)]

@subgraph 1 0 some_oxygensmarts [$phenol,$ester]data...end

It is obvious that readability is much better with predefined vector bindings. Usually vectorbinding definitions are local, but if no appropriate vector binding is found in the local file,FlexS searches in all other static data files for a vector binding definition with the givenname. Patterns for SYBYL atom types are defined as vector bindings as well. A classical@defgroup statement definition like

@defgroup Nar N.2 N.ar N.pl3 N.am

can be redefined as

@vector_binding Nar [$N.2, $N.ar, $N.pl3, $N.am]@vector_binding Nar2 [{N.2},{N.ar},{N.pl3},{N.am}]

@subgraph 1 0 my_subgraphsmarts $Nardata...

end

Note: The definition of Nar2 is only valid in the FlexS environment. To write compatibleSMARTSTM we recommend you use the vector bindings syntax which is more common in-stead of direct SYBYL-type matching. However direct SYBYL-type matching is much fasterbecause the SYBYL types are represented internally by single numbers and no further sub-graph matching is necessary.

Page 187: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

10.16. TRANSFORMING MOLECULES VIA SMARTSTM 187

10.16 Transforming molecules via SMARTSTM

The subgraph recognition mechanism based on SMARTSTM is extended by a substructuremodification facility called transformation. The command transform in the mol A newcommand transform in the ligand menu allows direct application of transformation rules.The static data file transform.dat contains all rules used during molecule import.A transformation rule is defined by two SMARTSTM patterns, a match pattern and a trans-formation pattern. The match pattern describes a subgraph in a molecule, whereas the trans-formation pattern describes properties that should be assigned to these recognized atoms.A simple example is given below:

transform [CD3](~[OD1])[OD1] >> *(=[+0.0])-[-1.0]

The left side matches terminal carbonic acids. The right side defines explicitly some prop-erties that should be set, and first we found an asterisk that matches the carbon atom in thematch pattern. An asterisk in a transform pattern is just a placeholder, to keep the atomenumeration identical to the match pattern and nothing is done to this atom. The bond be-tween carbon and the first oxygen should be a double bond and will be set to this type ifnecessary. Then a formal charge of 0.0 should be assigned to the atom matching the firstoxygen. This is the only property that will be changed. The bond between the carbon andthe second oxygen will become a single bond and a formal charge of -1.0 will be assigned.All other properties of an atom that are not explicitly defined in the transform pattern willnot be affected.

10.16.1 Formal charges and hydrogens

A typical application of these transformation rules is to assign formal charges and proto-nation states to an atom. Both the formal charge and the protonation state depend on eachother, so the transformation rules below have identical meaning, but they include each otherimplicitly.

transform [CD3](~[OD1])[OD1] >> *(=[+0.0])-[O-]transform [CD3](~[OD1])[OD1] >> *(=[+0.0])-[OH0]

The situation becomes more difficult if the target atom is not part of an acidic or basic group,but just a carbon atom. So the first rule below (1) neutralizes all groups, this is pretty clear,and all atoms get as many hydrogens added to them to reach a formal charge of 0. Butthe second rule (2) is not so obvious: not all atoms are parts of titratable groups, yet thetransform command would add as many hydrogens to all atoms until a formal charge of+1 is reached for the respective atom. To prevent being trapped in such cases, it is a goodidea to specify the number of hydrogens together with the corresponding formal charge, just asin example (3). Here we have a guanidinium-like carbon atom, and in these situations aformal charge of +1 will be placed on the carbon which should then have the SYBYL typeC.cat.

(1) transform [*] >> [+0](2) transform [*] >> [+1](3) transform [$(C(N)(N)N)] >> [C.catH0+1]

Page 188: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

188 CHAPTER 10. FILES AND FILE FORMATS

So the [Hn] rule has priority over the charge assignment. If an explicit number of hydro-gens is specified, an additional formal charge is just assigned without any influence on theprotonation. If no protonation is specified, the number of hydrogens is adjusted to reach thespecified formal charge (refer to 10.5.3 for details). The example below shows a potentialrule to delocalize a carboxylic acid. Here all hydrogens are removed and a formal charge of-0.5 will then be applied to both oxygen atoms.

transform C(~O)~O >> C(:[OH0-0.5]):[OH0-0.5]

Note: The transform rules are processed from left to right. The order of occurrence of thelabels must therefore be identical on both sides of the rule, e.g.

[C:1](=O)[C:2] >> [C:1](=O).[C:2] # correct order[C:1](=O)[C:2] >> [C:2].[C:1](=O] # wrong order

Furthermore, the rules must be defined so that the bonds are cut first and additionalatoms/linkers are added afterwards:

[C:1][N:2](-[C:3])[C:4] >> \[C:1](-[1*]).[N:2](.[C:3](-[1*]))(-[2*])(-[2*])(-[2*]).[C:4](-[1*])

^ ^cut bond add linkers => correct

[C:1][N:2](-[C:3])[C:4] >> \[C:1](-[1*]).[N:2](-[2*])(-[2*])(-[2*]).[C:3](-[1*]).[C:4](-[1*])

^ ^add linkers cut bond => wrong

Furthermore, if a rule should match more than once make sure that the SMARTSTM patternsdo not overlap, i.e. an atom can be matched by a SMARTSTM pattern only once. So makesure that your matching rule does not match to many atoms. At best only the atoms directlynext to the bond to be cut are matched. Recursive SMARTSTM expressions can be used tomatch the environment. For example, if you wanted to cut at an ester group, you couldwrite

[*:1][O:2][C:3](=[O:4])[*:5] >> [*:1][O:2][1*].[2*][C:3](=[O:4])[*:5]

But in this case the match pattern is quite large. The molecule COC(=O)C(=O)COC couldnot be shred twice because the pattern would have to match both carbonyl carbon twice.Better would be to use recursive SMARTSTM expressions which match only the atoms nextto the cut:

[$([O:2]([*]))][$([C:3](=[O:4])[*]]) >> [O:2]([*])[1*].[2*][C:3](=[O:4])[*]

If even recursive SMARTSTM expressions do not work, you will have to define a special rulewith higher priority for these particular cases.

10.16.2 Atom type assignment

As mentioned earlier, FlexS traditionally makes heavy use of SYBYL atom types. This spe-cial property can be used for matching as well if the text definition of the atom type is

Page 189: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

10.17. STRUCTURE CORRECTION AND ATOM TYPE ASSIGNMENT 189

given in {}. The SMARTSTM expression {C.3} would match all atoms that are of type C.3. Intransformation rules, you can use the same expression to explicitly enforce a special SYBYLtype for a given atom. A SYBYL atom type is internally represented by a hybridizationstate and the element. The element/hybridization state combinations are defined in the filechempar.dat. The following example assigns a SYBYL type C.3 to all atoms that matchthe SMARTSTM pattern on the left side of the expression. The changed properties are theelement, hybridization state and of course the SYBYL type number.

transform [#6X4] >> C.3

The following table describes in detail what kind of assignments are possible at the moment.

Element Chemical elementC.3 SYBYL type, element, hybridization state[a],[A] Aromaticity[+<n>],[-<n>] Formal charge[H<n>] Number of hydrogensbond-types All bond types can be inter-converted

Note: Transformation is a very powerful mechanism given to the user and should be usedvery carefully because it cannot always be assured that the resulting molecule is correctand can be docked correctly and all properties are still consistent. It is advisable to usetransformation rules in the context of the ligand initialization procedure during loading andnot in a batch script afterwards. Nevertheless the TRANSFORM function is quite useful fortesting a transformation rule before putting it into transform.dat.

10.17 Structure correction and atom type assignment

Traditionally, FlexS takes its ligands from files in SYBYL MOL2 format. All subgraph defi-nitions depend on file formats and the underlying atom type definition is further used fortorsion and interaction geometry assignment. To enable ligands to be imported from othersources such as MDL’s SD file format, or crystal structures such as the PDB format, correctassignment of atom and bond types is necessary.Depending on which transformation level is enabled, the transformation rules from trans-form.dat in the static data directory are used during ligand import. There are several levelsof assignment during import of a ligand structure.From this list it becomes clear that at the stage of structure correction, the complete rangeof SMARTSTM properties is available. At the stage of aromaticity assignment it is clear thatusing properties that depend on a correct assignment of aromaticity makes no sense.

10.18 Transformation rules (transform.dat)

The static data file transform.dat is new since release 2.0 and covers a lot of things thathave been distributed in different static data files or used to be hard-coded in earlier ver-sions of FlexS. This is a consequent extension to the principle of separating chemistry andalgorithms. Nearly everything FlexS knows about correct chemistry is placed in these static

Page 190: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

190 CHAPTER 10. FILES AND FILE FORMATS

data files, which makes it possible to give computational chemists the chance to review whatFlexS is doing.The transformation rules placed in transform.dat cover the complete initialization processduring ligand loading which checks and adjusts their chemical properties, such as correctionof errors, valence check, aromaticity perception, atom type assignment, protonation andformal charges.For this purpose a new keyword @transform has been introduced. The syntax of a transfor-mation rule is

@transform <class/level> <prio> matchpattern >> transformpattern

The syntax is similar to the transform command described above, but contains a class/leveland priority number as well. The class/level ID defines the group to which a transformationbelongs. The priority number defines the order of application in a specific group. Ruleswith higher priority values are applied prior to rules with lower numbers. But there is noexclusion between different patterns.No overlapping matches are allowed within an application of one single rule. As an examplelet’s take a bis-phosphate as a ligand, e.g. OP(=O)(-O)OP(=O)(-O)O, and a transformationrule that matches one side of the phosphate groups OP(=O)(-O)[OD2]. At first glance, youwould expect to match both phosphates, but the bridging oxygen between them wouldbe matched two times for the same subgraph, which is not allowed. Another example isa subgraph for a phenyle ring c1ccccc1, this would match benzene six times and that isdefinitely not what we want.So the phosphate rule will only match one of the two PO3 groups because the bridging oxy-gen between them will not be matched a second time. It is usually a good idea to use recur-sive SMARTSTM expressions if only certain atoms should be modified; complete subgraphsare better if bonds need adjusting. The following example shows the difference between theexplicit subgraph and the recursive representation. The second line will be able to match allfour single-bonded oxygens in the bis-phosphate, the first one will only match two of them.Example

@transform 5 1 bad_RPO3 OP(=O)(-[OD1])[OD2] >> [O-]P(=O)(-[O-])O@transform 5 1 good_RPO3 [$(OP(=O)(-[OD1])[OD2])] >> [O-]

Summary: Within one rule overlaps are avoided, but different rules in a single level canmatch and modify atoms several times. The priority only sets the order of the application ofrules.It is highly advisable to test any new tranformation rule with the transform command beforeinserting it into the transform.dat file. The transformation engine is intended to changesimple atom properties, modifying 3D coordinates upon bond type changes is not possible.The resulting molecule can of course be passed through the built-in minimizer after thetransformation process, but this is not the default behavior.A detailed description of the transform.dat file that comes with your current FlexS releasecan be found in the transform.dat file in your static_data directory. Please consultthe latest rules and advice there.Note: The transform rules are processed from left to right. The order of occurrence of thelabels must therefore be identical on both sides of the rule, e.g.

Page 191: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

10.19. *LIGAND GAUSSIAN REPRESENTATION (GAUSSIAN.DAT) 191

[C:1](=O)[C:2] >> [C:1](=O).[C:2] # correct order[C:1](=O)[C:2] >> [C:2].[C:1](=O] # wrong order

Furthermore, the rules must be defined so that the bonds are cut first and additionalatoms/linkers are added afterwards:

[C:1][N:2](-[C:3])[C:4] >> \[C:1](-[1*]).[N:2](.[C:3](-[1*]))(-[2*])(-[2*])(-[2*]).[C:4](-[1*])

^ ^cut bond add linkers => correct

[C:1][N:2](-[C:3])[C:4] >> \[C:1](-[1*]).[N:2](-[2*])(-[2*])(-[2*]).[C:3](-[1*]).[C:4](-[1*])

^ ^add linkers cut bond => wrong

10.19 *Ligand Gaussian representation (gaussian.dat)

Subgraphs from the gaussian file are mapped onto the ligand in order to assign the Gaus-sian representation to the molecule. The data area in the subgraph definitions of this filelooks as follows:

quality<i> <n><atom-no> <x> <y> <z> <height> <width>

Behind the keyword quality, i specifies the type of quality with i∈ −1, . . . , 4. Here -1 standsfor the quality no-gauss which can be used to specify atoms that are explicitly excludedfrom Gaussian representation. The other values for i map with the figures for the differentphysico-chemical qualities. 0 for electron density, 1 for partial charge, 2 for hydrophobicity, 3for hydrogen bonding donors and 4 for hydrogen bonding acceptors. There must be one linefor each of these qualities. The next figure n specifies the number of Gaussian functions thatmust be attached, followed by their specification in the next lines. The specification startswith the number of the atom in the subgraph file which is supposed to carry the Gaussian.Then the local coordinates followed by height and width of the respective Gaussian aregiven.

Page 192: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

192 CHAPTER 10. FILES AND FILE FORMATS

10.20 *Graphics (graphic_sp.dat)

This file contains the default settings for variables relating to FlexS graphics. The descriptionis not yet complete, but the file is self-explanatory. If you want to change any defaultsrelating to graphics, therefore, do not hesitate to look in this file.The graphic_sp.dat file is divided into sections, the name of each section must bepreceded by ’@’.The skeleton of graphic_sp.dat is:

@atom-colors<element-name> <color>

@contact-colors<contact-type-name> <color>

@test-ligand-defaults

@ref-ligand-defaults

@superpos-defaults

@tl-ref-coords-defaults

@combilib-defaults

@colors<red_value> <green_value> <blue_value> [<lucent_value>] <COLORNAME>

For each section there is a set of valid keywords. The sections are described below.Each of the values in the sections @test-ligand-defaults, @reference-ligand-defaults,@superposition-defaults and @lig-ref-coords-defaults can also be set by one of the com-mands SELGRA, SELADM, SELLAB or SELCOL.All other values can only be set by graphic_sp.dat.

10.20.1 Colors

A valid <color> is one of the following four expressions:

• A number between 0 and 360 interpreted as an angle in the color circle.

In FlexS color 0 represents invisible. Thus, it is possible to exclude parts of a drawing(e.g. main directions, ligand surface) by setting their color to 0.

• A color name defined in graphic_sp.dat. A name with two or more words mustbe enclosed in double quotes, e.g. "light green". For invisible you can use the keywordinvisible. You can use the prefix "trans" for translucent color, for example "translight green" gives a translucent light green color.

• Three (or four) floating numbers in between 0.0 and 1.0 representing RGB values(RGBA in the case of four values), separated by slashes, e.g. 0.8/0.33/0.0 or0.8/0.33/0.0/1.0

Page 193: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

10.20. *GRAPHICS (GRAPHIC_SP.DAT) 193

• As above but separated by blanks and enclosed in double quotes, e.g. "0.8 0.330.0" or "0.8 0.33 0.0 1.0"

10.20.2 Switches

Syntax: switch <name> <int_val>

Description: Switches represent integer selections of the graphic settings. These are thedefault values for the commands SELADM, SELGRA, SELLAB and SELCOL.

10.20.3 Scalars

Syntax: scalar <name> <double_val | ’Inf’>

Description: Scalars represent doubles of the graphic settings. These are the default valuesfor the commands SELADM, SELGRA, SELLAB and SELCOL. ’Inf’ stands for ∞.

10.20.4 Lists

Syntax: list <name> <int> [<int> . . . ]

Description: Lists represent integer lists of the graphic settings (e.g. contact types). Theseare the default values for the commands SELADM, SELGRA, SELLAB and SELCOL.

10.20.5 Color modes

Color modes can be chosen for the following items: test ligand, reference ligand, superpo-sition, main direction, geometry, ligand surface. The following table describes the availablecolor modes, and indicates for which items a color mode is valid.

atom Bonds are drawn half-half in the colors of the atoms. Atoms are colored according totheir element.

Parameters: NoneAvailable for: Test ligand, reference ligand, surface

contact Interactions, main directions or interaction geometries, are colored depending ontheir contact type. The color value for each contact type must be specified in thegraphic_sp.dat file.

Parameters: NoneAvailable for: Superposition, main directions, interaction geometries.

fragment The test ligand is bi-colored to visualize the fragmentation. The base fragmentgets an extra color.

Parameters: <color0> is the color for the base fragment, <color1> <color2> are al-ternately used for the remaining fragments.

Available for: Test ligand

energy Used with superposition, the interactions are colored according to their score. Usedwith test ligands, the ligands are drawn according to their superposition score.

Page 194: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

194 CHAPTER 10. FILES AND FILE FORMATS

opt-energy The interactions are colored according to their optimal score. The energy valuesare linearly mapped to <intervals> color values.

Parameters: <intervals> <min-energy> <max-energy> <min-color> <max-color>

Available for: Superposition, test ligand

invisible Nothing is drawn

Parameters: NoneAvailable for: All

unique Item is drawn in one color

Parameters: <color>Available for: All

surfpatch The surface is colored according to the surface patch type (concave, saddle, con-vex patch).

surf-atom The (surface) convex patches are colored by atom type, reentrant (saddle andconcave) patches are colored in one user-defined color.

Parameters: <concave color> <saddle color> <convex color>Available for: Surface

10.20.6 Defining atom colors (@atom-colors)

Syntax: <element-name> <color><element-name> must be the chemical name of an element. Determines the colors of atomsin color mode ’atom’.

10.20.7 Defining colors for contact types (@contact-colors)

Syntax: <contact-name> <color><contact-name> must be the name of a contact type. Determines the colors of contact typesin color mode ’contact-type’.

Page 195: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

10.20. *GRAPHICS (GRAPHIC_SP.DAT) 195

10.20.8 Setting test ligand graphic defaults (@test-ligand-defaults)

Name Type ValueSELADM :MOL_OBJ_NUMBER switch [0–255]selects the test ligand’s graphics object numberTMP_FILES switch [0|1]if set to 1, all drawings are written to temporary files (see 6.4.13)ORG_MODE switch [0|1|2 <min> <max>]selects the test ligand’s organization mode 0: undef, 1: default, 2: FIFO (see 6.4.13)SELGRA :DRAW_HYDROGEN switch [0–2]decides whether test ligand hydrogens are shown (see 6.4.14)MOL_DISP_MODE switch [1–4]1 : lines, 2 : sticks, 3 : balls & sticks, 4 : atoms (see 6.4.14)DRAW_INTERACT_GEOMS switch [0|1]decides whether interaction geometries and main directions of geometries are drawn (see 6.4.14)DRAW_GAUSS switch [0|1]decides whether Gauss surfaces are drawn (see 6.4.14)DRAW_ALL_CONTACT_TYPES switch [0|1]decides whether all interaction geometries are drawn (see 6.4.14)DRAW_ALL_COMPONENTS switch [0|1]decides whether all components are drawn (see 6.4.14)DRAW_SURFACE switch [0–4]selects the surface mode, 0: no-surf, 1: line-surf, 2: triangle-surf, 4: Connolly-surf (see 6.4.14)SELLAB :ATOM_NAMES switch [0|1]decides whether atom names are drawn (see 6.4.16)INFILE_NUMBERS switch [0|1]decides whether numbers of the atom in the file are drawn (see 6.4.16)SYBYL_TYPE_STRINGS switch [0|1]decides whether the sybyl type strings are drawn (see 6.4.16)FRAGMENT_NUMBERS switch [0|1]decides whether numbers of the fragments are drawn (see 6.4.16)SELCOL :TEST_LIG colormode [ INVISIBLE | ATOM | UNIQUE | FRAG-

MENT ]selects the color mode for test ligand bonds and atoms (see 6.4.15)GEOMETRY colormode [ INVISIBLE | UNIQUE | CONTACT ]selects the color mode for test ligand geometries (see 6.4.15)MAIN_DIRECTION colormode [ INVISIBLE | UNIQUE | CONTACT ]selects the color mode for test ligand main directions (see 6.4.15)SURFACE colormode [ INVISIBLE | UNIQUE | CEN_DIST |

SURFPATCH | SURF_ATOM ]selects the color mode for the test ligand surface (see 6.4.15)<object-name> color [<colorname>]sets the color of the object (TEST_LIG, GEOMETRY, SURFACE, . . . ) (see 6.4.15)gauss [<color 1> <color 2> . . . ]coloring of the Gauss surfaces

Page 196: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

196 CHAPTER 10. FILES AND FILE FORMATS

10.20.9 Setting reference ligand graphic defaults (@reference-ligand-defaults)

Name Type ValueSELADM :MOL_OBJ_NUMBER switch [0–255]selects the reference ligand’s graphics object numberTMP_FILES switch [0|1]if set to 1, all drawings are written to temporary files (see 6.5.6)ORG_MODE switch [0|1|2 <min> <max>]selects the reference ligand’s organization mode 0: undef, 1: default, 2: FIFO (see 6.5.6)SELGRA :DRAW_GAUSS switch [0|1]decides whether Gauss surfaces are drawn (see 6.5.7)DRAW_HYDROGEN switch [0–2]decides whether reference ligand hydrogens are shown (see 6.5.7)MOL_DISP_MODE switch [1–4]1 : lines, 2 : sticks, 3 : balls & sticks, 4 : atoms (see 6.5.7)DRAW_INTERACT_GEOMS switch [0|1]decides whether interaction geometries and main directions of geometries are drawnDRAW_IA_POINTS switch [0|1]decides whether discrete interaction points are drawn (see 6.5.7)DRAW_ALL_CONTACT_TYPES switch [0|1]decides whether all interaction geometries are drawn (see 6.5.7)DRAW_SURFACE switch [0–4]selects the surface mode, 0: no-surf, 1: line-surf, 2: triangle-surf, 4: Connolly-surfSELLAB :ATOM_NAMES switch [0|1]decides whether atom names are drawn (see 6.5.9)INFILE_NUMBERS switch [0|1]decides whether numbers of the atom in the file are drawn (see 6.5.9)SELCOL :REF_LIG colormode [ INVISIBLE | ATOM | UNIQUE ]selects the color mode for reference ligand bonds and atoms (see 6.5.8)GEOMETRY colormode [ INVISIBLE | UNIQUE | CONTACT ]selects the color mode for reference ligand geometries (see 6.5.8)MAIN_DIRECTION colormode [ INVISIBLE | UNIQUE | CONTACT ]selects the color mode for reference ligand main directions (see 6.5.8)SURFACE colormode [ INVISIBLE | UNIQUE | CEN_DIST |

SURFPATCH | SURF_ATOM ]selects the color mode for the reference ligand surface (see 6.5.8)<object-name> color [<colorname>]sets the color of the object (REF_LIG, GEOMETRY, SURFACE, . . . ) (see 6.4.15)gauss [<color 1> <color 2> . . . ]coloring of the Gauss surfaces

Page 197: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

10.20. *GRAPHICS (GRAPHIC_SP.DAT) 197

10.20.10 Setting superposition graphic defaults (@superposition_defaults)

A ’superposition’ itself consists only of a set of lines showing the paired intermolecularinteractions. To visualize the complete superposition, you must add a drawing of the testligand with the coordinates from the placement and a drawing of the reference ligand.

Name Type ValueSELADM :MOL_OBJ_NUMBER switch [0–255]selects the superposition’s graphics object numberTMP_FILES switch [0|1]if set to 1, all drawings are written to temporary files (see 6.7.20)ORG_MODE switch [0|1|2 <min> <max>]selects the superposition’s organization mode 0: undef, 1: default, 2: FIFO (see 6.7.20)SELGRA :INCLUDE_TEST_LIG switch [0|1]if set to ’1’, the test ligand is drawn together with the superposition (see 6.7.21)INCLUDE_REF_LIG switch [0|1]if set to ’1’, the reference ligand is drawn together with the superposition (see 6.7.21)DRAW_GAUSS switch [0|1]decides whether Gauss surfaces are drawn (see 6.7.21)DRAW_ALL_CONTACT_TYPES switch [0|1]decides whether all interaction geometries are drawn (see 6.7.21)SELLAB :INTERACTION_TYPES switch [0|1]if set to ’1’, interaction labels contain the interaction type (see 6.7.23)ENERGIES switch [0|1]if set to ’1’, interaction labels contain the scoring value of the interaction(energy contribution of the interaction) (see 6.7.23)OPTIMAL_ENERGIES switch [0|1]if set to ’1’, interaction labels contain the optimal scoring value interactionsof the corresponding type can achieve (see 6.7.23)SELCOL :SUPERPOS colormode [ INVISIBLE | UNIQUE | ENERGY |

OPT_ENERGY | CONTACT ]selects the color mode for the superposition (see 6.7.22)<object-name> color [<colorname>]sets the color of the object (SUPERPOS, GEOMETRY, SURFACE, . . . ) (see 6.7.22)gauss [<color 1> <color 2> . . . ]coloring of the Gauss surfaces

Page 198: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

198 CHAPTER 10. FILES AND FILE FORMATS

10.20.11 Setting combinatorial libraries (combilibs) graphic defaults (@combilib-defaults)

Name Type ValueSELADM :MOL_OBJ_NUMBER switch [0–255]selects the combilib’s mol object numberTMP_FILES switch [0|1]if set to 1, all drawings are written to temporary files (see 7.2.2.17)ORG_MOD switch []

SELGRA :MOL_DISP_MODE switch [1–4]1 : lines, 2 : sticks, 3 : balls & sticks, 4 : atoms (see 7.2.2.18)DRAW_HYDROGEN switch [0–2]decides whether combilib instance hydrogens are shown (see 7.2.2.18)DRAW_INTERACT_GEOMS switch [0|1]decides whether interaction geometries and main directions of geometries are drawnDRAW_ALL_CONTACT_TYPES switch [0|1]decides whether the interaction geometries of all contact types are drawn (see 7.2.2.18)DRAW_ALL_COMPONENTS switch [0|1]decides whether the atoms of all components are drawn (see 7.2.2.18)DRAW_SURFACE switch [0–4]selects the surface mode, 0 : no-surf, 1 : line-surf, 2 : triangle-surf, 4 : Connolly-surfDRAW_R_VECTORS switch [0|1]decides whether vectors are drawn at the position where further R-groups can be added (see 7.2.2.18)DRAW_X_VECTORS switch [0|1]decides whether a vector is drawn at the position where the current R-group is connected to the parent R-group or core(see 7.2.2.18)SELLAB :MOL_NAMES switch [0|1]

MOL_IDS switch [0|1]

ATOM_NAMES switch [0|1]decides whether atom names are drawn (see 7.2.2.20)INFILE_NUMBERS switch [0|1]decides whether numbers of the atom in the file are drawn (see 7.2.2.20)SYBYL_TYPE_STRINGS switch [0|1]decides whether the sybyl type strings are drawn (see 7.2.2.20)FRAGMENT_NUMBERS switch [0|1]decides whether numbers of the fragments are drawn (see 7.2.2.20)SELCOL :COMBILIB colormode [ INVISIBLE | ATOM | UNIQUE | FRAG-

MENT ]selects the color mode for combilib instance bonds and atoms (see 7.2.2.19)GEOMETRY colormode [ INVISIBLE | UNIQUE | CONTACT ]selects the color mode for combilib instance geometries (see 7.2.2.19)MAIN_DIRECTION colormode [ INVISIBLE | UNIQUE | CONTACT ]selects the color mode for combilib instance main directions (see 7.2.2.19)SURFACE colormode [ INVISIBLE | UNIQUE | CEN_DIST |

SURFPATCH | SURF_ATOM ]selects the color mode for the combilib instance surface (see 7.2.2.19)<object-name> color [<colorname>]sets the color of the object (COMBILIB, GEOMETRY, SURFACE, . . . ) (see 7.2.2.19)

Page 199: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

10.21. *OPTIMIZATION/RIGID-BODY SUPERPOSITION/SCORING PARAMETER FILE (OPTPAR.DAT)199

10.21 *Optimization/Rigid-body superposition/Scoring param-eter file (optpar.dat)

10.21.1 Scoring parameters

The record @scoring_parameters_sp specifies the global scoring parameters, i.e. thecoefficients for the different scoring contributions. Different parameter sets may be definedfor the different phases of the algorithm (0 base placement, 1 ≤ i ≤ n scoring after placingthe ith fragment, -1 scoring after complete placement). The parameter set of phase -2 isdefined for the flexible superposition influencing the Gaussian overlap index (see subsection6.9.1).

@scoring_parameters_sp# scoring of phase 0, 1, or -1

PHASE 0, 1, -1G_ED -0.000057G_CHG -236.445496G_HYD -8.743366G_DON -0.005366G_ACC -1.654658G_DONG -0.000000G_ACCG -0.000000G_STM -0.098095G_OVL -0.000158G_ROT -0.000000

# scoring during postoptimization/superpositionPHASE -2G_ED -0.009921G_CHG -996.813477G_HYD -29.102539G_DON -74.266846G_ACC -0.000995G_DONG -0.000000G_ACCG -0.000000E_OVL -8.000000E_LJP -2.000000E_TOR -2.500000

<PHASE> Placement phase.

<G_ED> Electron density overlap.

<G_CHG> Partial charge overlap.

<G_HYD> Hydrophobicity overlap.

<G_DON> Overlap of hydrogen bonding donor Gaussians.

<G_ACC> Overlap of hydrogen bonding acceptor Gaussians.

<G_DONG> Overlap of hydrogen bonding donor geometry Gaussians.

<G_ACCG> Overlap of hydrogen bonding acceptor geometry Gaussians.

<G_OVL> Van der Waals overlap volume.

<G_STM> Subtree matching score.

Page 200: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

200 CHAPTER 10. FILES AND FILE FORMATS

<G_ROT> Internal energy of the conformation.

Phase -2: Gaussian overlap parameters of similarity/overlap index:

<G_ED> Electron density overlap.

<G_CHG> Partial charge overlap.

<G_HYD> Hydrophobicity overlap.

<G_DON> Overlap of hydrogen bonding donor Gaussians.

<G_ACC> Overlap of hydrogen bonding acceptor Gaussians.

<G_DONG> Overlap of hydrogen bonding donor geometry Gaussians.

<G_ACCG> Overlap of hydrogen bonding acceptor geometry Gaussians.

Phase -2: Energy parameters:

<E_OVL> Parameter for the overlap index.

<E_LJP> Parameter for the Lennard-Jones potential.

<E_TOR> Parameter for the torsion energy.

10.21.2 Flexible optimization parameter

The record @flexible_superposition specifies the stop criteria of optimization.

@flexible_superpositiongradient_tolerance 0.00001step_size_stop_criterion yes 0.9energy_stop_criterion nomaximal_iter_step 100

<gradient_tolerance> Minimum gradient criterion (see 6.9.3).

<step_size_stop_criterion> Minimum step size criterion (see 6.9.3). Turned on.

<energy_stop_criterion> Minimum energy size criterion (see 6.9.3). Turned off.

<maximal_iter_step> Maximum number of optimization steps (see 6.9.3).

10.21.3 Rigid-body superposition parameter

The RigFit module is controlled by various parameters which allow a fine tuning of themethod. Only a few of them are intended to be modified frequently. These parameters arelisted in the command OPTPARAM/SETPAR.

The QUALITY table specifies which quality is used in the Fourier space optimization (rec-spc) and in the real space optimization (realspc) and which weight factors for the singlequality are used.

Name (type): <laue_radius> (floating-point)

Page 201: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

10.21. *OPTIMIZATION/RIGID-BODY SUPERPOSITION/SCORING PARAMETER FILE (OPTPAR.DAT)201

Description: Determines the number Laue vectors used in Fourier space. All Lauevectors within a sphere of <laue_radius> are considered. This parameter essen-tially affects the accuracy and the speed of the calculations in Fourier space. Fastcalculations can be achieved with a <laue_radius> of ≈ 2.4.Default value: 3.5Reasonable range: ≥ 1.5

Name (type): <start_rotation_sample_step> (floating-point)Description: Determines the step length used in rotational sampling to generatestart positions. The default value of 120 degrees gives 15 start rotations. A value of90 degrees gives 24 start rotations.Default value: 120Reasonable range: (0,180]

Name (type): <translation_sample_method> (int)Description: Determines the method used to generate start positions for transla-tion optimization. Available methods are

• 0: No sampling: the gravity center of the test ligand is placed upon the gravitycenter of the reference ligand;

• 1: Sampling by approximate bond-length: grid with bond-length;• 2: Sampling by atom: the gravity center of the test ligand is placed upon each

atoms of the reference ligand;• 3: The gravity center of the test ligand is placed upon the component gravity

centers of the reference ligand.• 4: Each fragment gravity center of the test ligand is placed upon each compo-

nent gravity center of the reference ligand.• 5: Rigid grid with 27 start translations.

Default value: 3Reasonable range: [0..5]

Name (type): <nof_translation_optimization> (int)Description: The best <nof_translation_optimization> sample points are opti-mized.Default value: 10Reasonable range: [10,50]

Name (type): <rotation_filter_tolerance> (floating-point)Description: If fbest is the function value of the best rotation solution, solutionswith function values inferior to <rotation_filter_tolerance> * fbest are rejected.Default value: 0.5Reasonable range: [0,1]

Name (type): <baseplace_filter_tolerance> (floating-point)Description: If fbest is the function value of the best base placement, placementswith function values inferior to <baseplace_filter_tolerance> * fbest are rejected.Default value: 0.5

Page 202: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

202 CHAPTER 10. FILES AND FILE FORMATS

Reasonable range: [0,1]

Name (type): <glob_opt_filter_tolerance> (floating-point)Description: If fbest is the function value of the best translation solution, solutionswith function values inferior to <glob_opt_filter_tolerance> * fbest are rejected.Default value: 0.5Reasonable range: [0,1]

Name (type): <rotation_cluster_tolerance> (floating-point)Description: Two rotation solutions are clustered, if the difference of the two rota-tion angles is smaller as <rotation_cluster_tolerance> degree.Default value: 0.1Reasonable range: [0,0.5]

Name (type): <translation_cluster_tolerance> (floating-point)Description: Two translation solutions are clustered, if the difference of the twotranslation vectors is smaller as <translation_cluster_tolerance> Angstroms.Default value: 0.1Reasonable range: [0,0.5]

Name (type): <function_difference_tolerance> (floating-point)Description: Special parameter related to the tightness of a stop criterion: If, dur-ing several optimisation steps, no seizable reduction (reduction is smaller than<function_difference_tolerance> of the function can be obtained, then the opti-misation will be aborted. (A value of 10−4 can strongly decrease the number ofiterations.)Default value: 0.001Reasonable range: [0.00001,0.01]

10.21.4 RIF optimization parameter sets

The table in section @rif_alignment specifies the number of solutions which are flexiblyrelaxed with a 4-step optimization method. After performing a coarse alignment by rigidlysuperimposing the given match points (step 1) (see section 6.8.1.3), the complexity of theflexible alignment and by that the run time of the computation is increased in the next steps.First, only the RMSD of the match points is flexibly optimized (step 2), then also the shapeof the molecules is taken into account (step 3), and finally the all gaussian properties of theFlexS scoring function are used (step 4).

There are four predefined settings that can be configured here. The very fast SCREENmode is basically meant for automatic filtering unfavorable alignments in screening runs.The INTERACTIVE mode should be used if you use FlexS interactively in order to visuallyinspect the alignments. If the alignments generated in this mode are not sufficient you canalso use the PRECISE mode, which takes however rather long. Finally you can config-ure the settings that are used for the command line option -ri (see 5.2.11) in the line CLIRIF.

The default mode which is used in the command ALIGN (see section 6.8.2) is theINTERACTIVE mode. It can be changed with the command SWITCHP (see section 6.8.3)

Page 203: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

10.21. *OPTIMIZATION/RIGID-BODY SUPERPOSITION/SCORING PARAMETER FILE (OPTPAR.DAT)203

to the SCREEN or PRECISE mode. Alternatively, user defined parameters can be set withthis command.

@rif_alignment#----------------------------------------------------------------------------# Mode Nof Max Max Max Max# sol iter iter iter iter# step 1 step 2 step 3 step 4#----------------------------------------------------------------------------SCREEN 1 0 50 20 10INTERACTIVE 1 -3 50 20 30PRECISE 3 -3 50 50 50CLIRIF 1 0 50 20 10

If any of the parameters is set to 0, the corresponding optimization step is switched off.The value for the maximum number of iteration steps for the first optimization step can benegative. In this case the maximum number of iterations will be set to the absolute valuemultiplied by the square root of the fragment number of the test ligand.

Page 204: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

204 CHAPTER 10. FILES AND FILE FORMATS

Page 205: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

11*Program interfaces

11.1 Interface to PYTHON

From version 2.1, the interface to Python no longer exists.

11.2 Interface to WHATIF

From version 1.9, the interface library for connecting FlexS to WHATIF is no longer main-tained.

11.3 Interface to SCA

The interface to SCA no longer exists.

11.4 Interface to CORINA

CORINA [4, 16] is a 3D structure generator used by FlexS to generate ring conformations orto clean up ligand molecule structures. FlexS requires CORINA versions no older than v2.6.

In the latest versions of CORINA, the driver option nh is no longer supported. If yourCORINA is that new, please remove the nh from the respective CORINA call of RCGEN-ERATOR in your config_sp.dat.

Ring conformation generation:

• External usage of CORINA: The value of <RING_MODE> must be set to 1, and theflag value of <RCGENERATOR> must point to the CORINA executable to use.

CORINA has a specially tailored interface for FlexS which is activated by the driveroption ’-d flexx’. Every ring system will then be written into corina_in_*.mol2 inthe specified temporary directory in the configuration file. CORINA creates conforma-tions which are written into files named corina_out_*.mol2 (and some temporaryfiles). FlexS subsequently processes these files. Tracing information and error out-put will be written into the standard CORINA trace files named corina.trc in thecurrent directory.

205

Page 206: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

206 CHAPTER 11. *PROGRAM INTERFACES

For the generation of stereoisomers of rings (which is switched off by default), it is nec-essary to activate the respective switch for CORINA. This is done by adding stergento the list of CORINA flags in the line starting with <RCGENERATOR> in your con-figuration.

• With Release 1.31, FlexS comes with an integrated version of CORINA which canoptionally be used to compute ring conformers for superposition.

FlexS can be configured to use this built-in CORINA if you set the value of RING_MODEto 3 in config_sp.dat or explicitly in the commandline interface.

Please note that stereoisomerism in rings is currently not supported with the internalversion of CORINA.

Please note: To use the built-in CORINA you need a CORINA_F license which isavailable by BioSolveIT.

General compound cleanup (3D coordinate generation):Here <3DGENERATOR> must point to the CORINA executable to be used. Forthis purpose CORINA has a specially tailored interface for FlexS switched on by thedriver option ’-i t=sdf -o t=sdf’. Every ligand molecule structure will be written intoflexclean__in_*.sdf in the specified temporary directory (see above). CORINA cre-ates its output structures in files named flexclean__out_*.sdf. Error output will againbe written into flexclean__trc_*.trc in your temporary directory.

11.5 Interface to CONFORT

The interface to CONFORT works in the same way as the CORINA interface describedabove.

11.6 The FlexV graphical interface

The second visualization software interfaced to FlexS is FlexV . Because it is an in-housedevelopment, FlexV supports all the graphic features of FlexS. Thus, if you do not alreadyhave a specific preference, we suggest you use FlexV .When you execute a display command, a FlexV viewer is started and linked to FlexS au-tomatically. The linking is done via named pipes (_pipe_*). The pipes are used for sendingcommands only. The graphics are stored in .gdf files. If the graphics files are temporary,they are written into the directory specified with parameter TEMP with flexv_tmp_*.gdfas filenames.Sometimes it is useful to be able to have two FlexV windows open at the sime time, forexample when comparing two results. This can easily be done: type toflexv b at theFlexS prompt. This command, toflexv, sends commands to FlexV in the same way as bythe internal drawing functions. The parameter ’b’ stands for ’break the pipes’. As a result,your running FlexVwill be disconnected from FlexS and stays separate from FlexS. Whenyou now draw your second object and type ’display’, FlexS will recognize that your previousinstance of FlexV has been disconnected from FlexS and will start a new one. Admittedlythe drawback of this is that you cannot connect to the first FlexV later on. (For more on

Page 207: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

11.6. THE FLEXV GRAPHICAL INTERFACE 207

commands like this, please refer to the FlexV documentation which is freely available fromthe BioSolveIT web site http://www.biosolveit.de/downloadRemember that FlexV is only a visualization tool and knows (almost) nothing aboutmolecules. It is therefore not possible to change the coloring of atoms or bonds or changethe labels. All these actions must be done in FlexS before executing the DRAW commands.

Page 208: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

208 CHAPTER 11. *PROGRAM INTERFACES

Page 209: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

IV

APPENDIX

209

Page 210: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann
Page 211: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

ASample configuration file

Here is a sample configuration file. Hashes mark comment lines. Your file may be quitedifferent from this one, one of the most important changes you will have to make is the@ROOTDIR variable.

#--------------------------------------------------------------------------## ## CONFIG_SP.DAT: Configuration file for the FlexS program, version 2.1 ## ##--------------------------------------------------------------------------#

# --------------------------------------------------------------------------# LICENSE FILES# --------------------------------------------------------------------------# DESCRIPTION:# FlexS starts only with valid license strings. These can be either# license strings issued by Tripos, Inc. or by BioSolveIT GmbH.# --------------------------------------------------------------------------@LICENSE_FILES__CONFIG_LICDIR__/flexs.lic.

# --------------------------------------------------------------------------# ROOT DIRECTORY# --------------------------------------------------------------------------# DESCRIPTION:# FlexS needs the definition of some paths and files, for example where# it should search for test ligands, reference ligands, etc. To make the# definition of these paths easier, you can define a root directory.# Each subsequent path definition starting with a directory name or# ../ is then relative to this directory. Note that ./ is relative# to the current directory.# REMARKS:# The root directory should be an absolute path.#---------------------------------------------------------------------------@ROOTDIR __CONFIG_ROOTDIR__

# --------------------------------------------------------------------------@DIRECTORIES# --------------------------------------------------------------------------# DESCRIPTION:# FlexS uses user-defined standard directories for all kinds of files.# This is the place where you can define them. If a filename is entered# in FlexS, the program looks first for a local file and then in the# standard directory. The following standard directories can be defined:## Each path may contain multiple directory names separated by ’:’ or ’+’.# Directory names may be relative or absolute as described in section# ’ROOT DIRECTORY’.## Ligand:# LIGAND : location of mol2 files of the ligands

211

Page 212: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

212 APPENDIX A. SAMPLE CONFIGURATION FILE

# Optional FlexS modules:# COMBILIB : location of combinatorial library input files# Misc.:# SCRIPT : location of batch files# HELP : location of the FlexS User Guide and shorthelp_sp file# LIF : location of the library information files (lif)# PREDICT : place where output of FlexS is written to# TEMP : space for temporary files# PVM_TEMP : space for pvm temporary files## REMARKS:# If more than one user is working with FlexS in the same temp space,# problems can occur with SCA.# --------------------------------------------------------------------------LIGAND .+./lig/+example/lig/

#application COMBISUPERCOMBILIB .+./clib/+example/clib/#end_application

LIF predict/

SCRIPT .+example/bat/HELP doc/

PREDICT predict/TEMP /tmp/#system WindowsTEMP tmp#end_system#system Linux Linuxx86_64PVM_TEMP tmp/#end_system

# --------------------------------------------------------------------------@STATIC_DATA# --------------------------------------------------------------------------# DESCRIPTION:# FlexS needs a lot of background knowledge which is stored in a set of# ASCII data files. A description of the content of the different# files can be found in the User Guide.# REMARKS:# FlexS must have all these files. If one of them is not present under# the described path, FlexS terminates directly in the initialization phase.# SHORT DESCRIPTIONS:# (for a more detailed description refer to the manual)# SETTINGS : tool specified parameters# CHEMPAR : general chemistry knowledge of FlexS# GEOMETRY : interaction geometries# FCHARGES : formal charges# DELOC : delocalization of electrons# CONTACT : intermolecular contacts# TORSION : torsional angle database (see flag TORSION_MODE)# torsion_standard.dat (=>TORSION_MODE 0)# torsion_fine.dat (=>TORSION_MODE 0/1/2)## Note: The files torsion_standard.dat and torsion_fine.dat# should be unpacked form the package, therefore the following lines# TORSION static_dat/torsion_fine.dat# and# TORSION static_dat/torsion_standard.dat# have to be given below. After the two files have been# unpacked, one of the lines must be removed according# to which torsion database should be used in FLEXS.#

Page 213: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

213

# CONTYPE : contact types# GAUSSIAN : Gaussian representation# OPTPAR : optimization and scoring parameters# GRAPHIC : graphic output, colors# --------------------------------------------------------------------------SETTINGS static_data/flexs_settings.datCHEMPAR static_data/chempar.datGEOMETRY static_data/geometry_sp.datFCHARGES static_data/fcharges.datDELOC static_data/delocalized.datCONTACT static_data/contact_sp.datTORSION static_data/torsion_fine.datTORSION static_data/torsion_standard.datCONTYPE static_data/contype_sp.datGAUSSIAN static_data/gaussian.datOPTPAR static_data/optpar.datGRAPHIC static_data/graphic_sp.dat

#SMARTS supportTRANSFORM static_data/transform.dat

# --------------------------------------------------------------------------@PROGRAMS# --------------------------------------------------------------------------# DESCRIPTION:# FlexS executes some external programs. All the programs you need are# distributed with FlexS (except the editor!).# REMARKS:# With the aid of so-called directives (#system <name> and #end_system) you# may specify program names for different operating systems simultaneously.# Currently Linux, Linuxx86_64, Windows and Darwin (MacOSX) are supported.# EDITOR : must be defined if you want to edit static data files# PDF_VIEWER : must be defined if you want to use the manual online# FLEXS : location of the FlexS executable program# FLEXV : must be defined if you want to visualize results with the# FlexV-3D viewer# RCGENERATOR : optional but highly recommended, must be defined if you# want to handle flexible ring systems# 3DGENERATOR : optional, must be defined if you want to clean up 3D coordinates# of a ligand# PCGENERATOR : optional, must be defined if you want to read molecule# files without partial charges input# CONFGENERATOR: optional, must be defined if you want to use an external# conformation generator for the command ALIGN.# --------------------------------------------------------------------------EDITOR /usr/bin/viPDF_VIEWER ./acroread

# LINUX versions#system Linux

FLEXS flexsFLEXV flexv -size 900 750 -pos 400 400RCGENERATOR __CONFIG_BINDIR_CORINA__ -d flexx3DGENERATOR __CONFIG_BINDIR_CORINA__ -i t=sdf,dummies -o t=sdf,m2lPCGENERATOR __CONFIG_BINDIR_PCGENERATOR__CONFGENERATOR __CONFIG_BINDIR_CORINA__ -d stergen,rc,mc=5 -i t=mol2,dummies -o t=mol2

#end_system

# Intel based PCs with Linux (64bit)#system Linuxx86_64

FLEXS flexsFLEXV flexv -size 900 750 -pos 400 400RCGENERATOR __CONFIG_BINDIR_CORINA__ -d flexx3DGENERATOR __CONFIG_BINDIR_CORINA__ -i t=sdf,dummies -o t=sdf,m2lPCGENERATOR __CONFIG_BINDIR_PCGENERATOR__CONFGENERATOR __CONFIG_BINDIR_CORINA__ -d stergen,rc,mc=5 -i t=mol2,dummies -o t=mol2

Page 214: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

214 APPENDIX A. SAMPLE CONFIGURATION FILE

#end_system

# Windows NT PC version (32 bit)#system Windows

FLEXS flexs.exeFLEXV flexv.exeRCGENERATOR __CONFIG_BINDIR_CORINA__ -d flexx3DGENERATOR __CONFIG_BINDIR_CORINA__ -i t=sdf,dummies -o t=sdf,m2lPCGENERATOR __CONFIG_BINDIR_PCGENERATOR__CONFGENERATOR __CONFIG_BINDIR_CORINA__ -d stergen,rc,mc=5 -i t=mol2,dummies -o t=mol2

#end_system

# ------------------------------------------------------------------------ ## END OF PATH, FILE AND PROGRAM DEFINITION ## the remaining part of this file is not important for installing FlexS ## ------------------------------------------------------------------------ #

# --------------------------------------------------------------------------@FLAGS# --------------------------------------------------------------------------# DESCRIPTION:# Elementary features of FlexS are controlled by a set of flags. You can# define the default value for these flags here. The flags and their# values are:# VERBOSITY: defines the level of verbosity, the following is output:# 0 : nearly silent, only warnings, error messages and direct results# 1 : user messages, displaying the program flow# 2 : important statistical output# 3 : runtime information (if the output is directed into a file, the# verbosity level must be less than 3)# 9 : all except control output of READ and WRITE commands# USER_MODE: defines the user mode# 0 : off, complete menu not present# 1 : standard user mode# 2 : advanced user mode# 3 : special user mode, all available commands are present# 4 : special user mode + debug information in error cases# PRINT_TIMES: printing runtimes# 0 : off# 1 : on, prints the elapsed process time after each command# PRINT_SIZE: printing current process size (KB)# 0 : off# 1 : on, prints the process size after each command# SIZE_LIMIT: stops the FlexS process, if the memory requirement# exceeds the limit# 0 : off# x : size limit of x KB (check after each command)# MOL_NAME: switches between different output name strings# 1 : mol2 molecule name (from input file)# 2 : mol2 molecule name + infile number (multi-mol file) or# mol2 molecule name + solution number (placement solution)# 3 : output filename + infile number (multi-mol file) or# output filename + solution number (placement solution)# USE_RL_TRANSFORMS:# 0 : use old molecule initialisation# 1 : the rules defined in TRANSFORM static data file are applied upon# loading a reference ligand.# USE_TL_TRANSFORMS:# 0 : use old molecule initialisation# 1 : the rules defined in TRANSFORM static data file are applied upon# loading a test ligand.# RIGID_TORSIONS: switches flexibility of torsion angles on or off# 0 : on# 1 : off# RING_MODE: way of computing ring conformation# 0 : RIGID RINGS (take input coordinates)

Page 215: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

215

# 1 : CORINA# 2 : CONFORT# 3 : internal CORINA# TORSION_MODE: selection of the torsion angle model# If the TORSION set to torsion_standard.dat, the flag# TORSION_MODE must be 0. But if variable TORSION set to# torsion_fine.dat, the flag TORSION_MODE may be 0, 1 or 2.# But it should be 1.# 0 : old MIMUMBA model (torsion angles are intersections of torsion# angle sets of different patterns)# static-data file: torsion_fine.dat# 1 : new MIMUMBA model (merging the statistics)# Remark: you must change the torsion database together with# this flag# static-data file: torsion_fine.dat# 2 : new MIMUMBA model without preselection of angles (take all 10 deg.)# Remark: you must change the torsion database together with# this flag# static-data file: torsion_fine.dat# SECONDARY_TORSION_MODE: calculation of torsion angles# 0 : off# 1 : on, if no torsion angles for a rotatable bond defined in torsion db# the torsion potiantial is calculated in force field and appropriate# angles are selected# ASSIGN_EXT_PARTIAL_CHARGE: this enables or disables the use of an# external partial charge generator for the molecule initialization# 0 : off# 1 : on, use partial charge generator defined by PCGENERATOR# KEEP_RCGEN_FILES:# 0 : delete corina in/out/trace files# 1 : save all RC generator (CORINA, CONFORT) files and display filenames# on the screen# SUPERPOSITION_MODE: Selection of the type of superpositioning# 0 : usual flexible ligand superpositioning (FlexS)# 1 : usual rigid-body ligand superpositioning (RigFit)# 2 : database screening (RigFit)# OPTIMIZE: mode for the flexible postoptimization# 0: similarity(overlap) index will be used as energy function# 1: superpos-energy will be used as energy function# KEEP_3DGEN_FILES:# 0 : delete cleanup in/out/trace files# 1 : save all cleanup program files and display filenames on the screen# SDF_MOL_ID_TYPE: determines the field from which the molecule ID in an SD# file is taken# 0 : First line of header block# 1 : Property line with the name given by SDF_MOL_ID_STRING# 2 : Property line starting/ending with <ID.. / <id.. / ..ID> / ..id># 3 : Take the SDF_MOL_ID_NUMth field in the data section# SDF_MOL_ID_NUM: determines the field from which the molecule ID in an SD# file is taken, if SDF_MOL_ID_TYPE is set to 3# x : molecule ID is read from the xth property line of type ’> <..>’# USE_CONF_GEN: enables or disables the use of conformation generator# for the command RIF/ALIGN# 0 : off# 1 : use internal generator, exclude given conformation of the template# 2 : use internal generator, include given conformation of the template# 3 : use external generator, exclude given conformation of the template# 4 : use external generator, include given conformation of the template# EXT_CONF_GEN_FORMAT: defines the input format for the external conformation# generator.# 0 : input: name - output: multi MOL2# 1 : input: MOL2 - output: multi MOL2# USE_PVM_FEATURE:# 0 : a batch script is executed only sequentially# 1 : the execution of a parallel batch file is available#

Page 216: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

216 APPENDIX A. SAMPLE CONFIGURATION FILE

# --------------------------------------------------------------------------# User interfaceVERBOSITY 3USER_MODE 3PRINT_TIMES 0PRINT_SIZE 0SIZE_LIMIT 0

# Ligand data preparationRIGID_TORSIONS 0RING_MODE 3TORSION_MODE 0SECONDARY_TORSION_MODE 0KEEP_RCGEN_FILES 0USE_RL_TRANSFORMS 1USE_TL_TRANSFORMS 1ASSIGN_EXT_PARTIAL_CHARGE 0

# Type of superpositioningSUPERPOSITION_MODE 0

# Mode of flexible postoptimizationOPTIMIZE 0

# Preprocessing and consistency checking of input dataKEEP_3D_GEN_FILES 0SDF_MOL_ID_TYPE 1SDF_MOL_ID_NUM 1MOL_NAME 2

# RIF conformation generator flagsUSE_CONF_GEN 1EXT_CONF_GEN_FORMAT 1

# PVM flagUSE_PVM_FEATURE 1

# --------------------------------------------------------------------------@ID_STRINGS# --------------------------------------------------------------------------# DESCRIPTION:# This optional section is for parsing SD files (SDF) only. The field names# that determine the molecule ID can be defined here. If they are not defined,# "ID" is assumed for SDF. Both values cannot be changed at FlexS runtime!# --------------------------------------------------------------------------SDF_MOL_ID_STRING ID

# --------------------------------------------------------------------------@parallel# --------------------------------------------------------------------------# DESCRIPTION:# This section is for FlexS with PVM interface only. It describes the hosts# where a parallel script can be executed. After each host name, the maximum# number of allowed processes and the optional nice value follows (max 39).# --------------------------------------------------------------------------#

@ALIASES# --------------------------------------------------------------------------# DESCRIPTION:# To make working with FlexS easier, you can define an alias for each# command and submenu in FlexS. Commands and submenu names are not# case sensitive.# REMARKS:# You can only define ONE alias per command or submenu.# Even first arguments can be determined together with a command as one

Page 217: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

217

# alias, using the pipe character ’|’ as separator for following arguments,# eg. "alias SET|verbosity|10 TELLALL"# --------------------------------------------------------------------------# general commands / main menu commandsEND EQUIT QQUIT|Y XSCRIPT SEDITCFG ECSETPAR SPTOFLEXV TOFLIST LHELP H# submenusTEST_LIG TLREF_LIG RLSUPERPOS SUPEVALUATE EVDATABASE DBOPTPARAM OP#application COMBISUPERCSUPER CSCLIB CL#end_application# special menu commandsWRITE WWRITEG WGMERGEG MGDELETEG DGREAD RREADREF RRQUERY YLISTSOL LSLISTMAT LMLISTALL LASELBAS SBPLACEBAS PBCOMPLEX CC# graphic aliasDISPLAY GOSELGRA SGSELCOL SCSELADM SA

End of the sample configuration file config_sp.dat.

Page 218: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

218 APPENDIX A. SAMPLE CONFIGURATION FILE

Page 219: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

BExamples of script files

The following examples are intended to help setting up batch files.

B.1 Example I: Flexibly superpose a pair of ligands (1stTest.bat)

Example

# Flexibly superpose a pair of ligands# C.Lemmen# 04.12.98## Variables to be provided as parameters in the command# line are: $(reflig) = name of the reference ligand# $(testlig) = name of the test ligand############################################################

OUTPUT " >> Superposing " $(testlig) " flexibly onto " $(reflig)OUTPUT "-----------------------------------------------------------"

# Part 1: Loading dataREF_LIGREAD $(reflig)

ENDTEST_LIGREAD $(testlig)SELBAS a # Automatically select base frag.

END

# Part 2: compute placementsSUPERPOSPLACEBAS 3 # Place base frag. with triangle alg.COMPLEX all # Add all fragments

# Part 3: resultsLISTSOL 30 # Generate output

END

continued

219

Page 220: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

220 APPENDIX B. EXAMPLES OF SCRIPT FILES

Example (continued)

TEST_LIGWRITE $(PREDICT)/1stTest_pred y y c 1-10

ENDOUTPUT " >> Done. Results written to "OUTPUT " " $(PREDICT) "1stTest_pred.mol2"

# Part 4: cleanup and quitDELALL y # delete dataQUIT y

B.2 Example II: Flexibly superpose a set of ligand pairs sequen-tially (flexs.bat)

Example

# Flexibly superpose a set of ligand pairs sequentially# C.Lemmen# 14.02.98###########################################################

FOR_EACH $0 $1 $2 $3 in "testset.list" # Read every line in this file# not starting with ’#’ and fill in# the variables $0 to $4

TIMER start # Start timerTEST_LIG # Go to Test Ligand menuREAD $0/$2_min_h # Read the test ligandSELBAS m $3 # Select a base fragment manually

# as specified in $4READREF $0/$2_kpl # Read the reference coordinates

ENDREF_LIG # Go to Reference Ligand menuREAD $0/$1_kpl_h # Read the reference ligandTRIHASH # Build hashing table

ENDOUTPUT "Preparation time:" # Print preparation time ...TIMER stop # ... with the stop time commandSUPERPOS # Go to Superpos menuPLACEBAS # Base placementCOMPLEX all # Complex constructionLISTSOL 30 # Generate outputSELOUTP red_testset a # Redirect output into file short.logINFO l # Generate short output (one line)SELOUTP screen # Redirect output back to the screen

ENDDELALL # Delete all data structures

END_FOR # End of loop for systemsQUIT y # finish FlexS session

One line of the corresponding file testset.list looks as follows:

carboxyptd 1cbx 2ctc "1 6 7"continued

Page 221: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

B.2. EXAMPLE II: FLEXIBLY SUPERPOSE A SET OF LIGAND PAIRS SEQUENTIALLY (FLEXS.BAT) 221

Example (continued)

The variable expressions in the batch files expand to the following filenames:

Expression Meaning$0/$2_min_h carboxyptd/1cbx_min_h$3 1 6 7$0/$2_kpl carboxyptd/1cbx_kpl$0/$1_kpl_h carboxyptd/1cbx_kpl_h

Page 222: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

222 APPENDIX B. EXAMPLES OF SCRIPT FILES

B.3 RigFit example I: Rigid-body superpose a set of ligands allagainst all (rigfit.bat)

Example

# Rigid-body superpose a set of ligands all against all# C.Lemmen 14.02.98###########################################################SET SUPERPOSITION_MODE 1 # Rigid-body modeOPTPARAM # Go to OPTPARAM menuSETPAR laue_radius 2.0 # Select low resolution

END # for high-speed comparison# Input ligand set 1 and ligand set 2TEST_LIG # Go to Test Ligand menuFOR_EACH $0 $1 in "lig1.list" # For each (multi)-mol2 fileREAD $0_min_h # specified in list1, load

END_FOR # the entire (multi)-mol2 fileENDREF_LIG # Go to Reference Ligand menuFOR_EACH $2 $3 in "lig2.list" # For each (multi)-mol2 fileREAD $2_min_h # specified in list2, load

END_FOR # the entire (multi)-mol2 fileEND

# All by all superpositionTIMER start # Start timerFOR_EACH $8 fromto 1 $1 # Enumerate all ligands of list1

TEST_LIG # Note that $1 needs to specifySELECT $8 # the accumulated number of

END # all test ligands loadedFOR_EACH $9 fromto 1 $3 # Enumerate all ligands of list2REF_LIG # Note that $3 needs to specifySELECT $9 # the accumulated number of

END # all reference ligands loadedSUPERPOS # Go to the SUPERPOS menuPLACEBAS oSELOUTP cross-test a # Redirect output into file cross-testINFO l # Generate short output (one line)SELOUTP screen # Redirect output back to the screen

ENDEND_FOR # End of inner loop

END_FOR # End of outer loopOUTPUT "Cross test time consumption:"timer stop # Output time consumption

An example file lig?.list looks as follows:

1cbx 102ctc 203cpa 30

In this case the cumulative number of ligands loaded from the multi-mol2 files 1cbx.mol2, 2ctc.mol2and 3cbx.mol2 would be set to 30.

Page 223: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

B.4. RIGFIT EXAMPLE II: RIGID-BODY SCREEN A FRAGMENT AGAINST A SET OF DB LIGANDS (SCREEN.BAT)223

B.4 RigFit example II: Rigid-body screen a fragment against aset of DB ligands (screen.bat)

Example

# Rigid-body screening a fragment against set of DB ligands# C.Lemmen 01.01.2001###########################################################SET VERBOSITY 0 # Minimum of outputSET SUPERPOSITION_MODE 2 # Screening mode

# Read fragmentTEST_LIG # Go to Test Ligand menuREAD <LigandWithFragmentOFInterest> # Read ligand withSELBAS m <SpecifyFragmentOfInterest> # desired base fragment

END

# Read the reference ligand database in 1000 chunksFOR_EACH $0 IN "DB.list"REF_LIG # Go to Reference Ligand menuDELETE all # Clean up firstREAD database/$0 all # Read multi-mol2

END

# RigFit screeningTIMER start # Start timerSUPERPOS # Go to the SUPERPOS menuSELOUTP db_screen a # Redirect output to fileSCREEN 1 2.0 # Do the fragment screening

# against all reference# ligands currently loaded

SELOUTP screen # Redirect output backENDOUTPUT "screening time for \$0"timer stop # Output time consumption

END_FOR # End for all DB files

DB.list contains a list of multi-mol2 files, each holding at most 1000 compounds. The screen-ing output, which is a score per compound is stored in db_screen.log. The format is: scr1 scr2scr3 scr4 scr5 | refFilenam_xxx refName | testFilenam_xxx testName, where scr1–scr5 are scorecolumns that accommodate up to five fragments selected on the test ligand, xxx gives the numberof the compound in a multi-mol2 file, and refName, testName are the respective compound names.

Page 224: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

224 APPENDIX B. EXAMPLES OF SCRIPT FILES

B.5 Examples for the alignment of combinatorial libraries

Example

# Various examples ofcombinatorial libraries alignment# M.Lilienthal 05.12.02#########################################################################set PREDICT /home/markus/tmp/tmp/set store_placement_mode -1 # The placements file will be pdf fileset keep_all_scores_achieved 1 # FlexS keeps the scores achieved for

# all placementsset verbosity 2

# Read the reference ligandref_lig

read ref_ligandtrihash

end

# Read combinatorial libraries filesclib# core instance

read core R# instances of the rgroups

read rgroup_1 R 0 Xread rgroup_2 R 0 Xread rgroup_3 R 0 Xcloseinfo

# The number of active instances will be restricted (optional)# or all instances will be used

select 1 0,1,2,3,4,5select 2 0,1,2,3,4,5select 3 0,1,2,3,4,5

end

# Alignment of the core instances. The placements of core instance 0 will# be saved in core_place_C-000.pdfcsuper

selinsplacec 3listp cwritec 0 core_place

end

continued

Page 225: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

B.5. EXAMPLES FOR THE ALIGNMENT OF COMBINATORIAL LIBRARIES 225

Example (continued)

# Combinatorial alignment of an R-group# optional: extendr 1 y rgroup_place# placements will be saved in rgroup_place_C-000_R-001csuper

timer startselinsreadc 0 core_place_C-000extendr 1 nlistp soutput "Preparation time:"timer stopdelete

end

# Combinatorial alignment of all R-groups# optional: extendmr n 1 2 3 y all_rgroup_place# placements will be saved in all_rgroup_place_C-000_R-001_...csuper

timer startselinsreadc 0 core_place_C-000extendmr n 1 2 3 nlistp mlistp aoutput "Preparation time:"timer stop

end

# Sequentialize alignment of all R-groups (normal alignment)# optional: placeseq n 1 2 3 y seq_place n# placements will be saved in seq_place_C-000_R-001_...csuper

timer startselinsreadc 0 core_place_C-000placeseq n 1 2 3 3 nlistp mlistp aoutput "Preparation time:"timer stop

end

Page 226: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

226 APPENDIX B. EXAMPLES OF SCRIPT FILES

B.6 Examples and results of the postoptimization

Example

# popt.bat# Superpose a set of ligands all against all with flexible postoptimization# M.Lilienthal 31.05.01###########################################################set OPTIMIZE 0 # Flag OPTIMIZE 0 or 1optparam # Go to OPTPARAM menuswitchp y 1 1 1 n # Select energy parameterstopcrt 1e-3 1 0.9 0 50 # Select stop criteria for flexible

# superpositionend

#for_each $1 in "ligand1.list" # For each mol2 file (reference ligand)for_each $2 in "ligand2.list" # For each mol2 file (test ligand)ref_lig # Go to Reference Ligand menuread $1_kpl_h.mol2 # specified in ligand1, loadendtest_lig # Go to Test Ligand menuread $2_min_h.mol2 # specified in ligand2, loadreadref $2_kpl_h.mol2 # Read the reference coordinatesselbas a f # Select a base fragment automatically

# for flexible fittingendsuperpos # Go to Superpos menuplacebas 3 # Base placementcomplex all # Complex construction

#popt y all # Postoptimization: flexible superposition

# of all placements to the reference# ligand $1

sort e_total # Sort optimized placementsenddelall y # Delete allend_forend_for

An example file ligand?.list looks as follows:

1dwc1dwd1ett

Page 227: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

B.6. EXAMPLES AND RESULTS OF THE POSTOPTIMIZATION 227

A table of results:

Result table: (Flag OPTIMIZE = 0)No. | Index | E_ovl | E_total | RMS |-----------------------------------------------------------------------------10 | -0.718 -0.927 | -811.871 -916.633 | -887.455 -994.684 | 2.58 2.63 |18 | -0.806 -0.940 | -891.443 -913.050 | -966.899 -991.019 | 2.60 2.55 |15 | -0.644 -0.935 | -807.042 -911.210 | -876.840 -989.206 | 2.53 2.51 |20 | -0.625 -0.841 | -798.478 -884.150 | -869.774 -962.120 | 2.54 2.56 |1 | -0.840 -0.897 | -863.709 -882.877 | -942.770 -960.875 | 2.59 2.61 |6 | -0.677 -0.897 | -828.426 -882.860 | -901.096 -960.859 | 2.70 2.61 |7 | -0.668 -0.776 | -817.705 -844.534 | -896.759 -922.223 | 2.67 2.72 |

12 | -0.712 -0.776 | -808.072 -844.534 | -886.313 -922.223 | 2.85 2.72 |19 | -0.689 -0.776 | -797.417 -844.463 | -874.933 -919.214 | 2.68 2.72 |2 | -0.739 -0.859 | -834.932 -839.586 | -911.383 -914.962 | 2.83 2.61 |3 | -0.685 -0.859 | -837.204 -839.586 | -911.273 -914.962 | 2.79 2.61 |4 | -0.684 -0.859 | -837.223 -839.642 | -911.124 -914.908 | 2.80 2.61 |5 | -0.738 -0.859 | -834.123 -839.642 | -910.615 -914.908 | 2.84 2.61 |

14 | -0.721 -0.822 | -808.782 -842.244 | -880.803 -914.581 | 2.24 2.09 |8 | -0.661 -0.775 | -819.603 -833.163 | -896.640 -910.969 | 2.71 2.63 |

13 | -0.646 -0.775 | -807.258 -833.163 | -883.814 -910.969 | 2.77 2.63 |16 | -0.629 -0.775 | -802.864 -833.163 | -876.694 -910.969 | 2.66 2.63 |9 | -0.660 -0.775 | -817.866 -833.226 | -894.767 -910.844 | 2.72 2.63 |

11 | -0.646 -0.775 | -809.869 -833.226 | -886.509 -910.844 | 2.77 2.63 |17 | -0.628 -0.775 | -801.942 -833.226 | -875.922 -910.844 | 2.67 2.63 |

Result table: (Flag OPTIMIZE = 1)No. | E_match | E_ovl | E_total | RMS |-------------------------------------------------------------------------------18 | -75.455 -75.441 | -891.443 -895.463 | -966.899 -970.904 | 2.60 2.58 |1 | -80.089 -80.090 | -863.709 -864.393 | -943.798 -944.483 | 2.59 2.58 |2 | -76.451 -76.750 | -834.932 -836.719 | -911.383 -913.470 | 2.83 2.80 |5 | -76.492 -76.863 | -834.123 -836.209 | -910.615 -913.073 | 2.84 2.81 |3 | -74.069 -74.162 | -837.204 -838.350 | -911.273 -912.513 | 2.79 2.79 |4 | -73.900 -73.954 | -837.223 -837.939 | -911.124 -911.893 | 2.80 2.79 |6 | -73.699 -73.828 | -828.426 -828.802 | -902.124 -902.630 | 2.70 2.69 |7 | -80.083 -80.084 | -817.705 -821.442 | -897.788 -901.526 | 2.67 2.65 |8 | -77.037 -77.816 | -819.603 -822.760 | -896.640 -900.576 | 2.71 2.68 |9 | -76.900 -77.686 | -817.866 -821.427 | -894.767 -899.113 | 2.72 2.68 |

10 | -76.609 -77.020 | -811.871 -814.228 | -888.480 -891.249 | 2.58 2.58 |15 | -75.122 -76.424 | -807.042 -812.857 | -882.164 -889.281 | 2.53 2.52 |12 | -79.270 -79.056 | -808.072 -809.778 | -887.342 -888.834 | 2.85 2.85 |11 | -76.930 -77.224 | -809.869 -811.005 | -886.799 -888.229 | 2.77 2.75 |13 | -76.845 -76.959 | -807.258 -808.306 | -884.103 -885.265 | 2.77 2.76 |14 | -69.006 -69.359 | -808.782 -815.887 | -877.788 -885.246 | 2.24 2.18 |17 | -73.175 -74.515 | -801.942 -806.365 | -875.117 -880.880 | 2.67 2.63 |16 | -73.026 -74.352 | -802.864 -806.305 | -875.889 -880.658 | 2.66 2.63 |20 | -76.816 -77.554 | -798.478 -802.121 | -875.294 -879.675 | 2.54 2.54 |19 | -78.544 -78.458 | -797.417 -799.245 | -875.962 -877.703 | 2.68 2.68 |

Page 228: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

228 APPENDIX B. EXAMPLES OF SCRIPT FILES

B.7 Examples and results of the special postoptimization

Example

# spopt.bat# Flexible postoptimization of placements to a set of references# M.Lilienthal 31.05.01############################################################ Set parametersoptparam # Go to OPTPARAM menuswitchp y 1 1 1 n # Select energy parameterstopcrt 1e-3 1 0.9 0 50 # Select stop criteria for flexible

# superpositionend

# flexible postoptimization of placements to a set of referencesref_lig # Go to Reference Ligand menuread 1dwc_kpl_h.mol2 # Reference ligand (rigid)endtest_lig # Go to Test Ligand menuread secref.mol2 # Read second flexible referenceread thirdref.mol2 # Read third flexible referenceread 1ett_min_h.mol2 # Read test ligandendsuperpos # Go to Superpos menuread 1ett_1dwc.pdf # Load placements of test ligand

# from 1ett_1dwc.pdffor_each $0 fromto 1 25 # for first 25 placements of

# 1ett_1dwc.pdf#

spopt y 1-2 1 y 4 n 1 y 4 n y $0# Special postoptimization: flexible# superposition of placement $0 and# three flexible/rigid reference ligands;# first ref: reference ligand# (1dwc_kpl_h.mol2)# second ref: secref.mol2 (use fix-coord,# transform modus: local transformation,# don’t freeze flexible bonds)# third ref: thirdref.mol2 (use fix-coord,# transform modus: local transformation,# don’t freeze flexible bonds)# flexible placement $0 of 1ett_min_h

end_for

Page 229: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

B.8. EXAMPLES AND RESULTS OF THE FLEXIBLE SUPERPOSITION 229

B.8 Examples and results of the flexible superposition

Example

# fsuper.bat# Superpose a set of ligands# M.Lilienthal 31.05.01###########################################################optparam # Go to OPTPARAM menuswitchp y 1 1 1 n # Select energy parametertransmod 7 # Select default transformation modusstopcrt 1e-3 1 0.9 0 75 # Select stop criteria for flexible superpositionend

#ref_lig # Go to Reference Ligand menuread ref_mol.mol2 # Load a reference ligandendtest_lig # Go to Test Ligand menuread lig_mol1.mol2 # Load a test ligandreadref lig_mol1_kpl.mol2 # Read the reference coordinatesread lig_mol2.mol2 # Load a test ligandreadref lig_mol2_kpl.mol2 # Read the reference coordinatesread lig_mol3.mol2 # Load a test ligandreadref lig_mol3_kpl.mol2 # Read the reference coordinatesendoptparam

#fsuper y 0 1-3 1 n n 1 n n 1 n y 4

# Flexible superposition of test# ligands to the reference ligand# lig_mol1 and lig_mol2: use fix coord, use# default transformation modus, don’t# freeze flexible bonds;# lig_mol3: use fix coord, use default# transformation modus, freeze bond no. 4

# or# fsuper n 1-3 1 y 4 n 1 n n 1 n n

# Flexible superposition of test# ligands without a the reference ligand# lig_mol1: use fix coord, transform modus:# local transformation, don’t freeze# flexible bonds;# lig_mol2 and lig_mol3: use fix coord, use# default transformation modus, don’t# freeze flexible bonds;

end

Page 230: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

230 APPENDIX B. EXAMPLES OF SCRIPT FILES

Example

I) Flexible superposition using a rigid reference ligand as target:----------------------------------------------------------------------

Reference ligand: ref_mol-----------------------------------------------------------------------------------No.| Ligand | Overl.Index |Position| Ref RMSD | Nof | Max |Stop| Time

| |before after| RMSD | before after | Step| Step| |-----------------------------------------------------------------------------------1| lig_mol1 | 0.519 0.996| 3.1668| 3.1922 0.0816| 50| 50| S | 4.90 s2| lig_mol2 | 0.620 0.998| 1.5882| 1.5950 0.0634| 43| 50| S | 5.15 s3| lig_mol3 | 1.000 0.998| 0.0615| 0.0000 0.0615| 38| 50| S | 5.61 s

Data of optimized ligand(s):--------------------------------------------------------------------------------No.| Ligand | LJ-Potential |Torsions-Energy| Energy Score |T-Mode

| | before after | before after | before after |--------------------------------------------------------------------------------1| lig_mol1 | 0.5699 0.5694| 0.2175 0.2046| -0.8808 -2.7950| 72| lig_mol2 | 0.5690 0.5694| 0.1246 0.3602| -1.3102 -2.7638| 73| lig_mol3 | 0.5694 0.5694| 0.2725 0.1828| -2.7930 -2.8083| 7

Legend:

Reference ligand : Name of the reference ligand.No. : Number of the current test ligand/placement.Ligand : Name of the current test ligand/placement.Overl.Index : Similarity (overlap) index (see eq. (6.2) in section 6.9.1) before and

after the optimization.

Position RMSD :√

1n ∑

ni ‖anew

i − aoldi ‖2

2, where aold/newi is the old and the new

atom coordinates of conformation respectively. n is the number of atoms.Ref-RMSD : If reference coordinates (see 6.4.3) are read, the RMSD between the

old and the reference conformation and the RMSD between the newand the reference conformation will be computed.

- before :√

1n ∑

ni ‖aold

i − are fi ‖2

2

- after :√

1n ∑

ni ‖anew

i − are fi ‖2

2Nof Step : Number of optimization steps.Max Step : Maximum number of optimization steps.Stop : Stop criterion of the optimization algorithm (see section 6.9.3).

G: gradient criterion; S: step criterion; M: maximum number of stepsreached; E: energy stop criterion; A: abort, the algorithm can’tcompute the next step and stops with the current step; N: numericalerror

Time : Running time of the optimization.LJ-Potential : Scaling (12,6) Lennard-Jones potential before and after the

optimization.Torsions-Energy : Scaling torsion energy before and after the optimization.Energy Score : Score of the energy formula before and after the optimization.T-Mode : Transformation mode (see section 6.9.4).

Page 231: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

B.8. EXAMPLES AND RESULTS OF THE FLEXIBLE SUPERPOSITION 231

Example

II) Flexible superposition treating all ligands as flexible:------------------------------------------------------------

------------------------------------------------------------------------Nof Ligands | Overl.Index | Energy Score | Nof | Max |Stop| Time

|before after| before after | step| step| |------------------------------------------------------------------------

3 | 0.519 0.937| -0.8841 -2.5708| 7| 15| S | 22.63 s

------------------------------------------------------------------------------------No.| Ligand | LJ-Potential |Torsions Energy|T-Mode| Overl.Index | with

| | before after | before after | |before after| ligand------------------------------------------------------------------------------------1| lig_mol1 | 0.5699 0.5694| 0.2175 0.2506| 4 | 0.468 0.906| 2-32| lig_mol2 | 0.5690 0.5693| 0.1246 0.1718| 7 | 0.518 0.953| 1-1, 3-33| lig_mol3 | 0.5694 0.5693| 0.2725 0.0593| 7 | 0.569 0.953| 1-2

Legend:

Nof Ligands : Total number of optimized test ligands/placements.Overl.Index : Overlap index (see eq. (6.3) in section 6.9.1) before and after the

optimization.Energy Score : Score of the energy formula before and after the optimization.Nof Step : Number of optimization steps.Max Step : Maximum number of optimization steps.Stop : Stop criterion of the optimization algorithm (see section 6.9.3).

G: gradient criterion; S: step criterion; M: maximum number of stepsreached; E: energy stop criterion; C: stop criterion of the centersof molecules; A: abort, the algorithm can’t compute the next step andstop with the current step; N: numerical error

Time : Running time of the optimization.No. : Number of current test ligand/placement.Ligand : Name of curent test ligand/placement.LJ-Potential : Scaling (12,6)-Lennard-Jones potential before and after the

optimization.Torsions-Energy : Scaling torsion energy before and after the optimization.T-Mode : Transformation mode (see section 6.9.4).Overl.Index : Overlap index (see eq. (6.4) in section 6.9.1) with the other ligand

before and after the optimization.with ligand : The number of the other test ligands/placements.

Page 232: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

IndexFlexV , 32, 70

about FlexS, 13additional modules, 119aliases, 58alternative directory, 60amides, 174Appendix, 211atom type assignment, 189atom types, 62, 162atomic charges, 62

basefragment, 34, 77placement, 34, 94, 156selection, 156

batch, 219, 220, 222, 223branch, 152command

FOR_EACH/END_FOR, 151FOREVER, 151IF/ELSE/ENDIF, 152INCR, 153INPUT, 153OUTERR, 153OUTPUT, 153PROCSIZE, 153SELINP, 152SETVAR, 153TIMER, 153WAIT, 153WHILE, 151

input, 153loops, 151mode, 74, 148output, 153parameter list, 150progsize, 153timer, 153variables, 149, 152

wait, 153batch mode, 59

arguments, 59BioSolveIT Host ID, 21bond, 20, 63, 174bond length

heavy atoms, 162SYBYL, 162

bond types, 62build-up procedure, 95, 158

charges, 178reference ligand, 91test ligand, 84

Combinatorial librariescombinatorial alignment, 43, 134handling CombiLibs, 126introduction, 126

combinatorial libraries, 40, 41example, 224

command?, 30, 68ALIGN, 105, 109CLOSE, 42, 127CLUSTER, 96COC(=O)C(=O)COC, 188COMPLEX, 95DECRYPT, 94DELALL, 73DELETE, 80, 88, 97, 128, 138DELETEG, 91DISPLAY, 31, 32, 72DRAW, 31–33, 83, 90, 103, 133EDIT, 79, 93EDITCFG, 68END, 29, 68ENUM, 130ENUMALL, 87ERASE, 73EXTEND, 130

232

Page 233: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

INDEX 233

EXTENDCORE, 129EXTENDMR, 138EXTENDR, 44, 137EXTRACT, 130, 139EXTRACTTOP, 140FIXRMSD, 87FSUPER, 114GRAINF, 84, 91, 104, 134HELP, 29, 68INFO, 29, 42, 79, 98, 128LIST, 69LISTALL, 99LISTMAT, 99LISTONE, 100LISTP, 44, 45, 138LISTRMS, 99LISTSOL, 36, 98MAIN, 68MANUAL, 68MAPREF, 76, 128MATCH, 85MATCH_LS, 104MDRAW, 36, 83, 103MERGEG, 91MINCONF, 86MINFO, 131OVERLAP, 105PLACEBAS, 94PLACEC, 44, 135PLACER, 136PLACESEQ, 136POPT, 96POPTC, 135PRINTSOL, 101QHIST, 101QUERY, 100QUIT, 67READ, 42, 74, 88, 97, 126, 127READC, 84, 91, 136READG, 92READREF, 76, 127RELEASE, 130, 139RELEXTCORE, 129RELOADDAT, 93RESETCORE, 130RGROUP, 129RMSHIST, 105

SAVEGC, 93SCORE, 104SCREEN, 39, 95SCRIPT, 74SDRAW, 84SELADM, 80, 89, 102, 131SELBAS, 34, 77SELCOL, 32, 34, 81, 102, 132SELECT, 43, 78, 88, 96, 128SELECTR, 137SELGRA, 32, 34, 81, 89, 102, 131SELINS, 134SELLAB, 82, 90, 103, 133SELOUTP, 36, 69, 125SET, 37, 58, 69SETALGO, 115SETC, 84, 91SETPAR, 117SETREF, 77, 128SMARTS, 85SORT, 97STOPCRT, 113SWITCH, 129SWITCHP, 110, 113SWITCHTYPE, 85TOFLEXV, 70TRANSFORM, 86TRANSMOD, 114TRIHASH, 89VOLUME, 80WRITE, 37, 79, 97WRITEC, 44, 136WRITEFC, 84WRITEG, 84, 91WRITEOPT, 86WRITESOL, 139WRITONE, 87WRITRAND, 87

command line options, 59commandline, 143commands, 67complex, 95, 158configuration

as command line argument, 60configuring FlexS, 27, 47, 60, 68, 69, 117, 211conformation, 86, 87Conformation generator, 108

Page 234: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

234 INDEX

CONFORT, 206console, 18constraining torsional angles/ring confor-

mations, 63control flags, 37coordinates

switching type, 85writing type ’opt’, 86

core, 41CORINA, 50, 205

deprecated driver options, 205crypted files, 94

DB, 95Debian, 141decryption, 94degrees of freedom

constraining, 63delete, 73, 80, 88, 97directories, 27, 47–49

edittest ligand, 79

email, 143encryption, 94environment variable

data, 48, 93directories, 47flags, 50list, 69programs, 47, 49, 60set, 69strings, 58

errors, 62example files, 18exit, 67

FAQ, 143FAQs, 141file format, 147files, 147Filter for aligned solutions (RIF), 108first steps, 27fix bond in test ligand, 63fix ring in test ligand, 63fixing ring conformations, 63fixing torsional angles, 63flags, 50, 58

flexible ring systems, 20flexible superposition, 112

energy/overlap parameters, 199examples, 229FSUPER, 114special algorithm for fsuper, 115stop criteria, 113, 200switch parameter on or off, 113transformations mode, 114

FlexIDgen, 21flexlm, 21FlexV, 206

atom coloring, 207graphical interface, 206libraries needed, 141two instances with one FlexS, 206

fragment, 34complex, 95placement, 94, 156selection, 77, 156

Gaussiandelete, 91description, 154merge, 91read, 92representation, 191volume, 80write, 84, 91

global commands, 67graphic

commands, 31, 33, 72, 73, 83, 84, 90, 91,103, 104

files, 93, 192setting, 80–82, 89, 90, 102, 103

graphics, 19graphics problems, 141

hash table, 89help, 29, 60, 68, 143Host ID, 21

I/Oredirect, 60, 69table output, 79, 98–101, 105

initializationcombinatorial library, 127reference ligand, 89

Page 235: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

INDEX 235

test ligand, 75initialization procedure, 75, 89, 127input, 62installation, 17

host ID, 21ROOT directory, 18

insufficient memory, 141interaction

geometry, 166, 167matching, 104score, 170type, 164, 173

interaction geometries, 166definition, 167

interactive mode, 59interface options, 61interfaces, 205

corina, 205introduction, 13

known issues and problems, 141

license, 21license scheme, 22license scheme HP-UXia64/SunOS/SGI-Irix,

24Linux, 17Linux distributions

Ubuntu, Debian, 141

menuCLIB, 126CSUPER, 134DATABASE, 93OPTPARAM, 112PVM, 119REF_LIG, 88RIF, 106SUPERPOS, 94TEST_LIG, 74

menus, 67mode

batch, 74, 148mol2, 62mol2 format, 30

OpenGL, 141operating system

console, 18operating systems, 17overlap, 105, 112, 154

parallel, 58parallel script execution, 20parameters

chemical, 161program, 154

placementanalysis, 104clustering, 96delete, 97lists, 98–101postoptimization, 96read, 97selection, 96sort, 97write, 97

postoptimizationenergy/overlap parameters, 199examples, 226, 228POPT, 96POPTC, 135special algorithm for spopt, 115stop criteria, 113, 200switch parameter on or off, 113

preconditions for FlexS to run, 17processor ID, 21, 60PVM, 119

aborting and recovering, 123batch files, 121command

ADD, 120INFO, 120OFMERGE, 123RECOVER, 123REMOVE, 121TOPVM, 121

configuring PVM, 120filenames in scripts, 70, 102, 125kill work process, 124merging of files, 70, 102, 125parallel, 58preliminaries, 119problems, 70, 102, 120, 124, 125starting PVM, 120

Page 236: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

236 INDEX

working with PVM, 124Python, 205

query, 100, 101quit, 67

R-groups, 41RCGENERATOR, 205read

charges of reference ligand, 91charges of test ligand, 84Gaussian, 92placement, 97reference ligand, 31, 88test ligand, 33, 74

reference coordinatesfile, 64read, 76set, 77

reference ligand, 30, 41, 64, 88resulting information file (rif), 106RIF

command line option ri, 60command line option ro, 61

rif optimization parameter sets, 202rigfit, 200RMSD calculation, 87RMSD histogram, 105ROOTDIR, 18, 47

SCA, 205scoring parameters, 199screening, 37, 95scripting, 59, 74, 148

example, 219, 220, 222, 223variables, 149

sdf, 62select

placement, 96reference ligand, 88test ligand, 78

session logging, 60shell, 61, 73single ligand superposition, 27SMARTS, 179

aromaticity perception, 182hydrogens, 183logical operators, 184

recursive, 185ring perception, 182subgraphs, 185

start-up, 59starting FlexS, 59static data, 93

environment variable, 48static data files, 48, 147, 154

contype_sp.dat, 164optpar.dat, 199flexs_settings.dat, 154chempar.dat, 161contact_sp.dat, 173fcharges.dat, 178gaussian.dat, 191decrypting, 94

structure correction, 189subgraph matching, 171superposition, 34, 43, 94support, 143SYBYL atom types, 62system ID, 60

technical reference, 147templates

use of, 186test ligand, 33, 62, 74tests, 219, 220, 222, 223Token not numeric, 141torsion angles, 20, 174transformation, 86tripos, 30troubleshooting, 141tutorial, 27

FlexV , 32combinatorial libraries, 40, 41csuper, 43help, 29mol2 format, 30overview, 29reference ligand, 30, 41screening, 37single ligand, 27superposition, 34, 43test ligand, 33

Ubuntu, 141

Page 237: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

INDEX 237

USE_RL_TRANSFORMS, 56USE_TL_TRANSFORMS, 56User Guide, 14user guide, 27

valence states, 163van der Waals

volume, 80van der Waals radii, 161version information, 61Vista, 142visualization, 19volume, 80

warnings, 62WHATIF, 205Windows, 17, 141

insufficient memory, 141Vista compatibility, 142

working with FlexS, 47write

atom coordinates of type ’opt’, 86charges of test ligand, 84Gaussian, 91placement, 97test ligand, 79

Page 238: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

238 INDEX

Page 239: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

Bibliography

[1] F.C. Bernstein, T.F. Koetzle, G.J.B. Williams, E.F. Jr. Meyer, M.D. Brice, J.R. Rodgers,O. Kennard, T. Shimanouchi, and M. Tasumi. The protein data bank: a computer basedarchival file for macromolecular structures. Journal of Molecular Biology, 112:535–542,1977. 47, 79

[2] H.-J. Böhm. LUDI: rule-based automatic design of new substituents for enzyme in-hibitor leads. Journal of Computer-Aided Molecular Design, 6:593–606, 1992. 13

[3] H.-J. Böhm. The development of a simple empirical scoring function to estimate thebinding constant for a protein-ligand complex of known three-dimensional structure.Journal of Computer-Aided Molecular Design, 8:243–256, 1994. 13

[4] J. Gasteiger, C. Rudolph, and J. Sadowski. Automatic generation of 3d-atomic coor-dinates for organic molecules. Tetrahedron Computer Methodology, 3:537–547, 1990. 20,205

[5] A. Geist, A. Beguelin, J. Dongarra, W. Jiang, R. Manchek, and V.Sunderam. PVM:Parallel Virtual Machine. A Users’ Guide and Tutorial for Networked Parallel Computing.The MIT Press, Cambridge, Massachusetts, 1994. http://www.netlib.org/pvm3/book/pvm-book.html. 120

[6] A. Griewel, O. Kayser, J. Sclosser, and M. Rarey. Conformational sampling for large-scale virtual screening: Accuracy versus ensemble size. J. Chem. Inf. Model., 49:2303–2311, 2009. 56, 108

[7] G. Klebe and T. Mietzner. Correlation of crystal data to analyze and predict lig-and/receptor interactions. In D. W. Jones, editor, Organic Crystal Chemistry. OxfordUniversity Press, Oxford, UK, 1992. 13

[8] G. Klebe and T. Mietzner. A fast and efficient method to generate biologically relevantconformations. Journal of Computer-Aided Molecular Design, 8:583–606, 1994. 20

[9] C. Lemmen, C. Hiller, and T. Lengauer. RIGFIT: A new approach to superimpose ligandmolecules. Journal of Computer-Aided Molecular Design, 12:491–502, 1998. 13

[10] C. Lemmen and T. Lengauer. Time-efficient flexible superposition of medium-sizedmolecules. Journal of Computer-Aided Molecular Design, 11:357–368, 1997. 13

[11] C. Lemmen and T. Lengauer. Fragment-based screening of ligand databases. In K. Gun-dertofte and F.S. Jorgensen, editors, Molecular Modelling and Prediction of Bioactivity, Pro-ceedings of the 12th European Symposium on Quantitative Structure-Activity Relationships(QSAR’98), New York, 1999. Plenum Press. Talk. 13

239

Page 240: BioSolveIT: FlexS reference manual · Version 2.1 Reference Manual Christian Lemmen, Matthias Rarey, Bernd Kramer, Thomas Lengauer, Markus Lilienthal, Frank Sonnenburg, Marc Zimmermann

240 BIBLIOGRAPHY

[12] C. Lemmen, T. Lengauer, and G. Klebe. FLEXS: A method for fast flexible ligand super-position. Journal of Medicinal Chemistry, 41:4502–4520, 1998. 13

[13] M. Rarey, B. Kramer, T. Lengauer, and G. Klebe. Predicting receptor-ligand interactionsby an incremental construction algorithm. Journal of Molecular Biology, 261:470–489,1996. 13

[14] M. Rarey and T. Lengauer. A recursive algorithm for efficient combinatorial librarydocking. Perspectives in Drug Discovery and Design, 20:63–81, 2000. 138

[15] M. Rarey and M. Zimmermann. FTrees 2.3.0 User Guide. BioSolveIT GmbH, St. Au-gustin, Germany, 2010. 106, 107

[16] J. Sadowski, J. Gasteiger, and G. Klebe. Comparison of automatic three-dimensionalmodel builders using 639 x-ray structures. Journal of Chemical Information and ComputerScience, 34:1000–1008, 1994. 20, 205

[17] V. Sunderam, J. Dongarra, A. Geist, and R. Manchek. The pvm concurrent computingsystem: Evolution, experiences, and trends. Parallel Computing, 20(4), 1994. 20, 58

[18] Symyx Technologies Inc., www.symyx.com, 2440 Camino Ramon, Suite 300, San Ra-mon, CA 94583. CTFile Formats November 2007, 2007. 62, 64, 74, 79, 88, 148

[19] TRIPOS Associates, Inc., St. Louis, Missouri, USA. SYBYL Molecular Modeling SoftwareVersion 6.x, 1994. 47, 62, 63, 64, 74, 76, 79, 86, 88, 92, 148, 171, 172, 173