chapter 17 · eric ennifar (ed.), nucleic acid crystallography: methods and protocols , methods in...

14
269 Eric Ennifar (ed.), Nucleic Acid Crystallography: Methods and Protocols, Methods in Molecular Biology, vol. 1320, DOI 10.1007/978-1-4939-2763-0_17, © Springer Science+Business Media New York 2016 Chapter 17 RNA Structure Refinement Using the ERRASER-Phenix Pipeline Fang-Chieh Chou, Nathaniel Echols, Thomas C. Terwilliger, and Rhiju Das Abstract The final step of RNA crystallography involves the fitting of coordinates into electron density maps. The large number of backbone atoms in RNA presents a difficult and tedious challenge, particularly when experimental density is poor. The ERRASER-Phenix pipeline can improve an initial set of RNA coordi- nates automatically based on a physically realistic model of atomic-level RNA interactions. The pipeline couples diffraction-based refinement in Phenix with the Rosetta-based real-space refinement protocol ERRASER ( Enumerative Real-Space Refinement ASsisted by Electron density under Rosetta). The com- bination of ERRASER and Phenix can improve the geometrical quality of RNA crystallographic models while maintaining or improving the fit to the diffraction data (as measured by R free ). Here we present a complete tutorial for running ERRASER-Phenix through the Phenix GUI, from the command-line, and via an application in the Rosetta On-line Server that Includes Everyone (ROSIE). Key words RNA structure, Structure prediction, X-ray crystallography, Refinement, Force field 1 Introduction Over the last decade, fruitful progress in RNA X-ray crystallogra- phy has revealed three-dimensional all-atom models of numerous riboswitches, ribozymes, and ribonucleoprotein machines [13]. Due to the difficulty of manually fitting RNA backbones into experimental density maps, many of these crystallographic models contain myriad unlikely conformations and unusually close con- tacts as revealed by automated MolProbity tools for geometric evaluation of models [4, 5]. Inspired by recent advances in ab ini- tio RNA structure prediction [68] and successful applications of the Rosetta modeling suite in crystallographic and electron micros- copy density fitting problems [9, 10], we developed the ERRASER method and integrated it with Phenix diffraction-based refinement [11]. In our previous publication [12], we demonstrated that the ERRASER-Phenix pipeline resolves the majority of steric clashes

Upload: others

Post on 09-Aug-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Chapter 17 · Eric Ennifar (ed.), Nucleic Acid Crystallography: Methods and Protocols , Methods in Molecular Biology, vol. 1320, ... X-ray crystallography , ReÞ nement , Force Þ

269

Eric Ennifar (ed.), Nucleic Acid Crystallography: Methods and Protocols, Methods in Molecular Biology, vol. 1320,DOI 10.1007/978-1-4939-2763-0_17, © Springer Science+Business Media New York 2016

Chapter 17

RNA Structure Refi nement Using the ERRASER-Phenix Pipeline

Fang-Chieh Chou , Nathaniel Echols , Thomas C. Terwilliger , and Rhiju Das

Abstract

The fi nal step of RNA crystallography involves the fi tting of coordinates into electron density maps. The large number of backbone atoms in RNA presents a diffi cult and tedious challenge, particularly when experimental density is poor. The ERRASER-Phenix pipeline can improve an initial set of RNA coordi-nates automatically based on a physically realistic model of atomic-level RNA interactions. The pipeline couples diffraction-based refi nement in Phenix with the Rosetta-based real-space refi nement protocol ERRASER ( E numerative R eal-Space R efi nement AS sisted by E lectron density under R osetta). The com-bination of ERRASER and Phenix can improve the geometrical quality of RNA crystallographic models while maintaining or improving the fi t to the diffraction data (as measured by R free ). Here we present a complete tutorial for running ERRASER-Phenix through the Phenix GUI, from the command-line, and via an application in the Rosetta On-line Server that Includes Everyone (ROSIE).

Key words RNA structure , Structure prediction , X-ray crystallography , Refi nement , Force fi eld

1 Introduction

Over the last decade, fruitful progress in RNA X-ray crystallogra-phy has revealed three-dimensional all-atom models of numerous riboswitches, ribozymes, and ribonucleoprotein machines [ 1 – 3 ]. Due to the diffi culty of manually fi tting RNA backbones into experimental density maps, many of these crystallographic models contain myriad unlikely conformations and unusually close con-tacts as revealed by automated MolProbity tools for geometric evaluation of models [ 4 , 5 ]. Inspired by recent advances in ab ini-tio RNA structure prediction [ 6 – 8 ] and successful applications of the Rosetta modeling suite in crystallographic and electron micros-copy density fi tting problems [ 9 , 10 ], we developed the ERRASER method and integrated it with Phenix diffraction-based refi nement [ 11 ]. In our previous publication [ 12 ], we demonstrated that the ERRASER-Phenix pipeline resolves the majority of steric clashes

Page 2: Chapter 17 · Eric Ennifar (ed.), Nucleic Acid Crystallography: Methods and Protocols , Methods in Molecular Biology, vol. 1320, ... X-ray crystallography , ReÞ nement , Force Þ

270

and anomalous backbone and bond geometries assessed by MolProbity in a benchmark of 24 RNA crystal structures. Furthermore, this method led to models with similar or better R free . This chapter describes the details of using ERRASER in three easily accessible ways: by a GUI in the Phenix package, the command-line, and the ROSIE server [ 13 ].

2 Materials

The ERRASER-Phenix pipeline relies on two software toolkits: the Rosetta modeling suite [ 14 ] and the Phenix package [ 11 ]. Both toolkits are currently offi cially supported on Linux and Mac- OS X platforms. (Phenix is available on Windows; Rosetta might be compiled in Windows using Cygwin but is not offi cially supported and well-tested.) To run the pipeline locally, the user needs to have the following versions of the above toolkits installed on their computer:

Rosetta (version 3.5) http://www.rosettacommons.org/ . Phenix (version 1.8.3) http://www.phenix-online.org/ .

Both Rosetta and Phenix are freely available to academic and non-profi t institutions. Details of downloading, licensing, and the installation instructions can be found in the above listed websites. Phenix installation instructions can be found at http://www.phenix- online.org/documentation/install.htm . On Mac OS X sys-tems, the installation simply consists of downloading a .dmg fi le and double-clicking the icon. On Linux systems it consists of unpacking a tar archive and running an installation script. Instructions for Rosetta installation compatible with Phenix and ERRASER can be found at http://www.phenix-online.org/docu-mentation/erraser.htm .

It is also possible to run the ERRASER part of the pipeline online and privately using the ROSIE server ( http://rosie.rosettacommons.org/ ).

3 Methods

The standard ERRASER-Phenix pipeline consists of three major stages: an initial Phenix refi nement, followed by iterative ERRASER refi nement, and a fi nal Phenix refi nement (Fig. 1 ). Here the initial Phenix refi nement can be skipped if the input structure has already been refi ned with all hydrogen atoms included in the model. In general, we fi nd that maintaining hydrogen atoms during diffraction- based refi nement tends to give models with better geometrical quality, particularly with regards to steric interactions, as assessed by the MolProbity clashscore. Since ERRASER

Fang-Chieh Chou et al.

Page 3: Chapter 17 · Eric Ennifar (ed.), Nucleic Acid Crystallography: Methods and Protocols , Methods in Molecular Biology, vol. 1320, ... X-ray crystallography , ReÞ nement , Force Þ

271

performs only real-space refi nement, a fi nal diffraction-based refi nement is necessary to fi t the model directly to the original data and evaluate R free statistics. We have carried out tests using the Phenix refi nement tool for these two refi nement stages [ 15 ], but users should be able to substitute in refi nement tools if preferred (e.g. SHELXL [ 16 ], Refmac [ 17 ], CNS [ 18 ], etc.).

In the sections below, we will focus on the details of the ERRASER refi nement stage. We will mainly briefl y describe how to run ERRASER using the Phenix GUI interface, and briefl y describe how to run ERRASER using shell command lines and ROSIE web server. Finally, we discuss some settings and options we found useful in the Phenix refi nement of RNA.

After both Phenix and Rosetta are properly installed and compiled on the user’s local computer, the user should set the path so that Phenix can locate the Rosetta applications. Suppose you have Rosetta installed at “/home/user/rosetta-3.5.” If using the bash or sh shells, add the following line into “~/.profi le” or “~/.bashrc”:

export PHENIX_ROSETTA_PATH=/home/user/rosetta-3.5

Or if using C-shell, put the following line into “~/.cshrc”:

setenv PHENIX_ROSETTA_PATH /home/user/rosetta-3.5

The following fi les need to be prepared before running ERRASER. Note that ERRASER is designed for the fi nal fi ne- tuning of the RNA models, and has only been tested for such cases. Applying ERRASER to models from earlier refi nement stages might lead to unexpected behavior and poor models.

ERRASER accepts PDB fi les of RNA with the standard format as input. The users should ensure the input fi le follows the standard PDB format, with correct residue and atom naming conventions.

3.1 Set Up the Phenix- Rosetta Connection

3.2 Prepare the ERRASER Input Files

3.2.1 Input PDB File

Fig. 1 Flow chart of the ERRASER-Phenix pipeline

RNA Structure Refi nement with ERRASER

Page 4: Chapter 17 · Eric Ennifar (ed.), Nucleic Acid Crystallography: Methods and Protocols , Methods in Molecular Biology, vol. 1320, ... X-ray crystallography , ReÞ nement , Force Þ

272

In addition, the user should ensure that there are no missing heavy atoms in the model, or ERRASER may not run properly.

Currently, ERRASER supports input 2 mF o – DF c density maps in the CCP4 format. It is important to exclude R free diffraction data during map creation to ensure that ERRASER is not infl uenced by the set-aside data and that fi nal R free values are appropriate for cross-validation. All density data covering the entire unit cell should be included to allow ERRASER to correctly evaluate the correlation to density during the sampling. Here, we demonstrate the process of creating the density map using the “calculate maps” GUI in Phenix [ 19 ]. In the GUI window, click the “CCP4 or XPLOR Maps” button, and a window will pop out. In the new window, select the following options (Fig. 2 ):

Map type: 2mFo-DFc. File format: CCP4. Map region: Unit cell.

Enable the following three options:

1. “Kicked.” 2. “Fill missing f obs.” 3. “Exclude free r refl ections.”

Other options are kept default. Here the “kicked” option makes use of the kicked map algorithm to improve the map quality and to reduce map bias [ 19 ]. The “Fill missing f obs” option will

3.2.2 Density Map File

Fig. 2 Snapshot and useful options of the phenix.maps utility

Fang-Chieh Chou et al.

Page 5: Chapter 17 · Eric Ennifar (ed.), Nucleic Acid Crystallography: Methods and Protocols , Methods in Molecular Biology, vol. 1320, ... X-ray crystallography , ReÞ nement , Force Þ

273

allow Phenix to fi ll missing experimental diffraction data ( F obs ) with calculated diffraction ( F calc ), thereby avoiding Fourier truncation errors. We found that these two options lead to better ERRASER results (e.g., in terms of fi nal R free ) in many cases. “Exclude free r refl ections” ensures that the test set of refl ections is not included. If you fail to check this option then your fi nal R free may be biased and be misleadingly low.

It is straightforward to run ERRASER using the Phenix GUI. Click on “Refi nement”- > “ERRASER” in the main Phenix GUI to access the ERRASER GUI (Fig. 3 ). As a quick start, put in a PDB fi le and a corresponding CCP4 map using the “Input PDB” and “CCP4 map” boxes. Then click on the “Run” button. ERRASER will run in a new tab. Descriptions of ERRASER options that the user may wish to explore are given below in Subheading 3.5 .

After refi nement is complete, ERRASER runs several valida-tion metrics from MolProbity [ 4 , 5 ] as implemented in Phenix. Results are summarized along with output fi les in a new tab (Fig. 4 ), with buttons to load the results in Coot and PyMOL. The validation display is identical to the all-atom contact and RNA- specifi c components in the Phenix GUI, which interact directly with Coot [ 20 ]. These include:

– Steric clashes [ 21 ], defi ned as atomic overlaps of at least 0.4 Å when explicit hydrogen atoms are present.

– RNA bond length and angle geometry outliers, which has values > 4 s.d. from the Phenix reference values.

3.3 A Simple ERRASER Job Example

Fig. 3 The main window of the Phenix ERRASER GUI

RNA Structure Refi nement with ERRASER

Page 6: Chapter 17 · Eric Ennifar (ed.), Nucleic Acid Crystallography: Methods and Protocols , Methods in Molecular Biology, vol. 1320, ... X-ray crystallography , ReÞ nement , Force Þ

274

– RNA pucker outliers. RNA sugar rings typically adopt either C2′-endo or C3′-endo conformation, known as the pucker. MolProbity can confi dently identify erroneous puckers by measuring the perpendicular distance between the 3′-phosphate and the glycosidic bond vector.

– RNA “suite” outliers. Previous research [ 22 , 23 ] shows that most of the RNA backbone “suites” (sets of two consecutive sugar puckers with fi ve connecting backbone torsions ) fall into 54 rotameric classes. Non-rotameric outlier “suites” are likely to be problematic.

Phenix also outputs a “Validation summary” text fi le that com-pares the model before and after the ERRASER refi nement, which can be opened from the GUI by clicking on the corresponding button. In addition to the standard MolProbity metrics above, the “Validation summary” also gives a list of χ angle (glycosidic bond torsion) fl ips by ERRASER ( see Note 1 ).

The users can use this “Validation summary” list to guide their manual inspection of the post-ERRASER model. If during the inspection, the users think that any residue is not properly rebuilt in the standard ERRASER run, they might try to rebuild such resi-due with the single-residue rebuilding mode described below.

Fig. 4 Output window of the ERRASER GUI

Fang-Chieh Chou et al.

Page 7: Chapter 17 · Eric Ennifar (ed.), Nucleic Acid Crystallography: Methods and Protocols , Methods in Molecular Biology, vol. 1320, ... X-ray crystallography , ReÞ nement , Force Þ

275

For model regions with low density or for which other information is known (e.g., based on homology to a high-resolution structure), ERRASER’s top picks for residue conformations may not be opti-mal. In addition, ERRASER might not always resolve all errors in the models and might, in a few cases, lead to new errors in the model while fi xing other errors. In such cases, the user may want to inspect other possibilities for residue conformations. Therefore, in addition to the standard ERRASER protocol described above, there is a single-residue rebuilding mode available in ERRASER. In this mode, the user selects a particular residue in the complete structure and rebuilds it using the ERRASER algorithm. Up to ten different models, sorted by their relative ERRASER score, are returned as output which the user can then inspect.

The user can run the single-residue mode under the same GUI interface similar to the standard ERRASER application. After inputting the PDB fi le and CCP4 map fi le, click on the “single resi-due rebuilding mode” checkbox, and input the residue to be rebuilt in the “Residue to be rebuilt” box. The format of input residue is chain ID followed by residue number, e.g., “A35” or “F152.” Then click “Run” button to execute the program. After completion, the generated models and validation results are dis-played in the summary tab. The validation includes the “suite,” pucker, and glycosidic torsion assignments, as well as the fi nal ERRASER scores.

The following options can be found in the main window of the ERRASER GUI:

It is usually a good idea to provide the resolution of the input den-sity map, so that ERRASER can give a more accurate estimation of the fi t between the model and the input map. If not provided, ERRASER will assume a default value of 2.5 Å.

The total number of iterations of the ERRASER cycles (Fig. 1 ). By default it is 1. The user can increase the number to perform multi- cycle ERRASER refi nement. We recommend not to use any value greater than 3. Beyond 3 iterations, the ERRASER model will most likely be converged, and it is a waste of time to continue the iterations.

To speed up the ERRASER rebuilding process, by default ERRASER only considers conformations within 3.0 Å RMSD of the input model during the sampling of each residue. The user can change this cutoff RMSD value by editing this option. If the screen RMSD is set to be larger than 10.0 Å, ERRASER will simply skip the RMSD screening step.

3.4 Single-Residue Rebuilding Mode

3.5 Available Options in ERRASER

3.5.1 Map File Resolution

3.5.2 Iterations

3.5.3 Screen RMSD

RNA Structure Refi nement with ERRASER

Page 8: Chapter 17 · Eric Ennifar (ed.), Nucleic Acid Crystallography: Methods and Protocols , Methods in Molecular Biology, vol. 1320, ... X-ray crystallography , ReÞ nement , Force Þ

276

This is for the single-residue rebuilding mode only. See Subheading 3.4 .

This is for the single-residue rebuilding mode only. See Subheading 3.4 .

By default ERRASER will only rebuild residues that are assessed by MolProbity as having outlier pucker, suite, bond length or bond angle, or residues that moved signifi cantly during the Rosetta mini-mization step. Selecting this option forces ERRASER to rebuild all residues in the model. In general, however, we recommend using the default rebuilding protocol. Rebuilding all residues is likely to be a time-consuming process without obvious model improvement. In some cases such excessive rebuilding might even lead to worse models, due to artifacts in the input density map and the Rosetta scoring function. If the user wishes to rebuild some specifi c resi-dues, use the “Extra residues to be rebuilt” option below.

This option is for debugging only. Turning on this option will generate a lot of output and slow down the code. If you fi nd a bug, it might be a good idea to run this mode and send all the output to the developers to aid diagnosis ([email protected]).

The following options are accessible by clicking the “Other settings” button in the GUI.

This option allows the user to specify residues that should be kept fi xed throughout the ERRASER refi nement. The format is “A32”, where A is the ID of the chain and 32 is the residue number. The user can also input “A22-30” to fi x all residues from A22 to A30. This option is especially useful when the RNA contains ligands, modifi ed nucleobases, or strong crystal contacts, none of which are currently modeled in ERRASER. In these cases, residues near to these unmodeled parts of the molecule might get rebuilt into unreasonable conformations due to absence of critical interactions to the unmodeled parts. It is therefore necessary to fi x these resi-dues using the “fi xed residues” option. See Note 2 for general sug-gestions on determining the necessary fi xed residues.

This option allows the user to specify particular residues in the model that ERRASER must rebuild regardless of whether it has apparent errors. The format is the same as “fi xed residues” above.

Syn conformers of pyrimidines (U and C) are rare in RNA struc-tures. By default, ERRASER only samples syn conformers of pyrimidine residue if the user supplies an input residue with the syn conformer. The purpose is, fi rst, to speed up the computation, and second, to avoid possibly problematic syn pyrimidine conformers that show up in the fi nal models due to artifacts of the electron

3.5.4 Residue to be Rebuilt

3.5.5 Single-Residue Rebuilding Mode

3.5.6 Rebuild all Residues

3.5.7 Debugging Output

3.5.8 Fixed Residues

3.5.9 Extra Residues to Rebuild

3.5.10 Native Syn Only Pyrimidine

Fang-Chieh Chou et al.

Page 9: Chapter 17 · Eric Ennifar (ed.), Nucleic Acid Crystallography: Methods and Protocols , Methods in Molecular Biology, vol. 1320, ... X-ray crystallography , ReÞ nement , Force Þ

277

density map. Turning off this option will allow ERRASER to sam-ple syn pyrimidine in all cases. In practice we found that this option is advantageous for low-resolution models (>2.5 Å, as a rule of thumb); for high-resolution models where it is possible to distin-guish syn/anti conformers from the density map, the user may explore turning off this option. The user should carefully examine all the syn–anti pyrimidines fl ips in the fi nal model if this option is turned off, to make sure there are no suspicious syn pyrimidines in the models ( see Note 1 ).

By default, ERRASER applies a weak constraint on the χ angle (glycosidic torsion) to favor χ conformers that are similar to the input model. The purpose is also to avoid suspicious syn–anti fl ips during rebuilding; only alternative χ conformers with obvious energy bonuses will be accepted into the fi nal model. Similar to “native syn only pyrimidine” option above, this option is worth exploring for high-resolution data sets, but the user should care-fully examine the resulting output ( see Note 1 ).

The following options are accessible by clicking the “Directories” button in the main GUI.

If you have not followed Subheading 3.1 to set up the PHENIX_ROSETTA_PATH environmental variable, you can enter the path to the root Rosetta folder here. Otherwise just leave it blank.

This option specifi es the location of ERRASER scripts in the Rosetta root folder. By default Phenix should automatically fi gure out the correct path for ERRASER, but if the folder organization in future Rosetta releases change, the user can modify it here to make Phenix and ERRASER compatible.

It is also easy to run ERRASER from command-line using the phe-nix.erraser command line tool (Fig. 5 ). As a simple example, just run

phenix.erraser model.pdb map.ccp4

Here “model.pdb” is the input pdb model, and “map.ccp4” is the density map fi le. For a complete list of all available options in phenix.erraser, simply run

phenix.erraser --help

For example, one can use the following command line:

phenix.erraser model.pdb map.ccp4 single_res_mode = True rebuild_res_pdb = A25 map_reso = 2.0

This allows you to run ERRASER in the single-residue mode to rebuild residue A25. The input map resolution is also given as 2.0 Å.

3.5.11 Constrain χ

3.5.12 Rosetta Path

3.5.13 Rosetta Eraser Directory Name

3.6 Run ERRASER from Command Line

RNA Structure Refi nement with ERRASER

Page 10: Chapter 17 · Eric Ennifar (ed.), Nucleic Acid Crystallography: Methods and Protocols , Methods in Molecular Biology, vol. 1320, ... X-ray crystallography , ReÞ nement , Force Þ

278

An alternative way of running ERRASER is to use to the ROSIE online server (Fig. 6 ). In this way, the user does not have to install Phenix and Rosetta locally. The ROSIE ERRASER app can be found at http://rosie.rosettacommons.org/erraser/ . Simply fol-low the online documentation to run the job. ROSIE can run ERRASER in both standard mode and single-residue rebuilding mode. Note that the ROSIE server has fewer tunable options than local installation. For example, users cannot currently carry out multiple iterations of an ERRASER run (it is still possible manu-ally; see Note 3 ). One additional warning: the memory on each core of the ROSIE backend is limited; therefore the job may crash if the user inputs a very large electron density map fi le in ROSIE.

The refi nement tool in Phenix can be accessed by the phenix.refi ne GUI from the main Phenix window. Comprehensive documenta-tion for the phenix.refi ne GUI can be found in the Phenix website ( http://www.phenix-online.org/documentation/refi nement.htm ). Here we will discuss some non-default options that we found useful in refi ning RNA crystallographic models produced by ERRASER. All of the options discussed below can be found under the “refi nement setting” tab in the phenix.refi ne GUI. Since the refi nement setting is sensitive to the initial model and diffraction data, there is no generic rule in setting up the refi nement that works in all cases. The user can monitor the change of R , R free and other geometric valida-tion results of the refi nement outputs to help decide whether the selected refi nement strategy is appropriate.

3.7 Running ERRASER Using the ROSIE Server

3.8 Advice for Phenix RNA Refi nement

Fig. 5 The phenix.erraser command line application

Fang-Chieh Chou et al.

Page 11: Chapter 17 · Eric Ennifar (ed.), Nucleic Acid Crystallography: Methods and Protocols , Methods in Molecular Biology, vol. 1320, ... X-ray crystallography , ReÞ nement , Force Þ

279

We recommend turning off this option during refi nement of ERRASER models. By default Phenix alternates between real- space and diffraction-based refi nement during a refi nement cycle. As the ERRASER models have already gone through extensive real-space refi nement in the ERRASER protocol, further real-space refi nement in Phenix usually leads to no model improvement, or even leads to worse models in terms of clashes or other quality measures. For the exact parameters being used in refi ning the models in the 24 ERRASER benchmark set, please refer to the supplementary information of the original ERRASER paper.

While the default of three cycles of refi nement is usually good enough to refi ne the ERRASER models, sometimes additional cycles of refi nement leads to better results. The user may try a few different numbers (usually 3–10) for best results.

This option enables Phenix to perform a grid search to fi nd the best X-ray/ADP (Atomic Displacement Parameter, also known as B-factor) weight ratio during each refi nement cycle. In practice we found using this option generally leads to fi nal models with better R and R free .

3.8.1 Refi nement Strategy: Real-Space

3.8.2 Refi nement Strategy: Number of Cycles

3.8.3 Targets and Weighting: Optimize X-ray/ADP Weights

Fig. 6 The ROSIE ERRASER server application

RNA Structure Refi nement with ERRASER

Page 12: Chapter 17 · Eric Ennifar (ed.), Nucleic Acid Crystallography: Methods and Protocols , Methods in Molecular Biology, vol. 1320, ... X-ray crystallography , ReÞ nement , Force Þ

280

This option enables Phenix to perform a grid search to fi nd the best X-ray/Stereochemistry weight ratio during each refi nement cycle. In some cases this option leads to better R and R free , but in other cases it is better to use a constant X-ray/Stereochemistry weight (see below). For best refi nement results, the user can try both and select the best model.

This option sets constant X-ray/Stereochemistry weight during the refi nement. In practice we found that it is better to use a lower wxc_scale for models with lower resolution. For example, we used the following heuristic rule in determining the wxc_scale in refi n-ing the ERRASER benchmark: wxc_scale = 0.5 for Resolution < 2.3 Å, wxc_scale = 0.1 for 2.3 Å ≤ Resolution < 3 Å, wxc_scale = 0.05 for 3 Å ≤ Resolution ≤ 3.6 Å, and wxc_scale = 0.03 for Resolution > 3.6 Å.

The option enables automatic water picking during Phenix refi ne-ment. This procedure removes crystallographic waters lacking evident electron density and adds waters to locations with strong unoccupied electron density. In general we found this algorithm improved the model and reduced R and R free .

4 Notes

1. ERRASER can introduce fl ips of the χ angles (glycosidic tor-sion) during the run; we have shown previously that many of these fl ips are accurate when comparisons can be made of ERRASER-refi ned models to higher resolution data sets. If “constrain chi” or “native syn only pyrimidine” options are turned off, χ angle fl ips are made more often, and some of these fl ips may be incorrect. In low-resolution density maps (>2.5 Å), it can be hard to determine the glycosidic conformer, especially for pyrimidines. While most χ angle fl ips are reason-able, there are cases where the fl ips are likely to be incorrect. We recommend the user to inspect visually all the χ angle fl ips before proceeding with the fi nal model. Here are some general suggestions on examining the χ angle fl ips: First, syn conform-ers are rare, especially for syn pyrimidines. Any anti-to-syn fl ips should be closely examined. Unless there is strong electron density evidence or new hydrogen bond interactions, the fl ips are likely to be problematic. Syn-to-anti fl ips are more likely to be correct, but the user should still examine the models to make sure no important hydrogen-bonding interactions are broken during the fl ips.

2. Currently ERRASER does not handle crystal contacts, ligands and modifi ed residues during the refi nement. To address these

3.8.4 Targets and Weighting: Optimize X-ray/Stereochemistry Weights

3.8.5 Targets and Weighting: Refi nement Target Weights: wxc_scale

3.8.6 Other Options: Update Waters

Fang-Chieh Chou et al.

Page 13: Chapter 17 · Eric Ennifar (ed.), Nucleic Acid Crystallography: Methods and Protocols , Methods in Molecular Biology, vol. 1320, ... X-ray crystallography , ReÞ nement , Force Þ

281

issues, the user can use the “fi xed residues” option. Residues contacting ligands and residues in crystal contacts can be fi xed by adding them to the list of “fi xed residues.” For modifi ed resi-dues, one should further constrain the residues which the modi-fi ed residue is directly bonded to. For circular RNA or lariat connections, one can constrain the residues with additional bonds to ensure the bond does not break during ERRASER refi nement. An alternative way to handle crystal contacts is to manually add in the crystal packing partners during ERRASER step; this can be achieved using the symexp utility in PyMol. Usually it is unnecessary to add in the entire molecules of the crystal packing partners; one can cut out just the relevant resi-dues nearby. These extra add-in residues should be removed from the fi nal ERRASER models before Phenix refi nement.

3. To manually iterate ERRASER refi nement in ROSIE, simply use the output pdb fi le of the fi rst job as the input pdb fi le for a second job, and use the same input density map and options as the fi rst iteration.

5 Acknowledgements

The authors would like to thank Sergey Lyskov for implementing and maintaining the ROSIE Rosetta server, and members of the Rosetta and the Phenix communities for discussions and code sharing. The authors acknowledge support from Howard Hughes International Student Research Fellowship (F.C.), Stanford BioX graduate student fellowship (F.C.), Burroughs-Wellcome Career Award at Scientifi c Interface (R.D.), NIH grants R21 GM102716 (R.D.) and P01 GM063210 (P.D. Adams, PI, to N.E. and T.T.).

References

1. Golden BL, Kim H, Chase E (2005) Crystal structure of a phage Twort group I ribozyme- product complex. Nat Struct Mol Biol 12:82–89

2. Serganov A, Huang L, Patel DJ (2008) Structural insights into amino acid binding and gene control by a lysine riboswitch. Nature 455:1263–1267

3. Dunkle JA, Wang L, Feldman MB, Pulk A, Chen VB, Kapral GJ, Noeske J, Richardson JS, Blanchard SC, Cate JH (2011) Structures of the bacterial ribosome in classical and hybrid states of tRNA binding. Science 332:981–984

4. Davis IW, Leaver-Fay A, Chen VB, Block JN, Kapral GJ, Wang X, Murray LW, Arendall WB

3rd, Snoeyink J, Richardson JS, Richardson DC (2007) MolProbity: all-atom contacts and structure validation for proteins and nucleic acids. Nucleic Acids Res 35(suppl 2):W375–W383

5. Chen VB, Arendall WB 3rd, Headd JJ, Keedy DA, Immormino RM, Kapral GJ, Murray LW, Richardson JS, Richardson DC (2010) MolProbity: all-atom structure validation for macromolecular crystallography. Acta Cryst D 66:12–21

6. Das R, Baker D (2007) Automated de novo prediction of native-like RNA tertiary struc-tures. Proc Natl Acad Sci U S A 104:14664–14669

RNA Structure Refi nement with ERRASER

Page 14: Chapter 17 · Eric Ennifar (ed.), Nucleic Acid Crystallography: Methods and Protocols , Methods in Molecular Biology, vol. 1320, ... X-ray crystallography , ReÞ nement , Force Þ

282

7. Das R, Karanicolas J, Baker D (2010) Atomic accuracy in predicting and designing nonca-nonical RNA structure. Nat Methods 7:291–294

8. Sripakdeevong P, Kladwang W, Das R (2011) An enumerative stepwise ansatz enables atomic-accuracy RNA loop modeling. Proc Natl Acad Sci U S A 108:20573–20578

9. DiMaio F, Terwilliger TC, Read RJ, Wlodawer A, Oberdorfer G, Wagner U, Valkov E, Alon A, Fass D, Axelrod HL, Das D, Vorobiev SM, Iwaï H, Pokkuluri PR, Baker D (2011) Improved molecular replacement by density- and energy-guided protein structure optimiza-tion. Nature 473:540–543

10. DiMaio F, Tyka MD, Baker ML, Chiu W, Baker D (2009) Refi nement of protein struc-tures into low-resolution density maps using Rosetta. J Mol Biol 392:181–190

11. Adams PD, Afonine PV, Bunkoczi G, Chen VB, Davis IW, Echols N, Headd JJ, Hung LW, Kapral GJ, Grosse-Kunstleve RW, McCoy AJ, Moriarty NW, Oeffner R, Read RJ, Richardson DC, Richardson JS, Terwilligen TC, Zwart PH (2010) PHENIX: a compre-hensive Python- based system for macromo-lecular structure solution. Acta Cryst D 66:213–221

12. Chou FC, Sripakdeevong P, Dibrov SM, Hermann T, Das R (2013) Correcting perva-sive errors in RNA crystallography through enumerative structure prediction. Nat Methods 10:74–76

13. Lyskov S, Chou FC, Conchuir SO, Der BS, Drew K, Kuroda D, Xu J, Weitzner BD, Renfrew PD, Sripakdeevong P, Borgo B, Havranek JJ, Kuhlman B, Kortemme T, Bonneau R, Gray JJ, Das R (2013) Serverifi cation of molecular modeling applica-tions: the Rosetta online server that includes everyone (ROSIE). PLoS One 8:e63906

14. Leaver-Fay A, Tyka M, Lewis SM, Lange OF, Thompson J, Jacak R, Kaufman K, Renfrew PD, Smith CA, Sheffl er W, Davis IW, Cooper S, Treuille A, Mandell DJ, Richter F, Ban YE, Fleishman SJ, Corn J, Kortemme T, Gray JJ, Kuhlman B, Baker D, Bradley P (2011) ROSETTA3: an object-oriented software suite

for the simulation and design of macromole-cules. Methods Enzymol 487:545–574

15. Afonine PV, Grosse-Kunstleve RW, Echols N, Headd JJ, Moriarty NW, Mustyakimov M, Terwilliger TC, Urzhumtsev A, Zwart PH, Adams PD (2012) Towards automated crystal-lographic structure refi nement with phenix.refi ne. Acta Cryst D 68:352–367

16. Sheldrick G (2008) A short history of SHELX. Acta Cryst A 64:112–122

17. Vagin AA, Steiner RA, Lebedev AA, Potterton L, McNicholas S, Long F, Murshdov GN (2004) REFMAC5 dictionary: organization of prior chemical knowledge and guidelines for its use. Acta Cryst D 12:2184–2195

18. Brunger AT (2007) Version 1.2 of the crystal-lography and NMR system. Nat Protocols 2:2728–2733

19. Praznikar J, Afonine PV, Guncar G, Adams PD, Turk D (2009) Averaged kick maps: less noise, more signal…and probably less bias. Acta Cryst D 65:921–931

20. Echols N, Grosse-Kunstleve RW, Afonine PV, Bunkoczi G, Chen VB, Headd JJ, McCoy AJ, Moriarty NW, Read RJ, Richardsson DC, Richardson JS, Terwillerger TC, Adams PD (2012) Graphical tools for macromolecular crystallography in PHENIX. J Appl Crystallogr 45:581–586

21. Word JM, Lovell SC, LaBean TH, Taylor HC, Zalis ME, Presley BK, Richardson JS, Richardson DC (1999) Visualizing and quan-tifying molecular goodness-of-fi t: small-probe contact dots with explicit hydrogen atoms. J Mol Biol 285:1711–1733

22. Murray LJW, Arendall WB, Richardson DC, Richardson JS (2003) RNA backbone is rota-meric. Proc Natl Acad Sci U S A 100:13904–13909

23. Richardson JS, Schneider B, Murray LW, Kapral GJ, Immormino RM, Headd JJ, Richardson DC, Ham D, Hershkovits E, Williams LD, Keating KS, Pyle AM, Micallef D, Westbrook J, Berman HM, RNA Ontology Consortium (2008) RNA backbone: consen-sus all-angle conformers and modular string nomenclature (an RNA Ontology Consortium contribution). RNA 14:465–481

Fang-Chieh Chou et al.