phenix and neutrons › presentations › ...phenix.autobuild data=scale.mtz model=mr.pdb...
TRANSCRIPT
-
PHENIX and neutrons
Neutrons in Biology Santa Fe, NM
October 26, 2009
Pavel Afonine
Computation Crystallography Initiative Physical Biosciences Division
Lawrence Berkeley National Laboratory, Berkeley CA, USA
-
PHENIX project
Collaboration between several groups:
Los Alamos National Lab Tom Terwilliger, Li-Wei Hung (SOLVE / RESOLVE) Paul Langan et al (Tools for Neutron crystallography) (MNC)
Cambridge University, UK Randy Read et al (PHASER)
Duke University Jane & David Richardson et al (MolProbity, hydrogens)
Lawrence Berkeley National Lab Paul Adams et al (CCI Apps: phenix.refine, eLBOW, Xtriage,…)
Paul Adams – project director
-
PHENIX and Neutron crystallography
Macromolecular Neutron Crystallography Consortium (MNC)
Lawrence Berkeley National Lab (LBNL) Paul Adams, Pavel Afonine
Los Alamos National Lab Paul Langan, Marat Mustyakimov, Benno Schoenborn
http://mnc.lanl.gov/
-
The PHENIX project: facts
PHENIX is a new package for automated structure solution that incorporates handling of both: X-ray and neutron data
PHENIX is not a pipe-line made of existing programs, but a highly integrated software
Library based development (Python, C++) and new algorithms
Designed to be used by both novices and experienced users
Long-term development and support
-
Why Automation ?
• Automation can increase efficiency, and reduce human error
-
• Can speed up the process and can help reduce errors • Makes difficult cases more feasible for experts • Routine structure solution cases are accessible to a wider group of
structural biologists • Software can try more possibilities than we are typically willing to bother
with
• Multiple trials or use of different parameters can be used to estimate uncertainties
• What is required: – Software carrying out individual steps – Integration between the steps (collaboration between developers) – Algorithms to decide which is best from a list of possible results – Strategies for structure determination and decision-making
Why Automation ?
-
PHENIX: principal tools
Complete set of tools for crystallographic structure determination: from experimental data to PDB deposited structure
-
Running PHENIX
Running PHENIX programs: - GUI: easy for beginners, guided process - less chance of errors - Command line: convenient for scripting of multiple and large scale tasks
Command line tools are still easy to run: Autobuild (from starting phases to complete and refined model):
phenix.autobuild data=scale.mtz model=mr.pdb seq_file=correct.seq
Ligandfit (automatically find and build ligands into density): phenix.ligandfit data=nsf.mtz model=noligand.pdb ligand=atp.pdb
AutoMR (molecular replacement with Phaser + Autobuild = refined model): phenix.automr nsf-d2.mtz nsf.pdb
phenix.refine (highly automated structure refinement, X-ray, Neutron): phenix.refine nsf-d2.mtz nsf.pdb
phenix.xtriage (complete data analisys): phenix.xtriage porin_fp.mtz
-
PHENIX: new GUI
-
Neutron structure determination
An X-ray derived structure is always available and is used as a starting model for neutron structure determination, therefore the main software needs are:
- model preparation (add H or D, exchangeable H/D; library files for ligands), - model refinement and completion (adding DOD, OD or O), - model validation
-
Neutron Crystallography Challenges Crystallographic software is designed and optimized to work with X-ray data
- Manual work is required to customize the software to work with neutron data • define scattering tables • change libraries to adopt handling of H, D or H/D atoms
• add D or exchangeable H/D (and not only H) – manual file editing - Cannot use all the features of the software (for example, TLS)
- Statistics for PDB deposition (REMARK 3 formatted header) Methodological challenges:
- Build H, D or H/D to model, including water or ligands
- Optimize fit of water (DOD) into the density - Optimize fit of rotatable X-H/D into the density
- Adding H or/and D increases the amount of refinable parameters: need to use X-ray and neutron data simultaneously (Joint XN refinement)
- Cancellation effect makes X-H species poorly defined in density. This is needed to be addressed in refinement
- Constrained occupancy refinement of H/D sites
-
Solutions in PHENIX
Automated tools to add H, D or/and exchangeable H/D sites and create consistent library (CIF) files for ligands
Refinement using neutron data alone or both, X-ray and neutron
Joint refinement: TJOINT = EXRAY * wXC + ENEUTRON * wNC + wC * EGEOM
Constrained occupancy refinement of H/D sites is done completely automatically
Automatic adjustment of H/D positions into density map by local real space search (as part of refinement)
Ready-to-PDB-deposit output files: complete refinement statistics in REMARK 3 for both X-ray and neutron data. Example: PDB code 2R24.
Automatic building of DOD (O and D atoms) into neutron maps (development version).
-
phenix.refine
Highly-automated state-of-the-art structure refinement tool of PHENIX Active development mainly at Lawrence Berkeley National Laboratory:
- Paul Adams - Pavel Afonine - Nathaniel Echols - Ralf Grosse-Kunstleve - Nigel Moriarty - Peter Zwart
+ valuable scientific support by many others (Tom Terwilliger, Randy Read, Sasha Urzhumtsev, Vladimir Lunin, …)
+ other developers: Marat Mustyakimov, …
-
Automation of structure refinement
What used to be in the past … and often still the case nowadays
Clearly, the modern software should do all these steps automatically
PHENIX is making a good progress in achieving this goal
Acta Cryst. (2002). D58, 2009-2017, Yousef et al.
-
Automation of structure refinement
Recognize any input data file format: CNS/Xplor, MTZ, SHELX, SCALA, …
Robust processing of PDB files
Decide refinement strategy based on inputs: iso/aniso, twinning
Combine multiple steps into one (water picking, TLS, SA, etc.)
Make each step robust
Integrate validation tools to minimize the human’s error and optimize time of structure solution
Output complete information
Preserve as much information as reasonable - Do not discard riding hydrogens - Have a complete foot-print of restraints used
-
phenix.refine: single program for a very broad range of resolutions
- Group ADP refinement - Rigid body refinement - Torsion Angle dynamics
- Restrained refinement (xyz, ADP: isotropic, anisotropic, mixed) - Automatic water picking
- Automatic NCS restraints
- Simulated Annealing
- Occupancies (individual, group, automatic constrains for alternative conformations)
Low Medium and High Subatomic
- TLS refinement
- Use hydrogens at any resolution - Refinement with twinned data
- X-ray, Neutron, joint X-ray + Neutron
- Built-in water picking and refinement
- Bond density model - Unrestrained refinement - FFT or direct - Explicit hydrogens
-
Refine any part of a model with any strategy: all in one run
+ Automatic water picking + Simulated Annealing
+ Add and use hydrogens
-
Refinement flowchart
Input data and model processing
Refinement strategy selection
Bulk-solvent, Anisotropic scaling, Twinning parameters refinement
Ordered solvent (add / remove)
Target weights calculation Coordinate refinement
(rigid body, individual) (minimization or Simulated Annealing)
ADP refinement (TLS, group, individual iso / aniso)
Occupancy refinement (individual, group) Output: Refined model, various maps, structure factors, complete statistics, ready for deposition PDB file
PDB model, Any data format (CNS, Shelx, MTZ, …)
Files for COOT, O, PyMol
Repeated several times
-
Water picking done within refinement - remove “bad” water:
• 2mFo-DFc (peak height) • Distances • map CC (2mFo-DFc, Fc) • B-factors and anisotropy • occupancy
- add new: • mFo-DFc, • distances
- refine water’s ADP before adding to model - Neutron refinement: automatically add and optimize D atoms
Automatic Water Picking
Input data and model processing
Refinement strategy selection
Bulk-solvent, Anisotropic scaling, Twinning parameters refinement
Ordered solvent (water picking)
Target weights calculation Coordinate refinement
(rigid body, individual) (minimization or SA) ADP refinement
(TLS, group, individual iso / aniso) Occupancy refinement (individual, group)
Output: Refined model, various maps, structure factors, complete statistics, ready for deposition PDB file
-
Local and global real-space refinement
Update Fmodel and re-compute 2mFobs-DFmodel map Real-space refine whole model into 2mFobs-DFmodel
Compute 2mFobs-DFmodel, mFobs-DFmodel, Fmodel maps
for residue in residues: compute start map- and CC-values for residue for rotamer in rotamers: if criteria1: residue = rotamer real-space refine residue: residuerefined if criteria1: residue = residuerefined update structure with residue
N m
acro-cycles
Validate changes: compute 2mFobs-DFmodel, mFobs-DFmodel and Fmodel for residue in residues: if criteria2: restore original residue (discard change)
phenix.refine protocol
-
Automation requires quick and simple tools for assessing if a model is in general good.
More sophisticated tools can be used for thorough model validation (such as Molprobity).
“Is this model good?” The answer depends on many details, such as resolution, for example. A typical question on crystallographic bulletin boards:
• “I got Rwork=20% and Rfree=25%, is it a good result?”
- Yes, it’s likely a good result if the data resolution is around 2.5 Å.
- No, it is very bad result, if the data resolution is 1.0 Å or higher.
• One can ask similar questions about other parameters, such as bond/angles RMSDs, average B-factors, etc…
Results evaluation
-
POLYGON: Crystallographic Model Quality at a Glance
Say you are refining a structure at 1.0 Å resolution and the R-factors are: Rwork = 18% and Rfree is 23%.
- Are these values good? Is refinement completed?
Go to PDB and plot the histograms for Rwork, Rfree and Rfree-Rwork for all similar structures:
Rwork at 0.9-1.1Å 0.10 - 0.12: 68 0.12 - 0.14: 94 0.14 - 0.16: 73 0.16 - 0.18: 17
-
POLYGON: Crystallographic Model Quality at a Glance
Good model Likely incorrect model
Acta Cryst. (2009). D65, 297-300
-
R-factors (all models in PDB at resolution < 1 Å )
Resolution, Å
R-factor, % PDB code (year)
R-work, %
published phenix.refine
2ppn (2007) 20.9 11.7 1g2y (2000) 19.5 12.3 1zlb (2005) 16.8 12.0 2g6f (2006) 18.4 12.9 2elg (2007) 23.2 13.0 1aho (1997) 16.3 9.6 1zf5 (2005) 29.0 16.9
There are ~25 models out of 324 that have suspiciously high or very high R-factors.
- For most of them the R-factors can be decreased to typical for this resolution values (~10-15%) in one phenix.refine run.
Automated software with integrated validation would immediately flag these models as suspicious.
-
Structure from PDB: 1eic (resolution = 1.4Å; deposition year: 2000)
PUBLISHED: Rwork = 20% Rfree = 25%
Clear problems: - No ‘riding’ H atoms; - All atoms are isotropic;
Potential problems - Inoptimal weights, refinement is not converged, incomplete solvent model
Fixing the model with PHENIX: - Add and refine H as riding model - Update ordered solvent - Refine atoms as anisotropic (except H and water) - Optimize X-ray/Restraints weights FINAL MODEL: Rwork = 14% Rfree = 17%
Under-refined models or why automation is important
All this could be done by the software automatically, preventing deposition of under-refined models into PDB
-
All neutron structures deposited in PDB with data available
1: sum of exchangeable H/D does not add up to 1 2: severe geometry problems 3: minor geometry problems 4: twinning 5: bad or missing information in PDB file header 6: H/D exchange is not modeled or incomplete 7: free-R flags might be bad or absent 8: I/F mismatch in input data file (F reported as I, for example) 9: negative occupancies 10: atoms with unknown scattering type
-
Re-refinement of neutron structures deposited in PDB
-
Conclusion
• PHENIX provides complete set of tools for automated structure solution, refinement and validation
• PHENIX provides complete support for neutron structure solution
• Using PHENIX tools requires minimal amount of manual work minimizing related errors
• Highly integrated validation tools allow to spot problems at earlier stages
• PDB deposited structures need to be re-visited and remediated. This should be done in close contact with the original authors.
-
• Academic releases every several months
• Nightly builds
• Supported on:
– Linux (eq.: RedHat, Fedora)
– Mac OSX
– Some limited support on Windows
• Extensive documentation
http://www.phenix-online.org
PHENIX distribution
-
Acknowledgments • Computational Crystallography Initiative
- Paul Adams - Nat Echols - Ralf Grosse-Kunstleve - Nigel Moriarty - Nicholas Sauter - Peter Zwart • Los Alamos National Laboratory
- Tom Terwilliger - Li-Wei Hung - Paul Langan - Marat Mustyakimov
• Funding: - NIH/NIGMS: - P01GM063210, P50GM062412,
P01GM064692, R01GM071939, PN2EY016525
- Lawrence Berkeley Laboratory - PHENIX Industrial Consortium
• Cambridge University - Randy Read - Airlie McCoy - Laurent Storoni - Gabor Bunkoczi - Robert Oeffner
• Duke University - Jane Richarson - David Richardson - Ian Davis - Vincent Chen - Jeff Headd
All PHENIX users
Non-PHENIX colleagues for scientific support, very fruitful collaboration and constant valuable feedback:
Sacha Urzhumtsev, Vladimir Lunin, Dale Tronrud, Dusan Turk
-
Workshop tomorrow !!!