phenix and neutrons › presentations › ...phenix.autobuild data=scale.mtz model=mr.pdb...

31
PHENIX and neutrons Neutrons in Biology Santa Fe, NM October 26, 2009 Pavel Afonine Computation Crystallography Initiative Physical Biosciences Division Lawrence Berkeley National Laboratory, Berkeley CA, USA

Upload: others

Post on 04-Feb-2021

2 views

Category:

Documents


0 download

TRANSCRIPT

  • PHENIX and neutrons

    Neutrons in Biology Santa Fe, NM

    October 26, 2009

    Pavel Afonine

    Computation Crystallography Initiative Physical Biosciences Division

    Lawrence Berkeley National Laboratory, Berkeley CA, USA

  • PHENIX project

    Collaboration between several groups:

      Los Alamos National Lab Tom Terwilliger, Li-Wei Hung (SOLVE / RESOLVE) Paul Langan et al (Tools for Neutron crystallography) (MNC)

      Cambridge University, UK Randy Read et al (PHASER)

      Duke University Jane & David Richardson et al (MolProbity, hydrogens)

      Lawrence Berkeley National Lab Paul Adams et al (CCI Apps: phenix.refine, eLBOW, Xtriage,…)

    Paul Adams – project director

  • PHENIX and Neutron crystallography

    Macromolecular Neutron Crystallography Consortium (MNC)

    Lawrence Berkeley National Lab (LBNL) Paul Adams, Pavel Afonine

    Los Alamos National Lab Paul Langan, Marat Mustyakimov, Benno Schoenborn

    http://mnc.lanl.gov/

  • The PHENIX project: facts

      PHENIX is a new package for automated structure solution that incorporates handling of both: X-ray and neutron data

      PHENIX is not a pipe-line made of existing programs, but a highly integrated software

      Library based development (Python, C++) and new algorithms

      Designed to be used by both novices and experienced users

      Long-term development and support

  • Why Automation ?

    •  Automation can increase efficiency, and reduce human error

  • •  Can speed up the process and can help reduce errors •  Makes difficult cases more feasible for experts •  Routine structure solution cases are accessible to a wider group of

    structural biologists •  Software can try more possibilities than we are typically willing to bother

    with

    •  Multiple trials or use of different parameters can be used to estimate uncertainties

    •  What is required: –  Software carrying out individual steps –  Integration between the steps (collaboration between developers) –  Algorithms to decide which is best from a list of possible results –  Strategies for structure determination and decision-making

    Why Automation ?

  • PHENIX: principal tools

    Complete set of tools for crystallographic structure determination: from experimental data to PDB deposited structure

  • Running PHENIX

      Running PHENIX programs: -  GUI: easy for beginners, guided process - less chance of errors -  Command line: convenient for scripting of multiple and large scale tasks

      Command line tools are still easy to run:   Autobuild (from starting phases to complete and refined model):

    phenix.autobuild data=scale.mtz model=mr.pdb seq_file=correct.seq

      Ligandfit (automatically find and build ligands into density): phenix.ligandfit data=nsf.mtz model=noligand.pdb ligand=atp.pdb

      AutoMR (molecular replacement with Phaser + Autobuild = refined model): phenix.automr nsf-d2.mtz nsf.pdb

      phenix.refine (highly automated structure refinement, X-ray, Neutron): phenix.refine nsf-d2.mtz nsf.pdb

      phenix.xtriage (complete data analisys): phenix.xtriage porin_fp.mtz

  • PHENIX: new GUI

  • Neutron structure determination

      An X-ray derived structure is always available and is used as a starting model for neutron structure determination, therefore the main software needs are:

    -  model preparation (add H or D, exchangeable H/D; library files for ligands), -  model refinement and completion (adding DOD, OD or O), -  model validation

  • Neutron Crystallography Challenges   Crystallographic software is designed and optimized to work with X-ray data

    -  Manual work is required to customize the software to work with neutron data •  define scattering tables •  change libraries to adopt handling of H, D or H/D atoms

    •  add D or exchangeable H/D (and not only H) – manual file editing -  Cannot use all the features of the software (for example, TLS)

    -  Statistics for PDB deposition (REMARK 3 formatted header)   Methodological challenges:

    -  Build H, D or H/D to model, including water or ligands

    -  Optimize fit of water (DOD) into the density -  Optimize fit of rotatable X-H/D into the density

    -  Adding H or/and D increases the amount of refinable parameters: need to use X-ray and neutron data simultaneously (Joint XN refinement)

    -  Cancellation effect makes X-H species poorly defined in density. This is needed to be addressed in refinement

    -  Constrained occupancy refinement of H/D sites

  • Solutions in PHENIX

      Automated tools to add H, D or/and exchangeable H/D sites and create consistent library (CIF) files for ligands

      Refinement using neutron data alone or both, X-ray and neutron

    Joint refinement: TJOINT = EXRAY * wXC + ENEUTRON * wNC + wC * EGEOM

      Constrained occupancy refinement of H/D sites is done completely automatically

      Automatic adjustment of H/D positions into density map by local real space search (as part of refinement)

      Ready-to-PDB-deposit output files: complete refinement statistics in REMARK 3 for both X-ray and neutron data. Example: PDB code 2R24.

      Automatic building of DOD (O and D atoms) into neutron maps (development version).

  • phenix.refine

      Highly-automated state-of-the-art structure refinement tool of PHENIX   Active development mainly at Lawrence Berkeley National Laboratory:

    -  Paul Adams -  Pavel Afonine -  Nathaniel Echols -  Ralf Grosse-Kunstleve -  Nigel Moriarty -  Peter Zwart

    +  valuable scientific support by many others (Tom Terwilliger, Randy Read, Sasha Urzhumtsev, Vladimir Lunin, …)

    + other developers: Marat Mustyakimov, …

  • Automation of structure refinement

      What used to be in the past … and often still the case nowadays

      Clearly, the modern software should do all these steps automatically

      PHENIX is making a good progress in achieving this goal

    Acta Cryst. (2002). D58, 2009-2017, Yousef et al.

  • Automation of structure refinement

      Recognize any input data file format: CNS/Xplor, MTZ, SHELX, SCALA, …

      Robust processing of PDB files

      Decide refinement strategy based on inputs: iso/aniso, twinning

      Combine multiple steps into one (water picking, TLS, SA, etc.)

      Make each step robust

      Integrate validation tools to minimize the human’s error and optimize time of structure solution

      Output complete information

      Preserve as much information as reasonable -  Do not discard riding hydrogens -  Have a complete foot-print of restraints used

  • phenix.refine: single program for a very broad range of resolutions

    - Group ADP refinement - Rigid body refinement - Torsion Angle dynamics

    - Restrained refinement (xyz, ADP: isotropic, anisotropic, mixed) - Automatic water picking

    - Automatic NCS restraints

    - Simulated Annealing

    - Occupancies (individual, group, automatic constrains for alternative conformations)

    Low Medium and High Subatomic

    - TLS refinement

    -  Use hydrogens at any resolution - Refinement with twinned data

    -  X-ray, Neutron, joint X-ray + Neutron

    -  Built-in water picking and refinement

    - Bond density model - Unrestrained refinement -  FFT or direct -  Explicit hydrogens

  • Refine any part of a model with any strategy: all in one run

    + Automatic water picking + Simulated Annealing

    + Add and use hydrogens

  • Refinement flowchart

    Input data and model processing

    Refinement strategy selection

    Bulk-solvent, Anisotropic scaling, Twinning parameters refinement

    Ordered solvent (add / remove)

    Target weights calculation Coordinate refinement

    (rigid body, individual) (minimization or Simulated Annealing)

    ADP refinement (TLS, group, individual iso / aniso)

    Occupancy refinement (individual, group) Output: Refined model, various maps, structure factors, complete statistics, ready for deposition PDB file

    PDB model, Any data format (CNS, Shelx, MTZ, …)

    Files for COOT, O, PyMol

    Repeated several times

  • Water picking done within refinement -  remove “bad” water:

    •  2mFo-DFc (peak height) •  Distances •  map CC (2mFo-DFc, Fc) •  B-factors and anisotropy •  occupancy

    -  add new: •  mFo-DFc, •  distances

    -  refine water’s ADP before adding to model -  Neutron refinement: automatically add and optimize D atoms

    Automatic Water Picking

    Input data and model processing

    Refinement strategy selection

    Bulk-solvent, Anisotropic scaling, Twinning parameters refinement

    Ordered solvent (water picking)

    Target weights calculation Coordinate refinement

    (rigid body, individual) (minimization or SA) ADP refinement

    (TLS, group, individual iso / aniso) Occupancy refinement (individual, group)

    Output: Refined model, various maps, structure factors, complete statistics, ready for deposition PDB file

  • Local and global real-space refinement

    Update Fmodel and re-compute 2mFobs-DFmodel map Real-space refine whole model into 2mFobs-DFmodel

    Compute 2mFobs-DFmodel, mFobs-DFmodel, Fmodel maps

    for residue in residues: compute start map- and CC-values for residue for rotamer in rotamers: if criteria1: residue = rotamer real-space refine residue: residuerefined if criteria1: residue = residuerefined update structure with residue

    N m

    acro-cycles

    Validate changes: compute 2mFobs-DFmodel, mFobs-DFmodel and Fmodel for residue in residues: if criteria2: restore original residue (discard change)

    phenix.refine protocol

  •   Automation requires quick and simple tools for assessing if a model is in general good.

      More sophisticated tools can be used for thorough model validation (such as Molprobity).

      “Is this model good?” The answer depends on many details, such as resolution, for example. A typical question on crystallographic bulletin boards:

    •  “I got Rwork=20% and Rfree=25%, is it a good result?”

    -  Yes, it’s likely a good result if the data resolution is around 2.5 Å.

    -  No, it is very bad result, if the data resolution is 1.0 Å or higher.

    •  One can ask similar questions about other parameters, such as bond/angles RMSDs, average B-factors, etc…

    Results evaluation

  • POLYGON: Crystallographic Model Quality at a Glance

      Say you are refining a structure at 1.0 Å resolution and the R-factors are: Rwork = 18% and Rfree is 23%.

    -  Are these values good? Is refinement completed?

      Go to PDB and plot the histograms for Rwork, Rfree and Rfree-Rwork for all similar structures:

    Rwork at 0.9-1.1Å 0.10 - 0.12: 68 0.12 - 0.14: 94 0.14 - 0.16: 73 0.16 - 0.18: 17

  • POLYGON: Crystallographic Model Quality at a Glance

    Good model Likely incorrect model

    Acta Cryst. (2009). D65, 297-300

  • R-factors (all models in PDB at resolution < 1 Å )

    Resolution, Å

    R-factor, % PDB code (year)

    R-work, %

    published phenix.refine

    2ppn (2007) 20.9 11.7 1g2y (2000) 19.5 12.3 1zlb (2005) 16.8 12.0 2g6f (2006) 18.4 12.9 2elg (2007) 23.2 13.0 1aho (1997) 16.3 9.6 1zf5 (2005) 29.0 16.9

      There are ~25 models out of 324 that have suspiciously high or very high R-factors.

    - For most of them the R-factors can be decreased to typical for this resolution values (~10-15%) in one phenix.refine run.

      Automated software with integrated validation would immediately flag these models as suspicious.

  •   Structure from PDB: 1eic (resolution = 1.4Å; deposition year: 2000)

    PUBLISHED: Rwork = 20% Rfree = 25%

      Clear problems: - No ‘riding’ H atoms; - All atoms are isotropic;

      Potential problems - Inoptimal weights, refinement is not converged, incomplete solvent model

      Fixing the model with PHENIX: -  Add and refine H as riding model -  Update ordered solvent -  Refine atoms as anisotropic (except H and water) -  Optimize X-ray/Restraints weights FINAL MODEL: Rwork = 14% Rfree = 17%

    Under-refined models or why automation is important

      All this could be done by the software automatically, preventing deposition of under-refined models into PDB

  • All neutron structures deposited in PDB with data available

    1: sum of exchangeable H/D does not add up to 1 2: severe geometry problems 3: minor geometry problems 4: twinning 5: bad or missing information in PDB file header 6: H/D exchange is not modeled or incomplete 7: free-R flags might be bad or absent 8: I/F mismatch in input data file (F reported as I, for example) 9: negative occupancies 10: atoms with unknown scattering type

  • Re-refinement of neutron structures deposited in PDB

  • Conclusion

    •  PHENIX provides complete set of tools for automated structure solution, refinement and validation

    •  PHENIX provides complete support for neutron structure solution

    •  Using PHENIX tools requires minimal amount of manual work minimizing related errors

    •  Highly integrated validation tools allow to spot problems at earlier stages

    •  PDB deposited structures need to be re-visited and remediated. This should be done in close contact with the original authors.

  • •  Academic releases every several months

    •  Nightly builds

    •  Supported on:

    –  Linux (eq.: RedHat, Fedora)

    –  Mac OSX

    –  Some limited support on Windows

    •  Extensive documentation

    http://www.phenix-online.org

    PHENIX distribution

  • Acknowledgments •  Computational Crystallography Initiative

    -  Paul Adams -  Nat Echols -  Ralf Grosse-Kunstleve -  Nigel Moriarty -  Nicholas Sauter -  Peter Zwart •  Los Alamos National Laboratory

    -  Tom Terwilliger -  Li-Wei Hung -  Paul Langan -  Marat Mustyakimov

    •  Funding: - NIH/NIGMS: - P01GM063210, P50GM062412,

    P01GM064692, R01GM071939, PN2EY016525

    - Lawrence Berkeley Laboratory - PHENIX Industrial Consortium

    •  Cambridge University -  Randy Read -  Airlie McCoy -  Laurent Storoni -  Gabor Bunkoczi -  Robert Oeffner

    •  Duke University -  Jane Richarson -  David Richardson -  Ian Davis -  Vincent Chen -  Jeff Headd

      All PHENIX users

      Non-PHENIX colleagues for scientific support, very fruitful collaboration and constant valuable feedback:

    Sacha Urzhumtsev, Vladimir Lunin, Dale Tronrud, Dusan Turk

  • Workshop tomorrow !!!