wide search molecular replacement and the nebiogrid portal interface

27
WideSearch Molecular Replacement Ian StokesRees http://portal.nebiogrid.org/

Upload: ian-stokes-rees

Post on 11-May-2015

1.434 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Wide Search Molecular Replacement and the NEBioGrid portal interface

Wide-­‐Search  Molecular  Replacement

Ian  Stokes-­‐Reeshttp://portal.nebiogrid.org/

Page 2: Wide Search Molecular Replacement and the NEBioGrid portal interface

When  WS-­‐MR  is  suitable

• You’ve  got  good  data  (<4  A)• You’ve  tried  MR  with  lots  of  good  candidates

• a  priori  knowledge• sequence  similarity  (PSI-­‐BLAST  search)

• Or• protein  not  sequenced• no  a  priori  knowledge  of  expected  fold

• You  haven’t  found  any  good  models  to  use  for  phasing

• Time  to  try  a  brute-­‐force  search:  WS-­‐MR

Page 3: Wide Search Molecular Replacement and the NEBioGrid portal interface

When  MR  is  not  suitable

• Complexes  containing  signiOicant  DNA  or  RNA• at  least  right  now,  these  will  probably  not  work

• You  haven’t  tried  MR  and  just  want  a  “quick  Oix”• Very  large  or  very  small  structures

• both  are  computationally  difOicult

• Low  resolution  (>  4.5  A)• experience  so  far  suggests  these  aren’t  going  to  be  helped  much

Page 4: Wide Search Molecular Replacement and the NEBioGrid portal interface

Requirements• ReOlection  data  in  MTZ  Oile  format

• Must  have  amplitude  columns  (e.g.  FP,  SIGFP)

• Doesn’t  work  with  intensities  (I,  SIGI)

• Time• To  analyze  results

• To  take  next  steps

• Managed  expectations• Identify  good  MR  candidates  about  1  in  4  cases

• We  don’t  produce  a  fully  phased  structure,  only  a  list  of  good  MR  candidates  and  their  best  placements  as  returned  by  Phaser

• Experience  with  Phaser  to  interpret  results  and  re-­‐run  candidate  models

Page 5: Wide Search Molecular Replacement and the NEBioGrid portal interface

Background• Utilizes  Phaser  for  MR• Utilizes  Open  Science  Grid  for  computing• References

• Stokes-­‐Rees,  Sliz,  Protein  structure  determination  by  exhaustive  search  of  Protein  Data  Bank  derived  databases,  Proc.  Nat'l  Academy  of  Sciences  doi:10.1073/pnas.1012095107

• Stokes-­‐Rees,  Sliz,  Compute  and  data  management  strategies  for  grid  deployment  of  high  throughput  protein  structure  studies,  IEEE  Workshop  on  Many  Task  Computing  on  Grids  and  Supercomputers  2010  (MTAGS10),  Seattle,  November  2010

• Phaser:  McCoy,  Grosse-­‐Kunstleve,  Adams,  Winn,  Storoni,  Read;  J.  Appl.  Cryst.  (2007).  40,  658-­‐674

• Murzin  A.  G.,  Brenner  S.  E.,  Hubbard  T.,  Chothia  C.  (1995).  SCOP:  a  structural  classi?ication  of  proteins  database  for  the  investigation  of  sequences  and  structures.  J.  Mol.  Biol.  247,  536-­‐540.

• Requires  20-­‐50,000  hours  of  computing• Produces  300,000  Oiles• Attempts  100,000  single-­‐domain  MR  trials  using  all  SCOP  

domains

Page 6: Wide Search Molecular Replacement and the NEBioGrid portal interface

https://portal.nebiogrid.org/d/accounts/create

Step  1:  Register  to  use  Portal

Page 7: Wide Search Molecular Replacement and the NEBioGrid portal interface

Step  2:  Submit  Computational  Taskhttps://portal.nebiogrid.org/d/apps/wsmr/create

Page 8: Wide Search Molecular Replacement and the NEBioGrid portal interface

Side  Note:  MTZ  columns

• Use  CCP4  tool  “mtzdmp”  to  check  column  names  and  resolution  if  you’re  not  sure

$ mtzdmp GAS.mtz | less... * Column Labels : H K L FP SIGFP FreeRflag... * Resolution Range : 0.00050 0.25197 ( 44.699 - 1.992 A )...

columnnames resolution

Page 9: Wide Search Molecular Replacement and the NEBioGrid portal interface

Step  3a:  Review  active  task  list  on  portal

click  here  to  access  task

Page 10: Wide Search Molecular Replacement and the NEBioGrid portal interface

Step  3b:  Check  email  for  task  details  and  link

click  here  to  access  task

Page 11: Wide Search Molecular Replacement and the NEBioGrid portal interface

Step  4:  Log  into  job  page

Page 12: Wide Search Molecular Replacement and the NEBioGrid portal interface

Step  5a:  Review  web  page

Page 13: Wide Search Molecular Replacement and the NEBioGrid portal interface

Step  5b:  Check  status

R  =  RunningI  =  IdleH  =  Held

Remember:  Someone  from  SBGrid  will  manually  review  your  job  and  release  it.    Until  that  happens  your  job  won’t  even  be  in  the  queue.    Even  after  that,  it  could  be  in  the  queue  for  several  days  before  it  starts  running.    Do  email  us  if  you  have  questions  or  if  it  seems  stuck  or  not  running.

Click  here

Page 14: Wide Search Molecular Replacement and the NEBioGrid portal interface

Step  5c:  Check  status

outcomes  to  date

summary  of  active  jobs

Page 15: Wide Search Molecular Replacement and the NEBioGrid portal interface

Step  6a:  Review  scatter  graphs

Look  for  a  cluster  of  high  TFZ  and  high  LLG  results  distinct  from  the  rest  

NOTE:  This  graph  is  a  static  image

Page 16: Wide Search Molecular Replacement and the NEBioGrid portal interface

Step  6b:  Cases  with  no  strong  MR  candidates*

*  Remember  this  is  usually  the  case,  unfortunately

Page 17: Wide Search Molecular Replacement and the NEBioGrid portal interface

Step  6c:  Review  scatter  graphs

NOTE:  This  graph  is  a  dynamic  clickable  image.    Only  the  Oirst  5000  results  by  LLG  are  currently  available  because  of  memory  constraints

Click  this  button  to  load  data  and  enable  clickable  image

Page 18: Wide Search Molecular Replacement and the NEBioGrid portal interface

Step  6d:  Review  scatter  graphs

Click  data  point  to  view  details

Click  large  cartoon  image  to  add  to  image  basket

PDB  details

Page 19: Wide Search Molecular Replacement and the NEBioGrid portal interface

Step  7:  Review  tabular  data

live  results  (space  delimited)

sorted  results  (tab  delimited),  generated  by  ”check  status”

Page 20: Wide Search Molecular Replacement and the NEBioGrid portal interface

Step  8:  Wait  for  job  to  Oinish

results  aprox.  100,000errors  <  5,000

No  running  jobs  (all  done)

NOTE:  This  job  is  not  yet  Oinished!

Page 21: Wide Search Molecular Replacement and the NEBioGrid portal interface

Step  9:  Download  Oinalized  augmented  results

augmented  contains  static  SCOP  domain  class  and  name  (25  MB)

Oinal  contains  a  sorted,  cleaned  set  of  results  (5  MB)

Page 22: Wide Search Molecular Replacement and the NEBioGrid portal interface

Step  10:  Review  and  download  speciOic  SCOP  PDB  

• Use  the  tabular  results  to  identify  speciOic  SCOP  codes  that  look  promising

• PDBs  can  be  fetched  using  one  of  these  resources:http://portal.nebiogrid.org/biodb/scop/v1.75/clean/code2/http://abitibi.sbgrid.org/cgi/pdbview.pyhttp://abitibi.sbgrid.org/cgi/tmalign.py

Page 23: Wide Search Molecular Replacement and the NEBioGrid portal interface
Page 24: Wide Search Molecular Replacement and the NEBioGrid portal interface
Page 25: Wide Search Molecular Replacement and the NEBioGrid portal interface

Step  11:  Recreate  Phaser  output

Click  on  “test”  directory(bottom  of  job  page)

ROOT 2vlj-testMODE MR_AUTOHKLIn ../2vlj.mtzLABIn F=FP SIGF=SIGFPENSEmble 200la_ PDB 00/200la_.pdb IDENtity 0.3COMPosition SOLVENT 50.0RESOlution 2.4SEARch ENSEmble 200la_ NUM 1

This  is  the  command  input  to  Phaser

Page 26: Wide Search Molecular Replacement and the NEBioGrid portal interface

Step  12:  Over  to  you

• You  now  need  to  reOine  your  structure• WS-­‐MR  only  gets  you  as  far  attempting  to  identify  promising  MR  candidates  if  you  haven’t  had  success  with  conventional  model  identiOication  methods

• Some  further  MR  options  that  exist:• Second  domain  search  with  Oirst  domain  Oixed• homo-­‐dimer/homo-­‐trimer  searches• Custom  PDB  search  library  -­‐  you  give  us  the  PDBs,  we  can  run  WS-­‐MR  

over  the  set

Page 27: Wide Search Molecular Replacement and the NEBioGrid portal interface

Conclusion  and  Thanks

• We  welcome  ideas  for  improvements• Special  processing  requirements?

• We  may  be  able  to  do  this  from  the  command  line  interface

• Please  contact  us  if  you  have  any  questions• [email protected]

• Open  Science  Grid  is  a  big  enabler  here!• http://opensciencegrid.org

• Thanks  to  SBGrid  team:• http://www.sbgrid.org

• Thanks  to  the  Sliz  Lab  at  Harvard  Medical  School:• http://hkl.hms.harvard.edu