progress report on crank: experimental phasing biophysical structural chemistry leiden university,...

14
Progress report on Crank: Experimental phasing Biophysical Structural Chemistry Leiden University, The Netherlands

Upload: shana-hodges

Post on 12-Jan-2016

214 views

Category:

Documents


0 download

TRANSCRIPT

Progress report on Crank:Experimental phasing

Biophysical Structural ChemistryLeiden University, The Netherlands

Crank developments available in CCP4 6.1

• “Greatly enhanced” – better tested• Underlying programs haven’t changed (much), but crank

almost completely re-written from version in 6.0.2• Better ccp4i interface• Support for more programs (PIRATE, BUCCANEER,

RESOLVE, COOT)• Faster substructure detection

– Use BP3 to (quickly) check trials and look at deviations between different CRUNCH2 trials significantly decreases the time required for successful substructure detection.

Speeding up CRUNCH2:Results showing improvement

Resol.

(Å)

Anom.

atoms

Exp. Time (old)

(min)

Time (new)

(min)

subtilisin 1.77 3 Ca, SAD 28.91 6.33

Carboxyl proteinase

1.8 9 Br SAD-peak 247.99 19.19

gere 2.75 12 Se MAD p/i/h 5.77 1.42

cyanase 2.41 40 Se MAD p/i/h 31.72 8.18

thioesterase 1.81 20 Br SAD-peak 6.32 1.95

Improved CCP4i interface

Preliminary substructure detection results from JCSG test cases

• 144 mostly MAD Se-Met data sets• Defaults only: the only input was number of Se-

Met per monomer (number of monomer was guessed). Mtz files, f’, f”.

• Some data sets had f” < 1 (solved by MR)• Some data sets had incorrectly labelled X-PLOR

files as mtz.• DISCLAIMER: 1st logfiles produced and analyzed

yesterday after dinner (until 4 a.m.).

AFRO/CRUNCH2 vs SHELXC/D(both run in CRANK)

CRUNCH2 SHELXD

100 % found 104 72

0 % found 15 15

Input error 25 25

My input error 0 32

total 144 144

Of the 79 jobs in common, crunch2 was faster in 20 jobs, whileshelxd was faster in 59.

Comparison not fair

• Same algorithm to identify solution with BP3 can be used in SHELXD

• SHELXD uses much better Fa values (i.e. using the MAD data – at the moment, Afro just uses delta F from the data set with the greatest anomalous signal).

Improving FA values

• An early step in solving a structure by SAD/MAD or SIRAS is to determine FA values.

• FA is the structure factor amplitude corresponding to the substructure to input to direct methods and/or Patterson programs (i.e. SHELXD or CRUNCH2)

Current FA estimation

• FA is currently estimated by | |F+| - |F-| | for SAD data in most programs.

• Direct method programs are very sensitive to FA values.

• Improving estimates can improve hit rates of direct methods and solve substructures that can not previously been solved.

Multivariate SAD equationE(|FA|,|F+|,|F-|) =

|FA| P(|FA|, αA,| |F+|, α+,|F-|, α-) d|FA| dαA dα+dα-

• Giacovazzo previously proposed multivariate FA estimation, with an implementation assuming Bijvoet phases are equal.

• An equation can be obtained without the equal phase assumption requiring only one numerical integration.

• The equation has been implemented – which reduces to Giacovazzo’s equation if Bijvoet phases are equal.

Covariance matrix properties

• The covariance matrix considers experimental sigmas and correlations between F+, F- and FA.

• Problem: Covariance matrix also depends on (overall) substructure occupancy and b-factor.

• Solution: Obtain a multivariate likelihood estimate for unknown parameters.

Refining overall substructure parameters

• Initial guess of number of substructure atoms per monomer obtained from user.

• Initial guess of B-factor obtained from likelihood estimate of overall B-factor of data set.

• Result: Refinement is stable and maximizes correlation with calculated final E’s.

• Another possible application: Use refined overall occupancy and B-factor for anomalous signal estimation.

Test cases: Correlations with final calculated E’s

Reso AnomAtom

f ˝ Corr

ΔE

Corr Emulti

Ferrodoxin 0.94 Fe 1.25 0.252 0.338

Thioesterase 2.5 Se 5.3 0.529 0.549

Lyso 180 1.64 S 0.56 0.324 0.348

Lyso 135 1.64 S 0.56 0.262 0.319

DNA 360 1.5 P 0.43 0.517 0.540

DNA 90 1.5 P 0.43 0.422 0.478

More robustness in difficult cases with CRUNCH2

• Using default parameters (resolution cutoff of 0.5 from the high resolution limit).

Hit rate: ΔE Hit rate: Emulti

Ferrodoxin 1/20 4/20

DNA 90* 0/20 2/20

* Can be solved with ΔE by using data to 1.5 Angstroms