phoelix: a package for semi-automated helical reconstruction

15
ul tramicroscopy ELSEVIER Ultramicroscopy 58 (1995) 245-259 PHOELIX: a package for semi-automated helical reconstruction Michael Whittaker a, * , Bridget 0. Carragher b, Ronald A. Milligan a a Department of Cell Biology, The Scripps Research Institute, 10666 North Torrey Pines Road, La Jolla, CA 92037, USA ’ Beckman Institute and Department of Cell and Structural Biology, lJniL)ersity of Illinois at Urbana-Champaign, USA Received 1 December 1994; in final form 7 April 1995 Abstract We describe a set of procedures and algorithms which have been developed to provide an efficient and reliable method for reconstructing a three-dimensional density map from specimens with helical symmetry. These procedures build on the original MRC helical processing suite, with extensions principally developed using the SUPRIM image processing package. Actomyosin is used as a model specimen to demonstrate the utility of this repackaged and expanded set of routines. The time required to complete a three-dimensional map has been reduced from several weeks using traditional manual techniques to a few days. The increased signal/noise provided has allowed for the extraction of additional layer lines not previously identified by manual techniques. 1. Introduction Procedures for processing electron images of macromolecular structures with helical symmetry were originally developed at the MRC, Cam- bridge [l-4], and have been used successfully for many years for the determination of helical struc- tures at moderate resolutions (9-30 A; for exam- ples see Refs. [5-111). The series of steps re- quired to process electron images of helical struc- tures are well documented and may be used consistently and reliably for the generation of three-dimensional density maps. These proce- dures have traditionally relied heavily upon oper- ator intervention to manually extract and manipu- late data during the numerous processing steps. Such heavy demand on an operator’s time is * Corresponding author. understandable, and perhaps even desirable, when processing a specimen of unknown selec- tion rule for the first time as it requires the operator to pay careful attention to every step in the processing. The principal disadvantage of such a manual approach is that, by its time consuming nature, it limits the number of images which may reasonably be processed. It is not uncommon for an experienced operator using these procedures to spend several weeks processing the large num- ber of images required to produce a single, aver- aged 3D map at an appropriate resolution. While the resolution could be improved still further by increasing the number of images which contribute to the average [12], practical time considerations effectively limit what can reasonably be achieved. Furthermore, in situations where several inde- pendent 3D maps are required for a thorough understanding of a structure, the time required of the operator for the image processing is an oner- 0304-3991/95/%09.50 0 1995 Elsevier Science B.V. All rights reserved SSDI 0304-3991(95)00057-7

Upload: michael-whittaker

Post on 21-Jun-2016

219 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: PHOELIX: a package for semi-automated helical reconstruction

ul t ramicroscopy ELSEVIER Ultramicroscopy 58 (1995) 245-259

PHOELIX: a package for semi-automated helical reconstruction

Michael Whittaker a, * , Bridget 0. Carragher b, Ronald A. Milligan a

a Department of Cell Biology, The Scripps Research Institute, 10666 North Torrey Pines Road, La Jolla, CA 92037, USA ’ Beckman Institute and Department of Cell and Structural Biology, lJniL)ersity of Illinois at Urbana-Champaign, USA

Received 1 December 1994; in final form 7 April 1995

Abstract

We describe a set of procedures and algorithms which have been developed to provide an efficient and reliable method for reconstructing a three-dimensional density map from specimens with helical symmetry. These procedures build on the original MRC helical processing suite, with extensions principally developed using the SUPRIM image processing package. Actomyosin is used as a model specimen to demonstrate the utility of this repackaged and expanded set of routines. The time required to complete a three-dimensional map has been reduced from several weeks using traditional manual techniques to a few days. The increased signal/noise provided has allowed for the extraction of additional layer lines not previously identified by manual techniques.

1. Introduction

Procedures for processing electron images of macromolecular structures with helical symmetry were originally developed at the MRC, Cam- bridge [l-4], and have been used successfully for many years for the determination of helical struc-

tures at moderate resolutions (9-30 A; for exam-

ples see Refs. [5-111). The series of steps re- quired to process electron images of helical struc-

tures are well documented and may be used consistently and reliably for the generation of

three-dimensional density maps. These proce- dures have traditionally relied heavily upon oper-

ator intervention to manually extract and manipu- late data during the numerous processing steps. Such heavy demand on an operator’s time is

* Corresponding author.

understandable, and perhaps even desirable,

when processing a specimen of unknown selec- tion rule for the first time as it requires the

operator to pay careful attention to every step in

the processing. The principal disadvantage of such a manual approach is that, by its time consuming nature, it limits the number of images which may

reasonably be processed. It is not uncommon for

an experienced operator using these procedures to spend several weeks processing the large num- ber of images required to produce a single, aver-

aged 3D map at an appropriate resolution. While the resolution could be improved still further by increasing the number of images which contribute to the average [12], practical time considerations effectively limit what can reasonably be achieved. Furthermore, in situations where several inde- pendent 3D maps are required for a thorough understanding of a structure, the time required of the operator for the image processing is an oner-

0304-3991/95/%09.50 0 1995 Elsevier Science B.V. All rights reserved SSDI 0304-3991(95)00057-7

Page 2: PHOELIX: a package for semi-automated helical reconstruction

246 M. Whittaker et al. / Ultramicroscopy 58 (1995) 245-259

ous burden. One final disadvantage is that the tedium of the task does not encourage repeating the processing either as a means of error check- ing or in order to measure the effects of alterna- tive preparative or processing steps.

These disadvantages of the manual approach may be overcome through the judicious use of computational tools. At the most basic level, much of the operator input required when using the standard helical processing tools involves manu- ally entering the results of one step as parameters to the next. This would be much more efficiently and rapidly handled by passing the parameters directly between the various computational steps without operator intervention. At a more com- plex level, there are a few critical steps where an intelligent decision must be made in order to extract the data. Examples of such steps include the straightening of the helical axis or the deter- mination of layer line intercepts and the correct helical selection rule. Such steps require a com- puter algorithm which approximates the operator’s decision-making processes.

In this paper we present procedures and algo- rithms which we have developed in order to pro- vide a time efficient and reliable helical process- ing method. The approach we have taken bears many similarities to the approach of Morgan and DeRosier [12] in which they described a set of automated procedures which they used to extract data to 10 8, from the bacterial flagellar filament. The set of procedures presented here have been developed for Silicon Graphics workstations (or other workstations which utilize the SGI graphics language) as an extension of the original MRC helical processing suite [l-4], with extensions principally developed using the SUPRIM image processing package [131. While many of the rou- tines are copied or derived from these previously existing packages we have for the sake of simplic- ity and clarity assigned the name “PHOELIX” to this repackaged and expanded set of routines. The PHOELIX package is designed to run either in batch or interactive mode and in principal could proceed from a selected helical filament to the generation of a 3D density map without oper- ator intervention. In practice the operator is pro- vided with intermediate data which are used to

evaluate the integrity of the results of each step. A number of automatic checks on these interme- diate results will halt the process if any serious anomalies are detected. In addition, following the three critical steps where we have developed al- gorithms to emulate operator decisions, the re- sults are presented to the operator and can be modified if necessary. When running in batch mode the processing runs to completion without any operator inspection. However a single com- mand then presents the intermediate data to the operator in the same manner as in an interactive run. If the operator is required to modify the presented data in any way, all the subsequent steps may be repeated starting from the modified step.

Using actomyosin as a model specimen, the results obtained using the PHOELIX package are compared to our previously published actomyosin data which was obtained using standard manual image processing methods. We discuss these re- sults in terms of both the time required for the processing and the quality of the resulting data. We also discuss the applicability of these proce- dures to specimens other than actomyosin.

2. Description of the PHOELIX package

2.1. Overview

The overall design of the software adheres to a very modular structure in which data is sequen- tially passed along a series of individual process- ing steps. These steps are controlled by a UNIX shell script which may be readily edited to mod- ify, reorder, add or delete steps as required by the operator. The individual software modules which operate on the data have been largely drawn, or derived from, two software packages, SUPRIM 1131 and programs written originally at the MRC, Cambridge [l-4]. In choosing to use the SUPRIM software package as a basis for PHOELIX we were influenced by the very modu- lar approach to the problems of image processing taken by this package and also its high level of organization and documentation. This has made the process of adding new modules and modifying

Page 3: PHOELIX: a package for semi-automated helical reconstruction

M. Whittaker et al. / Ultramicroscopy 58 (1995) 245-259 241

existing ones very straightforward. The MRC he- lical software routines were used to perform the Fourier-Bessel reconstruction because these rou- tines are very widely used and accepted. The overall modular design of the current package allows for the incorporation of new or modified MRC code in a straightforward manner. Libraries used for the MRC routines compatible with the UNIX operating system were generously pro- vided by Michael Schmid and Wah Chiu of Bay- lor College of Medicine. Other routines were ported to or rewritten for the UNIX operating system as required.

An overview of the semi-automated helical reconstruction procedure is represented schemat- ically in Fig. 1. Those steps which have been written specifically for PHOELIX will be de- scribed in some detail in the following para- graphs. For those steps which have been drawn from the MRC helical processing suite it is sug- gested that the original publications be used as a reference [l-4]. We will use actin decorated with myosin II subfragment 1 (acto-Sl) as a model specimen throughout this discussion. An exten- sive appendix to this document containing addi- tional documentation on all software modules used in the PHOELIX package is omitted here for brevity. It is available as part of the PHOELIX distribution.

2.2. Straightening the filament

Cryoelectron images of acto-Sl were recorded under low dose conditions and selected filaments converted into digital density arrays as described previously [81. A typical digitized array is shown in Fig. 2a. In order to isolate the filament from surrounding noise, other filaments, etc., a box is constructed by having the user interactively spec- ify a few points tracking the axis of the filament of interest. Using these specified points a curve is fitted to the axis and the filament is isolated by boxing off a region of a given width around the curve (“snake” boxing). This is illustrated in Fig. 2b.

Filaments selected for processing often exhibit a slight curvature along their length. While it is possible to identify short segments of a filament

which are essentially straight, such regions are not generally more than a few helical repeats in length. Processing much longer straight filaments increases signal/noise, thus making it easier to index the helical diffraction pattern and extract layer line data. Additionally, longer filaments re- duce the number of filaments which must be densitometered, processed, and reconstructed and so also reduce the overall time and effort neces- sary to obtain a final 3D density map. Computa- tional straightening of a curved filament [14] is now a well established technique and is per- formed as a matter of routine as an initial step during processing. The straightening procedure begins by calculating a Fourier-space cross corre-

Scanned filament array

Box filament

Calculate cross-correlation

Search for cross-correlation peaks

i

Fit curve to peaks

Reinterpolate filament to straight helical axis

I

Background subtraction

Calculate fft of straight filament

Search for layer lines

Determine selection rule

Interpolate original filament for integral number of helical repeats

Extract layer line data

Center filament in box and determine out-of-plane tilt

Fit to reference data set

c Average a number of data sets

Average layer line data

Fig. 1. Schematic diagram of the PHOELIX helical processing package. Those procedures which were developed specifically as part of PHOELIX are discussed in the text. For those procedures taken from the MRC helical image processing package additional documentation is available in the original publications [l-4] and with the PHOELIX distribution.

Page 4: PHOELIX: a package for semi-automated helical reconstruction

Fig. 2. Boxing of the densitometered filament image. Images of filaments wereOconverted to computer density arrays using a Perkin

Elmer lOlOG densitometer operating with 25 pm spot and step sizes (7.14 A at the image). A typical filament which has been

boxed rectangularly is shown in panel (a) and an image which has been “snake” boxed is shown in panel (b).

lation map between a template and the curved filament. The template can consist of either a short segment of the filament under considera- tion or a short section of an average structure calculated from an initial 3D map (inset in Fig. 3a). The cross correlation map thus calculated will have a series of maxima along the helical axis. We have found that identification of these peaks is often made easier if the digitized image is first low pass filtered prior to calculating the

cross correlation map. This step is at the discre- tion of the operator and the filtered image is discarded once it has been used to help define the helical axis of the original image.

Precise peak locations and values are deter- mined using a parabolic fit in a 3 X 3 neighbor- hood around the highest values in the cross corre- lation map. The number of peaks expected to lie on the filament axis is estimated by calculating the number of times the length of the template

Fig. 3. Intermediate st?ges in the straightening process. The template used for cross correlation (inset in panel (a)) is a projection

of one crossover (360 A) of acto-Sl, shown here for clarity at twice the scale of the other panels in the figure. Peak values from the

cross correlation map, calculated as described in the text, are displayed overlying the filament (a). Beginning with the highest peak,

neighboring peaks are compared. A peak is discarded if, compared to the preceding peak, it is closer than 40 pixels in n. or

diverges further than 12 pixels in y and has a slope (i.e. By/Ax) greater than 0.3. The remaining peaks (b) are used to define the

filament axis. A cubic sphne is fit to these points, and the curve is used to map the filament onto a linear helical axis. This

straightened filament image is then background corrected and used for determination of the selection rule. Using the chosen

selection rule, the original filament image is reinterpolated and restraightened such that an integral number of helical repeats

precisely fills a box suitable for Fourier transformation Cc).

Page 5: PHOELIX: a package for semi-automated helical reconstruction

M. Whittaker et al. / Ultramicroscopy 58 (1995) 245-259 249

divides into the total length of the filament. As a number of peaks will be identified which are not precisely on the filament axis, the total number of peaks examined is this expected value times some multiplier, in this case 3 (Fig. 3a). The set of identified peaks are then passed along to a rou- tine which is designed to identify and eliminate the spurious peaks. It does so by starting with the highest peak (assuming that this highest peak is on the helical axis) and then working along the helical axis in both directions rejecting peaks that are outside of certain defined error limits. These error limits are defined to reject peaks which are either too close together along the helical axis or which show too rapid a change perpendicular to the axis. These criteria are highly effective in identifying an axis which is free of spurious points and work perfectly for about 90-100% of the filament length (Fig. 3b). The method fails occa- sionally when a peak passes all of the error re- quirements but is still off the axis of the filament. Usually this peak can be simply deleted during the interactive examination of the selected axis, resulting in a curve which is smooth and continu- ous along the length of the filament. The curve

which is fitted to the axis points may be selected to be either a cubic spline or a low order polyno- mial. A straightened image of the curved filament is calculated by interpolating the filament along lines perpendicular to the fitted curve. The power spectra of a filament before and after straighten- ing are shown in Fig. 4.

2.3. Background subtraction

The straightened filaments can be greater than 1 micron in length. For cryoelectron micrographs it is not unusual for there to be significant varia- tion in the thickness of the ice layer over this distance, resulting in a non uniform background over the image. This non-uniformity in turn intro- duces low frequency artifacts into the Fourier transform which may affect the amplitudes and phases of the low order layer lines extracted from these transforms. This problem may be reduced if the background variations are subtracted from the image prior to calculating the transform. This is achieved by extracting the first and last rows of the straightened filament, fitting a polynomial curve to the intensities along these two rows,

Fig. 4. Power spectra of an unstraightened (Fig. 2b) and straightened filament (Fig. 3~) are shown in panels (a) and (b), respectively.

Page 6: PHOELIX: a package for semi-automated helical reconstruction

2.50 M. Whittaker et al. / Ultramicroscopy 58 (1995) 245-259

calculating an interpolated surface defined by these two curves, and subtracting this from the original image. During this procedure the image is also floated to a mean value of zero.

2.4. Identification of layer lines

In order to extract the layer lines from the transform they are first located in the following manner. A region around the meridian of the power spectrum which contains strong amplitudes is excised from the transform and the integrated intensities perpendicular to the meridian are plotted versus transform pixel number. The back- ground of this curve is removed by calculating a very low pass filtered image of the curve and subtracting this from the original. The result is illustrated in Fig. 5. A peak searching algorithm is then used to search for a number of peaks at set intervals along the curve which are more than a given number of standard deviations above the background. The interval used is the approximate location of the first layer line, determined from the user-provided crossover length of the helix. The peaks which are identified in this way are noted in Fig. 5.

2.5. Determination of the correct selection rule

The layer line positions which are identified in the peak search are assigned layer line numbers and Bessel orders according to a list of possible selection rules provided by the operator (Table 1). A linear regression algorithm is used to deter- mine the selection rule which gives the best fit to a straight line for layer line position versus layer line number. If none of the provided selection rules give a good fit (x2 < l), a warning message is sent to the operator. Once the best selection rule has been determined the filament is reboxed so as to contain an integral number of helical repeats which are exactly enclosed in an array which is 2” in length. This is done so that the computed layer lines will lie exactly on the trans- form sampling raster. The reboxing is performed by reinterpolating the image from the original curved filament, thus creating an interpolated and straightened image in a single step. The

3 .Z

5

I I I I I I

0 200 400 600 600 1000

Reciprocal Lattice Units

Fig. 5. Identification of layer line intercepts. The power spectrum in Fig. 4b is collapsed to a one-dimensional array and corrected for background as described in the text. Peaks which are more than 3 standard deviations above the back- ground and which are located at the predicted layer line spacings (within a given range) are indicated by arrows.

selection rule of this new straightened image is checked for consistency with the selection rule determined during the initial straightening. Fi- nally, this image is background corrected and floated to a mean density of zero as described above (Fig. 3~).

2.6. Extracting the layer line data

The layer line spacing for the final straight- ened filament is determined from the intercepts of certain specified strong layer lines. For the data shown here the J2 and J_, Bessel orders are used, and the spacing is determined by sum- ming the layer line intercepts and dividing by the sum of the layer line numbers (e.g. for a 13/6

Page 7: PHOELIX: a package for semi-automated helical reconstruction

M. Whittaker et al. / Ultramicroscopy 58 (1995) 245-259 251

Table 1 Layer line numbers and Bessel orders for various helical

selection rules

Bessel order Layer line number (selection rule)

(13/6) (28/13) (41/19) (54/25)

Jo 0 0 0 0

J2 1 2 3 4

J4 2 4 6 8 J-9 2 5 7 9

J6 3 6 9 12

J-7 3 7 10 13

J, 4 8 12 16 J-S 4 9 13 17

J

J’“1

5 10 15 20

5 11 16 21

J-l 6 13 19 25

Jl 7 15 22 29

J3 8 17 25 33

J-IO 8 18 26 34

J5 9 19 28 37

J-8 9 20 29 38 J7 10 21 31 41

J-6 10 22 32 42

J9 11 23 34 45

J-4 11 24 35 46

J-2 12 26 38 50 Jo 13 28 41 54

selection rule, if the intercept of layer line 1 was at 10 reciprocal lattice units (rlu), and the inter- cept of layer line 6 was 60 rlu, the average spac- ing would be (10 + 60)/(1 + 6) = 10 rlu). This average spacing is used to predict layer line inter- cepts out to some defined resolution. For those layer line intercepts which were located as peaks in the 1D array during the determination of the selection rule, the located intercept rather than the predicted intercept is used. If this located intercept differs by more than 1 rlu from the predicted intercept, a warning message is sent to the operator. In interactive mode the operator is presented with a power spectrum of this ques- tionable region of the transform and asked to decide whether to use the predicted or located intercept. Alternatively, an intercept of the oper- ator’s choosing may be selected. In batch mode the located intercept is used and the operator may decide to accept or correct this value at a later time. The intercepts are written into a pa-

rameter deck required by the MRC helical pro- gram suite and used to extract layer lines from an MRC format transform file. Further processing, including correction for out-of-plane tilt, center- ing in the transform box, fitting to a reference data set and averaging was performed essentially as previously described [8]. Correction for tilt and centering require the operator to specify ampli- tude peaks on certain strong layer lines which approximately match across the meridian. These peaks are determined computationally by identi- fying the amplitude maximum in the vector aver- age of the near and far side of each layer line. In interactive mode the amplitude peaks selected in this way are presented graphically to the operator (Fig. 6) and may be edited.

3. Use of the PHOELIX package

In practice an operator would begin a new reconstruction by editing a number of values in a global parameter file. These would include the list of possible selection rules, approximate crossover distance, Bessel orders of the strongest layer lines and a number of other critical parame- ters which control the procedures and software modules. While some of the controlling parame- ters (e.g. selection rule, strong Bessel orders, etc.) can be measured directly from the data, others (straightening parameters, radius of low pass fil- ter, etc.) might need to be empirically deter- mined. The entire process runs rapidly enough (- 5 min per filament) so that a large number of parameter values may be tested in a relatively short time. Once a set of parameters has been determined to work for a given helical filament it should be possible to use them for all further processing of this structure.

The filament must be in SUPRIM format, boxed (using a rectangular or snake box) and oriented with its long axis parallel to the x axis. The template may be either a small piece of the original filament or a model structure determined from a previously calculated map. Once the pa- rameter file has been set up and the filament and template have been selected, the operator begins

Page 8: PHOELIX: a package for semi-automated helical reconstruction

252 M. Whittaker et al. / Vltramicroscopy 58 (1995) 245-259

0 0.01 0.02 0.03

Radius (Angstrom-‘) Fig. 6. Amplitude peaks used for correction of out-of-plane

tilt and centering of the filament in the transform box. Near-

and far-side amplitudes of those layer lines identified as

“strong” are displayed together. Note that on the workstation

screen the near- and far-side data are displayed in different

colors for clarity. The maximum value in the vector average of

each layer line is computed and marked with a “+“. These

amplitudes and phases are extracted for tilt and shift determi-

nation.

a reconstruction by starting the main controlling script using the command:

“ssphoelix [filament name] [template] I tee

> [output file]”

If the processing is running in interactive mode the information written to the output file will also be echoed to the screen. This information con- sists of terse comments detailing the actions taken by each script which is called and relevant data pertaining to the results of these actions. These

data include the list of layer line intercepts for the strong Bessel orders which are used to deter- mine the selection rule, the chosen selection rule and the x2 value representing the goodness of fit to this selection rule, the location of each strong layer line intercept in relation to its predicted value, tilt/shift search results, and results of the fit to a reference data set. In addition, the opera- tor is presented with a number of graphical dis- plays including the location of the cross correla- tion peak values used to straighten the filament, the collapsed power spectrum, the final reinter- polated filament and its power spectrum, and layer line data indicating the radial positions of the peaks used in the tilt/shift search. The oper- ator is also prompted for input at the following points: (1) approval or correction of the cross correlation points used for straightening the fila- ment; (2) choice of layer line intercept when the determined intercept differs by more than one pixel from the predicted value; and (3) approval or correction of the amplitude peaks chosen for out-of-plane tilt and axis centering correction.

If these procedures are run in batch mode, the graphical output may be viewed at a later stage to assess the success of the processing steps. During this examination any of the parameters requiring operator approval may be edited and the process- ing restarted from that point.

4. Results

The procedures presented here were initially tested by processing cryoelectron images of actin decorated with Sl containing the alkali 2 light chain (acto-Sl(A2)). The final layer lines are an average of 14 individual data sets (near-and far- side layer lines from 7 filament images). For each filament, the selection rule was 54,25 (the major- ity of our decorated actin filaments fit best to this selection rule). Those few filaments which did not conform to this selection rule or which fit only poorly were not included in the average. Approxi- mately half of the filaments we have processed required intervention by the operator during the straightening step or the step in which peaks are

Page 9: PHOELIX: a package for semi-automated helical reconstruction

a b

R (

j/A)

R (

l/A)

Fig.

7.

Fina

l Sl

Q.2

) la

yer

line

aver

ages

. (a

) T

he

10 l

ayer

lin

es

obta

ined

by

man

ual

proc

essi

ng

of s

hort

, st

raig

ht

regi

ons

of a

ctom

yosi

n [1

6].

(b)

The

22

lay

er

lines

ob

tain

ed

usin

g PH

OE

LIX

. C

c) A

s in

(b)

, ex

cept

th

at

ampl

itude

s ar

e th

e ve

ctor

su

m o

f th

ree

pixe

ls

cent

ered

on

the

la

yer

line

inte

rcep

t. A

mpl

itude

s on

lay

er l

ines

34

-54

have

bee

n sc

aled

3

x

Page 10: PHOELIX: a package for semi-automated helical reconstruction

254 M. Whittaker et al. / Ultramicroscopy 58 (1995) 245-259

Page 11: PHOELIX: a package for semi-automated helical reconstruction

M. Whittaker et al. / Uitra~‘~rosco~ 58 (1995j245-259 255

selected to use for correcting out-of-plane tilt and deed, the pitch probably varies from monomer to axis centering of the filament. These interven- monomer rather than from repeat to repeat. As a tions were typically as simple as deleting a single result, layer lines which should be a single pixel automatically determined point. The average in width are blurred somewhat onto surrounding length of each filament was approximately 1.15 pixels. This attenuation of amplitudes may be pm, representing 8 repeats of the M/25 helix. reduced if the long straightened filaments are The final average layer line data, representing a broken into shorter segments, each containing an total of 3024 individual actoSl(A2) subunits, are integral number of repeats, and processed sepa- shown in Fig. 7b. Data obtained by manually rately (data not shown). This does, however, in- processing a large number of short, naturally crease the noise in transforms from individual straight regions of filament are shown for com- filaments, which makes the determination of the parison in Fig. 7a. By manual processing, a total selection rule more difficult. It also increases the of 10 layer lines owere identified, @ending to a number of computations, and therefore the time resolution of 45 A axially and 35 A radially. The required, to complete a map. As a simpler correc- increased signal/noise ratio provided by the long, tion, amplitudes are extracted by axial integration straightened filaments enabled us to define the across several pixels in the transform, centered selection rule better and to identify a total of 22 layer lines extending to N 27 A in all directions.

on the calculated layer line position. The ampli- tude component is determined from a vector ad-

In addition to those layer lines beyond the J, dition across three pixels centered on the layer which had previously not been identified, we were line and the phase component taken from the able to extract a number of layer lines at low center pixel, where the amplitude is highest and resolution (e.g. J.+) which had been missed pre- the phase best defined. The resulting average viously. Although the amplitudes of the higher layer lines are presented in Fig. 7c. A comparison resolution layer lines (layer lines 34-54) are weak, of the layer lines in this new data set to that the data are reliable as phases across the ampli- presented in Fig. 7b, shows that while the ampli- tude peaks are reproducible in data representing tude of the J, remains approximately unchanged the same structure calculated from independent the J_, amplitude is no longer attenuated rela- populations of filaments (data not shown). tive to the previously published data.

One consequence of processing long filaments using these procedures is an attenuation of layer line amplitudes at increasing meridional spacing. For example, the J, layer lines in the current data set and in the previously published data set have approximately equal amplitudes (see Figs. 7a and 7b). In contrast, the amplitudes on the J_, layer line in the current data set have been reduced by appro~mately 25% relative to the earlier data set. The observed attenuation is likely a result of small variations in pitch along the length of these very long filaments [17-201. In-

Three-dimensional maps were calculated by Fourier-Bessel inversion of the layer line data in Figs. 7a and 7c. The maps are surfaced at a contour level which encloses approximately 100% of the expected mass of the structures and these surfaces are displayed in Fig. 8. All of the fea- tures which were present in the earlier map are also present in the new map. However, as a result of the additional layer lines identified by process- ing long straightened filaments, the surface enve- lope of the structure is more detailed and con- tours at 10% of the expected mass (the internal

Fig. 8. Surface views of acto-Sl(AZ). Three-dimensional maps were calculated from our previously pubIished data (a) and from data obtained from straightened filaments (b) by Fourier-Bessel inversion of the layer line data in Figs. 7a and 7c, respectively. The surface enclosing - 100% of the expected mass of acto-Sl(A2) is displayed here transparently to allow viewing of an internal solid contour representing 10% of the expected total mass of the structure. Surfaces are visualized using the program SYNU [15]. It is the map displayed in panel (b) which was used to model the atomic structures of actin and Sl into a filament [27].

Page 12: PHOELIX: a package for semi-automated helical reconstruction

256 M. Whittaker et al. / Ultramicroscopy 58 (1995) 245-259

Fig. 9. Surface views of the decorated thin filament. The surfaces have been calculated from our previously published data [16] (a)

and from data obtained from straightened filaments fb). By calculating difference maps between maps containing and lacking the

DTNB light chain, the additional mass at high radius in panel (b) (arrow) has been identified as representing the DTNB light chain domain.

Page 13: PHOELIX: a package for semi-automated helical reconstruction

M. Whittaker et al. / Ultramicroscopy 58 (1995) 245-259 251

contours in Fig. 8) show that peak density which was an elongated smear has now been resolved into two sharp density peaks.

The processing of acto-Sl demonstrated the utility of these procedures and allowed us to gain experience in using them. To estimate the length of time required for an experienced user to com- plete a new reconstruction we next analyzed im- ages of a related structure: the thin filament (actin + tropomyosin + troponin) decorated with Sl containing both the alkali and DTNB light chains (TFSl). The final layer line data are an average of 12 data sets (6 filaments, 1512 molecules; not shown). Beginning with scanned density arrays, a complete reconstruction (i.e. col- lection of final average layer lines and calculation of a 3D density map) was performed in a single day. A statistical comparison to the acto-Sl(A2) map discussed above required a second day of computing. This compares very favorably to the approximately 3-4 weeks required to calculate an earlier map of the same structure using tradi- tional image processing techniques [161. This ear- lier map was calculated from a large number of short, naturally straight filaments, for which a 13,6 helical selection rule was assumed. The ear- lier average was calculated from 1339 TFSl molecules, compared to 1512 molecules in the new map. In a difference map calculated between the earlier TFSl map and a map lacking the DTNB light chain there was no statistically signif- icant density identifiable as the light chain (Fig. 9a). In contrast, a comparison of maps deter- mined by processing long filaments using PHOELIX show additional highly significant density at high radius which we ascribe to this light chain (Fig. 9b). A full account of these data will be published elsewhere.

5. Conclusions and future prospects

We have developed a semi-automated set of procedures for the processing of helical filaments and demonstrated its utility by applying it to actomyosin. These procedures dramatically re- duce the time taken to complete a map and result

in a three-dimensional data set with a resolution better than that achieved using manual methods. The package in its current form has been opti- mized for the processing of actomyosin filaments but has also been successfully applied to undeco- rated actin. The modular design should readily accommodate any modifications necessary to ap- ply these procedures to other helical structures.

One goal in developing this package was sim- ply to streamline the helical reconstruction proce- dures so that they might be performed quickly and routinely. This goal has been met and it is now possible to complete a 3D map in a day or two, a process that would have taken several weeks previously. As a further advantage of the increased efficiency of the reconstruction process we are able to experiment with the parameters and individual processing steps in order to opti- mize conditions for our particular data set. The procedures can also be easily and quickly re- peated by a number of different operators in order to explore the effects of subjective bias on the final outcome.

A second goal for these procedures was to increase the length of the filaments as well as the total number of molecules contributing to the final average layer line data in a given map. The resulting overall increase in signal/noise has al- lowed for more precision in the identification of layer lines and in the determination of the helical selection rule as compared to our previous maps. We have been able to extend the nominal axial resolution of our maps from 45 to 27 A and to collect previously unidentified layer lines at low resolution. These improvements in the data have enabled us to locate the DTNB light chain in acto-Sl. The ability to identify an additional light chain in maps calculated using PHOELIX as opposed to maps calculated using manual tech- niques was particularly significant. The two maps were calculated from averages containing approx- imately equal numbers of molecules, indicating that the increase in precision of the image pro- cessing steps provides much of the improvement seen in the final data.

A further extension of the resolution is proba- bly limited by remaining disorder in the specimen and by difficulties inherent in collecting electron

Page 14: PHOELIX: a package for semi-automated helical reconstruction

2.58 M. Whittaker et al. / Ultramicroscopy 58 (1995) 245-259

images of frozen specimens, e.g. inelastic scatter- ing, specimen movement, and charging [21]. These limitations may be thought of as analogous to those caused by thermal motion in X-ray crystal- lographic studies, albeit on a larger scale. By analogy then, one can describe these limitations as contributing to a “temperature factor” which describes a reduction in scattering power of a molecule as a result of disorder and imaging difficulties. Previous work has described limits in structure factor amplitudes at high resolution re- sulting from temperature factors and noise, using 2D crystals of bacteriorhodopsin as a model [22]. The authors describe a strong attenuation of am- plitudes which becomes progressively worse as the temperature factor is increased. Additionally, as the signal/noise decreases at high resolution, the extraction of accurate phase information be- comes problematic. With a crystalline specimen such as bacteriorhodopsin, such difficulties be- come very important at high resolution and may be overcome by combining amplitudes obtained from electron diffraction patterns with phases from the electron images. For helical structures which are both very weak diffractors 1121 and, in the case of actomyosin, much more poorly or- dered than the crystals, these temperature factor effects become significant at the moderate resolu- tion described here.

Possible sources of disorder in the actomyosin filaments used in this study include errors in axis straightening, variability in helical pitch, and vari- ations in out-of-plane tilt along the filament. Ini- tially, the accuracy of the straightening algorithm could be perhaps improved. For example, peaks in the cross correlation map can be sharpened up considerably by orienting the template to more closely match the local orientation of small sec- tions along the filament. Additional improvement may be obtained by the incorporation of iterative techniques similar to real-space methods de- scribed for 2D crystals (whereby a projection of the calculated 3D map is used as a reference to restraighten and reprocess the images) [23] or correlation functions which emphasize high reso- lution features in images [24]. It should be noted that the modular nature of PHOELIX makes addition of modules with new features very con-

venient. Errors due to noise in the image are however likely to remain a problem, particularly for ice embedded specimens, where the contrast is low.

Variability in helical pitch [17-201 is a much larger source of remaining disorder in these fila- ments, as was demonstrated by the attenuation of layer line amplitudes seen in Fig. 7. As noted, it was possible to improve the data by breaking the filament into shorter and shorter lengths, though this sacrifices signal/noise and significantly in- creases computational requirements. The simpler correction of collecting data by integration across a number of pixels in the transform appears to provide an approximately equivalent improve- ment in the data, albeit at the expense of in- creased noise on weak layer lines. Further im- provements might be achieved by explicitly mea- suring and taking into account the instantaneous pitch along each filament. This approach has been successfully applied in the analysis of sickle hemoglobin filaments [25], and could be used for actomyosin now that the high resolution X-ray crystal structures of both actin [26,27] and the myosin head [28,29] are available. Similarly, local variations in out-of-plane tilt along the filament can be measured [121, and while algorithms for their correction have yet to be described, tech- niques akin to those applied for the correction of variable pitch should be applicable.

A further increase in the signal-to-noise ratio of high resolution layer lines might be achieved through the use of a layer line “sniffer” algorithm as described by DeRosier [30]. In this algorithm an initial set of average layer lines is calculated by extracting and averaging data from individual filaments for which the predicted positions of high resolution layer lines has been determined from the selection rule. As a result of disorder in the filaments the actual layer line position might differ from the predicted layer line position, par- ticularly at increasing resolution. The sniffer al- gorithm thus calculates a phase residual between each layer line in the average data set and each of a set of layer lines centered around the pre- dicted layer line position in the individual fila- ment. The layer lines corresponding to the lowest phase residual are then used to compute a second

Page 15: PHOELIX: a package for semi-automated helical reconstruction

M. Whittaker et al. / Ultramicroscopy 58 (1995) 245-259 259

average and the process is iterated until no fur- ther improvement is achieved.

Finally, while these corrections may help to compensate for specimen disorder, they do not address those difficulties inherent in recording electron images. It may be possible to attempt a correction for these difficulties similar to that described by Schertler, Villa and Henderson [311. In this study the attenuation of amplitudes com- puted from electron images of rhodopsin was determined by comparison to electron diffraction patterns obtained from bacteriorhodopsin. This amplitude correction relies on the similarity be- tween rhodopsin and bacteriorhodopsin and would not be directly applicable to actomyosin. It might be possible, however, to obtain a similar correction by computing a diffraction pattern from a “model” actomyosin filament based on the X- ray crystallographic map of actin and myosin molecules. Comparison of the theoretical layer line amplitudes to our calculated average ampli- tudes would provide an equivalent temperature factor correction.

The current implementation of the PHOELIX package is available on request. Send e-mail to [email protected] or [email protected].

Acknowledgements

We thank Wah Chiu and Mike Sehmid (Baylor College of Medicine) for generously providing UNIX-compatible versions of the MRC image libraries. This work was supported by grant AR39155 (to R.A.M.) from the National Insti- tutes of Health. R.A.M. is an Established Investi- gator of the American Heart Association.

References

[ll P.B. Moore, H.E. Huxley and D.J. DeRosier, J. Mol. Biol. 50 (1970) 279.

[2] D.J. DeRosier and P.B. Moore, J. Mol. Biol. 52 (1970) 355.

[31

[41 [51 [61 [71

PI

[91 [lOI

(111

[I21

[I31

[141 I151

I161

[I71 [181

1191

DO1

WI [221

D31

[241

D51

D61

[271

LB1

D91

I301

[311

T. Wakabayashi, H.E. Huxley, L.A. Amos and A. Klug, J. Mol. Biol. 93 (1975) 477. L.A. Amos and A. Klug, J. Mol. Biol. 99 (1975) 51. K.A. Taylor and L.A. Amos, J. Mol. Biol. 147 (1981) 297. P. Vibert and R. Craig, J. Mol. Biol. 157 (1982) 299. C. Toyoshima and T. Wakabayashi, J. Biochem. 97 (1985) 219. R.A. Milligan and P.F. Flicker, J. Cell Biol. 105 (1987) 29. N. Unwin, J. Mol. Biol. 229 (1993) 1101. M.F. Schmid, J.M. Agris, J. Jakana et al., J. Cell Biol. 124 (1994) 341. A. McGough, M. Way and D.J. DeRosier, J. Cell Biol. 126 (1994) 433. D.G. Morgan and D.J. DeRosier. Ultramicroscopy 46 (1992) 263. J.K. Stoops, J.P. Schroeter, J.P. Bretatudiere, N.H. Ol- son, T.S. Baker and D.K. Strickland, J. Struct. Biol. 106 (1991) 172. E.H. Egelman, Ultramicroscopy 19 (1986) 367. D. Hessler, S.J. Young, B.O. Carragher, M. Martone, J.E. Hinshaw, R.A. Milligan, E. Masliah, M. Whittaker, S. Lamont and M.H. Ellisman, Microscopy 22 (1992) 73. R.A. Milligan, M. Whittaker and D. Safer, Nature 348 (1990) 217. J. Hanson, Nature 213 (1967) 353. E.H. Egelman and D.J. DeRosier, Acta Crystallogr. A 38 (1982) 796. U. Aebi, R. Millonig, H. Salvo et al., Ann. N.Y. Acad. Sci. 483 (1986) 100. D.L. Stokes and D.J. DeRosier, J. Cell Biol. 104 (1987) 1005. R. Henderson, Ultramicroscopy 46 (1992) 1. R.M. Glaesar and K.H. Downing, Ultramicroscopy 47 (1992) 256. R.H. Crepeau and E.K. Fram, Ultramicroscopy 6 (1981) 7. M. Schatz and M. van Heel, Ultramicroscopy 45 (1992) 15. D.A. Bluemke, B. Carragher and R. Josephs, Ultrami- croscopy 26 (1988) 255. W. Kabsch, H.G. Mannherz, D. Suck et al., Nature 347 (1990) 37. K.C. Holmes, D. Popp, W. Gebhard et al., Nature 347 (1990) 44. I. Rayment, W.R. Rypniewski, K. Schmidt-Base et al., Science 261 (1990) 50. I. Rayment, H.M. Holden, M. Whittaker et al., Science 261 (1993) 58. D.G. Morgan and D. DeRosier, Biophys. J. 64 (1993) a243. G.F.X. Schertler, C. Villa and R. Henderson, Nature 362 (1993) 770.