journal of structural biology - university of southern...

14
Exploring the spatial and temporal organization of a cell’s proteome Martin Beck a , Maya Topf b , Zachary Frazier c , Harianto Tjong c , Min Xu c , Shihua Zhang c , Frank Alber c,a European Molecular Biology Laboratory, Meyerhofstr. 1, 69117 Heidelberg, Germany b Molecular Biology, Crystallography, Department of Biological Sciences, Birkbeck College, University of London, London, UK c Program in Molecular and Computational Biology, University of Southern California, 1050 Childs Way, RRI 413E, Los Angeles, CA 90089, USA article info Article history: Available online 19 November 2010 Keywords: Macromolecular assemblies Visual proteomics Cryo electron tomography Cryo electron microscopy Density fitting Brownian dynamics simulations Reaction-diffusion modeling abstract To increase our current understanding of cellular processes, such as cell signaling and division, knowl- edge is needed about the spatial and temporal organization of the proteome at different organizational levels. These levels cover a wide range of length and time scales: from the atomic structures of macro- molecules for inferring their molecular function, to the quantitative description of their abundance, and spatial distribution in the cell. Emerging new experimental technologies are greatly increasing the availability of such spatial information on the molecular organization in living cells. This review addresses three fields that have significantly contributed to our understanding of the proteome’s spatial and temporal organization: first, methods for the structure determination of individual macromolecular assemblies, specifically the fitting of atomic structures into density maps generated from electron microscopy techniques; second, research that visualizes the spatial distributions of these complexes within the cellular context using cryo electron tomography techniques combined with computational image processing; and third, methods for the spatial modeling of the dynamic organization of the prote- ome, specifically those methods for simulating reaction and diffusion of proteins and complexes in crowded intracellular fluids. The long-term goal is to integrate the varied data about a proteome’s orga- nization into a spatially explicit, predictive model of cellular processes. Ó 2010 Elsevier Inc. All rights reserved. 1. Introduction A vast amount of information on a cell’s proteome is being gen- erated by a variety of experimental methods, revealing multiple levels of spatial organization in a cell (Fig. 1) Robinson et al., 2007; Yus et al., 2009; Kühner et al., 2009; Kiel and Serrano, 2009; Collins et al., 2007; Lippincott-Schwartz and Manley, 2009. The molecular structures of individual macromolecules and their assemblies are routinely determined by X-ray crystallography, NMR spectroscopy and electron microscopy (EM). These structures are necessary to determine the molecular functions of macromole- cules and their complexes, and provide insights into their evolu- tion (Robinson et al., 2007; Sali et al., 2003). Insights into the proteome organization at a cellular level can be determined by fluorescence imaging (Betzig et al., 2006; Manley et al., 2008) and cryo electron tomography (cryoET) (Leis et al., 2009; Li and Jensen, 2009; Baumeister, 2002). CryoET can be applied to a pleo- morphic biological specimen in its near native state, and can not only determine a cell’s ultra-structure (i.e., the membrane bound compartmentalization in the cell) (Leis et al., 2009; Briegel et al., 2009; Morris and Jensen, 2008), but also can increasingly produce 3D snapshots of the distributions of large complexes inside the cell (Kühner et al., 2009; Medalia et al., 2002; Beck et al., 2009). At lower levels of organization, new genomics and proteomic technologies are documenting what genes are transcribed and when (Morris and Jensen, 2008; Yus et al., 2009; Güell et al., 2009). Quantitative mass spectrometry can provide information about which types of proteins are active at a single point in time, and can also reveal the relative and absolute abundance of these proteins and their assemblies in cells and organelles (Kühner et al., 2009; Malmström et al., 2009; Ranish et al., 2003; Aebersold and Mann, 2003). It is also possible to obtain quantitative informa- tion about the association rates of reversible protein interactions and the rates of enzymatic reactions in many process pathways (Kinkhabwala and Bastiaens, 2010; Maeder et al., 2007; Dehmelt and Bastiaens, 2010). A compelling long-term goal of biology is to combine all avail- able spatial and quantitative information about a cell’s constituent parts into a predictive spatially explicit model of cell processes, metabolism and behavior (Morris and Jensen, 2008; Kinkhabwala and Bastiaens, 2010; Lemerle et al., 2005; Ridgway et al., 2006). Ideally, such models would include atomic details relevant to the molecular function of macromolecular assemblies as well as infor- mation about the cell’s ultra-structure and the higher-order spatial organization of these assemblies at a cellular level. Both scales are 1047-8477/$ - see front matter Ó 2010 Elsevier Inc. All rights reserved. doi:10.1016/j.jsb.2010.11.011 Corresponding author. Fax: +1 213 740 2437. E-mail address: [email protected] (F. Alber). Journal of Structural Biology 173 (2011) 483–496 Contents lists available at ScienceDirect Journal of Structural Biology journal homepage: www.elsevier.com/locate/yjsbi

Upload: others

Post on 25-Aug-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Journal of Structural Biology - University of Southern Californiaweb.cmb.usc.edu/people/alber/pdf/science_JSB2011.pdf · 2012. 7. 9. · to expand the scope of structural biology

Journal of Structural Biology 173 (2011) 483–496

Contents lists available at ScienceDirect

Journal of Structural Biology

journal homepage: www.elsevier .com/locate /y jsbi

Exploring the spatial and temporal organization of a cell’s proteome

Martin Beck a, Maya Topf b, Zachary Frazier c, Harianto Tjong c, Min Xu c, Shihua Zhang c, Frank Alber c,⇑a European Molecular Biology Laboratory, Meyerhofstr. 1, 69117 Heidelberg, Germanyb Molecular Biology, Crystallography, Department of Biological Sciences, Birkbeck College, University of London, London, UKc Program in Molecular and Computational Biology, University of Southern California, 1050 Childs Way, RRI 413E, Los Angeles, CA 90089, USA

a r t i c l e i n f o

Article history:Available online 19 November 2010

Keywords:Macromolecular assembliesVisual proteomicsCryo electron tomographyCryo electron microscopyDensity fittingBrownian dynamics simulationsReaction-diffusion modeling

1047-8477/$ - see front matter � 2010 Elsevier Inc. Adoi:10.1016/j.jsb.2010.11.011

⇑ Corresponding author. Fax: +1 213 740 2437.E-mail address: [email protected] (F. Alber).

a b s t r a c t

To increase our current understanding of cellular processes, such as cell signaling and division, knowl-edge is needed about the spatial and temporal organization of the proteome at different organizationallevels. These levels cover a wide range of length and time scales: from the atomic structures of macro-molecules for inferring their molecular function, to the quantitative description of their abundance,and spatial distribution in the cell. Emerging new experimental technologies are greatly increasing theavailability of such spatial information on the molecular organization in living cells. This reviewaddresses three fields that have significantly contributed to our understanding of the proteome’s spatialand temporal organization: first, methods for the structure determination of individual macromolecularassemblies, specifically the fitting of atomic structures into density maps generated from electronmicroscopy techniques; second, research that visualizes the spatial distributions of these complexeswithin the cellular context using cryo electron tomography techniques combined with computationalimage processing; and third, methods for the spatial modeling of the dynamic organization of the prote-ome, specifically those methods for simulating reaction and diffusion of proteins and complexes incrowded intracellular fluids. The long-term goal is to integrate the varied data about a proteome’s orga-nization into a spatially explicit, predictive model of cellular processes.

� 2010 Elsevier Inc. All rights reserved.

1. Introduction

A vast amount of information on a cell’s proteome is being gen-erated by a variety of experimental methods, revealing multiplelevels of spatial organization in a cell (Fig. 1) Robinson et al.,2007; Yus et al., 2009; Kühner et al., 2009; Kiel and Serrano,2009; Collins et al., 2007; Lippincott-Schwartz and Manley, 2009.The molecular structures of individual macromolecules and theirassemblies are routinely determined by X-ray crystallography,NMR spectroscopy and electron microscopy (EM). These structuresare necessary to determine the molecular functions of macromole-cules and their complexes, and provide insights into their evolu-tion (Robinson et al., 2007; Sali et al., 2003). Insights into theproteome organization at a cellular level can be determined byfluorescence imaging (Betzig et al., 2006; Manley et al., 2008)and cryo electron tomography (cryoET) (Leis et al., 2009; Li andJensen, 2009; Baumeister, 2002). CryoET can be applied to a pleo-morphic biological specimen in its near native state, and can notonly determine a cell’s ultra-structure (i.e., the membrane boundcompartmentalization in the cell) (Leis et al., 2009; Briegel et al.,2009; Morris and Jensen, 2008), but also can increasingly produce

ll rights reserved.

3D snapshots of the distributions of large complexes inside the cell(Kühner et al., 2009; Medalia et al., 2002; Beck et al., 2009).

At lower levels of organization, new genomics and proteomictechnologies are documenting what genes are transcribed andwhen (Morris and Jensen, 2008; Yus et al., 2009; Güell et al.,2009). Quantitative mass spectrometry can provide informationabout which types of proteins are active at a single point in time,and can also reveal the relative and absolute abundance of theseproteins and their assemblies in cells and organelles (Kühneret al., 2009; Malmström et al., 2009; Ranish et al., 2003; Aebersoldand Mann, 2003). It is also possible to obtain quantitative informa-tion about the association rates of reversible protein interactionsand the rates of enzymatic reactions in many process pathways(Kinkhabwala and Bastiaens, 2010; Maeder et al., 2007; Dehmeltand Bastiaens, 2010).

A compelling long-term goal of biology is to combine all avail-able spatial and quantitative information about a cell’s constituentparts into a predictive spatially explicit model of cell processes,metabolism and behavior (Morris and Jensen, 2008; Kinkhabwalaand Bastiaens, 2010; Lemerle et al., 2005; Ridgway et al., 2006).Ideally, such models would include atomic details relevant to themolecular function of macromolecular assemblies as well as infor-mation about the cell’s ultra-structure and the higher-order spatialorganization of these assemblies at a cellular level. Both scales are

Page 2: Journal of Structural Biology - University of Southern Californiaweb.cmb.usc.edu/people/alber/pdf/science_JSB2011.pdf · 2012. 7. 9. · to expand the scope of structural biology

Fig.1. Bridging molecular and cellular biology. (Upper panels) Structural information gathered at different levels of organization, ranging from atomic structures ofmacromolecules (left), to the molecular architectures of larger cellular components (middle) and the higher order proteome organization (right). The right scheme shows amodel of the intracellular organization of E. coli (Reprinted with permission of David S. Goodsell, The Scripps Research Institute, La Jolla, CA). (Lower panels) Selection ofexperimental and computational methods that can provide spatial information of the proteome organization at the various scales of resolution. (SAXS: Small-angle X-rayscattering. TEM: Transmission electron microscopy).

484 M. Beck et al. / Journal of Structural Biology 173 (2011) 483–496

necessary to understand key biochemical processes. For instancespatial gradients, irregularities and discontinuities of macromolec-ular distributions in a cell are known to play an active role in bio-logical processes such as cell division, genome segregation, generegulation, cell morphogenesis, and shape maintenance. Many bio-logical processes related to cell signaling, cell cycle, and proteintransport are also modulated by the precise spatial organizationof molecules in space and time (Dehmelt and Bastiaens, 2010; Ga-vin et al., 2006; Scott and Pawson, 2009; Bork and Serrano, 2005).For instance, in the so-called ‘‘push–pull’’ signaling networks, vari-ations in the spatial distribution of two antagonistic enzymes thatcovalently modify a substrate can either dramatically weaken oramplify the corresponding signal transduction (Morelli and tenWolde, 2008). Two other examples are the chemotaxis pathwayin E. coli (Lipkow et al., 2005; Bray, 2002), and the mitogen-acti-vated protein kinase (MAPK) cascade processes (Maeder et al.,2007; Dehmelt and Bastiaens, 2010). Even the simplest cells arequite heterogeneous in their cellular proteome organization (Shap-iro et al., 2009), but so far relatively little is known about the orga-nization of the cellular proteome at molecular level and how thisorganization dynamically changes over time.

For a better understanding of cellular processes, it is necessaryto expand the scope of structural biology from the molecular func-tions of the isolated macromolecular assemblies to the concertedroles that such assemblies play in their cellular context (Leiset al., 2009; Nickell et al., 2006; Shtengel et al., 2009). Three fieldsthat in recent years have had a significant contribution in this con-text are addressed in this review: first, computational methods forcalculating pseudo-atomic models of large macromolecular assem-blies by combining EM density maps and atomic models of the iso-lated components – such models provide insights into the

molecular functions of the key players in cell processes. Thesestructures provide also a starting point for detecting their localiza-tion in the cellular context. Second, cryoET techniques that areused for characterizing the spatial distributions of these assemblieswithin the cell using computational image processing; and third,spatial modeling of the dynamic organization of the proteome, inparticular particle-based reaction-diffusion techniques that simu-late the trajectories of these macromolecules and their assembliesover time during reaction processes in their cellular context.

2. Modeling structures of macromolecular assemblies bycryoEM density fitting

High-resolution structures of macromolecular assemblies arekey for understanding biological processes at the molecular level(Alber et al., 2008). They can serve as templates in the identifica-tion of complexes in whole-cell tomograms (Section 3) and areuseful to predict realistic association rates (Schlosshauer and Ba-ker, 2004) and diffusion properties in particle-based simulationsof cellular mixtures (Section 4). Although X-ray crystallographyhas provided spectacular insights into the structure and functionof such assemblies, (e.g., the ribosome (Rodnina and Wintermeyer,2009) and the exosome (Lorentzen et al., 2008)), most of the struc-tural data on abundant large and stable assemblies has been de-rived in recent years by EM methods (Chiu et al., 2006; Frank,2009). EM produces 3D density maps of large assemblies typicallyat intermediate to low levels of resolution (�5–30 Å). The numberof density maps in the public domain is growing rapidly, with over360 structures in the EM databank alone (Henrick et al., 2003). Re-cent advances in cryoEM data acquisition and image processing

Page 3: Journal of Structural Biology - University of Southern Californiaweb.cmb.usc.edu/people/alber/pdf/science_JSB2011.pdf · 2012. 7. 9. · to expand the scope of structural biology

M. Beck et al. / Journal of Structural Biology 173 (2011) 483–496 485

have made it possible to determine a rapidly increasing number ofstructures at sub-nanometer resolution, allowing in many casesthe identification of secondary structure elements and even thetracing of the backbone (Chiu et al., 2005; Zhou, 2008; Lindertet al., 2009; Stahlberg and Walz, 2008). A number of structureswere even solved at near-atomic resolutions (�3.8–4.6 Å), wheremany detailed structural features such as turns and deep grovesof helices, strand separation in beta-sheets, densities for loopsand bulky amino acids side chains could be resolved (Zhou,2008). Such structures have so far been obtained in highly sym-metrical objects such as the bacterial flagellar filament (Yonekuraet al., 2003), various viruses (the cytoplasmic polyhedrosis virus(Yu et al., 2008), rotavirus inner capsid particle (Zhang et al.,2008), the Epsilon 15 bacteriophage (Jiang et al., 2008), the tobaccomosaic virus (Clare and Orlova, 2010), and the GroEL chaperonin(Ludtke et al., 2008). For most cryoEM maps, however, the levelof resolution is still not sufficient to directly determine the struc-ture at atomic detail. Indeed, the number of medium- to low-reso-lution density maps is increasing exponentially while the increaseof high-resolution maps grows at a smaller rate. At the intermedi-ate- to low-resolution range, pseudo-atomic models can be ob-tained by fitting atomic structures and models of individualcomponents (such as whole proteins, domains and nucleic acidchains) into the EM density maps (Stahlberg and Walz, 2008; Orl-ova and Saibil, 2004; Rossmann et al., 2005).

In this section we review computational methods for rigid fit-ting, flexible fitting and multiple-component fitting into EM den-sity maps of macromolecular assemblies. We give examples ofsuccessful applications of computational methods that producedpseudo-atomic models of macromolecular assemblies.

2.1. Rigid fitting

Rigid fitting methods have been very successful in providingmany pseudo-atomic models of macromolecular assemblies (Ta-ble 1) (Rossmann et al., 2005; Wriggers and Chacon, 2001; Fabiolaand Chapman, 2005; Topf and Sali, 2005). In most of these meth-ods, the quality-of-fit measure for the placement of a probe struc-ture in a density map is the cross-correlation coefficient (or itsvariants), calculated between the EM density map and the densityof the probe structure (computed by convolving the atomic struc-ture using a point-spread function). To find the highest cross-cor-relation some of these methods (e.g. COLORES (Chacon andWriggers, 2002)) perform an exhaustive search over all possibleprobe map positions and orientations relative to the reference

Table 1Computational methods for fitting of atomic structures into cryo-EM density maps.

Rigid fitting Density-based fittingCOAN (Volkmann and Hanein, 1999), EMFit (Rossmann, 2(Chacon and Wriggers, 2002) (Wriggers et al., 1999), UROXthreading MC (Wu et al., 2003), 3SOM (Ceulemans and RuFit_in_Map (Goddard et al., 2007)Point-matching methodsSITUS Wriggers et al., 1999 (Birmanns and Wriggers, 2007

Flexible fitting Fitting multiple conformationsNMA (Tama et al., 2002; Ma, 2005), NORMA(Suhre et al.,(Velazquez-Muriel et al., 2006), Rosetta/Foldhunter (BakeRefinement methodsRSRef Chen et al., 2003, NMFF(Tama et al., 2004), Flex-EMYUP.SCX (Tan et al., 2008), FIRST/FRODA (Jolley et al., 2002010), Cross-correlation based MD (Orzechowski and Tam2008)

Simultaneous Fitting ofComponents

GMFIT Kawabata, 2008, MultiFit (Lasker et al., 2009), IQP

Comprehensive dataintegration

IMP (Alber et al., 2008, 2007a,b; Förster et al., 2009; Lask

EM map. Since performing such a search is computationallyintensive, they usually increase the sampling efficiency by apply-ing a Fast Fourier Transform (FFT)-accelerated translational searchbased on the convolution theorem (Katchalski-Katzir et al., 1992).Sampling efficiency can also be accelerated by a fast rotationalsearch based on spherical harmonics (Kovacs et al., 2003) (e.g., asimplemented in ADP_EM (Garzon et al., 2007)). In contrast to theexhaustive search, some programs can perform a heuristic search,for example by using a Monte Carlo-based search algorithm (Wuet al., 2003; Topf et al., 2005).

Other methods increase the computational efficiency by reduc-ing the complexity of density maps to a small set of feature points(so-called code-book vectors) whose positions best reproduce thedensity map’s gross features, such as its shape and mass distribu-tion (Table 1). The optimal positions of feature points can be deter-mined by the vector quantization (VQ) technique (Martinetz et al.,1993). The fitting problem then effectively reduces into a commonpoint-set matching problem. This matching has been implementedin the SITUS package by an exhaustive search method for single-molecule fitting (Wriggers et al., 1999) and a heuristic anchor-point registration method for component fitting into an assemblymap (Birmanns and Wriggers, 2007). The latter method uses a hier-archical alignment of the point sets and reduces the search-spacecomplexity by an integrated tree-pruning technique.

There are also a number of programs that allow a local search(e.g., Mod-EM (Topf et al., 2005) and Chimera/Fit_in_Map (Goddardet al., 2007)), that is, maximizing the quality-of-fit measure bysearching only in the immediate neighborhood of the initial posi-tion of the probe structure. This option is rather useful because of-ten searching for a correct fit of a component in the entireassembly map may lead to an incorrect fit. If knowledge aboutthe approximate position of the component in the map exists(e.g., based on the position of other components or labeling exper-iments) local fitting may help avoiding this problem. Another wayto solve this problem is to use simultaneous multiple-componentfitting (see below).

2.2. Flexible fitting

A common problem in EM density fitting is that the isolatedcomponent structure may be in a different conformation than inthe assembly density map. These conformational differences canoriginate from the varied experimental conditions under whichcomponents and assembly structures were determined as well asfrom distortions due to crystal packing and experimental noise.

000), DOCKEM (Roseman, 2000), FoldHunter (Jiang et al., 2001), Situs/Colores(Navaza et al., 2002; Siebert and Navaza, 2009), FRM (Kovacs et al., 2003), grid-

ssell, 2004), Mod-EM (Topf et al., 2005), ADP_EM (Garzon et al., 2007), Chimera/

)

2006; Siebert and Navaza, 2009), Moulder-EM (Topf et al., 2006), S-flexfitr et al., 2006), CHOYCE (Rawi et al., 2010)

(Topf et al., 2008), MDFF (Trabuco et al., 2008), Rosetta (DiMaio et al., 2009),8), DireX (Schroder et al., 2007), DDFF (Kovacs et al., 2008), EM-IMO (Zhu et al.a, 2008), SITUS (Wriggers et al., 1999; Birmanns and Wriggers, 2007; Rusu et al.,

(Zhang et al., 2010)

er et al., 2010)

Page 4: Journal of Structural Biology - University of Southern Californiaweb.cmb.usc.edu/people/alber/pdf/science_JSB2011.pdf · 2012. 7. 9. · to expand the scope of structural biology

486 M. Beck et al. / Journal of Structural Biology 173 (2011) 483–496

In addition, due to sample heterogeneity single-particle cryoEMand image processing often results in a number of maps describingdifferent functional conformations of the entire complex (Spahnand Penczek, 2009). Common conformational differences are shearand hinge movements of domains and secondary structure ele-ments, as well as loop distortions. When a component structure isgenerated by a structure prediction method (Baker and Sali, 2001)additional errors can be introduced due to misassignment of sec-ondary structure elements to incorrect sequence regions resultingin their shifts in space (Alber et al., 2008; Topf et al., 2008).

The simplest approach to incorporate conformational variabilityis to divide the probe structure into rigid bodies, such as domains,and fit each of them independently into the map (Wendt et al.,2001; Volkmann et al., 2000). Clearly, this approach is problematicbecause the mechanical properties of the probe structure are notreflected in the deformation of the structure. A more objective ap-proach is to generate multiple ‘‘valid’’ conformations of the probestructure and select the top ranking conformation based on its fitin the density. This approach, which essentially separates samplingand scoring, is based on the principle that a correlation exists be-tween the quality of a structural model (e.g., in terms of the RMSDfrom the native conformation) and its cross correlation with thedensity map (Topf et al., 2005). An alternative to this approachare methods that simultaneously optimize the conformation ofthe probe and its position and orientation in the cryoEM map(Fabiola and Chapman, 2005). Such fitting methods are similar tocrystallographic refinement programs, except that they generallyrefine groups of atoms held together instead of individual atoms.Common to all the automated methods is the limited samplingof conformational degrees of freedom. Therefore, they are usuallyapplied to components that are first placed into the density mapby rigid fitting.

2.2.1. Fitting multiple conformationsIn order to generate candidate conformations for fitting one can

use normal mode analysis (Tama et al., 2002; Ma, 2005; Suhreet al., 2006), comparative protein structure modeling (Moulder-EM Topf et al., 2006 and the CHOYCE webserver (Rawi et al.,2010), ab intio modeling (Rosetta/Foldhunter, Baker et al., 2006),and exploit the structural variability of protein domains within agiven superfamily (S-flexfit, Velazquez-Muriel et al., 2006). Fittingof multiple conformations is particular useful when crystal struc-

Fig.2. Pseudo-atomic models of the yeast 60S (A) and 40S (B) ribosomal subunits determhomology modeling. YUP.SCX (Tan et al., 2008) and Flex-EM (Topf et al., 2008) were applishown as magenta and green tubes, respectively. Ribosomal proteins, with known hoaccounting for proteins that are unique to eukaryotes with no structural homolog availabin yellow (A) and slate (B). The boundary of each cryoEM map is shown as a gray mesh

tures of the components are not available and must be modeledby comparative modeling or by ab initio protein structure predic-tion methods. In an example of this approach, most eukaryoticribosomal proteins were modeled using comparative modelingand fitted into the ribosome density map at �9 Å resolution(Chandramouli et al., 2008; Taylor et al., 2009) (Fig. 2). A similarapproach was implemented using de novo protein structure predic-tions (Rosetta Rohl et al., 2004). Different conformations were gen-erated and assessed by cryoEM density fitting (Foldhunter Jianget al., 2001), leading to the discovery of a novel fold for the herpes-virus VP26 core domain (Baker et al., 2006).

2.2.2. Refinement methodsReal-space refinement methods (e.g., RSRef (Chen et al., 2003),

Flex-EM (Topf et al., 2008), MDFF (Trabuco et al., 2008), Rosetta (Di-Maio et al., 2009), and EM-IMO (Zhu et al. 2010)) use the density toguide the conformational changes in the probe structure. Generallythese methods optimize the fit of the probe structure in the densitywhile maintaining its mechanical properties. In Flex-EM for exam-ple, the scoring function includes a cross-correlation term as well asstereochemical and non-bonded interaction terms (Topf et al.,2008). A heuristic optimization that relies on a Monte Carlo search,a conjugate-gradients minimization, and simulated annealingmolecular dynamics (MD) is applied to a series of subdivisions ofthe structure into progressively smaller rigid bodies. The methodhas been used to refine a number of crystal structures and homol-ogy models in cryoEM maps, among them the structure of the elon-gation factor EF4 in the �11 Å resolution cryoEM map of EF4 boundto the E. coli ribosome (Fig. 3) (Connell et al., 2008) and the structureof the apotosome-procaspase-9 CARD complex at 9.5 Å (Yuan et al.2010). In YUP.SCX (Tan et al., 2008) intra-molecular forces areapproximated by a Gaussian Network Model (GNM) and the opti-mization protocol uses simulated annealing MD. Flex-EM andYUP.SCX were applied to refine pseudo-atomic models of proteinsand RNA, respectively, using the density of an eukaryotic ribosomeat �9 Å resolution (Taylor et al., 2009) (Fig. 2). In MDFF (Trabucoet al., 2008), external forces proportional to the gradient of the den-sity map are combined with an MD protocol to optimize the confor-mations of the component structures. This method was recentlyapplied to studies on the mechanisms of protein synthesis by theribosome (Seidelt et al., 2009) and the dynamics of protein translo-cation (Becker et al., 2009).

ined by combining cryoEM density fitting (at �9 Å resolution) with RNA and proteined to refine the models. Ribosomal RNA is represented as tubes. 5S and 5.8S rRNA aremologs and placement, are shown as orange (A) and pink (B) cartoons. Densityle or that have not been localized within the context of the ribosome is represented. (Figure is reproduced from Ref. (Taylor et al., 2009)).

Page 5: Journal of Structural Biology - University of Southern Californiaweb.cmb.usc.edu/people/alber/pdf/science_JSB2011.pdf · 2012. 7. 9. · to expand the scope of structural biology

Fig.3. The molecular model of ribosome-bound EF4 (rbEF4) in the process of back-translocating tRNAs. (A) The model of rbEF4 based on Flex-EM refinement of the unboundEF4 crystal structure (PDB 3CB4 (Evans et al., 2008)) in the cryoEM density (mesh), which was computationally segmented from the density of the entire 70S–EF4 complex (at�11 Å resolution). (B) The individual domains (corresponding to the EF-G domain nomenclature (Connell et al., 2007) are colored blue (I), green (II), yellow (III), dark orange(V) lighter orange (V0) and red (CTD). (B) The EF4 X-ray structure (gray cartoon) superimposed on the rbEF4 structure by aligning domain I to illustrate the domainrearrangement between the two. (C and D) The contrasting interaction of domain IV in rbEF-G (Connell et al., 2007) (PDB2OM7) and domain V0 in rbEF4 with the tRNAintermediate (A/L-tRNA, purple ladder representation) is shown (rbEF-G superimposed on the cryo-EM density of rbEF4). Domain IV and the G0 sub-domain of EF-G (orange)have no equivalent in EF4. The absence of domain IV in EF4 is a prerequisite for allowing the tRNA to access the A-site during the back-translocation reaction (from the P-siteto the A-site). The contacts with the A/L-tRNA (via sub-domain V0 and the CTD), revealed by flexible fitting, allow EF4–GTP to form a ternary complex with tRNA on theribosome. (Figure is reproduced from Reference (Connell et al., 2008)).

M. Beck et al. / Journal of Structural Biology 173 (2011) 483–496 487

In many refinement methods the conformation is optimized forgroups of atoms, either by defining them as rigid bodies (Topf et al.,2008) or by applying restraints on secondary structure elements(Trabuco et al., 2008). Rigid bodies can be selected manually orby automated methods, such as those based on graph theory(e.g., FIRST/FRODA (Jolley et al., 2008)). However, the definitionof rigid bodies can limit the conformational degrees of freedomof the structure: if the number of rigid bodies is too small the opti-mization may not reach the global minimum because a more de-tailed modification of the conformation is needed. If an all-atomrepresentation is chosen the computational efficiency is largely re-duced and the system is likely to get trapped in local minima. Away to avoid this problem can be the use of coarse-grained (CG)or reduced structure representation by considering only Ca atoms(e.g., as in DireX which combines an elastic network model withrandom walk displacements and distance restraints (Schroderet al., 2007)) or by representing the side-chain of a residue witha pseudo-atom (e.g., as in Damped-Dynamics Flexible Fitting -DDFF (Kovacs et al., 2008)). A recent method applied a Go-modelwith cross-correlation biased MD without imposing any restraintson the secondary structure elements (Grubisic et al., 2010). A re-duced representation can also be used in conjunction with interpo-lation methods, as an alternative to the constrained MD (Rusuet al., 2008). Finally, conformational changes in a structure can alsobe guided by normal mode analysis (Tama et al., 2002; Ma, 2005;Suhre et al., 2006). In NMFF for example, a linear combination oflow-frequency normal modes is used to iteratively deform thestructure to conform to the low-resolution density map (Tamaet al., 2004). One of the advantages of this method is that it allowsfor large conformational changes as observed in the anthrax com-plex (Tama et al., 2006) or GroEL (Falke et al., 2005). However,

NMA is often limited in describing smaller-scale conformationalchanges (Tama and Sanejouand, 2001) that are often reflected inintermediate-resolution cryoEM density map.

2.3. Simultaneous fitting of multiple components

A considerable challenge is the fitting of multiple componentsinto the density map of an assembly if no a priori knowledge aboutthe location of the components is available. Sequential fitting ofcomponents often fails when they cannot be unambiguouslyplaced in the density map as is often the case for maps ofintermediate-to low-resolution and for assemblies with a largenumber of components. In such cases, all the components mustbe fitted simultaneously into the map to identify the global opti-mum of the quality-of-fit measure. However, simultaneous fittingis difficult as the large search space makes an exhaustive searchprotocol (that uniformly samples over all degrees of freedom) com-putationally unfeasible.

One method (MultiFit (Lasker et al., 2009)) uses discrete sam-pling in combination with an inference optimizer and expandsthe quality-of-fit measure by additional information such as shapecomplementarities between interacting components and the pro-trusion of components from the map envelope. Other methods,such as GMFIT (Kawabata, 2008) and IQP (Zhang et al., 2010), sim-plify the search problem by reducing the complexity of the assem-bly density map and structures (Wriggers et al., 1999; Birmannsand Wriggers, 2007). In GMFIT the initial density distribution ofthe assembly and the components are approximated by a smallset of Gaussian functions to efficiently use gradient-based optimi-zation methods for the structural optimization of component orien-tations (GMFIT). In the IQP method integer quadratic programming

Page 6: Journal of Structural Biology - University of Southern Californiaweb.cmb.usc.edu/people/alber/pdf/science_JSB2011.pdf · 2012. 7. 9. · to expand the scope of structural biology

488 M. Beck et al. / Journal of Structural Biology 173 (2011) 483–496

is used for matching two point sets that represent the reduced den-sity distribution of several components and the assembly. IQP isable to match multiple component point sets simultaneously whileconsidering both information about the geometric architecture ofthe point distributions as well as the consistency of the densitymap in the immediate neighborhood of the points. Moreover, themethod allows straightforward integration of additional informa-tion about the assembly. Reducing the complexity of the densityinformation to point sets is accompanied by an inevitably loss inaccuracy in the fitting process. To overcome this challenge, typicallya large number of independent point set matches must be per-formed, which generates an ensemble of candidate solutions. Thisensemble is then assessed using an independent scoring functionthat measures the quality-of-fit between the component structuresand the assembly map using the cross-correlation function. Finally,the best scoring structures are refined to locally optimize the fit be-tween assembly and component density maps.

Although progress has been achieved, both simultaneous andflexible fitting of assembly components are still a challenge, in par-ticular for cryoEM maps at resolution worse than �10 Å (themajority of maps). There is currently a need to develop faster scor-ing functions that can help discriminate between good and bad fitsas well as optimization methods that can explore both conforma-tional and configurational space more efficiently.

2.4. Comprehensive data integration

For the fitting of low-resolution density maps the sampling can beimproved if additional information is available about the assemblystructure, for instance about protein interactions, sub-complex com-position, or shape information from mass spectrometry and SAXS.Recently, a systematic computational framework was introduced(IMP, integrative modeling platform) to determine the structures oflarge macromolecular assemblies by comprehensive integration ofdiverse data sources (Alber et al., 2008, 2007a,b; Förster et al.,2009; Lasker et al., 2010). The method can integrate structural infor-mation that can vary greatly in terms of their resolution and preci-sion, for instance, data from X-ray crystallography, NMRspectroscopy, electron microscopy, chemical cross-linking, affinitypurification, yeast two-hybrid experiments, and computationaldocking (Alber et al., 2007a; Förster et al., 2009; Lasker et al.,2010). The approach attempts to compute structures that are consis-tent with all available information about the assembly regarding thecomposition and structure of the complex. The structure determina-tion process is formulated as an optimization problem where struc-tures that are consistent with all input information are found byminimizing a scoring function. To comprehensively sample the solu-tion space, an ensemble is generated that contains independentlycalculated structures, each satisfying all of the input restraints.

By applying this framework, the structure of the 450-proteincontaining nuclear pore complex (NPC) was determined based onthe integration of data from seven independent experimental andtheoretical sources (Alber et al., 2007a,b). The averaged structuralNPC model has led to functional insights and evolutionary under-standing of the yeast NPC (Alber et al., 2007b). Another studydetermined the pseudo-atomic model AAA-ATPase/20S core parti-cle sub-complex of the 26S proteasome (Nickell et al., 2009; Försteret al., 2009). The structure was determined based on a cryoEM mapof the 26S proteasome, structures of homologs, and physical pro-tein–protein interaction data.

3. Determining the spatial organization of the cellularproteome

High-resolution structures of macromolecular complexes pro-vide major insights in their molecular mechanism. But to fully

understand their function in the context of cellular processes,knowledge of their spatial distribution within living cells is crucial.We currently only begin to understand that the spatial organiza-tion of the cell goes far beyond membrane confined compartmentsand solely diffusion controlled reactions. Cooperating functionalmodules can function more effectively in local proximity and theirreactive sites might be specifically orientated towards each other.For example, recent work demonstrated that certain mRNAs canbe locally translated (Lin and Holt, 2008) and that polysomal ribo-somes specifically orient along the mRNA trace (Brandt et al.,2009). Such polysomal ribosome arrangements occur in protru-sions of Glia cells (Brandt et al., 2010), which might form subcom-partments that could allow chaperones to easily access theemerging polypeptide chain and effectively shielding it from thecrowded environment until folding is complete. These findingsillustrate the need for techniques that reveal the spatial positionand orientation of macromolecules at sufficient resolution on aproteome-wide scale.

Classical proteomics approaches determine the protein contentof cells, however, any spatial information is lost and cell specificproperties are averaged over the population of lysed cells. Recentadvances in technology will allow a much better identificationand localization of these assemblies in the cell. Among these meth-ods are (i) high-resolution fluorescence imaging, which can pro-vide spatial and dynamic information about the localization,abundance and diffusion rates of macromolecules and their assem-blies (Werner et al., 2009) and (ii) cryoET, which not only can pro-vide insights into the ultra-structure of cells but increasingly eveninto the cellular distribution of assemblies (Beck et al., 2009; Ben-Harush et al., 2009; Förster et al., 2010; Grünewald, 2009). Withthese methods it may be possible to gain information about thecellular environment of macromolecular assemblies, and gain in-sights into the positions of all their instances in a cell, during dif-ferent time points in its lifetime. They also promise to revealdifferences in the distribution of macromolecules between individ-ual cells and their contribution to the behavior of the entire popu-lation. Such information will help bridging the knowledge aboutthe molecular function of macromolecules derived from theiratomic structure with their cellular function at the cellular systemslevel. In the following section we focus on methods and applica-tions of cryoET.

3.1. Visual proteomics using cryo electron tomography

CryoET provides 3D snapshots of cells under close-to-live condi-tions (Figs. 4 and 5). However, assigning the protein identity to theobserved shapes and distribution of complexes is not a trivial taskdue to a number of technical challenges: a tomogram typically haslow signal-to-noise ratio, non-isotropic resolution, and straight-forward in vivo labeling techniques, analogous to GFP-tags usedin fluorescence imaging, are not available for cryoEM (Leis et al.,2009; Morris and Jensen, 2008; Nickell et al., 2006; Lucic et al.,2005). For the proportion of protein assemblies whose structuresare known, visual proteomics methods have been developed thatattempt to generate molecular atlases containing the positionand angular orientation of protein complexes in cryo electrontomograms of cells by template matching (or pattern recognition)(Baumeister, 2002; Nickell et al., 2006; Förster et al., 2010;Frangakis et al., 2002; Böhm et al., 2000). At first a template libraryis built that contains the reference structures of the protein com-plexes to be localized. Subsequently, an exhaustive search is per-formed that scans the tomographic volumes for matches with thestructural templates by calculating a cross-correlation score. Final-ly, the observed distribution of the cross-correlation score is trans-lated into a position list using peak extraction and statisticalmethods. In 2000, the first proof-of-concept for template matching

Page 7: Journal of Structural Biology - University of Southern Californiaweb.cmb.usc.edu/people/alber/pdf/science_JSB2011.pdf · 2012. 7. 9. · to expand the scope of structural biology

Fig.4. Visual proteomics of the human pathogen Leptospira interrogans. (A) Scoredistributions obtained from artificial tomograms of the observed (blue), truepositive (green) false positive hits (red). The arrowhead marks the threshold thataccounts for a specificity of 40%. (B) Centered slices through an artificial Leptospiracell shown perpendicular (left) and parallel (right) to the electron optical axis, aswell as before (top) and after (bottom) simulating the image formation process.Numbers account for protein complexes as described below. (C) Centered slicethrough a tomographic reconstruction, oriented perpendicular to the electronoptical axis. (D) The templates assigned into this volume are shown surfacerendered. 70S ribosomes (1) are shown in dark brown, 2 RNA-polymerase II (2) ingreen, ATP-synthase (3) in bright brown, GroEL in bright blue, GroEL-ES (4) in darkblue, Hsp (5) in red, cytoplasmic membrane in transparent blue, and the cell wall intransparent brown). (Figure is adapted from Reference (Beck et al., 2009)).

M. Beck et al. / Journal of Structural Biology 173 (2011) 483–496 489

in cryo electron tomograms using simulations and a first experi-mental application were provided (Böhm et al., 2000). The ultimategoal is to fit high-resolution structures into the cellular context,which would allow for observing protein interactions at work.Nevertheless, the unambiguous detection of macromolecularassemblies in cryo electron tomograms of intact cells remains aconsiderable challenge (Beck et al., 2009; Förster et al., 2010),not only due to the biological complexity. At the currently achiev-able resolution only large protein complexes have a chance ofbeing identified and the low signal-to-noise ratio peculiar to cryoelectron tomograms hampers a robust and reliable detection.

3.1.1. Template librariesVisual proteomics approaches are so far dependent on the avail-

ability of template structures, or, with other words, are biased bytheir own template libraries. In order to generate libraries of tem-plate structures that best account for the signal observed in a cryoelectron tomogram, the imaging process has to be simulated asrealistically as possible. First, the imaged quantity needs to be cal-culated from the coordinates and identities of the atoms specifiedin the PDB. Subsequently the density is convoluted with the con-trast transfer function, which describes the imaging in the trans-mission electron microscopy in linear approximation (Försteret al., 2010; Lucic et al., 2005). Beyond these technicalities, thereare a few biological aspects to be taken into account when tem-plate libraries are being assembled. Can the template sufficientlyaccount for the structural state of the targeted molecule withinthe cell? A recent structural-proteomic survey of the largest andmost abundant protein complexes in Desulfovibrio vulgaris revealeda high structural diversity between different microorganisms (Han

et al., 2009). The selection of reference structures solved from otherspecies has therefore to be done with caution. Furthermore, theprotein complex might exist in diverse oligomeric states in the cellthat might differ from the species observed during structure deter-mination. For this reason, visual proteomics approaches so far werelimited to highly stable and conserved targets. What is the abun-dance of the targeted protein complexes in the cell and to what ex-tent does the template library account for all large proteincomplexes that exist in the cell? This is an important question aswell, because high abundant protein complexes of similar struc-ture might compete for assignments and thereby have a negativeeffect on the performance. Targeted and directed mass spectromet-ric measurements can provide the required auxiliary informationabout protein abundances on a proteome-wide scale (Malmstromet al., 2009).

3.1.2. Assessment of template matchingOnce the template library is built, matches are identified by cal-

culating a cross-correlation score between reference structures andthe subvolume of a tomogram. Förster implemented a local, con-strained correlation function for template matching to accountfor missing wedge effects (due to incomplete tomogram sampling)(Förster, 2005; Förster et al., 2005; Roseman, 2004; Ortiz et al.,2006). Recently, a scoring function was introduced that not onlyevaluates the cross-correlation value for the template, but alsothe cross-correlation values of competing templates and decoysat the same position in the tomogram (Beck et al., 2009). This scor-ing scheme outperformed the classical workflow particularly forcompeting protein complexes of similar structural signature. Forvisualization, the template is rotated and positioned at the deter-mined coordinates and the corresponding angles that lead to themaximal cross-correlation values. The major challenge thereby isto discriminate true positive from false positive detections(Fig. 4A). The performance not only depends on tomogram-specificparameters such as acquisition settings, specimen thickness, andmolecular crowding, but also target-specific parameters, such asmolecular weights, cellular abundance, and cellular abundance ofprotein assemblies competing for assignments. Therefore, the per-formance can vary for each template contained in the library. Thefalse positive discovery rates of all matches have to be estimatedusing a statistical model (Beck et al., 2009; Keller et al., 2002).The performance of template matching can also be assessed basedon a priori knowledge about the spatial distribution of the templatestructures, such as the cellular localization or orientation of proteinassemblies, which can provide clues about the expected false posi-tive discovery rates. For example, some membrane associated pro-tein complexes exhibit specific positioning and orientation relativeto the membrane (Ortiz et al., 2006). Also matching with a mir-rored template of ‘non-native’ handedness can provide comple-mentary information about the distribution of cross-correlationvalues. In addition, simulated tomograms can serve as test casesto estimate the performances. Because the position and orientationof all protein complexes in the artificial tomograms are known, thefalse positive rate can be calculated for specific template com-plexes under the specific experimental conditions (Fig. 4B).

3.2. Applications

In a recent study, namely the visual proteomics project for thepathogen Leptospira interrogans (Beck et al., 2009; Förster et al.,2010; Grünewald, 2009; Malmstrom et al., 2009), targeted proteo-mics experiments detecting the identity and concentrations of cel-lular proteins were combined with cryoET-based templatematching to detect the spatial localizations of a set of protein com-plexes (Fig. 4). To identify template structures that might be suit-able for the template matching, 26 candidate assemblies of a

Page 8: Journal of Structural Biology - University of Southern Californiaweb.cmb.usc.edu/people/alber/pdf/science_JSB2011.pdf · 2012. 7. 9. · to expand the scope of structural biology

Fig.5. By a combination of pattern recognition and classification algorithms, the following TAP-identified complexes from M. pneumoniae, matching to existing electronmicroscopy and X-ray and tomogram structures (A), were placed in a whole-cell tomogram (B): the structural core of pyruvate dehydrogenase in blue (�23 nm), the ribosomein yellow (�26 nm), RNA polymerase in purple (�17 nm), and GroEL homo-multimer in red (�20 nm). Cell dimensions are �300 nm by 700 nm. The cell membrane is shownin light blue. The rod, a prominent structure filling the space of the tip region, is depicted in green. Its major structural elements are HMW2 (Mpn310) in the core and HMW3(Mpn452) in the periphery, stabilizing the rod.The individual complexes (A) are not to scale, but they are shown to scale within the bacterial cell (B). (Figure is reproducedfrom Reference (Kühner et al., 2009).

490 M. Beck et al. / Journal of Structural Biology 173 (2011) 483–496

certain minimal size (>250 kDa) and a certain sequence conserva-tion were retrieved from structural databases (Beck et al., 2009;Förster et al., 2010). Targeted proteomics on a proteome-wide scaledetermined that the cellular protein abundance in L. interrogansranged over 3.5 orders of magnitude (Malmstrom et al., 2009).Only ten other protein complexes of sufficient size for templatematching were found with abundances of at least 100 copies percell. Tomograms of L. interrogans cells were acquired and subjectedto template matching and scoring (Fig. 4 C and D). The local con-centration of the targeted protein complexes varied within andacross data sets: the cells displayed an average ribosome concen-tration of�20 lM (�40 mg/ml) in the cytoplasm, but the local con-centration ranged from 5 to 30 lM (�10–65 mg/ml). The localfluctuations in case of total GroEL together with GroEL-ES were lar-ger and ranged from �8 to 100 lM (�0.5–6.5 mg/ml). This studyshowed that: (i) background noise reduces the performancedepending on the MTF of the particular CCD camera; (ii) the miss-ing wedge can introduce angular bias to the discovery rate of tem-plates with anisotropic signal content (such as the elongated ATP-synthase); (iii) the specificity increases with the template abun-dance; and (iv) decreases with the degree of molecular crowding(Beck et al., 2009; Malmstrom et al., 2009). The last two effectsare generally more pronounced for smaller than for larger tem-plates. The conclusion was that the specificity achieved for highabundant Megadalton complexes is satisfactory. However, true-po-sitive discovery rates higher than 50% are difficult to achieve whenprotein complexes of smaller molecular weights are targeted. Forcomplexes of very low cellular abundance it is quite a challenge

to obtain robust statistical models (Beck et al., 2009; Malmstromet al., 2009).

Recently, the proteome organization of the small model bacteriaM. pneumoniae was analyzed, revealing its proteome compositionand organization, as well as its transcriptional and metabolic reg-ulation (Yus et al., 2009; Kühner et al., 2009; Güell et al., 2009).Proteomics-based TAP-mass spectrometry data were comple-mented with structural modeling and cryoET to provide insightsinto the structural anatomy of the model bacteria (Kühner et al.,2009). The study identified more than 170 protein complexes. Fourof these protein complexes whose structures have been knownwere localized in the tomogram of an intact M. pneumoniae cell(Fig. 5). This ratio of detected to structurally-characterized proteincomplexes demonstrates current throughput limitations of cryoEMarising from: (i) the stability of protein complexes during the cryopreparation; (ii) their orientation on the EM-grid and; (iii) the factthat the data analysis by image processing is time consuming andrequires intensive user interference. Nevertheless, the study dem-onstrates that mapping of macromolecular structures into entire-cell tomograms is a powerful strategy when combined with unbi-ased large-scale complex purification techniques. The template li-brary in this study was in part built from single-particle EMstructures of M. pneumoniae protein complexes. This considerableeffort overcomes the difficulty of structural conservation acrossspecies discussed above.

CryoET combined with image averaging provides uniqueopportunities for structure determination of membrane-boundmacromolecular assemblies in intact cells (Bartesaghi and

Page 9: Journal of Structural Biology - University of Southern Californiaweb.cmb.usc.edu/people/alber/pdf/science_JSB2011.pdf · 2012. 7. 9. · to expand the scope of structural biology

M. Beck et al. / Journal of Structural Biology 173 (2011) 483–496 491

Subramaniam, 2009; Förster and Hegerl, 2007). For instance, a re-cent study investigated receptor clustering in bacterial chemotaxisand found a universal architecture in a wide range of bacteria(Briegel et al., 2009, 2008). In addition, cryoET can provide snap-shots of individual time points along a process pathway, as hasbeen demonstrated for cytoskeletal-driven processes that are in-volved in cell movements (Ben-Harush et al., 2010) and membranefusion in herpes simplex virus 1 entry (Maurer et al., 2008).

3.3. Outlook

A number of limitations to visual proteomics might be over-come by further technical developments. Currently the most criti-cal limitation in cryoET is the moderate signal-to-noise ratio. Sincethe dosage of electrons that can be applied to a biological specimenultimately limits the resolution of cryoET, visual proteomics cur-rently remains restricted to complexes of a certain minimal molec-ular weight that can be targeted by template matching. However,the signal in visual proteomics primarily arises from the contrastbetween the targeted protein complex and the surrounding sol-vent. Therefore, improving contrast, e.g. through phase plates(Murata et al. 2010), is likely to improve the performance in the fu-ture. Furthermore, specimen thinning techniques, e.g. through fo-cused ion beams (Rigort et al., 2010), will be crucial forimproving SNR and imaging larger cell types, such as eukaryoticcells, which exceed 500 nm in diameter; thus, application of visualproteomics to eukaryotic cells depends on progress in this field. Fi-nally, direct electron detection systems promise to largely improveSNR of cameras (McMullan et al., 2009), which will be highly ben-eficial for techniques that target pleomorphic structures, such ascryoET, in which the SNR cannot be improved by imaging the mul-tiple copies of the same specimen.

However, some limitations will remain: cryoET provides staticsnapshots of single cells without temporal resolution and the spa-tial resolution of live cell imaging techniques is comparably low.However, cryoET information can be complemented by computa-tional modeling of the proteome diffusion and reaction processes.Such combined approaches hold great potential to reveal insightsinto the dynamic spatial organization of the proteome and the bio-logical processes in cells.

4. Modeling the dynamic proteome organization

Novel experimental methods have greatly increased the avail-ability of information on the molecular organization of living cells.As described in the previous sections information is increasinglyavailable about the membrane confined compartments of cellsand the identity, abundance and spatial distributions of the macro-molecular components in the cell. There is a pressing need to inte-grate these varied data into spatially explicit, predictive models ofspecific biological processes such as the selective nucleo-cytoplas-mic transport, signal transduction, genome separation and mitosis.This goal not only requires knowledge about the atomic structuresand molecular functions of macromolecular assemblies involved,but also the quantitative rates of reactions these assemblies are in-volved in as well as their dynamic diffusion properties, whichdependent on their spatial environment in the cell. For instance,macromolecular crowding can influence protein diffusion and thusthe thermodynamic and statistical properties of important reactionprocesses. In the following sections, we review computationalmethods that simulate reaction processes while incorporating thedynamic diffusion behavior of proteins and their assemblies incrowded intracellular fluids. We begin by briefly mentioning themajor applications of network models for process simulations,

and then list the major particle-based reaction-diffusion modelingapproaches, which are the main focus of this section.

4.1. Network models

Network models are based on a mathematical description of thelaw of mass action and rate equations and often treat the cell as awell-mixed reactor where all components of the reaction networkhave spatially uniform concentrations (Kinkhabwala and Bastiaens,2010; Ridgway et al., 2006). However, many cellular processes,such as cell division and nucleo-cytoplasmic transport, are eitherspatially constrained or segregated (Terry et al., 2007). These activ-ities actively exploit spatial gradients in the macromolecular distri-butions. Spatial network models, such as Virtual Cell (Schaff et al.,1997; Slepchenko et al., 2003), MesoRD (Hattne et al., 2005), andSmartCell (Ander et al., 2004) incorporate such spatial effects bypartitioning the cell volume into microdomains (Lemerle et al.,2005; Ridgway et al., 2006). Virtual Cell (Schaff et al., 1997; Sle-pchenko et al., 2003), can incorporate experimentally determinedcell geometries into a network model of reaction processes.

When biomolecules are present in relatively low copy numbers,their local concentrations can fluctuate widely, which can causestochastic effects in reaction processes. The effective behavior ofsuch molecules may be very different from their behavior undera constant distribution; furthermore, in such cases the law of massaction for reaction kinetics no longer applies. It has been shownthat such stochastic particle behavior can significantly influencegene regulation (Raser and O’Shea, 2005; Cai et al., 2006), signaltransduction (Kollmann et al., 2005) and many other processes(Ridgway et al., 2006). To model stochastic behavior several net-work models (including SmartCell (Ander et al., 2004) and MesoRD(Hattne et al., 2005)) replace the law of mass action with a chem-ical master equation (Ridgway et al., 2006; Turner et al., 2004).Master equations are also simulated by the Stochastic SimulationAlgorithm (SSA) (Gillespie, 1977), and its variants such as thetau-leaping method (Rathinam et al., 2003), which can accountfor low copy numbers in cellular network simulations.

4.2. Particle-based reaction–diffusion simulations

Particle-based simulations naturally incorporate the concepts ofspace, crowding and stochasticity, so often provide a more realistictreatment than network models. A protein or other reactants areoften treated as single particles. Every particle in the reaction sys-tem is treated explicitly, and the particle positions are sampled atdiscrete time intervals. To simulate particle diffusion, these simu-lations use a method known as Brownian dynamics (BD) (Ridgwayet al., 2006; Ermak and McCammon, 1978): the net force experi-enced by a particle contains a random element in addition to con-tributions from interactions with other particles. The randomelement is an explicit approximation to the statistical propertiesof Brownian forces, due to the effects of collisions with solventmolecules, which are not explicitly modeled. More specifically,the particles are displaced from their position at each time intervalby a random vector whose norm is chosen from a probability dis-tribution function that is a solution to the Einstein diffusion equa-tion. As a consequence, spatial gradients and localized interactionsnaturally occur. A number of reaction–diffusion algorithms incor-porating Brownian dynamics have been developed (Morelli andten Wolde, 2008; Ridgway et al., 2008; Boulianne et al., 2008; An-drews and Bray, 2004; Sun and Weinstein, 2007). Interactions andreactions occur upon contact between particles according to spe-cific probabilities, which are chosen to reproduce the correct reac-tion kinetics (Morelli and ten Wolde, 2008; Ridgway et al., 2008;Boulianne et al., 2008; Andrews and Bray, 2004). The particle-based simulation scheme, described above, has been implemented

Page 10: Journal of Structural Biology - University of Southern Californiaweb.cmb.usc.edu/people/alber/pdf/science_JSB2011.pdf · 2012. 7. 9. · to expand the scope of structural biology

492 M. Beck et al. / Journal of Structural Biology 173 (2011) 483–496

by several packages, including GridCell (Boulianne et al., 2008),MCell (Coggan et al., 2005), Smoldyn (Andrews and Bray, 2004),and the molecular simulators described by Ridgway et al. (2008)and Morelli and ten Wolde (2008). In GridCell, particles move onlyto occupy vertices of cubic grid structures. In MCell and Smoldynthe molecules move freely and are treated as point particles with-out exclusion volume. Thus the latter two methods do not explic-itly resolve particle collisions. In Smoldyn, reactions are accepted iftwo particles pass within a certain distance chosen such that theoverall reaction rate is reproduced correctly. As the particles lackvolume, macromolecular crowding in the cell interior is incorpo-rated indirectly by adding inert stationary, impenetrable blocksto the cytoplasm (Lipkow et al., 2005; Andrews and Bray, 2004).Smoldyn can be applied with relatively long time steps, allowingsimulations at biologically relevant time scales. Smoldyn was usedto study the effects of cellular architecture on signal transductionin E. coli chemotaxis (Lipkow et al., 2005; Lipkow, 2006)(Fig. 6A). Simulations were performed for the diffusion of CheYpfrom the cluster of receptors to the flagellar motors, first undercontrol conditions and then in response to attractant and repellentstimuli (Lipkow et al., 2005). Smoldyn has also been used to studythe signal amplification in a lattice of coupled protein kinases(Goldman et al., 2009). MCell is based on Monte Carlo simulations,and can incorporate specific shapes and cell features derived fromcryoET and other experimental methods. MCell has been used tostudy presynaptic calcium dynamics and neurotransmitter release

Fig.6. (A) Smoldyn simulations of the chemotaxis pathway of the bacterium E. coli. Each ptype. The simulations show differential localization of CheZ and its oligomeric forms. UppCheZ dimers are unbound and freely diffusing in the cytoplasm. Middle left panel: Upon spanel: After 1 min at constant kinase activity, the increase of CheYp has led to the formshown as black circles. Panel A is reproduced from Reference (Lipkow, 2006). (B) Hard-sphrepresent various macromolecules and their complexes. The particle-size distributioncomplexes using an experimentally derived proteome catalog from E. coli K12. Panel B isthe E. coli cytoplasmic environment that includes 50 of the most abundant types of mcytoplasm model at the end of a Brownian dynamics simulation performed. Panel C is r

at a neuronal synapse (Coggan et al., 2005). Another study simu-lated macromolecular transport through the nuclear pore complexto interpret the experimentally determined cargo distribution inlight of existing models for nuclear import (Beck et al., 2007).

The intracellular space is a highly crowded and non-uniformwith an occupied volume of typically up to 50% in eukaryotesand 30–40% in bacteria (Ridgway et al., 2006; Minton, 2006; Ellisand Minton, 2003) (Fig. 6 B and C). In order to accurately simulatecrowding effects, a number of reaction–diffusion methods takeinto account the physical dimensions of proteins and complexes(Ridgway et al., 2008; Sun and Weinstein, 2007). Most of themmodel the proteins and complexes as hard, elastic spheres (Ridg-way et al., 2008; Sun and Weinstein, 2007; Kim and Yethiraj,2009). Crowding can have several profound effects on the dynam-ics within the cell, such as anomalous subdiffusion of macromole-cules (Golding and Cox, 2006; Weiss et al., 2004). These effects inturn have a direct impact on the thermodynamics and kinetics ofbiological processes, including protein interactions, diffusion, sta-bility, folding, and biochemical reactions rates (Ellis and Minton,2003; Schreiber et al., 2008). Crowding slows rates of diffusion-limited reactions (Kim and Yethiraj, 2009; Schreiber et al., 2008;Wieczorek and Zielenkiewicz, 2008), but accelerates reactions withlow association rates (i.e., very small reaction probabilities per col-lision), because it forces colliding particles to remain longer in clo-ser proximity. However, simply giving the particles an excludedvolume is not sufficient to accurately reproduce crowding effects

oint represents the position of a molecule, the color of the point indicates its specificer left panel: In the absence of CheAlong kinase activity and phosphorylated CheY, alludden increase of kinase activity, the level of CheYp rises in the entire cell. Lower leftation of oligomeric CheZ2Yp clusters at the inner face of the polar receptor clusterere representation of the E. coli cytoplasm at 50% volume fraction occupied. Spheres

was estimated from the relative abundance, mass, and stoichiometries of proteinreproduced from reference (Ridgway et al., 2008) (C) Atomically detailed model ofacromolecules at experimentally measured concentrations. Rendering shows the

eproduced from Reference (McGuffee and Elcock, 2010).

Page 11: Journal of Structural Biology - University of Southern Californiaweb.cmb.usc.edu/people/alber/pdf/science_JSB2011.pdf · 2012. 7. 9. · to expand the scope of structural biology

M. Beck et al. / Journal of Structural Biology 173 (2011) 483–496 493

(Ridgway et al., 2008; Sun and Weinstein, 2007; Ando et al., 2010).Significantly better results can be obtained by combining the elas-tic collision method for hard spheres (Sun and Weinstein, 2007)with a mean field description of hydrodynamic interactions (Andoet al., 2010; Urbina-Villalba et al., 2003) and a Yukawa-like interac-tion potential for electrostatic interactions (Tokuyama and Oppen-heim, 1994; Dzubiella and McCammon, 2005). The mean-fieldapproach updates the diffusion constant for each particle at everytime step according to its local volume fraction (Urbina-Villalbaet al., 2003; Heyes, 1996). Other advanced approaches have beenproposed for the inclusion of hydrodynamic and electrostatic inter-actions (Banchio and Brady, 2003; Tanaka and Araki, 2000; Tucciand Kapral, 2004; Hansen and Lowen, 2000). Another hard-spherealgorithm has been developed that rigorously obeys the detailedbalance requirement of equilibrium reactions, allowing for anaccurate description of equilibrium properties in biochemical net-works (Morelli and ten Wolde, 2008). The method was used tostudy the effect of spatial fluctuations in a push–pull model oftwo antagonistic enzymes covalently modifying a substrate. Theresults demonstrate that spatial density fluctuations of the compo-nents can strongly reduce the gain of a response in a biochemicalnetwork (Morelli and ten Wolde, 2008), and once again highlightthe importance of the spatial dimension. Recently a BD simulationexplored macromolecular diffusion at atomic resolution in thecrowded E. coli cytoplasmic environment (Fig. 6C). Snapshots ofthe simulation trajectories were used to compute the cytoplasm’seffects on the thermodynamics of protein folding, association andaggregation (McGuffee and Elcock, 2010; Elcock, 2010).

A disadvantage of time-driven BD schemes is that they require asmall time step, which limits the effective simulation time. Event-driven simulations, such as the Green’s function reaction dynamics(GRFD) scheme (Takahashi et al., 2009; van Zon and ten Wolde,2005), on the other hand, make large jumps in time and spacewhen the particles are far apart from each other (Takahashiet al., 2009; van Zon and ten Wolde, 2005). When particle concen-trations are low, these methods are much more efficient than time-driven methods. GRFD was used to study a mitogen-activated-pro-tein kinase (MAPK) pathway (Takahashi et al., 2009). This researchfound that rapid enzyme-substrate rebindings can turn a distribu-tive mechanism into a processive mechanism. Furthermore, the re-sponse of the network differed dramatically from that obtained bya mean-field analysis of the chemical rate equations (Takahashiet al., 2009). Another event-driven approach is the first-passageMonte Carlo algorithm (Opplestrup et al., 2006).

The difficulty of reaching long simulation times is a drawback ofall particle-based methods. This limitation is particularly onerouswhen dealing with crowded intracellular fluids, which requireshort time steps to accurately simulate protein diffusion. Biologicalprocesses occur on a wide range of time scales. Some proteins mayencounter each other in fractions of a millisecond, while otherstake hours. DNA transcription and translation are completed in afew minutes, but these processes are made up of thousands ofsmall diffusion-limited reactions, which take fractions of a nano-second. It is an ongoing challenge to develop particle-based simu-lations that can cover a wide range of time scales while accuratelyreproducing the properties of diffusion and reaction networks. Inaddition, a more realistic description of the cellular environmentis required, which includes the spatial organization of the cell’smembrane bound compartments and the knowledge of the types,abundance and spatial distributions of all the constituentmacromolecules.

5. Conclusions

Understanding biological processes will require knowledgeabout the molecular functions of the cellular components as well

as quantitative information about their abundance, interactionsand spatio-temporal distributions in the confined geometry ofthe cell. Emerging new experimental technologies are greatlyincreasing the availability of such spatial information about themolecular organization in living cells. In this review we have ad-dressed three distinct fields that have contributed to our under-standing of the proteome’s spatial and temporal organization.CryoEM density fitting methods can provide pseudo-atomic struc-tures of macromolecular assemblies. Such structures provide in-sights into the molecular functions of assemblies and can alsoserve as templates for visual proteomics approaches, which canpotentially identify their abundance and distributions in the celland therefore provide a three-dimensional view of the cellular pro-teomes and interactomes. Although the method is currently lim-ited to small cells and large macromolecular assemblies, anumber of limitations to visual proteomics might be overcomeby further technical developments, which will eventually allowto expand this exciting technology to larger eukaryotic cells. Final-ly, ongoing research is reviewed on the dynamic modeling of cellu-lar reaction processes. Over the last few years several particle-based reaction diffusion methods have been developed that accu-rately reproduce protein diffusion and reactions in highly crowdedcellular environments. However, it is an ongoing challenge to ex-tend such methods to cover the wide range of time scales seen inbiological processes, while maintaining high accuracy in reproduc-ing the reaction properties. The long-term goal will remain to inte-grate information derived at the various levels of spatial proteomeorganization, into a predictive spatial model of cell processes. Suchintegration will be facilitated by the continued extension of struc-tural biology methods from the molecular to the cellular level.

Acknowledgments

The authors acknowledge financial support from the HumanFrontier Science Program (RGY0079/2009-C to F.A. and M.T.), theNational Institute of Health (NIH) (1R01GM096089 and2U54RR022220-06 to F.A.), Alfred P. Sloan Research foundation(to F.A.); F.A. is a Pew Scholar in Biomedical Sciences, supportedby the Pew Charitable Trusts, Medical Research Council CareerDevelopment Award (G0600084) (to M.T.) and the ‘Special Presi-dential Prize and Excellent PhD Thesis Award – Scientific ResearchFoundation of the Chinese Academy of Sciences (to S.Z.). This re-view is partially based on refs. (Beck et al., 2009; Alber et al.,2008; Förster et al., 2010). The authors also acknowledge M.S.Madhusudhan and Ben Mathiesen for assistance in preparing themanuscript and useful discussions.

References

Aebersold, R., Mann, M., 2003. Mass spectrometry-based proteomics. Nature 422(6928), 198–207. March.

Alber, F. et al., 2007a. Determining the architectures of macromolecular assemblies.Nature 450 (7170), 683–694.

Alber, F. et al., 2007b. The molecular architecture of the nuclear pore complex.Nature 450 (7170), 695–701.

Alber, F. et al., 2008. Integrating diverse data for structure determination ofmacromolecular assemblies. Annu. Rev. Biochem. 77, 443–477.

Ander, M. et al., 2004. SmartCell, a framework to simulate cellular processes thatcombines stochastic approximation with diffusion and localisation: analysis ofsimple networks. Syst. Biol. (Stevenage) 1 (1), 129–138.

Ando, T., Skolnick, J., 2010. Crowding and hydrodynamic interactions likelydominate in vivo macromolecular motion. Proc. Natl. Acad. Sci. USA 107 (43),18457–18462.

Andrews, S.S., Bray, D., 2004. Stochastic simulation of chemical reactions withspatial resolution and single molecule detail. Phys. Biol. 1 (3–4), 137–151.

Baker, D., Sali, A., 2001. Protein structure prediction and structural genomics.Science 294 (5540), 93–96.

Baker, M.L. et al., 2006. Ab initio modeling of the herpesvirus VP26 core domainassessed by CryoEM density. PLoS Comput. Biol. 2 (10), e146.

Banchio, A.J., Brady, J.F., 2003. Accelerated Stokesian dynamics: Brownian motion. J.Chem. Phys. 118, 10323–10333.

Page 12: Journal of Structural Biology - University of Southern Californiaweb.cmb.usc.edu/people/alber/pdf/science_JSB2011.pdf · 2012. 7. 9. · to expand the scope of structural biology

494 M. Beck et al. / Journal of Structural Biology 173 (2011) 483–496

Bartesaghi, A., Subramaniam, S., 2009. Membrane protein structure determinationusing cryo-electron tomography and 3D image averaging. Curr. Opin. Struct.Biol. 19 (4), 402–407.

Baumeister, W., 2002. Electron tomography: towards visualizing the molecularorganization of the cytoplasm. Curr. Opin. Struct. Biol. 12 (5), 679–684.

Beck, M. et al., 2007. Snapshots of nuclear pore complexes in action captured bycryo-electron tomography. Nature 449 (7162), 611–615.

Beck, M. et al., 2009. Visual proteomics of the human pathogen Leptospirainterrogans. Nat. Methods 6 (11), 817–823.

Becker, T. et al., 2009. Structure of monomeric yeast and mammalian Sec61complexes interacting with the translating ribosome. Science 326 (5958),1369–1373.

Ben-Harush, K. et al., 2009. The supramolecular organization of the C. elegansnuclear lamin filament. J. Mol. Biol. 386 (5), 1392–1402.

Ben-Harush, K. et al., 2010. Visualizing cellular processes at the molecular level bycryo-electron tomography. J. Cell Sci. 123 (Pt 1), 7–12.

Betzig, E. et al., 2006. Imaging intracellular fluorescent proteins at nanometerresolution. Science 313 (5793), 1642–1645.

Birmanns, S., Wriggers, W., 2007. Multi-resolution anchor-point registration ofbiomolecular assemblies and their components. J. Struct. Biol. 157 (1), 271–280.

Böhm, J. et al., 2000. Toward detecting and identifying macromolecules in a cellularcontext: template matching applied to electron tomograms. Proc. Natl. Acad.Sci. USA 97 (26), 14245–14250.

Bork, P., Serrano, L., 2005. Towards cellular systems in 4D. Cell 121 (4), 507–509.Boulianne, L. et al., 2008. GridCell: a stochastic particle-based biological system

simulator. BMC Syst. Biol. 2 (1), 66.Brandt, F. et al., 2009. The native 3D organization of bacterial polysomes. Cell 136

(2), 261–271.Brandt, F. et al., 2010. The three-dimensional organization of polyribosomes in

intact human cells. Mol Cell. 39 (4), 560–569.Bray, D., 2002. Bacterial chemotaxis and the question of gain. Proc. Natl. Acad. Sci.

USA 99 (1), 7–9.Briegel, A. et al., 2008. Location and architecture of the Caulobacter crescentus

chemoreceptor array. Mol. Microbiol. 69 (1), 30–41.Briegel, A. et al., 2009. Universal architecture of bacterial chemoreceptor arrays.

Proc. Natl. Acad. Sci. USA 106 (40), 17181–17186.Cai, L., Friedman, N., Xie, X.S., 2006. Stochastic protein expression in individual cells

at the single molecule level. Nature 440 (7082), 358–362.Ceulemans, H., Russell, R.B., 2004. Fast fitting of atomic structures to low-resolution

electron density maps by surface overlap maximization. J. Mol. Biol. 338 (4),783–793.

Chacon, P., Wriggers, W., 2002. Multi-resolution contour-based fitting ofmacromolecular structures. J. Mol. Biol. 317 (3), 375–384.

Chandramouli, P. et al., 2008. Structure of the mammalian 80S ribosome at 8.7 Aresolution. Structure 16 (4), 535–548.

Chen, J.Z. et al., 2003. Low-resolution structure refinement in electron microscopy. J.Struct. Biol. 144 (1–2), 144–151.

Chiu, W. et al., 2005. Electron cryomicroscopy of biological machines atsubnanometer resolution. Structure (Camb) 13 (3), 363–372.

Chiu, W., Baker, M.L., Almo, S.C., 2006. Structural biology of cellular machines.Trends Cell. Biol. 16 (3), 144–150.

Clare, D.K., Orlova, E.V., 2010. 4.6A Cryo-EM reconstruction of tobacco mosaic virusfrom images at 300keV on a 4k � 4k CCD camera. J. Struc. Biol. 171 (3), 303–308.

Coggan, J.S. et al., 2005. Evidence for ectopic neurotransmission at a neuronalsynapse. Science 309 (5733), 446–451.

Collins, S.R. et al., 2007. Functional dissection of protein complexes involved inyeast chromosome biology using a genetic interaction map. Nature 446 (7137),806–810.

Connell, S.R. et al., 2007. Structural basis for interaction of the ribosome with theswitch regions of GTP-bound elongation factors. Mol. Cell 25 (5), 751–764.

Connell, S.R. et al., 2008. A new tRNA intermediate revealed on the ribosome duringEF4-mediated back-translocation. Nat. Struct. Mol. Biol. 15 (9), 910–915.

Dehmelt, L., Bastiaens, P.I., 2010. Spatial organization of intracellularcommunication: insights from imaging. Nat. Rev. Mol. Cell Biol. 11 (6), 440–452.

DiMaio, F. et al., 2009. Refinement of protein structures into low-resolution densitymaps using Rosetta. J. Mol. Biol. 392 (1), 181–190.

Dzubiella, J., McCammon, J.A., 2005. Substrate concentration dependence of thediffusion-controlled steady-state rate constant. J. Chem. Phys. 122 (18), 184902.

Elcock, A.H., 2010. Models of macromolecular crowding effects and the need forquantitative comparisons with experiment. Curr. Opin. Struct. Biol. 20 (2), 196–206.

Ellis, R.J., Minton, A.P., 2003. Cell biology: join the crowd. Nature 425 (6953), 27–28.Ermak, D.L., McCammon, J.A., 1978. Brownian dynamics with hydrodynamic

interactions. J. Chem. Phys..Evans, R.N. et al., 2008. The structure of LepA, the ribosomal back translocase. Proc.

Natl. Acad. Sci. USA 105 (12), 4673–4678.Fabiola, F., Chapman, M.S., 2005. Fitting of high-resolution structures into electron

microscopy reconstruction images. Structure (Camb) 13 (3), 389–400.Falke, S. et al., 2005. The 13 angstroms structure of a chaperonin GroEL-protein

substrate complex by cryo-electron microscopy. J. Mol. Biol. 348 (1), 219–230.Förster, F., 2005. PhD thesis. Technical University of Munich, vol. 1, issue 1. p. 1.Förster, F., Han, B.G., Beck, M., 2010. Visual Proteomics. Book 1 (1), 1.Förster, F., Hegerl, R., 2007. Structure determination in situ by averaging of

tomograms. Methods Cell Biol. 79, 741–767.

Förster, F. et al., 2005. Retrovirus envelope protein complex structure in situ studiedby cryo-electron tomography. Proc. Natl. Acad. Sci. USA 102 (13), 4729–4734.

Förster, F. et al., 2009. An atomic model AAA-ATPase/20S core particle sub-complexof the 26S proteasome. Biochem. Biophys. Res. Commun. 388 (2), 228–233.

Frangakis, A.S. et al., 2002. Identification of macromolecular complexes incryoelectron tomograms of phantom cells. Proc. Natl. Acad. Sci. USA 99 (22),14153–14158.

Frank, J., 2009. Single-particle reconstruction of biological macromolecules inelectron microscopy – 30 years. Q. Rev. Biophys. 42 (3), 139–158.

Garzon, J.I. et al., 2007. ADP_EM: fast exhaustive multi-resolution docking for high-throughput coverage. Bioinformatics 23 (4), 427–433.

Gavin, A.C. et al., 2006. Proteome survey reveals modularity of the yeast cellmachinery. Nature 440 (7084), 631–636.

Gillespie, D.T., 1977. Exact stochastic simulation of coupled chemical reactions. J.Phys. Chem. 81 (25), 2340–2361.

Goddard, T.D., Huang, C.C., Ferrin, T.E., 2007. Visualizing density maps with UCSFChimera. J. Struct. Biol. 157 (1), 281–287.

Golding, I., Cox, E.C., 2006. Physical nature of bacterial cytoplasm. Phys. Rev. Lett. 96(9), 098102.

Goldman, J.P., Levin, M.D., Bray, D., 2009. Signal amplification in a lattice of coupledprotein kinases. Mol. Biosyst..

Grubisic, I. et al., 2010. Biased coarse-grained molecular dynamics simulationapproach for flexible fitting of X-ray structure into cryo electron microscopymaps. J. Struct. Biol. 169 (1), 95–105.

Grünewald, K., 2009. Adding a spatial dimension to the proteome. Nat. Methods 6(11), 798–800.

Güell, M. et al., 2009. Transcriptome complexity in a genome-reduced bacterium.Science 326 (5957), 1268–1271.

Han, B.G. et al., 2009. Survey of large protein complexes in D. vulgaris reveals greatstructural diversity. Proc. Natl. Acad. Sci. USA 106 (39), 16580–16585.

Hansen, J.P., Lowen, H., 2000. Effective interactions between electric double layers.Annu. Rev. Phys. Chem. 51 (20), 209–242.

Hattne, J., Fange, D., Elf, J., 2005. Stochastic reaction-diffusion simulation withMesoRD. Bioinformatics 21 (12), 2923–2924.

Henrick, K. et al., 2003. EMDep: a web-based system for the deposition andvalidation of high-resolution electron microscopy macromolecular structuralinformation. J. Struct. Biol. 144 (1–2), 228–237.

Heyes, D.M., 1996. Brownian dynamics simulations of self and collective diffusion ofnear hard sphere colloidal liquids: inclusion of many-body hydrodynamics.Mol. Phys. 87 (2), 287–298.

Jiang, W. et al., 2001. Bridging the information gap: computational tools forintermediate resolution structure interpretation. J. Mol. Biol. 308 (5), 1033–1044.

Jiang, W. et al., 2008. Backbone structure of the infectious epsilon15 virus capsidrevealed by electron cryomicroscopy. Nature 451 (7182), 1130–1134.

Jolley, C.C. et al., 2008. Fitting low-resolution cryo-EM maps of proteins usingconstrained geometric simulations. Biophys. J. 94 (5), 1613–1621.

Katchalski-Katzir, E. et al., 1992. Molecular surface recognition: determination ofgeometric fit between proteins and their ligands by correlation techniques.Proc. Natl. Acad. Sci. USA 89 (6), 2195–2199.

Kawabata, T., 2008. Multiple subunit fitting into a low-resolution density map of amacromolecular complex using a gaussian mixture model. Biophys. J. 95 (10),4643–4658.

Keller, A. et al., 2002. Empirical statistical model to estimate the accuracy of peptideidentifications made by MS/MS and database search. Anal. Chem. 74 (20), 5383–5392.

Kiel, C., Serrano, L., 2009. Cell type-specific importance of ras-c-raf complexassociation rate constants for MAPK signaling. Sci. Signal. 2 (81), ra38.

Kim, J.S., Yethiraj, A., 2009. Effect of macromolecular crowding on reaction rates: acomputational and theoretical study. Biophys. J. 96 (4), 1333–1340.

Kinkhabwala, A., Bastiaens, P.I., 2010. Spatial aspects of intracellular informationprocessing. Curr. Opin. Genet. Dev. 20 (1), 31–40.

Kollmann, M. et al., 2005. Design principles of a bacterial signalling network. Nature438 (7067), 504–507.

Kovacs, J.A. et al., 2003. Fast rotational matching of rigid bodies by fast Fouriertransform acceleration of five degrees of freedom. Acta Crystallogr., D Biol.Crystallogr. 59 (Pt 8), 1371–1376.

Kovacs, J.A., Yeager, M., Abagyan, R., 2008. Damped-dynamics flexible fitting.Biophys. J. 95 (7), 3192–3207.

Kühner, S. et al., 2009. Proteome organization in a genome-reduced bacterium.Science 326 (5957), 1235–1240.

Lasker, K. et al., 2009. Inferential optimization for simultaneous fitting of multiplecomponents into a CryoEM map of their assembly. J. Mol. Biol. 388 (1), 180–194.

Lasker, K. et al., 2010. Integrative structure modeling of macromolecular assembliesfrom proteomics data. Mol Cell Proteomics 1 (1), 1.

Leis, A. et al., 2009. Visualizing cells at the nanoscale. Trends Biochem. Sci. 34 (2),60–70.

Lemerle, C., Ventura, B., Di Serrano, L., 2005. Space as the final frontier in stochasticsimulations of biological systems. FEBS Lett. 579 (8), 1789–1794.

Li, Z., Jensen, G.J., 2009. Electron cryotomography: a new view into microbialultrastructure. Curr. Opin. Microbiol. 12 (3), 333–340.

Lin, A.C., Holt, C.E., 2008. Function and regulation of local axonal translation. Curr.Opin. Neurobiol. 18 (1), 60–68.

Lindert, S., Stewart, P.L., Meiler, J., 2009. Hybrid approaches: applyingcomputational methods in cryo-electron microscopy. Curr. Opin. Struct. Biol.19 (2), 218–225.

Page 13: Journal of Structural Biology - University of Southern Californiaweb.cmb.usc.edu/people/alber/pdf/science_JSB2011.pdf · 2012. 7. 9. · to expand the scope of structural biology

M. Beck et al. / Journal of Structural Biology 173 (2011) 483–496 495

Lipkow, K., 2006. Changing cellular location of CheZ predicted by molecularsimulations. PLoS Comput. Biol. 2 (4), e39.

Lipkow, K., Andrews, S.S., Bray, D., 2005. Simulated diffusion of phosphorylatedCheY through the cytoplasm of Escherichia coli. J. Bacteriol. 187 (1), 45–53.

Lippincott-Schwartz, J., Manley, S., 2009. Putting super-resolution fluorescencemicroscopy to work. Nat. Methods 6 (1), 21–23.

Lorentzen, E., Basquin, J., Conti, E., 2008. Structural organization of the RNA-degrading exosome. Curr. Opin. Struct. Biol. 18 (6), 709–713.

Lucic, V., Förster, F., Baumeister, W., 2005. Structural studies by electrontomography: from cells to molecules. Annu. Rev. Biochem. 74, 833–865.

Ludtke, S.J. et al., 2008. De novo backbone trace of GroEL from single particleelectron cryomicroscopy. Structure 16 (3), 441–448.

Ma, J., 2005. Usefulness and limitations of normal mode analysis in modelingdynamics of biomolecular complexes. Structure (Camb) 13 (3), 373–380.

Maeder, C.I. et al., 2007. Spatial regulation of Fus3 MAP kinase activity through areaction-diffusion mechanism in yeast pheromone signalling. Nat. Cell. Biol. 9(11), 1319–1326.

Malmstrom, J. et al., 2009. Proteome-wide cellular protein concentrations of thehuman pathogen Leptospira interrogans. Nature 460 (7256), 762–765.

Malmström, J. et al., 2009. Proteome-wide cellular protein concentrations of thehuman pathogen Leptospira interrogans. Nature 460 (7256), 762–765.

Manley, S. et al., 2008. High-density mapping of single-molecule trajectories withphotoactivated localization microscopy. Nat. Methods 5 (2), 155–157.

Martinetz, T.M., Berkovich, S.G., Schulten, K.J., 1993. Neural-gas’ network for vectorquantization and its application to time-series prediction. IEEE Trans. NeuralNetwork 4 (4), 558–569.

Maurer, U.E., Sodeik, B., Grunewald, K., 2008. Native 3D intermediates of membranefusion in herpes simplex virus 1 entry. Proc. Natl. Acad. Sci. USA 105 (30),10559–10564.

McGuffee, S.R., Elcock, A.H., 2010. Diffusion, crowding and protein stability in adynamic molecular model of the bacterial cytoplasm. PLoS Comput. Biol. 6 (3),e1000694.

McMullan, G. et al., 2009. Detective quantum efficiency of electron area detectors inelectron microscopy. Ultramicroscopy 109 (9), 1126–1143.

Medalia, O. et al., 2002. Macromolecular architecture in eukaryotic cells visualizedby cryoelectron tomography. Science 298 (5596), 1209–1213.

Minton, A.P., 2006. Macromolecular crowding. Curr. Biol. 16 (8), R269–R271.Morelli, M.J., ten Wolde, P.R., 2008. Reaction Brownian dynamics and the effect of

spatial fluctuations on the gain of a push-pull network. J. Chem. Phys. 129 (5),054112.

Morris, D.M., Jensen, G.J., 2008. Toward a biomechanical understanding of wholebacterial cells. Annu. Rev. Biochem. 77, 583–613.

Murata, K. et al., 2010. Zernike phase contrast cryo-electron microscopy andtomography for structure determination at nanometer and subnanometerresolutions. Structure 18 (8), 903–912.

Navaza, J. et al., 2002. On the fitting of model electron densities into EMreconstructions: a reciprocal-space formulation. Acta Crystallogr., D Biol.Crystallogr. 58 (Pt 2, Pt 10), 1820–1825.

Nickell, S. et al., 2006. A visual approach to proteomics. Nat. Rev. Mol. Cell Biol. 7 (3),225–230.

Nickell, S. et al., 2009. Insights into the molecular architecture of the 26Sproteasome. Proc. Natl. Acad. Sci. USA 106 (29), 11943–11947.

Opplestrup, T. et al., 2006. First-passage Monte Carlo algorithm: diffusion withoutall the hops. Phys. Rev. Lett. 97 (23), 230602–230606.

Orlova, E.V., Saibil, H.R., 2004. Structure determination of macromolecularassemblies by single-particle analysis of cryo-electron micrographs. Curr.Opin. Struct. Biol. 14 (5), 584–590.

Ortiz, J.O. et al., 2006. Mapping 70S ribosomes in intact cells by cryoelectrontomography and pattern recognition. J. Struct. Biol. 156 (2), 334–341.

Orzechowski, M., Tama, F., 2008. Flexible fitting of high-resolution X-ray structuresinto cryoelectron microscopy maps using biased molecular dynamicssimulations. Biophys. J. 95 (12), 5692–5705.

Ranish, J.A. et al., 2003. The study of macromolecular complexes by quantitativeproteomics. Nat. Genet. 33 (3), 349–355.

Raser, J.M., O’Shea, E.K., 2005. Noise in gene expression: origins, consequences, andcontrol. Science 309 (5743), 2010–2013.

Rathinam, M. et al., 2003. Stiffness in stochastic chemically reacting systems: theimplicit tau-leaping method. J. Chem. Phys. 119 (24), 12784–12794.

Rawi, R., Whitmore, L., Topf, M., 2010. CHOYCE – a web server for constrainedhomology modelling with cryoEM maps. Bioinformatics. 26 (13), 1673–1674.

Ridgway, D., Broderick, G., Ellison, M.J., 2006. Accommodating space, time andrandomness in network simulation. Curr. Opin. Biotechnol. 17 (5), 493–498.

Ridgway, D. et al., 2008. Coarse-grained molecular simulation of diffusion andreaction kinetics in a crowded virtual cytoplasm. Biophys. J. 94 (10), 3748–3759.

Rigort, A. et al., 2010. Micromachining tools and correlative approaches for cellularcryo-electron tomography. J. Struct. Biol..

Robinson, C.V., Sali, A., Baumeister, W., 2007. The molecular sociology of the cell.Nature 450 (7172), 973–982.

Rodnina, M.V., Wintermeyer, W., 2009. The ribosome goes Nobel. Trends Biochem.Sci. 35 (1), 1–5.

Rohl, C.A. et al., 2004. Protein structure prediction using Rosetta. Methods Enzymol.383, 66–93.

Roseman, A.M., 2000. Docking structures of domains into maps from cryo-electronmicroscopy using local correlation. Acta Crystallogr., D Biol. Crystallogr. 56 (Pt10), 1332–1340.

Roseman, A.M., 2004. FindEM – a fast, efficient program for automatic selection ofparticles from electron micrographs. J. Struct. Biol. 145 (1–2), 91–99.

Rossmann, M.G., 2000. Fitting atomic models into electron-microscopy maps. ActaCrystallogr., D Biol. Crystallogr. 56 (Pt 10), 1341–1349.

Rossmann, M.G. et al., 2005. Combining X-ray crystallography and electronmicroscopy. Structure (Camb) 13 (3), 355–362.

Rusu, M., Birmanns, S., Wriggers, W., 2008. Biomolecular pleiomorphism probed byspatial interpolation of coarse models. Bioinformatics 24 (21), 2460–2466.

Sali, A. et al., 2003. From words to literature in structural proteomics. Nature 422(6928), 216–225.

Schaff, J. et al., 1997. A general computational framework for modeling cellularstructure and function. Biophys. J. 73 (3), 1135–1146.

Schlosshauer, M., Baker, D., 2004. Realistic protein–protein association rates from asimple diffusional model neglecting long-range interactions, free energybarriers, and landscape ruggedness. Protein Sci. 13 (6), 1660–1669.

Schreiber, G., Haran, G., Zhou, H.X., 2008. Fundamental aspects of protein–proteinassociation kinetics. Chem. Rev. 109 (3), 839–860.

Schroder, G.F., Brunger, A.T., Levitt, M., 2007. Combining efficient conformationalsampling with a deformable elastic network model facilitates structurerefinement at low resolution. Structure 15 (12), 1630–1641.

Scott, J.D., Pawson, T., 2009. Cell signaling in space and time: where proteins cometogether and when they’re apart. Science 326 (5957), 1220–1224.

Seidelt, B. et al., 2009. Structural insight into nascent polypeptide chain-mediatedtranslational stalling. Science 326 (5958), 1412–1415.

Shapiro, L., McAdams, H.H., Losick, R., 2009. Why and how bacteria localize proteins.Science 326 (5957), 1225–1228.

Shtengel, G. et al., 2009. Interferometric fluorescent super-resolution microscopyresolves 3D cellular ultrastructure. Proc. Natl. Acad. Sci. USA 106 (9), 3125–3130.

Siebert, X., Navaza, J., 2009. UROX 2.0: an interactive tool for fitting atomic modelsinto electron-microscopy reconstructions. Acta Crystallogr., D Biol. Crystallogr.65 (Pt 7), 651–658.

Slepchenko, B.M. et al., 2003. Quantitative cell biology with the virtual cell. TrendsCell Biol. 13 (11), 570–576.

Spahn, C.M., Penczek, P.A., 2009. Exploring conformational modes ofmacromolecular assemblies by multiparticle cryo-EM. Curr. Opin. Struct. Biol.19 (5), 623–631.

Stahlberg, H., Walz, T., 2008. Molecular electron microscopy: state of the art andcurrent challenges. ACS Chem. Biol. 3 (5), 268–281.

Suhre, K., Navaza, J., Sanejouand, Y.H., 2006. NORMA: a tool for flexible fitting ofhigh-resolution protein structures into low-resolution electron-microscopy-derived density maps. Acta Crystallogr., D Biol. Crystallogr. 62 (Pt 9), 1098–1100.

Sun, J., Weinstein, H., 2007. Toward realistic modeling of dynamic processes in cellsignaling: quantification of macromolecular crowding effects. J. Chem. Phys.127 (15), 155105.

Takahashi, K., Tanase-Nicola, S., ten Wolde, P.R., 2009. Spatio-temporal correlationscan drastically change the response of a MAPK pathway. Proc. Natl. Acad. Sci.USA 107 (6), 2473–2478.

Tama, F., Sanejouand, Y.H., 2001. Conformational change of proteins arising fromnormal mode calculations. Protein Eng. 14 (1), 1–6.

Tama, F., Wriggers, W., Brooks 3rd, C.L., 2002. Exploring global distortions ofbiological macromolecules and assemblies from low-resolution structuralinformation and elastic network theory. J. Mol. Biol. 321 (2), 297–305.

Tama, F., Miyashita, O., Brooks 3rd, C.L., 2004. Flexible multi-scale fitting of atomicstructures into low-resolution electron density maps with elastic networknormal mode analysis. J. Mol. Biol. 337 (4), 985–999.

Tama, F. et al., 2006. Model of the toxic complex of anthrax: responsiveconformational changes in both the lethal factor and the protective antigenheptamer. Protein Sci. 15 (9), 2190–2200.

Tan, R.K., Devkota, B., Harvey, S.C., 2008. YUP.SCX: coaxing atomic models intomedium resolution electron density maps. J. Struct. Biol. 163 (2), 163–174.

Tanaka, H., Araki, T., 2000. Simulation method of colloidal suspensions withhydrodynamic interactions: fluid particle dynamics. Phys. Rev. Lett. 85 (6),1338–1341.

Taylor, D.J. et al., 2009. Comprehensive molecular structure of the eukaryoticribosome. Structure 17 (12), 1591–1604.

Terry, L.J., Shows, E.B., Wente, S.R., 2007. Crossing the nuclear envelope: hierarchicalregulation of nucleocytoplasmic transport. Science 318 (5855), 1412–1416.

Tokuyama, M., Oppenheim, I., 1994. Dynamics of hard-sphere suspensions. Phys.Rev. E.

Topf, M., Sali, A., 2005. Combining electron microscopy and comparative proteinstructure modeling. Curr. Opin. Struct. Biol. 15 (5), 578–585.

Topf, M. et al., 2005. Structural characterization of components of proteinassemblies by comparative modeling and electron cryo-microscopy. J. Struct.Biol. 149 (2), 191–203.

Topf, M. et al., 2006. Refinement of protein structures by iterative comparativemodeling and CryoEM density fitting. J. Mol. Biol. 357 (5), 1655–1668.

Topf, M. et al., 2008. Protein structure fitting and refinement guided by cryo-EMdensity. Structure 16 (2), 295–307.

Trabuco, L.G. et al., 2008. Flexible fitting of atomic structures into electronmicroscopy maps using molecular dynamics. Structure 16 (5), 673–683.

Tucci, K., Kapral, R., 2004. Mesoscopic model for diffusion-influenced reactiondynamics. J. Chem. Phys. 120 (7), 8262–8270.

Turner, T.E., Schnell, S., Burrage, K., 2004. Stochastic approaches for modellingin vivo reactions. Comput. Biol. Chem. 28 (3), 165–178.

Page 14: Journal of Structural Biology - University of Southern Californiaweb.cmb.usc.edu/people/alber/pdf/science_JSB2011.pdf · 2012. 7. 9. · to expand the scope of structural biology

496 M. Beck et al. / Journal of Structural Biology 173 (2011) 483–496

Urbina-Villalba, G., García-Sucre, M., Toro-Mendoza, J., 2003. Averagehydrodynamic correction for the Brownian dynamics calculation offlocculation rates. Phys. Rev. E. 68 (6), 61408-1–61408-9.

van Zon, J.S., ten Wolde, P.R., 2005. Green’s-function reaction dynamics: a particle-based approach for simulating biochemical networks in time and space. J.Chem. Phys. 123 (23), 234910.

Velazquez-Muriel, J.A. et al., 2006. Flexible fitting in 3D-EM guided by the structuralvariability of protein superfamilies. Structure 14 (7), 1115–1126.

Volkmann, N., Hanein, D., 1999. Quantitative fitting of atomic models into observeddensities derived by electron microscopy. J. Struct. Biol. 125 (2–3), 176–184.

Volkmann, N. et al., 2000. Evidence for cleft closure in actomyosin upon ADPrelease. Nat. Struct. Biol. 7 (12), 1147–1155.

Weiss, M. et al., 2004. Anomalous subdiffusion is a measure for cytoplasmiccrowding in living cells. Biophys. J. 87 (5), 3518–3524.

Wendt, T. et al., 2001. Three-dimensional image reconstruction ofdephosphorylated smooth muscle heavy meromyosin reveals asymmetry inthe interaction between myosin heads and placement of subfragment 2. Proc.Natl. Acad. Sci. USA 98 (8), 4361–4366.

Werner, J.N. et al., 2009. Quantitative genome-scale analysis of protein localizationin an asymmetric bacterium. Proc. Natl. Acad. Sci. USA 106 (19), 7858–7863.

Wieczorek, G., Zielenkiewicz, P., 2008. Influence of macromolecular crowding onprotein–protein association rates – a Brownian dynamics study. Biophys. J. 95(11), 5030–5036.

Wriggers, W., Chacon, P., 2001. Modeling tricks and fitting techniques formultiresolution structures. Structure (Camb) 9 (9), 779–788.

Wriggers, W., Milligan, R.A., McCammon, J.A., 1999. Situs: a package for dockingcrystal structures into low-resolution maps from electron microscopy. J. Struct.Biol. 125 (2–3), 185–195.

Wu, X. et al., 2003. A core-weighted fitting method for docking atomic structuresinto low-resolution maps: application to cryo-electron microscopy. J. Struct.Biol. 141 (1), 63–76.

Yonekura, K., Maki-Yonekura, S., Namba, K., 2003. Complete atomic model of thebacterial flagellar filament by electron cryomicroscopy. Nature 424 (6949),643–650.

Yu, X., Jin, L., Zhou, Z.H., 2008. 3.88 A structure of cytoplasmic polyhedrosis virus bycryo-electron microscopy. Nature 453 (7193), 415–419.

Yuan, S. et al., 2010. Structure of an apoptosome-procaspase-9 CARD complex.Structure 18 (5), 571–583.

Yus, E. et al., 2009. Impact of genome reduction on bacterial metabolism and itsregulation. Science 326 (5957), 1263–1268.

Zhang, X. et al., 2008. Near-atomic resolution using electron cryomicroscopy andsingle-particle reconstruction. Proc. Natl. Acad. Sci. USA 105 (6), 1867–1872.

Zhang, S. et al., 2010. A fast mathematical programming procedure for simultaneousfitting of assembly components into cryoEM density maps. Bioinformatoics(ISMB) 28 (12), i261–i268.

Zhou, Z.H., 2008. Towards atomic resolution structural determination by single-particle cryo-electron microscopy. Curr. Opin. Struct. Biol. 18 (2), 218–228.

Zhu, J. et al., 2010. Building and refining protein models within cryo-electronmicroscopy density maps based on homology modeling and multiscalestructure refinement. J. Mol. Biol. 397 (3), 835–851.