physics of protein function and evolution: from sequence...

50
Hue Sun Chan Departments of Biochemistry and Molecular Genetics University of Toronto, Ontario M5S 1A8 Canada http://biochemistry.utoronto.ca/person/hue-sun-chan/ UofT Physics November 30, 2017 Physics of Protein Function and Evolution: From Sequence-Structure to Sequence-Ensemble Relationships

Upload: others

Post on 03-Feb-2021

5 views

Category:

Documents


0 download

TRANSCRIPT

  • Hue Sun Chan Departments of Biochemistry and Molecular Genetics

    University of Toronto, Ontario M5S 1A8 Canada

    http://biochemistry.utoronto.ca/person/hue-sun-chan/

    UofT Physics November 30, 2017

    Physics of Protein Function and Evolution: From Sequence-Structure to

    Sequence-Ensemble Relationships

  • Figure from: L. Stryler, Biochemistry. W.H. Freeman & Co., NY (1981)

    Proteins are polymer chains of specific sequences of amino acids.

    bovine

    ribonuclease

    ● As a polymer, a protein

    molecule with n amino

    acids can adopt many

    (~μn) different shapes

    (conformations), many of

    which are open and

    disordered, others are

    more compact and

    ordered.

    ● All conformational

    states can be utilized

    by Nature to perform

    biological functions.

  • Figure from: Radisky & Koshland,

    PNAS 99:10316 (2002)

    CI2

    subtilisin

    chymotrypsin inhibitor 2 (CI2)

    The “classical” Sequence-(Folded) Structure-Function Paradigm

    Example of a protein functioning in its folded form:

  • Figure Credit: Pomès Group, The Hospital for Sick

    Children & University of Toronto

    Example:

    The spring/rubber-like

    elasticity of skin, lungs, blood

    vessels, and uterine tissue is

    imparted by the protein elastin.

    Entropy-driven rubber

    elasticity:

    The functional

    conformations of some

    proteins are disordered

    Figure from: Rauscher & Pomès, “Structural disorder and protein elasticity”. In: Fuzziness: Structural

    Disorder in Protein Complexes. Edited by Fuxreiter & Tompa. Springer (2012).

  • Figure from: Chan, Zhang, Wallin & Liu, Annu Rev Phys Chem (2011)

    Intrinsically Disordered Proteins (IDPs) ● IDPs do not fold spontaneously

    ● IDPs perform prominent functions in cellular signaling and regulation

    ● IDPs are made up of “low complexity” amino acid sequences

    ● Compared with sequences for globular proteins, IDP sequences have more

    polar, charged, and aromatic residues and fewer hydrophobic/nonpolar residues

    Some IDPs fold (become ordered) upon binding:

    phosphorylated kinase-

    inducible domain (pKID)

    kinase-inducible domain

    interacting domain (KIX)

  • Mittag, Orlicky, Choy, Tang, Lin, Sicheri, Kay, Tyers & Forman-Kay, PNAS 105:17772-17777 (2008)

    NMR experiments show that multiple phosphorylated

    sites of Sic1 engage Cdc4 without global ordering

    a dynamic,

    “fuzzy”

    complex

    Cdc4

    7pSic1 dynamic complex

    Some IDPs remain largely disordered even upon binding, forming “fuzzy complexes”:

    Borg, Mittag, Pawson, Tyers, Forman-Kay & Chan, PNAS 104:9650-9655 (2007)

  • Some IDPs function not only as individual molecules but also collectively by undergoing phase separation on a mesoscopic

    length scale to form condensed liquid-phase IDP-rich droplets that may encompass RNA and other biomolecules.

    ■ A phase-separated “assemblage” is formed reversibly when a critical

    concentration is reached. The formation of the assemblage provides spatial

    organization that can be regulated. Its liquid-phase properties allow exchange

    of constituent molecules with the surrounding solution to facilitate localization

    of and biochemical interactions with other protein and/or RNA species.

    Figure: Toretsky & Wright, J Cell Biol (2014)

  • Zeng et al. & Zhang, Cell 166:1163–1175 (2016) http://www.ust.hk/

    Protein Phase Transition in Synaptic Function

  • Organelles: Substructures - “little organs” - of the Cell Organelles can be membrane-bound or membrane-less

    Plant cell Animal cell

    Figures from: Encyclopædia Britannica (2010)

  • Membraneless Organelles underpinned by IDP phase separation

    A form of cellular compartmentalization

    P granules

    as liquid droplets.

    ◄Fluorescence recovery after

    photobleaching.

    ▼Two P granules fuse.

    Figure: Hyman, Weber & Jülicher, Annu Rev Cell Dev Biol (2014)

    One-cell stage of

    C. elegans embryo

    Figure:

    Wang &

    Seydoux,

    Curr Biol

    (2014) Adult hermaphrodite

    gonad

    Figure: Biology Reference http://www.biologyreference.com/Mo-Nu/Nucleolus.html

    Nucleolus (“ribosome factory”)

    as coexisting liquid phases

    [see Feric et al., Cell (2016)]

  • Nucleoli in C. elegan

    Image credit: Stephanie Weber, Department of Biology, McGill University

    Weber & Brangwynne, Curr Biol 25:64106 (2015)

  • How do the basic physical forces

    encoded by protein amino acid

    sequences give rise to these

    remarkable biological phenomena?

    ● Conformational Switches Between Folded Globular Structures

    ● Liquid-Liquid Phase Separation of Intrinsically Disordered Proteins

  • Hydrophobic Interaction is a Major Driving Force for Protein Folding

    Chan & Dill, Physics Today (1993)

    Kauzmann (1959), Dill (1990), etc.

  • Lau & Dill, Macromolecules (1989); Chan & Dill, J Chem Phys (1991) ▪ Figure from: Chan & Dill, Physics Today (1993)

    A Simple Exact Biophysical Model: The Hydrophobic-Polar (HP) Model

  • Superfunnels: Correlation between Thermodynamic and Mutational Stabilities

    Neutral net of sequences encoding for a given

    structure tends to organized around a “prototype

    sequence” with maximum thermodynamic stability and

    mutational stability (robustness).

    prototype sequence

    Bornberg-Bauer & Chan, PNAS (1999);

    Wroe, Bornberg-Bauer & Chan, Biophys J (2005)

    ● Sequences with higher mutational robustness tend to have

    higher steady-state populations under evolutionary dynamics cf. van Nimwegen et al,

    PNAS 96, 9716 (1999)

  • Evolutionary Paths and Conformational Switches in

    Sequence Space

    Lipman & Wilbur, Proc R Soc London B (1991)

    Neutral net Neutral net

  • str

    uctu

    ral/seq

    uen

    ce

    sim

    ilari

    ty t

    o t

    arg

    et

    Latent Evolutionary Potentials: Selection of excited-state (promiscuous) functions can speed up evolution dramatically

    Wroe, Chan & Bornberg-Bauer, HFSP J (2007)

    with excited-state

    selection

    without excited-state

    selection

    cf. Amitai, Gupta & Tawfik, HFSP J 1, 67 (2007); Tokuriki & Tawfik, Science 324, 203 (2009) for a review.

  • Role of excited-state selection in Escape from Adaptive Conflict

    The “Escape from Adaptive Conflict” Perspective focuses on

    adaptation before gene duplication [Hittinger & Carrol, Nature

    449, 677 (2007); Des Marais & Rausher, Nature 454, 762 (2008)]

    Neofunctionalization Subfunctionalization

    specialist generalist

    gene

    duplication

  • Sikosek, Chan & Bornberg-Bauer, PNAS (2012)

    Escape from Adaptive Conflict Follows from Weak Functional Trade-Offs and Mutational Robustness.

    ● Biophysics-based network connections.

    ● Evolutionary dynamics under mutations

    and gene duplications computed using

    both an analytical master equation and

    stochastic Monte Carlo simulations.

    ● Fitness is proportional to the stability

    (concentration) of the functional

    structures up to a certain optimum

    concentration above which fitness does

    not increase further with concentration.

    ● The optimal concentration corresponds

    to a measure of selection pressure.

  • Sikosek et al., PNAS (2012); Sikosek et al., PLoS Comput Biol (2012); Sikosek & Chan, J R Soc Interface (2014)

    Evolving a

    new folded

    structure:

    traversing two superfunnels

  • Sikosek & Chan, J R Soc Interface (2014)

    Subfunctionalization can be driven solely by Mutational Robustness

    mu

    tati

    onal

    rob

    ust

    nes

    s

    generalists less robust than specialists

  • Figure from: Ugalde, Chang & Matz, Science 305, 1433 (2004)

    Escape from Adaptive Conflict (EAC) in the real world

    Experiments showing that a reconstructed common ancestor of the fluorescent proteins in corals that emit either red or green

    light can emit light of both colors are indicative of EAC.

  • Sikosek & Chan (2014)

  • Experimentally designed bi-stable proteins and mutation-induced conformational switches

    Bouvignies et al.,

    Nature (2011)

    Cordes et al., Nature

    Struct Biol (2000)

    Meier et al.,

    Curr Biol (2007)

    Alexander et al.,

    PNAS (2009)

    Anderson et al.,

    Protein Eng Des Sel

    (2011) Figure from:

    Sikosek & Chan, J R Soc Interface (2014)

  • The GA/GB System: An Experimentally Designed Conformational Switch

    Fig

    ure

    fro

    m:

    Ale

    xan

    der

    et

    al

    (2009)

    GA: 3α GB: 4β+α

    ● One-mutation switch between the human

    serum albumin-binding domain (GA) and

    the IgG-binding domain (GB) of

    Streptococcus protein G [Alexander, He,

    Chen, Orban & Bryan, PNAS 106, 21149

    (2009)]. NMR structures were determined for GA98 & GB98 (the 56aa sequences are 98% identical,

    differ only by one single L45Y substitution ).

    Can theory capture the

    biophysics of this

    switching behavior?

    Explicit-water molecular dynamics

    simulations using current force

    fields cannot account for this

    behavior to date [van Gunsteren

    and coworkers, Biochemistry

    50:10965 (2011); 52:4962 (2013)]

  • Shea et al., PNAS (1999); Micheletti et al., Phys Rev Lett (1999); Clementi et al., JMB (2000); Koga & Takada, JMB (2001); Kaya & Chan, JMB (2003)

    Gō (Native-Centric, Structure-Based) Protein Chain Models

    CI2

    Taketomi, Ueda & Gō, Int J Peptide Res (1975)

  • “Hybrid Models”

    Augmenting the native-centric (SBM) potential with

    sequence-dependent (transferrable) physical nonnative

    effects such as hydrophobic interactions

    total native-centric sequence-dependent

    [cf. Shea et al., J Chem Phys (1998); Clementi & Plotkin, Protein Sci (2004); Pogorelov & Luthey-Schulten, Biophys J (2004)]

    Zarrine-Afsar, Wallin, Neculai, Neudecker, Howell, Davidson & Chan, PNAS (2008)

  • Sikosek, Krobath & Chan, PLoS Comput Biol (2016)

    Explicit-chain model for the GA/GB system:

    A bi-stable, multiple-structure structure-based (native-centric)

    potential (SBM)

  • Sikosek, Krobath & Chan, PLoS Comput Biol (2016)

    *Irbäck & Mohanty, J Comput Chem 27:1548-55 (2006)

    ‡Chen, Song & Chan, Curr Opin Struct Biol (2015)

    The experimental GA/GB trend is captured by an explicit-chain hybrid‡ atomic model

    bi-stable Gō (native-centric SBM) + PROFASI* (simple, physical & transferrable)

    −T

    ln(p

    op

    ulatio

    n)

    QB

    QA

    0 1

    1 L45Y

  • ■ Biophysical sequence-to-structure

    mappings based upon simple explicit-

    chain protein models are versatile

    conceptual tools for addressing general

    principles of evolution.

    ■ Subfunctionalization after duplication

    of a bi-stable gene with dual functions

    can be driven by sequence-space

    topology (i.e., mutational robustness) in

    an essentially nonadaptive manner.

    ■ The hybrid approach to modeling

    protein folding can be applied in the

    context of a simple atomic potential to

    Summary

    Mutational effects on the GA/GB

    energy landscape (Sikosek et al., 2016)

    rationalize the GA/GB conformational switch. Our physics-based analysis

    suggests a significant role of nonpolar and aromatic interactions in the striking

    behavior of this system. But probably much remains to be learn before we can

    provide an account based entirely on a transferrable interaction potential.

  • Experimental study by: Nott, Petsalaki, Farber, Jervis, Fussner, Plochowietz, Craggs,

    Bazett-Jones, Pawson, Forman-Kay & Baldwin, Mol Cell (2015)

    Intrinsically Disordered N Terminus of RNA Helicase Ddx4 Forms Organelles in Cells and in vitro

    ● Ddx4 proteins are

    essential for the assembly

    and maintenance

    of the related nuage in

    mammals, P-granules in

    worms, and

    pole plasm and polar

    granules in flies.

    Modeling Liquid-Liquid Phase Separation of Intrinsically Disordered Proteins

  • Results & Figure from: Nott et al., Mol Cell (2015)

    HeLa cells

    Ddx4 Spontaneously Self-Assembles to Form Organelles in Live Cells

  • 150 mM → ~150 μM ionic strength

    in live

    cell:

    in

    vitro:

    Fluorescence recovery

    after photobleaching

    in v

    itro

    in c

    ell

    Ddx4 Reversibly Forms Organelles In Live Cell and In Vitro

    Results & Figure from: Nott et al., Mol Cell (2015)

  • Charge-Scrambled and F→A Mutants of Ddx4 Do Not Form

    Organelles in Cell or in vitro under Physiological Conditions

    Results & Figure from: Nott et al., Mol Cell (2015)

    Sequence dependence of IDP liquid-liquid phase separation:

  • An Approximate Analytical Theory for Electrostatics-Driven Sequence-Dependent Heteropolymer Phase Separation

    ■ The approximate partition function

    depends only on the 0th-order and 2-body

    correlation of density (ρ) fluctuation:

    ■ This approach is known as “random phase

    approximation” (RPA)*.

    ■ RPA accounts for pairwise electrostatic

    interactions but neglected higher order

    density correlation arising from chain

    connectivity.

    ■ RPA provides an approximate account of

    chain connectivity and hence sequence-

    dependent interactions (beyond mean-field).

    density ρ(r) ~ concentration c(r)

    biphasic region

    Figures from:

    Doi & Edwards,

    The Theory of

    Polymer

    Dynamics

    (Oxford 1986). *see, e.g., Mahdi & Olvera de la Cruz, Macromolecules 33:7649 (2000)

  • ■ Free energy per

    unit volume:

    ■ Flory-Huggins (FH)

    mixing and excluded

    volume entropy:

    ■ Electrostatic free

    energy in RPA:

    where Ĝk, Ûk, and ρ ̂ are (N + 2) × (N + 2) matrices; N = chain length. The formulation is set up to account for the charge pattern of the N amino acids along the sequence.

    ■ The other 2 components are for the +/‒ ions (salt and/or counterions) in solution. A dielectric constant is used but electrostatic screening is treated directly (not via Debye length).

    Lin, Forman-Kay & Chan, Phys Rev Lett (2016); Lin, Song, Forman-Kay & Chan, J Mol Liquids (2017)

    Theory of Sequence-Dependent Polyampholyte Phase Separation

  • Lin, Forman-Kay & Chan, Phys Rev Lett (2016); Lin, Song, Forman-Kay & Chan, J Mol Liquids (2017)

    positive

    negative

    aromatic

    salt-free and

    salt-dependent

    co-existence

    curves

    Consistent with experiment, RPA predicts a significantly higher propensity for

    wildtype Ddx4N1 than the charge-scrambled mutant Ddx4N1CS to phase separate

    wildtype charges exhibit

    block-like properties

  • Lin, Forman-Kay & Chan, Phys Rev Lett (2016); Lin, Song, Forman-Kay & Chan, J Mol Liquids (2017)

    Figure from: Meyer, Castellano & Diederich,

    Angew Chem Int Ed (2003)

    π-π stacking

    cation-π

    π-π stacking O-H/π fe → fe + fFH

    Augmented RPA+FH Theory that accounts also for π-interactions and possibly other effects

    where fFH is a mean-field term for π-interactions

    agrees well

    with

    experiments

    π-interactions

  • Figures from: R. K. Das & R. V. Pappu, Proc Natl Acad Sci USA 110:13392–13397 (2013)

    same number of + and ‒ charges

    Average

    radius of

    gyration

    κ: a charge pattern parameter that quantifies local deviations from global charge asymmetry

    Single-chain IDP conformational dimensions are highly sensitive to not

    only total positive and negative charges but also the exact charge pattern

  • Y.-H. Lin & H.S. Chan, Biophys J (2017)

    Multiple-chain phase separation and single-chain conformational compactness of charged disordered proteins are strongly correlated

    critical temperature radius of gyration

    γ ≈ 5.8

  • Single-chain conformational compactness and multiple-chain

    phase separation are favored by similar block-like charge

    patterns that promote sequence-nonlocal attraction

    Y.-H. Lin & H.S. Chan, Biophys J (2017)

  • Sawle & Ghosh, J Chem Phys (2015)

    Das & Pappu, PNAS (2013)

    Sequence charge pattern parameters are predictive of

    conformational dimensions and phase separation tendency

    Y.-H. Lin & H.S. Chan, Biophys J (2017)

  • FIB1

    NPM1

    Image credit: Marina Feric & Cliff Brangwynne,

    Princeton University.

    From: New J Phys Focus on “Phase Transitions in Cells:

    From Metastable Droplets to Cytoplasmic

    Assemblies”

    How do different IDPs find one another to from the many separate intracellular compartments and subcompartments? Why don’t they all condense together into a large gemisch? A multivalent, stochastic, “fuzzy” mode of molecular recognition?

    nucleoli

  • Lin, Brady, Forman-Kay & Chan, New J Phys (2017)

  • Binary Coexistence of Two Charged Sequences: A step toward understanding the mechanisms of molecular recognition in IDP phase separation

    generalization:

    generalization:

    Lin, Brady, Forman-Kay & Chan, New J Phys (2017)

  • Coexistence Conditions are determined numerically by

    identifying phase-separated volume fractions that are consistent

    with the bulk volume fractions but yield a lower free energy

    Lin, Brady, Forman-Kay & Chan, New J Phys (2017)

  • T* = 4

    The Binary Phase Diagram (Pattern of Coexistence) for a Pair of Polyampholytes

    Varies Significantly with the Charge Patterns Along their Sequences

    Lin, Brady, Forman-Kay & Chan, New J Phys (2017)

  • Lin, Brady, Forman-Kay & Chan, New J Phys (2017)

    Asymmetry in the concentration ratios of two polyampholytes 1, 2 in the two phase-separated states α, β correlates with the difference in the sequences’ charge patterns

  • Some membraneless organelles are essentially condensed liquids

    underpinned by IDP liquid-liquid phase separation. It is a fundamental

    form of cellular compartmentalization enabling spatial and temporal

    organization of biomolecular processes.

    Some IDP phase separation are associated with pathologies.

    The phase behaviors of IDPs – involving a single or multiple IDP

    species – are based on sequence-dependent multivalent interactions.

    RPA provides a reasonable account of the effect of charge pattern on

    phase separation, as exemplified by the comparison with experimental

    data on Ddx4.

    Charge pattern matching can be a “fuzzy” mode of molecular

    recognition for partitioning different IDPs into different membraneless

    organelles and their subcompartments.

    Future efforts should be extended to treat the sequence dependence of

    other forms of interactions and to assess the accuracy of the analytical

    theory by explicit-chain simulations.

    Summary

  • Suman DAS • Yi-Hsuan LIN Alan AMIN

    Former group members:

    Artem Badasyan

    Mikael Borg

    Tao Chen

    Cristiano Dias

    Allison Ferguson

    Loan Huynh

    Hüseyin Kaya

    Michael Knott

    Heinrich Krobath

    Zhirong Liu

    Maria Sabaye Moghaddam

    Seishi Shimizu

    Tobias Sikosek

    Jianhui Song

    Stefan Wallin

    Zhuqing Zhang

    Coworkers University of Toronto University of Toronto

    Prof. Julie D. Forman-Kay Dr. Patrick Farber Dr. Veronika Csizmok

    Prof. Claudiu C. Gradinaru Gregory-Neal Gomes

    Prof. Régis Pomès

    University of Münster Prof. Erich Bornberg-Bauer

    Supported by the Canadian Institutes of Health Research, Natural Sciences and Engineering Research Council of Canada

    Baylor College of

    Medicine Prof. E. Lynn Zechiedrich Jennifer K. Mann

    Current group members:

    Peking University Prof. Zhirong Liu