proteins attila ambrus. versatile functions in biological systems linear polymers of amino acids...
TRANSCRIPT
Proteins
Attila Ambrus
versatile functions in biological systems
linear polymers of amino acids
spontaneous folding to 3D structures that eventually determines function
wide range of functional groups, most of them are chemically reactive
functional groups account for function (e.g. enzymes)
complexes with other biomacromolecules (proteins, RNA/DNA, lipids, carbohydrates, inorganics (e.g. ions), etc.) adopt even more functionalities that proteins alone lack
structure dictates function
(DNA replication machinery)
some proteins are rigid, some are flexible: rigid proteins may work for connective tissues or cytoskeleton while flexible ones can assemble with other molecules for more complex functions (e.g. transmit some kind of information in or between cells)
flexibility and function(the protein lactoferrin undergoes a substantial conformational
change upon binding Fe3+ ; apo- and holo-enzymes)
Alpha-amino acids
building blocks of proteins
four different substituents around (alpha-)carbon: chirality (except Gly)
“side chain”
L=S (except for cysteine)
absolute configuration (Cahn–Ingold–Prelog [CIP] system)
CORN rule: if COOH, R, NH2 are clockwise: D-form,
anticlockwise: L-form
Ionization state of amino acids as a function of pH(without side chain contributions)
side chainsSide chains
they differ in size, shape, charge, H-bonding capacity, hydrophobic character and chemical reactivity
twenty amino acids build up all proteins in all species in the evolutionarytree (with few exceptions; this “alphabet” is several billion years old)
hydrophobic effect in proteins: hydrophobic core resisting contact with water (apolar character), multimerization surfaces (protein-protein interactions)
polar side chains prefer being on the surface contacting water
Proline is a special amino acid
the ring structure markedly influences local protein structure due to itsrigid nature (see also cis/trans peptide bonds later)
Aromatic side chains
reactive
Determination of protein concentration
# of Tyr, Trp and S-S bonds count for of a protein
Polar/uncharged amino acids
reactiveadditionalasymmetric center
much more reactive than -OH
two Cys –SHs can form disulfide bonds (-S-S-, by oxidation,
forming cystine) that is particularly important in stabilization of
the 3D structure of proteins
Cysteine is also special in a way…
Polar/charged amino acids
at near neutral pH, depending on local environment (catalytic effects,
enzyme active centers)
Asparticacid
Glutamicacid
in special environments/settings in a protein Asp/Glu can be (partially or transiently) protonated that generally has an important functional role in enzymatic mechanisms
Why these amino acids (why not others)?
they are versatile enough for structure and function of necessary proteins/enzymes for life
they were probably available from prebiotic reactions (before the origin of life)
other possible amino acids may be too reactive for the purpose (e.g. homoserine or homocysteine)
spontaneous cyclization(limitations for protein structure)
(amide bond)
endergonic reaction under most conditions, needs input of free energy
the peptide bond is kinetically stabilized (metastable) since the lifetime of
a peptide bond in water is ~1000 years (in the absence of a catalyst)
in folded proteins overwhelmingly (~1000:1) the trans isomer dominate
(for X-Pro peptide bonds this ratio is only ~3:1!; similar state of energy)
two resonance forms, Ea=~20 kcal/mol, less reactive than esters, detection
of peptide bond: at 190-230 nm (UV spectrometry)
Peptide bond residue
dihedral angle: =0o for cis, 180o for trans isomer (isomerization is slow [10-100 s], but can be facilitated by peptidyl prolyl isomerases; normal protein folding is 10-100 ms)
condensation
steric clashes in cis configuration
with proline the magnitudes of theeffects are similar
relatively high dipole moment in the double-bonded form (~3.5 D), lining up these dipoles e.g. in an alpha-helix produces great net dipole moments (important in physico-chemical properties of proteins)
peptide bonds (proteins) can be broken down to amino acids (or smaller peptides) chemically by acids or bases (generally with 6 M HCl, 110 oC,18-96 h or 2-4 M NaOH, 100 oC, 4-8 h) or enzymatically by peptidases (proteinases, proteases, see later)
60% 40%
Protein termini
the protein chain has a polarity (the two ends of the chain are different)
by convention, the –NH2 terminus is put at the start of writing the sequence (Leu-Phe-Gly-Gly-Tyr is another oligo-peptide with indeed differing properties)
distinctive side chains (variable parts)
main chain or backbone (repeating/constant)
Backbone and side chains
there are great H-bonding potentials in the backbone: N-H is a good donor,C=O is a good acceptor
they interact with one another and with functional groups from side chainsand stabilize structural elements in proteins
proteins generally contain 50-2000 amino acids (a muscle
protein contain 27,000 amino acids)
sequences of small numbers of amino acids are called oligo-
peptides or just peptides (although if they serve already
protein-like functions, they may be called miniproteins)
the average molecular weight of an amino acid is ~110 g/mol,
hence the molecular weight (MW) of a protein generally
ranges from 5,500 to 220,000 g/mol
they also use as a unit of molecular weight of biomacromolecules the Dalton (after John
Dalton [1766-1844] who suggested for the unit of atomic mass the weight of an H atom in
1803; since 1961 we use 12C as a basis of atomic weight especially due to the discovery of
elemental isotopes in 1912). Designation of Dalton as a unit can be Da, D, d and kDa, kD, kd;
practically the same number numerically as the regular MW, so for example:
50 kDa=~50,000 g/mol
Cross-linking disulfide bonds
oxidation occurs especially for extracellular proteins (intracel-lular environment is generally too reductive for the S-S bond)
periplasm of bacterial strains are also rather oxidative and may support correct folding if proteins are stabile with specific –SH groups being oxidized (advantage of a periplasmicprotein over-expression system, see protein purification, later)
rarely there are other side chains participating in cross-links in proteins, like in collagen fibers in connective tissue or in fibrin blood clots
Frederick Sanger, 1953
amino acid sequence of insuline (protein hormone, the veryfirst protein sequence determined)
~2,000,000 protein sequences are known today!
amino acid sequence = primary structure (of a protein)
What is a protein sequence good for?
essential to get to know the mechanism of action (e.g. catalytic mechanismof an enzyme)
proteins with novel properties can be generated by varying the sequencesof known proteins (the science of protein engineering)
the primary sequence determines the 3D structure of the protein and itis the link between the genetically encoded information in DNA and the actual biological function of the protein
analysis of the relation between primary and 3D structures uncovers mechanisms of folding/unfolding/refolding of proteins
sequence determination is a component of molecular pathology (searchingfor mutations that determines predisposition to various diseases – alter-ations in amino acid sequence may result in abnormal function and disease)
sequence of a protein reveals much about its evolutionary history, proteinsequences that resemble to one another likely to have a common ancestor,hence molecular events in evolution can be traced down (phylogenetics – “relatedness”, molecular paleontology)