the 20 amino acids

Download The 20 Amino Acids

If you can't read please download the document

Upload: janae

Post on 09-Jan-2016

34 views

Category:

Documents


0 download

DESCRIPTION

The 20 Amino Acids. Amino Acid Similarities. buried vs. exposed: compare average buried surface area or solvent-exposure correlation with partition coefficient and free-energy of transfer hydrophobicity scales... AAindex: online database http://www.genome.jp/aaindex/ - PowerPoint PPT Presentation

TRANSCRIPT

  • The 20 Amino Acids

  • Amino acidNumber in the whole databasePercentage in the whole database Ala68449%Arg34374%Asp38025%Asn48626%Cys11341%Gln28934%Glu45066%Gly64758%His17832%Ile42725%

    Amino acidNumber in the whole databasePercentage in the whole database Leu63668%Lys46246%Met16062%Phe32244%Pro37485%Ser49206%Thr46896%Trp12542%Tyr30074%Val55327%

  • Amino Acid Similaritiesburied vs. exposed: compare average buried surface area or solvent-exposurecorrelation with partition coefficient and free-energy of transferhydrophobicity scales...AAindex: online databasehttp://www.genome.jp/aaindex/thousands of dimensions of similarityCornette et al. (1987) factor analysisJanin (1979)1; (2) Wolfenden, et al.2; (3) Kyte and Doolittle 3; and (4) Rose, et al

  • Lesser and Rose (1990)

    Sheet1

    surface area (A^2)avg exposed areadifferencefractiondelta G

    Ala (397)118.131.586.60.740.5

    Arg (137)25693.8162.20.64

    Asn (221)165.562.2103.30.63

    Asp (239)158.760.997.80.62

    Cys (98)146.113.9132.30.91

    Gln (164)193.274119.20.62

    Glu (217)186.272.3113.90.62

    Gly (435)88.125.262.90.720

    His (99)202.546.7155.80.780.5

    He (255)181231580.88

    Leu (297)193.129164.10.851.8

    Lys (288)225.8110.3115.50.52

    Met (66)203.430.5172.90.851.3

    Phe (135)222.828.7194.10.882.5

    Pro (152)146.853.792.90.64

    Ser (341)129.844.285.60.66-0.3

    Thr (265)152.546106.50.70.4

    Trp (75)266.341.7224.60.853.4

    Tyr (181)236.859.1177.70.762.3

    Val (348)164.523.51410.861.5

  • Rose (1985). Science.Richards (1977) Ann.Rev Biophys Bioeng

  • Evolution reflects a combination of multiple similarities...PAM250 (Dayhoff, 1978) 10 x log_odds of substitution (relativeto rate expected from random)

  • peptide bond

  • resonance (tautomers) promotes planarity

  • 3.4ABond lengths

  • steric conflictstranseclipsedgauche (+/- 60 deg torsion angles)

  • Ramachandaran Plotsbeta-sheetsright-handed-alpha-helices2 degrees of freedom per AA in backboneleft-handed helices

  • (show model)

  • both forms have clashes, so there is less preference (cis allowed)~5% cis (~2 kcal/mol delta G)50% of proteins have a cis peptide bond, 87% are Pro0.002/sec flipping frequency (~10 min) (14-24 kcal/mol activation barrier)role in turns

    Pal & Chakrabarti (1999), JMB

    5-HT neurotransmitter receptor/ion channel (Dougherty, Nature, 2005)

    Kun Ping Lu1, Greg Finn1, Tae Ho Lee1 & Linda K Nicholson (2007).Prolyl cis-trans isomerization as a molecular timer.Nature Chemical Biology 3, 619 629.

  • RotamerslibrariesPonder and RichardsDunbrackRichardsonchi-anglesconflict with carbonyl O (show picture)backbone-independent and dependence (conditional)

  • Ponder and Richards (1987)

  • Tuffery et al.http://bioserv.rpbs.jussieu.fr/doc/Rotamers.htmlaa chi1 chi2 num freq std.dev.-- ---- ---- --- ---- --------TYR -67.50 82.50 5364 0.2326 12.1920 TYR -62.50 -77.50 5262 0.2282 8.8149 TYR -62.50 -22.50 1442 0.0625 12.1077 TYR 62.50 -82.50 1122 0.0487 8.3048 TYR 62.50 82.50 1618 0.0702 8.0934 TYR -177.50 -82.50 1522 0.0660 11.8303 TYR 177.50 77.50 6728 0.2918 10.0764

    VAL -62.50 4225 0.1843 15.1085 VAL 67.50 1941 0.0847 20.3667 VAL 177.50 16754 0.7310 9.8021

  • Dunbrack & Cohen: BBDep library Bayesian statistics counts in bins -> conditional probs use Dirichlet priors infer posteriors by simulation

  • Disulfide bridgesonly non-linear connection; adds stabilityintracellular environment is usually reducingsecreted proteins have disulfides bridges more oftendsbABCD disulfide-bond isomerasesglutathione reductases

  • disulfide conformationsCa-Ca dist: 4.5-7.5Richardson (1981)left-handed spiral, right-handed hookadjacent Cys: (thioredoxin), near: Zn-finger (C-X-X-C)buried vs. solvent-exposedmore prevalent in secreted proteins (immunoglobulins, chymotrypsin, insulin...); cytosol is usually a reducing environment

  • InsulinImmunoglobulins (IgG)

  • Contribution of disulfides to protein stabilityCan you increase stability by engineering in a disulfide?Betz (1993) Protein Scienceeffects on DH vs. DS: main effect comes from reducing entropy of unfolded statedisruption of Cys6-Cys127 in HEL lysozyme costs 7.5 kcal/moldisruption of 1 disulfide in RNase T1 costs 3.3 kcal/moldisruption of Cys14-Cys38 in BPTI costs 8 kcal/mol

  • Alpha-helicesstandard alpha-helices (predominant form)right-handedH-bonds: i:i+4O points forward3.6 residues per turn of helix (100 deg/aa) p-helices (i:i+5)tighter, examples?often near ends?3/10-helices (i:i+3) 1EHK, 153-157, chain Bleft-handed helices Ramachandaran plot disallowed regionexamples: alanine racemase (res 40-44), nitrate reductase, collagen87% are short (only 4 residues long)Novotny and Kleywegt (2005)

  • a helix (i:i+4) 3/10 helix (i:i+3)p helix (i:i+5)carbonyl oxygens point forwardCbs point slightly backward

  • Helical Triviahelix dipoleC-cap (JMB paper) http://dx.doi.org/10.1016/S0022-2836(02)00734-9N-cap: Ser, Thrhelix packing angles (Bowie, 1997)helix bundles, hemoglobin, leu-zipperskinks: Pro (disrupt H-bonds), see 1MLT

  • Beta-sheetsparallelanti-paralleltwistladder of H-bondsside-chains alternate up and down (pleated)topology, Greek keys (5PCY), jelly rollsbeta-bulge (RNase A, 1Z6S, res 88-91)nitrate reductase

  • antiparallelparallel good examples to look at: flavodoxin (1CZN) 5-stranded parallel immunoglobulin (4FAB) - antiparallel see twist of sheet in 2o2v (kinase) notice C=O and Ca-Cb vectors, H-bonds, twist of sheet

  • Turnsdefined when C-alpha atoms are < 7A apartA -turn is characterized by hydrogen bond(s) in which the donor and acceptor residues are separated by two residues (i:i+2). A -turn (the most common form) is characterized by hydrogen bond(s) in which the donor and acceptor residues are separated by three residues (i:i+3). An -turn is characterized by hydrogen bond(s) in which the donor and acceptor residues are separated by four residues (i:i+4). A -turn is characterized by hydrogen bond(s) in which the donor and acceptor residues are separated by five residues (i:i+5). An -loop is a catch-all term for a longer loop with no internal hydrogen bonding.role of Gly, Pro...Richardson, 1980; Wilmot and Thornton, 1988

  • Residue 2 Residue 3 Designation Phi,Psi Phi,Psi Comments ------------------------------ --------I -60,-30 -90,0 Most common type. II -60,120 80,0 III -60,-30 -60,-30 Like 3/10 helix. IV unclassified turnsV -80,80 80,-80 VIa -60,120 -90,0 *VIb -120,120 -60,0 *VII** VIII -60,-30 -120,120 --------------------------------------------------------------- * 2-3 peptide bond is cis, residue 3 is proline. ** Type VII is a bend recognized by psi(2)~180 and phi(3)
  • secondary structure length distributionsmeans: alpha-helices: ?beta-sheets: ?

  • DSSPKabsch and Sander (1983)secondary structure identification based on geometry (f/y angles) AND H-bonding patternsidentifies sub-types of helices, turns, etc.calculates solvent accessibilitypatterns and merging rules

    H-bond criteria:up to 5A or 60(but not both)

  • # RESIDUE AA STRUCTURE BP1 BP2 ACC N-H-->O O-->H-N N-H-->O O-->H-N TCO KAPPA ALPHA PHI PSI X-CA Y-CA Z-CA 1 2 A T 0 0 47 0, 0.0 728,-1.9 0, 0.0 2,-0.4 0.000 360.0 360.0 360.0 123.8 39.4 15.0 22.0 2 3 A L - 0 0 88 725,-0.2 2,-0.4 726,-0.2 725,-0.2 -0.962 360.0-170.9-117.7 126.1 36.4 12.8 22.6 3 4 A L B -A 726 0A 3 723,-2.9 723,-3.0 -2,-0.4 8,-0.1 -0.944 7.9-152.7-117.8 141.8 32.8 13.9 22.1 4 5 A G - 0 0 2 -2,-0.4 2,-0.5 721,-0.2 7,-0.2 -0.182 31.3 -88.3 -94.2-165.9 29.7 11.9 23.1 5 6 A T > - 0 0 4 719,-0.3 3,-2.1 740,-0.1 6,-0.3 -0.920 45.7-105.5-108.1 127.9 26.2 12.0 21.6 6 7 A A T 3 S+ 0 0 1 28,-1.6 29,-0.1 -2,-0.5 7,-0.1 -0.224 103.8 14.8 -54.2 136.7 23.7 14.5 23.0 7 8 A L T 3 S+ 0 0 51 1,-0.3 -1,-0.2 5,-0.1 28,-0.0 0.266 104.9 104.3 84.2 -9.1 21.1 12.9 25.2 8 9 A R S X S- 0 0 60 -3,-2.1 3,-2.3 716,-0.1 -1,-0.3 -0.606 88.0-100.3 -97.2 162.2 23.0 9.6 25.5 9 10 A P T 3 S+ 0 0 100 0, 0.0 714,-0.1 0, 0.0 -2,-0.1 0.801 122.0 47.9 -52.9 -37.5 24.9 8.6 28.6 10 11 A A T 3 S+ 0 0 37 -5,-0.1 -6,-0.2 714,-0.1 -4,-0.1 0.294 86.0 145.7 -89.6 12.1 28.3 9.5 27.3 11 12 A A < - 0 0 2 -3,-2.3 2,-0.6 -6,-0.3 -5,-0.1 -0.021 51.7-126.2 -49.0 148.5 27.0 12.9 26.2 12 13 A T - 0 0 15 22,-0.4 24,-2.9 -8,-0.1 2,-0.5 -0.885 31.5-152.1 -97.5 120.9 29.2 16.0 26.3 13 14 A R E -c 36 0B 47 -2,-0.6 63,-2.9 61,-0.2 64,-1.8 -0.905 15.0-172.4-110.2 129.7 27.3 18.7 28.2 14 15 A V E -cd 37 77B 0 22,-2.8 24,-2.7 -2,-0.5 2,-0.5 -0.957 8.6-156.2-115.6 127.9 27.6 22.4 27.8 15 16 A M E -cd 38 78B 0 62,-3.4 64,-4.0 -2,-0.4 2,-0.6 -0.927 6.8-153.5-107.5 123.8 25.9 24.9 30.2 16 17 A L E -cd 39 79B 2 22,-2.8 24,-2.8 -2,-0.5 2,-1.0 -0.891 2.3-160.8 -98.4 119.4 25.2 28.4 28.8 17 18 A L E S+cd 40 80B 0 62,-1.9 64,-3.6 -2,-0.6 65,-0.8 -0.841 80.7 24.8-100.8 90.1 25.1 31.1 31.5 18 19 A G - 0 0 0 22,-2.7 23,-0.2 -2,-1.0 22,-0.1 0.221 67.5-156.8 113.4 119.6 23.2 33.7 29.5 19 20 A S + 0 0 0 20,-0.1 29,-2.6 4,-0.1 30,-0.5 -0.240 44.5 129.7-117.7 37.4 21.0 32.7 26.6 20 21 A G S > S- 0 0 6 27,-0.2 4,-2.2 26,-0.1 28,-0.3 0.001 81.7 -72.1 -78.5-168.2 20.9 36.0 24.5 21 22 A E H > S+ 0 0 15 1,-0.2 4,-1.1 2,-0.2 5,-0.1 0.737 134.8 53.5 -59.0 -30.3 21.6 36.2 20.7 22 23 A L H > S+ 0 0 23 2,-0.2 4,-1.4 1,-0.2 3,-0.4 0.932 111.1 44.4 -71.6 -46.8 25.3 35.6 21.3 23 24 A G H > S+ 0 0 3 1,-0.2 4,-2.9 2,-0.2 -2,-0.2 0.865 105.2 65.8 -63.1 -35.2 24.6 32.4 23.3 24 25 A K H X S+ 0 0 6 -4,-2.2 4,-1.9 1,-0.2 -1,-0.2 0.880 103.7 44.3 -54.5 -42.6 22.1 31.3 20.7 25 26 A E H X S+ 0 0 10 -4,-1.1 4,-2.4 -3,-0.4 -1,-0.2 0.830 110.9 53.5 -73.8 -32.9 24.8 31.0 18.1 26 27 A V H X S+ 0 0 5 -4,-1.4 4,-2.1 2,-0.2 -2,-0.2 0.934 108.8 51.3 -65.7 -41.5 27.1 29.2 20.5 27 28 A A H X S+ 0 0 0 -4,-2.9 4,-3.0 2,-0.2 -2,-0.2 0.923 109.8 49.3 -58.2 -46.8 24.1 26.7 21.1 28 29 A I H X S+ 0 0 0 -4,-1.9 4,-2.0 1,-0.2 -1,-0.2 0.941 110.9 48.7 -61.1 -46.3 23.8 26.2 17.4 29 30 A E H < S+ 0 0 22 -4,-2.4 4,-0.3 2,-0.2 -1,-0.2 0.840 112.4 48.6 -64.1 -31.7 27.4 25.5 17.0 30 31 A C H >S+ 0 0 0 -4,-2.1 5,-2.3 1,-0.2 3,-1.8 0.940 110.8 50.6 -69.9 -47.3 27.3 23.1 19.9 31 32 A Q H >
  • H = alpha helix B = residue in isolated beta-bridge E = extended strand, participates in beta ladder G = 3-helix (3/10 helix) I = 5 helix (pi helix) T = hydrogen bonded turn S = bend (direction change by > 70 degrees)

  • identical peptide fragments up to 8-mers can be found in both alpha-helical and beta-strand conformations in different proteins/contextsZhou, F. Alber, G. Folkers, G. Gonnet and Chelvanayagam (2000)Design of Protein Conformational SwitchesAmbroggio and Brian Kuhlman (2006)discusses how to engineer regions that can change statesalso discusses relation of alternative folding states to amyloid formationChameleon peptides (Minor & Kim, Nature, 1996)11-mer as both a-helix and b-strand in GB1

  • Secondary Structure PredictionChou/Fasmanaa propensitiesalpha-helix preference (aliphatic, non-branched): Ala, Leu, Met, Phe, Glu, Gln, His, Lys, Argbeta-sheet preference (hydrophobic):Tyr, Trp, (Phe, Met), Ile, Val, Thr, Cysrules: nucleation, helix-breaker60-70% accuracya helix is predicted if, in a run of six residues, four are helix favoring and the average valued of the helix propensity is greater than 1.0 and greater than the average strand propensity. Such a helix is extended along the sequence until a proline is encountered (helix breaker) or a run of 4 residues with helical propensity less than 1.0 is found. A strand is predicted if, in a run of 5 residues, three are strand favouring, and the average value of the strand propensity is greater than 1.04 and greater than the average helix propensity. Such a strand is extended along the sequence until a run of 4 residues with strand propensity less than 1.0 is found.

  • secondary structure propensities (relative to overall frequency)

  • PHD (Rost and Sander, 1993)exploits evolutionary information (multiple alignment of family members)neural network, window-size=17 residues70-75% accuracylimits of prediction? (rest is due to non-local interactions)identical fragments up to 7 aa can be found in both helices and sheets

  • Transmembrane regionssingle helix (endolysins)helix bundle (6-12) (K+ channel, ABC transporters, GPCRs)beta barrel (OMP)

  • Predicting Transmembrane Regionshydrophobic (no moment)charged near ends (like membrane)positive-inside rule (von Heinje, 1992)characteristic lengths (15-35 for helices, with caps)TMPredTMMHM (Sonnhammer)

    gluconate permease 3 from E. coli

  • Signal PeptidesSignalP (Nielsen, von Heinje, Brunak) - HMMsignal peptidases (gram-pos. vs. gram-neg. vs. eukaryotic)isoelectric point differencesThe average length of signal peptides range from 22 (eukaryotes) and 24 (Gram-negatives) to 32 amino acid residues for Gram-positives (-3,-1) rule smalland neutral

  • Disordered regionsKeith Dunker groupa 4th category of secondary structurenot just random coil (which has irregular but fixed f/y, H-bonds)unstructured, molten globule, meta-stableflexible, dynamic, sample multiple conformations in solution (HSQC)correlation with B-factors, dis-order in crystals?role in translocation, recognition, chromatin...PEST signals target proteins for proteosomal degradation, enriched in unstructured proteins (Singh, Proteins, 2006)role in disease (amyloidosis)NACP non-ab component precursor (14 kDa), intrinsically unstructured in solution

  • Disorder in CREB transcriptional activatorphosphorylation modulates interaction with CBP Wright and Dyson (JMB, 1999); Radhakrishnan (1997, 1998)kinase-induced domain (KID) of CREB binds CBP (res 586-672)only when phosphorylated on Ser133; forms pair of helicesdisordered when de-phosphorylated (
  • Calcineurinhelix must be accessible toget bound bycalmodulin

  • enriched in P, E, K, S, and Q (charged)depleted in W, Y, F, C, I, L, and N (hydrophobic)low sequence complexity (Romero et al., 2001)repeated aas, like collagen or silk; poly-Ala, Gly, Pro...low entropy of aa probs (window=45)K2
  • PONDR (Obradovic) Prediction of Naturally Disordered Regionsneural networksinput sliding window 9-21 aa; output smoothed over 9 aamultiple classifiers (different training sets): VL1, VSL1, XL1, XC...VL-XT (accuracy ~ 80%)The VL-XT predictor integrates three feedforward neural networks: the VL1 predictor (Romero et al. 1997), the N-terminus predictor (XN), and the C-terminus predictor (XC) (both from Li et al. 1999). VL1 was trained using 8 long disordered regions identified from missing electron density in x-ray crystallographic studies, and 7 long disordered regions characterized by NMR. The XN and XC predictors, together called XT, were also trained using x-ray crystallographic data, where the terminal disordered regions were 5 or more amino acids in length. Coordination number is the average number of side chain neighbors that are in contact with the given side chain when it is fully buried as determined from a set of 33 non-homologous proteins.

  • (from PSB98)

  • PDB files

    records: ATOM, HETATM, TER, ENDMDL, chain ids connectivity assumed; usually no H's B-factors, alt conf, NMR, ligands resolution vs. coordinate precision poly-ala, missing, mutations, truncations, His-tag

  • Post-translational modifications phosphorylation, glycosylation, lipidifaction proteolysis disulfide bridges side-chain adducts: GFP, katG covalent co-factors - PLP oxidation of sulfhydryls fMet - peptide deformylase; inteins acylation, ACE? co-factors: Fe/S proteins, hemes