protein structure - ntnufolk.ntnu.no/audunfor/7. semester/biopolymerkjemi/komplementerende... ·...

Post on 03-May-2018

229 Views

Category:

Documents

5 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Protein Structure Marianne Øksnes Dalheim, PhD candidate

Biopolymers, TBT4135, Autumn 2013

The presentation is based on the

presentation by Professor Alexander Dikiy,

which is given in the course compedium:

Part 4.4 on page 165

Outline

• Part 1: Protein structure fundamentals

• Part 2: Determining the protein structure

Part 1:

Protein structure fundamentals

Polypeptides

• Biopolymer

• Monomers (building blocks): Amino Acids

• Monodisperse – DNA RNA Protein

– Defined sequence of amino acids

• A protein: one or more polypeptide chains folded into a structure, having a biological function

All proteins are polypeptides, but not all polypeptides are proteins

Amino acids –the building blocks

Amino acids – stereochemistry

The proteins are constituted by L-AA isomers

In Fischer projection: • Vertical bonds: stretch out in

the space behind the paper • Horizontal bonds: stretch up

and out of the plane of the paper

• L-configuration: Functional group (-NH3

+) to the left

• D-configuration: Functional group (-NH3

+) to the right

Amino acids – chemistry of the side group

• Nonpolar, aliphatic • Polar, uncharged • Aromatic • Charged

• Positively • Negatively

Chemistry of the side group → function

• 20 different amino acids → many different functional groups in one molecule

• The proteins are tailor made to specific biological functions and reactions

• Proteins from very different organisms, with the same biological function: almost identical or very similar primary structure (homology)

• Complex proteins • Glycoprotein

• Lipoprotein

• Phosphoproteins

Functions: • Catalysis • Regulation • Structure • Movement • Transport • Signaling

The protein alphabet

• The protein alphabet is represented either in a one-letter or a three-letter code language

• Each AA has its own unique code definition

Alanine Ala A

Arginine Arg R

Asparagine Asn N

Aspartic acid Asp D

Cysteine Cys C

Glutamine Gln Q

Glutamic acid Glu E

Glycine Gly G

Histidine His H

Isoleucine Ile I

Leucine Leu L

Lysine Lys K

Methionine Met M

Phenylalanine Phe F

Proline Pro P

Serine Ser S

Threonine Thr T

Tryptophan Trp W

Tyrosine Tyr Y

Valine Val V

Acid-base properties of amino acids

Monovalent acid: HA H+ + A-

Henderson – Hasselbach equation:

Ka

apH = pK - logacid

base

apH = pK + log1

Ionization of Gly and His

The isoelectric point (pI) of amino acids

Definition: pI = pH when net charge is zero

∑(+) = ∑(-)

The properties of single amino acids are reflected on the protein functional peculiarities and structure

Importance of the amino acid nature for protein structure

- The hemoglobin

Hemoglobin A: -Val-His-Leu-Thr-Pro-Glu-Glu-Lys-

Hemoglobin S: -Val-His-Leu-Thr-Pro-Glu-Val-Lys-

Mutation of Glu (hydrophilic) on Val (hydrophobic) results in complete alteration of the protein structure thus causing disease – Sickle cell anemia.

The peptide bond

Formed by a condensation reaction: carboxyl + amine = amide + H2O

Rotation flexibility of AA

cis- and trans- AA

Backbone dihedral (torsion) angles

Dihedral angle

- Angle between two planes - Determined from 4 atoms Phi angle (φ) The dihedral angle composed of the four atoms: C(i-1) - >N(i) - C(i) - C(i). - free rotation around N-C bond. Psi angle (ψ) The dihedral angle composed of the four atoms: N(i) - C (i) - C(i) >- N(i+1). - free rotation around C-C(O) bond Omega angle (ω) The dihedral angle decided by the four atoms: Cα(i)-C(i)-N(i+1)-Cα(i+1) - rotation around the C(O)-N bond (peptide bond - restricted rotation, 0°or 180°(cis or trans)

Phi– and Psi- dihedral angles can not take any values combination, due to steric hindrance

Psi- angle

Main area 1: - φ: -60 → -180 - Ψ: -75 → - 15 → α helix Main area 2: - φ: -60 → -180 - Ψ: 10→ 180 → β sheet

Polypeptide chain

Protein structure

Primary structure

• The amino acid sequence.

• The nascent polypeptide chain should, in most cases, take the protein fold.

Let’s consider a protein with 100 AA. If each AA can assume 3 different conformations (in practice it is much more), it would exist for

this protein 3100 = 1047 possible conformations.

However, the proteins, during around

picoseconds, chooses its unique fold.

Anfinsen’s experiment

→ Proteins adopt their native structure/information spontaneously

• Proteins gets folded through the interaction of amino acids.

• Weak interactions: electrostatic, hydrophobic, hydrogen bonding, metal-AA coordination bonds

• Covalent bonds in a protein exist only within AA, peptide bond and disulfide bridges (S-S).

Protein folding

Secondary structure

• Interaction between AA lead to different types of secondary structure.

• Local folding

-helix, -sheet and loops

Different types of helixes

3.6 - 3 - 5 - residues per turn

Hydrogen bonding network: i - i+3 residue (310 helix), i - i+4 residue (normal helix), i - i+5 residue (pi helix)

-helix

-sheets

antiparallel

parallel

Tertiary structure

Tertiary structure represents the protein folding and is a spatial arrangement of elements of secondary structure (-helixes, -sheets), as well as connecting loops, turns, unfolded (not structured) regions.

Total amount of different folds can be estimated as approximately 2000.

Protein domains A protein domain is a part of protein sequence and structure that can evolve, function, and exist independently of the rest of the protein chain. Each domain forms a compact three-dimensional structure and often can be independently stable and folded. Many proteins consist of several structural domains. One domain may appear in a variety of different proteins.

Wikipedia

Pyruvate Kinase – 1pkn

Quaternary structure Only multi chain proteins have quaternary structure. The inter-chain

interaction is based on weak and S-S interactions

1) Membrane protein: Rhodopsin 2) Globular protein: SelW 3) Fibrous protein: Collagen

Some structural examples

Membrane protein: Rhodopsin

Rhodopsin is the protein component of the light receptor in the retinal rods of the vertebrate eye. Similar molecules are found in the light-sensing structures of all animals

Globular protein: The mammalian SelW protein

SelW is a selenoprotein involved in cellular redox reactions. • Small: 89 amino acids

• Motif: Cys-X-X-U, where U is

Selenocystein • Its structure reveals a -----

fold

• Globular

Fibrous protein: Collagen

Collagen is the most abundant protein in mammals. About one quarter of all of the protein in your body is collagen. Collagen is the main protein of connective tissue. It has great tensile strength.

Fibrous protein: Collagen

• Three polypeptide chains with the repeat sequence: Gly-X-Y

• X is often proline • Y is often hydroxyproline

(posttranslational modification) • Each chain is about 1000 amino acid

residues long • Synthesized as procollagens - globular

propeptides that are excised off by extracellular enzymes.

• Excision of propeptides allows the triple chain molecule to polymerize into fibrils

Branden C., Tooze, J. (1999) Introduction

to protein structure, 2nd ed., Garland

publishing, New York, p 284

Fibrous protein: Collagen

Each of the three polypeptide chains are folded into an extended left-handed helix • 3.3 residues per turn (α-helix: 3.6) • Rise per residue: 2.9 Å (α-helix: 1.5) • Rise per turn: 9.6 Å (α-helix: 5.4)

→ More extended conformation than the α-helix. The three helices in collagen form a trimeric molecule by coiling about the central axis to form a right-handed superhelix The side chain of every third residue is close to the central axis, where there is no room for a side chain, consequently every third residue must be a glycine. Branden C., Tooze, J. (1999) Introduction

to protein structure, 2nd ed., Garland

publishing, New York, p 284

Fibrous protein: Collagen

Part 2:

Determining the protein structure

What can we learn analyzing the protein structure?

• Protein function • Protein mechanism • Protein evolution • Protein system biology • Structure based drug design

What does it mean to determine the 3D structure of a protein?

Determine either •ALL the distances between each atom and the remaining protein atoms or •ALL protein’s dihedral angles

Experimental techniques for

macro-molecule structures determination

Low resolution techniques 1. Electron microscopy 2. SAXS (small angle X-ray scattering)

→ rough structure, topology, quarternary structure of large proteins. → Not position of each atom High resolution techniques 1. X-ray crystallography –first applied in 1961 (Kendrew and Perutz – Nobel

prize winners) 2. NMR spectroscopy –first applied in 1983 (Ernst and Wuthrich –Nobel

prize winners)

→ position of each atom

X-ray Crystallography • Most widespread technique to determine high-resolution structure of molecules in

the solid state

• The method depends on directing a beam of x-rays onto a regular, repeating array of many identical molecules a crystal

• The x-rays diffract from the crystal in a diffraction pattern

• The diffraction data from the crystal is used to calculate an electron density map

• Interpret the map as a polypeptide chain with a particular amino acid sequence

Branden C., Tooze, J. (1999) Introduction

to protein structure, 2nd ed., Garland

publishing, New York, p 377

X-ray Crystallography

X-ray Crystallography Prerequisite • Have to obtain well ordered crystals that diffract x-rays

Proteins can be difficult: large spherical irregular surfaces that is impossible to pack into a crystal. • Large channels between the individual molecules, filled with disordered solvent

molecules • Only a few contact points between the protein molecules. This is also the reason why the structures determined by x-ray crystallography are the same as those for the proteins in solution

NMR spectroscopy

A technique that relies on observation of energy absorption by nuclei in a external magnetic field under the influence of electromagnetic radio frequency irradiation

– Place the protein molecules in a strong magnetic field and the spin of their nuclei will align along the field. This process is an equilibrium process

– If you apply radio frequency pulses the equilibrium alignment will be changed to an excited state

– When the nuclei return to the equilibrium state, they emit radio frequency radiation that can be measured

– The frequency of the emitted radiation depends on the chemical environment of the nucleus and will therefor be different for each atom.

– The different frequencies are obtained relative to a reference signal and is what we call a chemical shift.

NMR spectroscopy

• Distinguish various nuclei on the basis of their magnetic properties determined by their chemical environment

• The nuclei has to have an intrinsic magnetic moment (non zero spin): 1H, 13C, 15N

• The nature, duration, and combination of the applied RF pulses can be varied to probe different molecular properties of the sample

• Assign the spectrum of chemical shifts

• Measure distances and dihedral angles

• Solid state NMR or Solution NMR

→ Complementary techniques Crystallography: high resolution, fast technique, strong macromolecular complexes NMR: Structure in solution, dynamics(folding), weak macromolecular complexes

How can I find whether the structure I am interested in is already determined?

Internet address: www.rcsb.org

→ all the determined structures are deposited in the protein data bank

Statistics available at RCSB on October 9, 2012

85212 released atomic coordinate entries

Molecule Type:

78911 proteins, peptides, and viruses

2432 nucleic acids

3845 protein/nucleic acid complexes

24 other

Experimental Technique

78911 X-ray

9626 NMR

499 electron microscopy

165 other

ATOM 1 N CYS A 1 -23.284 7.726 4.920 1.00 5.78 N ATOM 2 CA CYS A 1 -23.838 6.461 5.494 1.00 4.91 C ATOM 3 C CYS A 1 -22.786 5.345 5.449 1.00 3.96 C ATOM 4 O CYS A 1 -21.826 5.419 4.700 1.00 3.93 O ATOM 5 CB CYS A 1 -25.060 6.097 4.640 1.00 5.02 C ATOM 6 SG CYS A 1 -26.538 6.897 5.318 1.00 5.60 S ATOM 7 H2 CYS A 1 -24.029 8.449 4.870 1.00 6.28 H ATOM 8 HA CYS A 1 -24.152 6.627 6.514 1.00 5.16 H ATOM 9 HB2 CYS A 1 -24.908 6.431 3.625 1.00 5.62 H ATOM 10 HB3 CYS A 1 -25.201 5.025 4.645 1.00 4.44 H ATOM 11 HG CYS A 1 -26.624 7.759 4.904 1.00 5.70 H ATOM 12 H1 CYS A 1 -22.908 7.542 3.966 1.00 5.73 H ATOM 13 H3 CYS A 1 -22.516 8.073 5.530 1.00 6.16 H ATOM 14 N ALA A 2 -22.968 4.318 6.246 1.00 3.36 N ATOM 15 CA ALA A 2 -21.993 3.182 6.271 1.00 2.48 C ATOM 16 C ALA A 2 -22.085 2.364 4.975 1.00 1.96 C ATOM 17 O ALA A 2 -23.145 2.256 4.384 1.00 2.54 O ATOM 18 CB ALA A 2 -22.369 2.322 7.481 1.00 3.05 C ATOM 19 H ALA A 2 -23.753 4.294 6.832 1.00 3.63 H ATOM 20 HA ALA A 2 -20.991 3.564 6.403 1.00 2.30 H ATOM 21 HB1 ALA A 2 -22.564 2.957 8.333 1.00 2.93 H ATOM 22 HB2 ALA A 2 -23.252 1.744 7.252 1.00 3.30 H

From the coordinates we can easily calculate the distance between two points:

D=

What does it mean to determine the 3D structure of a protein?

Determine either •ALL the distances between each atom and the remaining protein atoms or •ALL protein’s dihedral angles

Therefore, the coordinates of each atom allows us to determine ALL the distances within the protein, and thus describe the structure of our protein

References

Brandon, C., and Tooze, J., (1999) Introduction to protein structure, 2nd edition, Garland Publishing, New York Smidsrød, O., Moe, S.T., (2008) Biopolymer chemistry, Tapir academic press, Trondheim, chapter 3 & 8 Christensen, B.E., (2013) Compedium TBT4135 Biopolymers

top related