bits protein structure

Post on 11-May-2015

793 Views

Category:

Technology

1 Downloads

Preview:

Click to see full reader

TRANSCRIPT

BITS TrainingProtein Structure

Joost Van DurmeVIB Switch Laboratory

Vrije Universiteit Brussel

http://www.bits.vib.be/training

VIB Switch Laboratory

Topics for today

• Exploring the protein structure databank (PDB)

• Viewing and analyzing protein structures with YASARA

• Comparing similar protein structures

• In silico mutagenesis with FoldX

• Homology modeling with FoldX

VIB Switch Laboratory

•PDB contains 65000 structures•EMBL-Bank contains 114,475,051 sequences or 215,540,553,360

nucleotides!

Sequences and structures

VIB Switch Laboratory

1976

1977

1978

1979

1980

1981

1982

1983

1984

1985

1986

1987

1988

1989

1990

1991

1992

1993

1994

1995

1996

1997

1998

1999

2000

2001

2002

2003

2004

2005

2006

1

10

100

1000

10000

100000

1000000

10000000

100000000

structuressequences

The sequence-structure gap

VIB Switch Laboratory

• X-ray crystallography (crystals)• Nuclear Magnetic Resonance (NMR) (in solution)• Electron microscopy (in native tissue)

Structures can be solved

VIB Switch Laboratory

• Solving structures is lots of work (6 months to years)

• Need lots of material/reagents. Solubility is a problem.

• Some protein structures are really difficult to solve: membrane proteins, extremely large proteins, protein complexes

• The field evolves fast. Techniques improve, more user friendly software, more automatisation (x-ray infrastructure, crystal growth)

• Despite this progress it is not expected that the sequence-structure gap will ever be closed.

But ...

VIB Switch Laboratory

What can we learn from models/structures?

• Active site structure: structure-based drug design

• Protein-protein interactions

• Function• Antigenic behavior / vaccine development

• Stabilising proteins using structural knowledge

VIB Switch Laboratory

Human vs parasite

Parasite

Active site

VIB Switch Laboratory

1918 Influenza Epidemic Influenza Virus

VIB Switch Laboratory

NEURAMIDASE POCKET

SIALIC ACID

VIB Switch Laboratory

RELENZA SIALIC ACID

VIB Switch Laboratory

RELENZA TAMIFLU5.000.000+ doses in NL

VIB Switch Laboratory

Trouw – 3 maart 2009

Trouw – 3 maart 2009

VIB Switch Laboratory

RELENZA TAMIFLU

WT Ki = 1.0H274Y Ki= 1.9

WT Ki = 1.0H274Y Ki

=265

H274YH274Y

VIB Switch Laboratory

PDB structures come from ...

• X-Ray crystallography experiments

• NMR structure determination

The PDB no longer contains:• EM structures (too low resolution)• Models (too unreliable)

VIB Switch Laboratory

Principle of X-Ray crystallography

initial model

electron densities

VIB Switch Laboratory

X-Ray structure

VIB Switch Laboratory

X-Ray models components

• x, y, z coordinates: define the mean atom position

• disorder about this mean: B-factor and occupancy• variations in time and space

• B-factor:• model the ‘smearing out’ of disorder around the mean atom

position (ellipsoids)• higher B-factor means more uncertainty about position

• Occupancy:• consider alternative conformations of the same sidechain• how often do we find this sidechain in one conformation and

how often in the other conformation

VIB Switch Laboratory

Occupancy

ATOM 625 C ILE A 77 -11.322 28.374 -1.179 1.00 28.77 C

ATOM 626 O ILE A 77 -11.946 29.453 -1.112 1.00 28.84 O

ATOM 627 CA AILE A 77 -11.432 27.329 -0.087 0.70 28.15 C

ATOM 628 CB AILE A 77 -12.918 26.874 0.087 0.70 28.64 C

ATOM 629 CG1AILE A 77 -13.042 25.758 1.141 0.70 26.75 C

ATOM 630 CG2AILE A 77 -13.516 26.421 -1.241 0.70 28.13 C

ATOM 631 CD1AILE A 77 -13.378 26.302 2.501 0.70 26.47 C

ATOM 632 CA BILE A 77 -11.423 27.327 -0.082 0.30 28.50 C

ATOM 633 CB BILE A 77 -12.874 26.775 0.117 0.30 28.79 C

ATOM 634 CG1BILE A 77 -13.519 26.423 -1.227 0.30 28.62 C

ATOM 635 CG2BILE A 77 -13.748 27.739 0.916 0.30 28.40 C

ATOM 636 CD1BILE A 77 -14.720 25.518 -1.100 0.30 28.69 C

ATOM 637 N ARG A 78 -10.521 28.048 -2.183 1.00 28.70 N

ATOM 638 CA ARG A 78 -10.258 28.952 -3.268 1.00 28.47 C

ATOM 639 C ARG A 78 -10.857 28.469 -4.584 1.00 28.22 C

2VWC

VIB Switch Laboratory

Atomic B-factors

• Value which determines the precision of an atom’s given position

• Atoms with the largest B-factors will have the largest positional uncertainty

• Indication of mobility of an atom

• 0 < B < 20: Atom is most likely OK • 20 < B < 40: Atom is probably OK, but positional

errors up to 0.5 Ångstrom are normal • 40 < B < 60: Atom is probably reasonably OK, but be

careful, because positional errors up to 1.0 Ångstrom can be observed

• B > 60: Atom is not likely to be within 1.0 Ångstrom from where you see it

• B around 100: Atom is guaranteed not within 1.0 Ångstrom from where you see it

VIB Switch Laboratory

B-factor

www.YASARA.org

Low

High

VIB Switch Laboratory

Resolution (Angstrom)

• Level of detail that can be observed in the electron density map

• The greater the disorder in the crystal, the lower the resolution (proportional to the protein size)

3.0A 2.0A 1.2A

VIB Switch Laboratory

R-factor

• The difference between the observed and computed diffraction pattern

• A measure of how well the refined structure predicts the observed data

• Higher values mean less agreement

• 0.40-0.60: very unreliable• 0.20 seems to be the

standard threshold

electron density map

VIB Switch Laboratory

NMR Structure determination

VIB Switch Laboratory

NMR models components

• In solution• study protein dynamics• solve protein structures that are difficult to crystallize

• Nuclear Overhauser Effect or NOE• intensities of signal peaks correspond to short inter-atomic distances

between spatially close protons (NOE distances)

• NOE constraints are known with low precision. E.g. NOEs are binned 2.5-4.0, 4.0-5.5, and 5.5-7.0 Angstrom

• Multiple models are generated that are consistent with the distance and angle constraints using e.g. molecular dynamics: the NMR ensemble

• Take average or best model from ensemble for PDB deposition, or just deposit a selected ensemble of superposed structures

VIB Switch Laboratory

Structure superposition (root mean

square distance)

RMSD=∑i d i2n

n = number of atomsdi = distance between 2 corresponding atoms i in 2 structures

The more atoms superpose on each other, the lower the RMSD

Unit of RMSD => Ångstroms

identical structures => RMSD = “0”similar structures => RMSD is small (1 – 3 Å)distant structures => RMSD > 3 Å

However, care has to be taken as RMSD is length dependent and dominated by outliers:• comparison of two short peptide structures can result in a small RMSD even if their structure is visibly different.

• very similar structures can have a bad RMSD due to a short part of the structures that is very different (loops)

• Insertions and deletions are not implemented in the RMSD calculation, since we only look at equivalent atoms/residues (see figure)

VIB Switch Laboratory

NMR ensemble RMSD (root mean square

distance)

• Superpose the NMR models

• Calculate RMSD of local regions and also whole models

• Regions with high RMSD are less well defined by the data

VIB Switch Laboratory

Structural data is stored in the Protein Data Bank (PDB)

http://www.pdb.org

Protein Data Bank (PDB)

VIB Switch Laboratory

©CMBI 2009©CMBI 2009

Protein Data Bank (PDB)

•Databank for 3-dimensional structures of biomolecules:

• Protein• DNA• RNA• Ligands

•Obligatory deposit of coordinates in the PDB before publication

•~ 65000 entries (April 2010) ( ~27000 “unique” structures)

• PDB file is a keyword-organised flat-file (80 column)1) human readable2) every line starts with a keyword (3-6 letters)3) platform independent

VIB Switch Laboratory

©CMBI 2009

PDB important records (1)

•PDB nomenclatureFilename= accession number= PDB CodeFilename is 4 positions (often 1 digit & 3 letters, e.g. 1CRN.pdb)

•HEADERdescribes molecule & gives deposition dateHEADER PLANT SEED PROTEIN 30-APR-81 1CRN

•CMPNDname of moleculeCOMPND CRAMBIN

•SOURCEorganismSOURCE ABYSSINIAN CABBAGE (CRAMBE ABYSSINICA) SEED

VIB Switch Laboratory

©CMBI 2009

PDB important records (2)

•SEQRESSequence of protein; be aware: Not always all 3d-coordinates are present for all the amino acids in SEQRES!!SEQRES 1 46 THR THR CYS CYS PRO SER ILE VAL ALA ARG SER ASN PHE 1CRN 51SEQRES 2 46 ASN VAL CYS ARG LEU PRO GLY THR PRO GLU ALA ILE CYS 1CRN 52SEQRES 3 46 ALA THR TYR THR GLY CYS ILE ILE ILE PRO GLY ALA THR 1CRN 53SEQRES 4 46 CYS PRO GLY ASP TYR ALA ASN 1CRN 54

•SSBONDdisulfide bridgesSSBOND 1 CYS 3 CYS 40

SSBOND 2 CYS 4 CYS 32

VIB Switch Laboratory

©CMBI 2009

PDB important records (3)

and at the end of the PDB file the “real” data:

ATOMone line for each atom with its unique name and its x,y,z coordinatesATOM 1 N THR 1 17.047 14.099 3.625 1.00 13.79 1CRN 70ATOM 2 CA THR 1 16.967 12.784 4.338 1.00 10.80 1CRN 71ATOM 3 C THR 1 15.685 12.755 5.133 1.00 9.19 1CRN 72ATOM 4 O THR 1 15.268 13.825 5.594 1.00 9.85 1CRN 73ATOM 5 CB THR 1 18.170 12.703 5.337 1.00 13.02 1CRN 74ATOM 6 OG1 THR 1 19.334 12.829 4.463 1.00 15.06 1CRN 75ATOM 7 CG2 THR 1 18.150 11.546 6.304 1.00 14.23 1CRN 76ATOM 8 N THR 2 15.115 11.555 5.265 1.00 7.81 1CRN 77ATOM 9 CA THR 2 13.856 11.469 6.066 1.00 8.31 1CRN 78ATOM 10 C THR 2 14.164 10.785 7.379 1.00 5.80 1CRN 79ATOM 11 O THR 2 14.993 9.862 7.443 1.00 6.94 1CRN 80

VIB Switch Laboratory

PDB entry

VIB Switch Laboratory

PDB entry

VIB Switch Laboratory

PDB entry

VIB Switch Laboratory

©CMBI 2009

Structure Visualization

Structures from PDB can be visualized with:

1. YASARA (http://www.yasara.org)

2. SwissPDBViewer (http://spdbv.vital-it.ch/)

1. PyMOL (http://www.pymol/org)

1. Chimera (http://www.cgl.ucsf.edu/chimera )

VIB Switch Laboratory

YASARA View nomenclature

Atom Residue = any continuous stretch of atoms sharing the same residue name, residue number and molecule name

Molecule = any continuous stretch of residues sharing the same molecule name (PDB calls this a CHAIN)

Object = a collection of molecules and additional items

VIB Switch Laboratory

Standard atom colors

• C = cyan• O = red• N = blue• H = white• S = green

VIB Switch Laboratory

Atom nomenclature

N

N

O

C

C

Cδ1Cδ2

OT1

OT2

N-term

C-term

VIB Switch Laboratory

FoldX: a molecular design toolkit

• Predict the effect of point mutation on the protein stability

• Predict the 3D structure of a sequence: homology modeling

VIB Switch Laboratory

Predict effect of point mutation

• FoldX is an empirical force field• It is validated with calorimetric experiments• E.g. If such an experiment concludes that breaking a

hydrogen bond costs 1.5 kcal/mol, FoldX uses this knowledge rather than using theoretical physics equations

• FoldX compares WT and mutant for:• Hydrogen bonds, electrostatics, Van der Waals clashes

and contacts, entropy, desolvation, ...

• FoldX energies• Energy of a single molecule is meaningless• The difference in energy of two molecules (such as WT

and a point mutant) approaches realistic values

VIB Switch Laboratory

Predict effect of point mutation

• FoldX calculates the stability of WT and MT and makes the difference (net effect of mutation):• ΔGMT-ΔGWT = ΔΔGmutation

• If ΔΔGmutation

• > 0 : mutation is bad for stability• < 0 : mutation is good for stability

• FoldX error margin is 0.5 kcal/mol, so changes within this margin are meaningless

VIB Switch Laboratory

Introduction to homology modeling

• Goal: predict a structure from its sequence with an accuracy that is comparable to the best results achieved experimentally (X-Ray)

• Protein modeling is the only way to obtain structural information when experimental techniques (x-ray, NMR, EM) fail

VIB Switch Laboratory

Homology Modeling

VIB Switch Laboratory

Principles of Homology Modeling

• Search for a sequence with a known structure that is very similar to the sequence with the unknown structure. Build model using known structure as template

• The structure of a protein is uniquely determined by its amino acid sequence

• Structure is more conserved than sequence• Similar sequences adopt nearly exact same

structure• Distantly related sequences can still fold into

a similar structure

VIB Switch Laboratory

Sequence similarity rule

• Rost (1999) modeled lots of structures and compared them to the real ones in the PDB

• Derived precise limits for homology modeling• This rule tells you whether a model will be reliable or unreliable

VIB Switch Laboratory

FoldX plugin for YASARA

VIB Switch Laboratory

Acknowledgements

• Gert Vriend, Radboud Universiteit Nijmegen, NL (www.cmbi.ru.nl)

• Sander Nabuurs, Lead Pharma, Nijmegen, NL

• Greet De Baets, VIB Switch Laboratory

top related