protein tertiary structure. protein data bank (pdb) contains all known 3d structural data of large...
TRANSCRIPT
![Page 1: Protein Tertiary Structure. Protein Data Bank (PDB) Contains all known 3D structural data of large biological molecules, mostly proteins and nucleic acids:](https://reader035.vdocuments.mx/reader035/viewer/2022070403/56649f2b5503460f94c454a7/html5/thumbnails/1.jpg)
Protein Tertiary
Structure
![Page 2: Protein Tertiary Structure. Protein Data Bank (PDB) Contains all known 3D structural data of large biological molecules, mostly proteins and nucleic acids:](https://reader035.vdocuments.mx/reader035/viewer/2022070403/56649f2b5503460f94c454a7/html5/thumbnails/2.jpg)
Protein Data Bank (PDB)
• Contains all known 3D structural data of large biological molecules, mostly proteins and nucleic acids: ~87,000 structures.
• The data is typically obtained by X-ray crystallography or NMR (Nuclear magnetic resonance) spectroscopy and submitted by biologists and biochemists from around the world.
• Freely accessible.
![Page 3: Protein Tertiary Structure. Protein Data Bank (PDB) Contains all known 3D structural data of large biological molecules, mostly proteins and nucleic acids:](https://reader035.vdocuments.mx/reader035/viewer/2022070403/56649f2b5503460f94c454a7/html5/thumbnails/3.jpg)
![Page 4: Protein Tertiary Structure. Protein Data Bank (PDB) Contains all known 3D structural data of large biological molecules, mostly proteins and nucleic acids:](https://reader035.vdocuments.mx/reader035/viewer/2022070403/56649f2b5503460f94c454a7/html5/thumbnails/4.jpg)
PDB file
Accession number
Java based visualization tools
2ndary structure
![Page 5: Protein Tertiary Structure. Protein Data Bank (PDB) Contains all known 3D structural data of large biological molecules, mostly proteins and nucleic acids:](https://reader035.vdocuments.mx/reader035/viewer/2022070403/56649f2b5503460f94c454a7/html5/thumbnails/5.jpg)
PDB file example:
A PDB file can be viewed by different visualization tools , such as Pymol
![Page 6: Protein Tertiary Structure. Protein Data Bank (PDB) Contains all known 3D structural data of large biological molecules, mostly proteins and nucleic acids:](https://reader035.vdocuments.mx/reader035/viewer/2022070403/56649f2b5503460f94c454a7/html5/thumbnails/6.jpg)
Protein, chain, domain
• Here is a protein compound by 4 chains.
• Which protein is that?
![Page 7: Protein Tertiary Structure. Protein Data Bank (PDB) Contains all known 3D structural data of large biological molecules, mostly proteins and nucleic acids:](https://reader035.vdocuments.mx/reader035/viewer/2022070403/56649f2b5503460f94c454a7/html5/thumbnails/7.jpg)
Protein, chain, domain
• One chain may have multiple domains.
• A protein domain is a conserved part of a given protein sequence and structure that can evolve, function, and exist independently of the rest of the protein chain.
• Each domain has a stable 3D structure.
![Page 8: Protein Tertiary Structure. Protein Data Bank (PDB) Contains all known 3D structural data of large biological molecules, mostly proteins and nucleic acids:](https://reader035.vdocuments.mx/reader035/viewer/2022070403/56649f2b5503460f94c454a7/html5/thumbnails/8.jpg)
Protein domain classifications
• Scientists have tried to classify proteins by their structural properties into a tree-like hierarchy.
• The 2 most used domain classifications are CATH and SCOP.
![Page 9: Protein Tertiary Structure. Protein Data Bank (PDB) Contains all known 3D structural data of large biological molecules, mostly proteins and nucleic acids:](https://reader035.vdocuments.mx/reader035/viewer/2022070403/56649f2b5503460f94c454a7/html5/thumbnails/9.jpg)
CATH: Protein Domain Structure ClassificationClass, Architecture, Topology and Homology
•Class: The secondary structure composition: mainly-alpha, mainly-beta and alpha-beta.
• Architecture: The overall shape of the domain structure. Orientations of the secondary structures : e.g. barrel or 3-layer sandwich.
• Topology: Structures are grouped into fold groups at this level depending on both the overall shape and connectivity of the secondary structures.
•Homologous Superfamily: Evolutionary conserved structures
![Page 11: Protein Tertiary Structure. Protein Data Bank (PDB) Contains all known 3D structural data of large biological molecules, mostly proteins and nucleic acids:](https://reader035.vdocuments.mx/reader035/viewer/2022070403/56649f2b5503460f94c454a7/html5/thumbnails/11.jpg)
CATH: Protein Domain Structure ClassificationClass, Architecture, Topology and Homology
![Page 12: Protein Tertiary Structure. Protein Data Bank (PDB) Contains all known 3D structural data of large biological molecules, mostly proteins and nucleic acids:](https://reader035.vdocuments.mx/reader035/viewer/2022070403/56649f2b5503460f94c454a7/html5/thumbnails/12.jpg)
SCOP Structural Classification of Proteins
http://scop.mrc-lmb.cam.ac.uk/scop/data/scop.b.html
Based on known protein structures
•Manually created by visual inspection
•Hierarchical database structure:
–Class, Fold, Superfamily, Family, Protein and Species
![Page 13: Protein Tertiary Structure. Protein Data Bank (PDB) Contains all known 3D structural data of large biological molecules, mostly proteins and nucleic acids:](https://reader035.vdocuments.mx/reader035/viewer/2022070403/56649f2b5503460f94c454a7/html5/thumbnails/13.jpg)
![Page 14: Protein Tertiary Structure. Protein Data Bank (PDB) Contains all known 3D structural data of large biological molecules, mostly proteins and nucleic acids:](https://reader035.vdocuments.mx/reader035/viewer/2022070403/56649f2b5503460f94c454a7/html5/thumbnails/14.jpg)
Parents of node
Childrenof node
Node
![Page 15: Protein Tertiary Structure. Protein Data Bank (PDB) Contains all known 3D structural data of large biological molecules, mostly proteins and nucleic acids:](https://reader035.vdocuments.mx/reader035/viewer/2022070403/56649f2b5503460f94c454a7/html5/thumbnails/15.jpg)
Protein structure alignment
• Structural alignment attempts to establish homology between two or more protein structures based on their 3D conformation.
• Structural alignmentoften implies evolutionary relationships between proteins with low seq-id.
![Page 16: Protein Tertiary Structure. Protein Data Bank (PDB) Contains all known 3D structural data of large biological molecules, mostly proteins and nucleic acids:](https://reader035.vdocuments.mx/reader035/viewer/2022070403/56649f2b5503460f94c454a7/html5/thumbnails/16.jpg)
Sequence – structure relations
• Similar sequences Similar structures.
• Different sequences ???
• Different sequences that fold into similar structures are most interesting, since they imply a common origin.
• This is what we aim to find
![Page 17: Protein Tertiary Structure. Protein Data Bank (PDB) Contains all known 3D structural data of large biological molecules, mostly proteins and nucleic acids:](https://reader035.vdocuments.mx/reader035/viewer/2022070403/56649f2b5503460f94c454a7/html5/thumbnails/17.jpg)
Protein structure alignment
• Alignment tools try to superimpose the 2 structures, so that the distance between them is minimal.
• The distance measure is RMSD - Root Mean Square Deviation.
• Given two sets of n points v and w, the RMSD is defined as follows:
2
1
1( , )
n
i ii
RMSD v w v wn
![Page 18: Protein Tertiary Structure. Protein Data Bank (PDB) Contains all known 3D structural data of large biological molecules, mostly proteins and nucleic acids:](https://reader035.vdocuments.mx/reader035/viewer/2022070403/56649f2b5503460f94c454a7/html5/thumbnails/18.jpg)
Protein structure alignment
• The structural alignment servers do LOCAL structural alignment.
• They try to align larger stretches of protein backbone with minimal RMSD.
• Thus, another parameter to assess the quality of the alignment is the alignment length.
![Page 19: Protein Tertiary Structure. Protein Data Bank (PDB) Contains all known 3D structural data of large biological molecules, mostly proteins and nucleic acids:](https://reader035.vdocuments.mx/reader035/viewer/2022070403/56649f2b5503460f94c454a7/html5/thumbnails/19.jpg)
Protein structure alignment
• Low RMSD _________ structures
• Low alignment length _________ structures
• SAS score = 100*RMSD/(alignment length)• Low SAS _________ structures
similar
similar
dissimilar
![Page 20: Protein Tertiary Structure. Protein Data Bank (PDB) Contains all known 3D structural data of large biological molecules, mostly proteins and nucleic acids:](https://reader035.vdocuments.mx/reader035/viewer/2022070403/56649f2b5503460f94c454a7/html5/thumbnails/20.jpg)
Structure alignment servers
Dalilite:
http://www.ebi.ac.uk/Tools/structure/dalilite/
• 1XIS and 1NAR have only 7% sequence identity, but they are structurally similar.
• We will download their pdb files from the PDB, and structurally align them using Dalilite.
![Page 21: Protein Tertiary Structure. Protein Data Bank (PDB) Contains all known 3D structural data of large biological molecules, mostly proteins and nucleic acids:](https://reader035.vdocuments.mx/reader035/viewer/2022070403/56649f2b5503460f94c454a7/html5/thumbnails/21.jpg)
Insert PDB files
![Page 22: Protein Tertiary Structure. Protein Data Bank (PDB) Contains all known 3D structural data of large biological molecules, mostly proteins and nucleic acids:](https://reader035.vdocuments.mx/reader035/viewer/2022070403/56649f2b5503460f94c454a7/html5/thumbnails/22.jpg)
![Page 23: Protein Tertiary Structure. Protein Data Bank (PDB) Contains all known 3D structural data of large biological molecules, mostly proteins and nucleic acids:](https://reader035.vdocuments.mx/reader035/viewer/2022070403/56649f2b5503460f94c454a7/html5/thumbnails/23.jpg)
This file can be loaded to Pymol viewer
![Page 24: Protein Tertiary Structure. Protein Data Bank (PDB) Contains all known 3D structural data of large biological molecules, mostly proteins and nucleic acids:](https://reader035.vdocuments.mx/reader035/viewer/2022070403/56649f2b5503460f94c454a7/html5/thumbnails/24.jpg)
Food for thought
How can structure alignment help us in structure prediction?
![Page 25: Protein Tertiary Structure. Protein Data Bank (PDB) Contains all known 3D structural data of large biological molecules, mostly proteins and nucleic acids:](https://reader035.vdocuments.mx/reader035/viewer/2022070403/56649f2b5503460f94c454a7/html5/thumbnails/25.jpg)
Structure prediction
• Input: protein sequence;• Output: protein 3D structure.
• This is a VERY difficult task.
• CASP: Critical Assessment of Techniques for Protein Structure Prediction
• Worldwide experiment for protein structure prediction taking place every two years.
![Page 26: Protein Tertiary Structure. Protein Data Bank (PDB) Contains all known 3D structural data of large biological molecules, mostly proteins and nucleic acids:](https://reader035.vdocuments.mx/reader035/viewer/2022070403/56649f2b5503460f94c454a7/html5/thumbnails/26.jpg)
Structure prediction
Comparative Modeling Ab Initio Modeling
build 3D protein models "from scratch", i.e., based
on physical principles rather than on previously
solved structures.
uses previously solved structures as starting points, or templates.
Protein threading:sequence to
structure alignment, against a database of ‘templates’ – known
structures.
Homology modeling:
searches similarity in sequences with
known structures.
![Page 27: Protein Tertiary Structure. Protein Data Bank (PDB) Contains all known 3D structural data of large biological molecules, mostly proteins and nucleic acids:](https://reader035.vdocuments.mx/reader035/viewer/2022070403/56649f2b5503460f94c454a7/html5/thumbnails/27.jpg)
I-TASSER structure prediction server
• based on multiple-threading alignments
• I-TASSER (as 'Zhang-Server') was ranked as the No 1 server for protein structure prediction in recent CASP7, CASP8, CASP9, and CASP10 experiments.
![Page 28: Protein Tertiary Structure. Protein Data Bank (PDB) Contains all known 3D structural data of large biological molecules, mostly proteins and nucleic acids:](https://reader035.vdocuments.mx/reader035/viewer/2022070403/56649f2b5503460f94c454a7/html5/thumbnails/28.jpg)
![Page 29: Protein Tertiary Structure. Protein Data Bank (PDB) Contains all known 3D structural data of large biological molecules, mostly proteins and nucleic acids:](https://reader035.vdocuments.mx/reader035/viewer/2022070403/56649f2b5503460f94c454a7/html5/thumbnails/29.jpg)
I-TASSER results
![Page 30: Protein Tertiary Structure. Protein Data Bank (PDB) Contains all known 3D structural data of large biological molecules, mostly proteins and nucleic acids:](https://reader035.vdocuments.mx/reader035/viewer/2022070403/56649f2b5503460f94c454a7/html5/thumbnails/30.jpg)
I-TASSER results
![Page 31: Protein Tertiary Structure. Protein Data Bank (PDB) Contains all known 3D structural data of large biological molecules, mostly proteins and nucleic acids:](https://reader035.vdocuments.mx/reader035/viewer/2022070403/56649f2b5503460f94c454a7/html5/thumbnails/31.jpg)
I-TASSER results