pfam a resource for remote homology domain identification
DESCRIPTION
Pfam a resource for remote homology domain identification. http:// pfam.xfam.org. Finn et al NAR 2014. Building families. Identify target. Abandon. Build SEED MSA of representative members. Build Profile-HMM. Search UniProtKB. Abandon. QCs and fix Significance thresholds. Annotate. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Pfam a resource for remote homology domain identification](https://reader035.vdocuments.mx/reader035/viewer/2022081519/56813956550346895da0f823/html5/thumbnails/1.jpg)
Pfama resource for remote homology domain identification
http://pfam.xfam.org Finn et al NAR 2014
![Page 2: Pfam a resource for remote homology domain identification](https://reader035.vdocuments.mx/reader035/viewer/2022081519/56813956550346895da0f823/html5/thumbnails/2.jpg)
Build SEED MSA ofrepresentative members
Build Profile-HMM
Search UniProtKB
AnnotateEMBO Workshop, Cape Town, 2014
Building familiesIdentify target
QCs and fix Significance thresholds
Abandon
Abandon
![Page 3: Pfam a resource for remote homology domain identification](https://reader035.vdocuments.mx/reader035/viewer/2022081519/56813956550346895da0f823/html5/thumbnails/3.jpg)
Old Family
New Family
EMBO Workshop, Cape Town, 2014
QC: family overlaps
![Page 4: Pfam a resource for remote homology domain identification](https://reader035.vdocuments.mx/reader035/viewer/2022081519/56813956550346895da0f823/html5/thumbnails/4.jpg)
Old Family
New Family
EMBO Workshop, Cape Town, 2014
SNLVMYIVIIIHWNACVFYSISKAIGFGNDTWVYPDINDPEFGRLARKYVYSLYWSTLTLTTIGETPPPVRDSEYVFVVVDFLIGVLIFATIVGNIGSMISN
QC: family overlaps
![Page 5: Pfam a resource for remote homology domain identification](https://reader035.vdocuments.mx/reader035/viewer/2022081519/56813956550346895da0f823/html5/thumbnails/5.jpg)
EMBO Workshop, Cape Town, 2014
Old Family
New Family
SNLVMYIVIIIHWNACVFYSISKAIGFGNDTWVYPDINDPEFGRLARKYVYSLYWSTLTLTTIGETPPPVRDSEYVFVVVDFLIGVLIFATIVGNIGSMISN
A – Old and New family are evolutionary related nature overlaps, profile-profile, functional residues, functional annotation, structure
QC: family overlaps
![Page 6: Pfam a resource for remote homology domain identification](https://reader035.vdocuments.mx/reader035/viewer/2022081519/56813956550346895da0f823/html5/thumbnails/6.jpg)
EMBO Workshop, Cape Town, 2014
A – Old and New family are evolutionary related
• Solution 1: Merge
Old Family
New Family
SNLVMYIVIIIHWNACVFYSISKAIGFGNDTWVYPDINDPEFGRLARKYVYSLYWSTLTLTTIGETPPPVRDSEYVFVVVDFLIGVLIFATIVGNIGSMISN
QC: family overlaps
![Page 7: Pfam a resource for remote homology domain identification](https://reader035.vdocuments.mx/reader035/viewer/2022081519/56813956550346895da0f823/html5/thumbnails/7.jpg)
EMBO Workshop, Cape Town, 2014
A – Old and New family are evolutionary related
• Solution 2: Create/Add to clan
Old Family
New Family
SNLVMYIVIIIHWNACVFYSISKAIGFGNDTWVYPDINDPEFGRLARKYVYSLYWSTLTLTTIGETPPPVRDSEYVFVVVDFLIGVLIFATIVGNIGSMISN
ClanQC: family overlaps
![Page 8: Pfam a resource for remote homology domain identification](https://reader035.vdocuments.mx/reader035/viewer/2022081519/56813956550346895da0f823/html5/thumbnails/8.jpg)
EMBO Workshop, Cape Town, 2014
A – Old and New family are NOT evolutionary related-> then overlaps might be false positives
Old Family
New Family
SNLVMYIVIIIHWNACVFYSISKAIGFGNDTWVYPDINDPEFGRLARKYVYSLYWSTLTLTTIGETPPPVRDSEYVFVVVDFLIGVLIFATIVGNIGSMISN
QC: family overlaps
![Page 9: Pfam a resource for remote homology domain identification](https://reader035.vdocuments.mx/reader035/viewer/2022081519/56813956550346895da0f823/html5/thumbnails/9.jpg)
A – Old and New family are NOT evolutionary related
Old Family
New Family
SNLVMYIVIIIHWNACVFYSISKAIGFGNDTWVYPDINDPEFGRLARKYVYSLYWSTLTLTTIGETPPPVRDSEYVFVVVDFLIGVLIFATIVGNIGSMISN
• Solution 1: Separate (expunge seqs from SEED, trim ends, raise threshold)
QC: family overlaps
![Page 10: Pfam a resource for remote homology domain identification](https://reader035.vdocuments.mx/reader035/viewer/2022081519/56813956550346895da0f823/html5/thumbnails/10.jpg)
A – Old and New family are NOT evolutionary related
Old Family
New Family
SNLVMYIVIIIHWNACVFYSISKAIGFGNDTWVYPDINDPEFGRLARKYVYSLYWSTLTLTTIGETPPPVRDSEYVFVVVDFLIGVLIFATIVGNIGSMISN
• Solution 2: Manually Edit (no change to family but sequence removed)
QC: family overlaps
![Page 11: Pfam a resource for remote homology domain identification](https://reader035.vdocuments.mx/reader035/viewer/2022081519/56813956550346895da0f823/html5/thumbnails/11.jpg)
• Overlaps
• Hits Score vs Taxonomic distribution
• Known annotation (e.g. functional/structural residues)
• Known structures
• …
EMBO Workshop, Cape Town, 2014
False positive detection
![Page 12: Pfam a resource for remote homology domain identification](https://reader035.vdocuments.mx/reader035/viewer/2022081519/56813956550346895da0f823/html5/thumbnails/12.jpg)
Build SEED MSA ofrepresentative members
Build Profile-HMM
Search UniProtKB
AnnotateEMBO Workshop, Cape Town, 2014
Building familiesIdentify target
QCs and fix Significance thresholds
Abandon
Abandon
![Page 13: Pfam a resource for remote homology domain identification](https://reader035.vdocuments.mx/reader035/viewer/2022081519/56813956550346895da0f823/html5/thumbnails/13.jpg)
Are all Pfam families structural domains?
EMBO Workshop, Cape Town, 2014
![Page 14: Pfam a resource for remote homology domain identification](https://reader035.vdocuments.mx/reader035/viewer/2022081519/56813956550346895da0f823/html5/thumbnails/14.jpg)
PDB (43%)No PDB (57%)
Pfam families with/without PDB structure
EMBO Workshop, Cape Town, 2014
![Page 15: Pfam a resource for remote homology domain identification](https://reader035.vdocuments.mx/reader035/viewer/2022081519/56813956550346895da0f823/html5/thumbnails/15.jpg)
Family
Domain
Repeat
Motif
Pfam types
EMBO Workshop, Cape Town, 2014
![Page 16: Pfam a resource for remote homology domain identification](https://reader035.vdocuments.mx/reader035/viewer/2022081519/56813956550346895da0f823/html5/thumbnails/16.jpg)
• A - Domain• B - Metal
stabilised domain• C - 7 repeats form
domain• D - 9 repeats form
domain could be unlimited number
A B
C D
Domain and repeats
EMBO Workshop, Cape Town, 2014
![Page 17: Pfam a resource for remote homology domain identification](https://reader035.vdocuments.mx/reader035/viewer/2022081519/56813956550346895da0f823/html5/thumbnails/17.jpg)
Example: Lipoprotein attachment site, LPAM_1
Alignment coloured by Residue-type
Motifs
EMBO Workshop, Cape Town, 2014
![Page 18: Pfam a resource for remote homology domain identification](https://reader035.vdocuments.mx/reader035/viewer/2022081519/56813956550346895da0f823/html5/thumbnails/18.jpg)
Family
Domain
Repeat
Disordered Family?
Pfam types
EMBO Workshop, Cape Town, 2014
![Page 19: Pfam a resource for remote homology domain identification](https://reader035.vdocuments.mx/reader035/viewer/2022081519/56813956550346895da0f823/html5/thumbnails/19.jpg)
![Page 20: Pfam a resource for remote homology domain identification](https://reader035.vdocuments.mx/reader035/viewer/2022081519/56813956550346895da0f823/html5/thumbnails/20.jpg)
![Page 21: Pfam a resource for remote homology domain identification](https://reader035.vdocuments.mx/reader035/viewer/2022081519/56813956550346895da0f823/html5/thumbnails/21.jpg)
![Page 22: Pfam a resource for remote homology domain identification](https://reader035.vdocuments.mx/reader035/viewer/2022081519/56813956550346895da0f823/html5/thumbnails/22.jpg)
PDBid: 2JGC
![Page 23: Pfam a resource for remote homology domain identification](https://reader035.vdocuments.mx/reader035/viewer/2022081519/56813956550346895da0f823/html5/thumbnails/23.jpg)
The Pfam website
EMBO Workshop, Cape Town, 2014
![Page 24: Pfam a resource for remote homology domain identification](https://reader035.vdocuments.mx/reader035/viewer/2022081519/56813956550346895da0f823/html5/thumbnails/24.jpg)
EMBO Workshop, Cape Town, 2014
The Pfam website
![Page 25: Pfam a resource for remote homology domain identification](https://reader035.vdocuments.mx/reader035/viewer/2022081519/56813956550346895da0f823/html5/thumbnails/25.jpg)
The Pfam website
EMBO Workshop, Cape Town, 2014
![Page 26: Pfam a resource for remote homology domain identification](https://reader035.vdocuments.mx/reader035/viewer/2022081519/56813956550346895da0f823/html5/thumbnails/26.jpg)
The Pfam website
![Page 27: Pfam a resource for remote homology domain identification](https://reader035.vdocuments.mx/reader035/viewer/2022081519/56813956550346895da0f823/html5/thumbnails/27.jpg)
EMBO Workshop, Cape Town, 2014
The Pfam website
![Page 28: Pfam a resource for remote homology domain identification](https://reader035.vdocuments.mx/reader035/viewer/2022081519/56813956550346895da0f823/html5/thumbnails/28.jpg)
The Pfam website
![Page 29: Pfam a resource for remote homology domain identification](https://reader035.vdocuments.mx/reader035/viewer/2022081519/56813956550346895da0f823/html5/thumbnails/29.jpg)
Pfam families’ interactions: iPfam
Finn et al. NAR 2013 http://www.ipfam.org
![Page 30: Pfam a resource for remote homology domain identification](https://reader035.vdocuments.mx/reader035/viewer/2022081519/56813956550346895da0f823/html5/thumbnails/30.jpg)
TUM, January 2013
Some caveats
• Identifying repeats is challenging, especially with HMMER3 ->local
• Functional diversity within families and clans
• Domains of Unknown Function
• Family boundaries if no structure available
EMBO Workshop, Cape Town, 2014
![Page 31: Pfam a resource for remote homology domain identification](https://reader035.vdocuments.mx/reader035/viewer/2022081519/56813956550346895da0f823/html5/thumbnails/31.jpg)
TUM, January 2013
Comparison of Enolase clan/superfamily in Pfam and SFLD
SFLD: Akiva et al. NAR 2013Picture courtesy of Patsy Babbit (UCSF)
![Page 32: Pfam a resource for remote homology domain identification](https://reader035.vdocuments.mx/reader035/viewer/2022081519/56813956550346895da0f823/html5/thumbnails/32.jpg)
from the Pfam blog: at http://xfam.wordpress.com/tag/pfam/
How far from covering the sequence space: H. sapiens
EMBO Workshop, Cape Town, 2014
![Page 33: Pfam a resource for remote homology domain identification](https://reader035.vdocuments.mx/reader035/viewer/2022081519/56813956550346895da0f823/html5/thumbnails/33.jpg)
Building a Pfam family
EMBO Workshop, Cape Town, 2014
![Page 34: Pfam a resource for remote homology domain identification](https://reader035.vdocuments.mx/reader035/viewer/2022081519/56813956550346895da0f823/html5/thumbnails/34.jpg)
TUM, January 2013
2KX7
Pick a target region
OPEN Chimera1.
File -> Open “2KX7.pdb”2.
EMBO Workshop, Cape Town, 2014
![Page 35: Pfam a resource for remote homology domain identification](https://reader035.vdocuments.mx/reader035/viewer/2022081519/56813956550346895da0f823/html5/thumbnails/35.jpg)
TUM, January 2013
SELECT “2KX7.pdb (#0.1) chain A”
Actions-> Ribbon-> hide
2KX7 model 1
1.
Actions -> Ribbon -> show
2.
3.
EMBO Workshop, Cape Town, 2014
Pick a target region
![Page 36: Pfam a resource for remote homology domain identification](https://reader035.vdocuments.mx/reader035/viewer/2022081519/56813956550346895da0f823/html5/thumbnails/36.jpg)
TUM, January 2013Schmöe et al. Structure 2011
2KX7
EMBO Workshop, Cape Town, 2014
Rcs-signaling systembacterial two component system (sensor kinase +response regulator)
![Page 37: Pfam a resource for remote homology domain identification](https://reader035.vdocuments.mx/reader035/viewer/2022081519/56813956550346895da0f823/html5/thumbnails/37.jpg)
TUM, January 2013EMBO Workshop, Cape Town, 2014
Pick a target regionLook-up UniprotKB ID: P39838 on the Pfam website (http://pfam.xfam.org)
![Page 38: Pfam a resource for remote homology domain identification](https://reader035.vdocuments.mx/reader035/viewer/2022081519/56813956550346895da0f823/html5/thumbnails/38.jpg)
TUM, January 2013EMBO Workshop, Cape Town, 2014
Pick a target regionLook-up UniprotKB ID: P39838 on the Pfam website (http://pfam.xfam.org)
![Page 39: Pfam a resource for remote homology domain identification](https://reader035.vdocuments.mx/reader035/viewer/2022081519/56813956550346895da0f823/html5/thumbnails/39.jpg)
TUM, January 2013
2KX7
EMBO Workshop, Cape Town, 2014Schmöe et al. Structure 2011
HK
S
ABL
HPt
Pick a target region
![Page 40: Pfam a resource for remote homology domain identification](https://reader035.vdocuments.mx/reader035/viewer/2022081519/56813956550346895da0f823/html5/thumbnails/40.jpg)
TUM, January 2013
2KX7
EMBO Workshop, Cape Town, 2014Schmöe et al. Structure 2011
HK
S
ABL
HPt
Pick a target region
![Page 41: Pfam a resource for remote homology domain identification](https://reader035.vdocuments.mx/reader035/viewer/2022081519/56813956550346895da0f823/html5/thumbnails/41.jpg)
EMBO Workshop, Cape Town, 2014
Pick a target region
![Page 42: Pfam a resource for remote homology domain identification](https://reader035.vdocuments.mx/reader035/viewer/2022081519/56813956550346895da0f823/html5/thumbnails/42.jpg)
EMBO Workshop, Cape Town, 2014
Pick a target region
![Page 43: Pfam a resource for remote homology domain identification](https://reader035.vdocuments.mx/reader035/viewer/2022081519/56813956550346895da0f823/html5/thumbnails/43.jpg)
Look for homologs
EMBO Workshop, Cape Town, 2014
http://hmmer.janelia.org
Click Start
HMMER website: Finn et al. NAR 2011
![Page 44: Pfam a resource for remote homology domain identification](https://reader035.vdocuments.mx/reader035/viewer/2022081519/56813956550346895da0f823/html5/thumbnails/44.jpg)
Look for homologs
EMBO Workshop, Cape Town, 2014
http://hmmer.janelia.org
Choose “Marco-Data/Other/2KX7.fasta”
![Page 45: Pfam a resource for remote homology domain identification](https://reader035.vdocuments.mx/reader035/viewer/2022081519/56813956550346895da0f823/html5/thumbnails/45.jpg)
Select your dataset
EMBO Workshop, Cape Town, 2014
Select rp75 in Sequence Database
![Page 46: Pfam a resource for remote homology domain identification](https://reader035.vdocuments.mx/reader035/viewer/2022081519/56813956550346895da0f823/html5/thumbnails/46.jpg)
Parse hits
EMBO Workshop, Cape Town, 2014
![Page 47: Pfam a resource for remote homology domain identification](https://reader035.vdocuments.mx/reader035/viewer/2022081519/56813956550346895da0f823/html5/thumbnails/47.jpg)
Parse hits
EMBO Workshop, Cape Town, 2014
Click
![Page 48: Pfam a resource for remote homology domain identification](https://reader035.vdocuments.mx/reader035/viewer/2022081519/56813956550346895da0f823/html5/thumbnails/48.jpg)
Check conservation and coverage
EMBO Workshop, Cape Town, 2014
![Page 49: Pfam a resource for remote homology domain identification](https://reader035.vdocuments.mx/reader035/viewer/2022081519/56813956550346895da0f823/html5/thumbnails/49.jpg)
Check low scores
EMBO Workshop, Cape Town, 2014
Scroll down
![Page 50: Pfam a resource for remote homology domain identification](https://reader035.vdocuments.mx/reader035/viewer/2022081519/56813956550346895da0f823/html5/thumbnails/50.jpg)
Check taxonomic distribution
EMBO Workshop, Cape Town, 2014
Click Taxonomy
![Page 51: Pfam a resource for remote homology domain identification](https://reader035.vdocuments.mx/reader035/viewer/2022081519/56813956550346895da0f823/html5/thumbnails/51.jpg)
Check taxonomic distribution
EMBO Workshop, Cape Town, 2014
![Page 52: Pfam a resource for remote homology domain identification](https://reader035.vdocuments.mx/reader035/viewer/2022081519/56813956550346895da0f823/html5/thumbnails/52.jpg)
Check domain architectures/overlaps
EMBO Workshop, Cape Town, 2014
Click Domain
![Page 53: Pfam a resource for remote homology domain identification](https://reader035.vdocuments.mx/reader035/viewer/2022081519/56813956550346895da0f823/html5/thumbnails/53.jpg)
Download aligned hits
EMBO Workshop, Cape Town, 2014
CLICK on Download and then on Aligned FASTA1.
Save as “RcsD-ABL-hmmer-ali.fasta”2.
![Page 54: Pfam a resource for remote homology domain identification](https://reader035.vdocuments.mx/reader035/viewer/2022081519/56813956550346895da0f823/html5/thumbnails/54.jpg)
OPEN Jalview1.
File -> Input Alignment -> From File “RcsD-ABL-hmmer-ali.fasta”2.
Manipulate alignment