pfam a resource for remote homology domain identification
Post on 04-Jan-2016
48 Views
Preview:
DESCRIPTION
TRANSCRIPT
Pfama resource for remote homology domain identification
http://pfam.xfam.org Finn et al NAR 2014
Build SEED MSA ofrepresentative members
Build Profile-HMM
Search UniProtKB
AnnotateEMBO Workshop, Cape Town, 2014
Building familiesIdentify target
QCs and fix Significance thresholds
Abandon
Abandon
Old Family
New Family
EMBO Workshop, Cape Town, 2014
QC: family overlaps
Old Family
New Family
EMBO Workshop, Cape Town, 2014
SNLVMYIVIIIHWNACVFYSISKAIGFGNDTWVYPDINDPEFGRLARKYVYSLYWSTLTLTTIGETPPPVRDSEYVFVVVDFLIGVLIFATIVGNIGSMISN
QC: family overlaps
EMBO Workshop, Cape Town, 2014
Old Family
New Family
SNLVMYIVIIIHWNACVFYSISKAIGFGNDTWVYPDINDPEFGRLARKYVYSLYWSTLTLTTIGETPPPVRDSEYVFVVVDFLIGVLIFATIVGNIGSMISN
A – Old and New family are evolutionary related nature overlaps, profile-profile, functional residues, functional annotation, structure
QC: family overlaps
EMBO Workshop, Cape Town, 2014
A – Old and New family are evolutionary related
• Solution 1: Merge
Old Family
New Family
SNLVMYIVIIIHWNACVFYSISKAIGFGNDTWVYPDINDPEFGRLARKYVYSLYWSTLTLTTIGETPPPVRDSEYVFVVVDFLIGVLIFATIVGNIGSMISN
QC: family overlaps
EMBO Workshop, Cape Town, 2014
A – Old and New family are evolutionary related
• Solution 2: Create/Add to clan
Old Family
New Family
SNLVMYIVIIIHWNACVFYSISKAIGFGNDTWVYPDINDPEFGRLARKYVYSLYWSTLTLTTIGETPPPVRDSEYVFVVVDFLIGVLIFATIVGNIGSMISN
ClanQC: family overlaps
EMBO Workshop, Cape Town, 2014
A – Old and New family are NOT evolutionary related-> then overlaps might be false positives
Old Family
New Family
SNLVMYIVIIIHWNACVFYSISKAIGFGNDTWVYPDINDPEFGRLARKYVYSLYWSTLTLTTIGETPPPVRDSEYVFVVVDFLIGVLIFATIVGNIGSMISN
QC: family overlaps
A – Old and New family are NOT evolutionary related
Old Family
New Family
SNLVMYIVIIIHWNACVFYSISKAIGFGNDTWVYPDINDPEFGRLARKYVYSLYWSTLTLTTIGETPPPVRDSEYVFVVVDFLIGVLIFATIVGNIGSMISN
• Solution 1: Separate (expunge seqs from SEED, trim ends, raise threshold)
QC: family overlaps
A – Old and New family are NOT evolutionary related
Old Family
New Family
SNLVMYIVIIIHWNACVFYSISKAIGFGNDTWVYPDINDPEFGRLARKYVYSLYWSTLTLTTIGETPPPVRDSEYVFVVVDFLIGVLIFATIVGNIGSMISN
• Solution 2: Manually Edit (no change to family but sequence removed)
QC: family overlaps
• Overlaps
• Hits Score vs Taxonomic distribution
• Known annotation (e.g. functional/structural residues)
• Known structures
• …
EMBO Workshop, Cape Town, 2014
False positive detection
Build SEED MSA ofrepresentative members
Build Profile-HMM
Search UniProtKB
AnnotateEMBO Workshop, Cape Town, 2014
Building familiesIdentify target
QCs and fix Significance thresholds
Abandon
Abandon
Are all Pfam families structural domains?
EMBO Workshop, Cape Town, 2014
PDB (43%)No PDB (57%)
Pfam families with/without PDB structure
EMBO Workshop, Cape Town, 2014
Family
Domain
Repeat
Motif
Pfam types
EMBO Workshop, Cape Town, 2014
• A - Domain• B - Metal
stabilised domain• C - 7 repeats form
domain• D - 9 repeats form
domain could be unlimited number
A B
C D
Domain and repeats
EMBO Workshop, Cape Town, 2014
Example: Lipoprotein attachment site, LPAM_1
Alignment coloured by Residue-type
Motifs
EMBO Workshop, Cape Town, 2014
Family
Domain
Repeat
Disordered Family?
Pfam types
EMBO Workshop, Cape Town, 2014
PDBid: 2JGC
The Pfam website
EMBO Workshop, Cape Town, 2014
EMBO Workshop, Cape Town, 2014
The Pfam website
The Pfam website
EMBO Workshop, Cape Town, 2014
The Pfam website
EMBO Workshop, Cape Town, 2014
The Pfam website
The Pfam website
Pfam families’ interactions: iPfam
Finn et al. NAR 2013 http://www.ipfam.org
TUM, January 2013
Some caveats
• Identifying repeats is challenging, especially with HMMER3 ->local
• Functional diversity within families and clans
• Domains of Unknown Function
• Family boundaries if no structure available
EMBO Workshop, Cape Town, 2014
TUM, January 2013
Comparison of Enolase clan/superfamily in Pfam and SFLD
SFLD: Akiva et al. NAR 2013Picture courtesy of Patsy Babbit (UCSF)
from the Pfam blog: at http://xfam.wordpress.com/tag/pfam/
How far from covering the sequence space: H. sapiens
EMBO Workshop, Cape Town, 2014
Building a Pfam family
EMBO Workshop, Cape Town, 2014
TUM, January 2013
2KX7
Pick a target region
OPEN Chimera1.
File -> Open “2KX7.pdb”2.
EMBO Workshop, Cape Town, 2014
TUM, January 2013
SELECT “2KX7.pdb (#0.1) chain A”
Actions-> Ribbon-> hide
2KX7 model 1
1.
Actions -> Ribbon -> show
2.
3.
EMBO Workshop, Cape Town, 2014
Pick a target region
TUM, January 2013Schmöe et al. Structure 2011
2KX7
EMBO Workshop, Cape Town, 2014
Rcs-signaling systembacterial two component system (sensor kinase +response regulator)
TUM, January 2013EMBO Workshop, Cape Town, 2014
Pick a target regionLook-up UniprotKB ID: P39838 on the Pfam website (http://pfam.xfam.org)
TUM, January 2013EMBO Workshop, Cape Town, 2014
Pick a target regionLook-up UniprotKB ID: P39838 on the Pfam website (http://pfam.xfam.org)
TUM, January 2013
2KX7
EMBO Workshop, Cape Town, 2014Schmöe et al. Structure 2011
HK
S
ABL
HPt
Pick a target region
TUM, January 2013
2KX7
EMBO Workshop, Cape Town, 2014Schmöe et al. Structure 2011
HK
S
ABL
HPt
Pick a target region
EMBO Workshop, Cape Town, 2014
Pick a target region
EMBO Workshop, Cape Town, 2014
Pick a target region
Look for homologs
EMBO Workshop, Cape Town, 2014
http://hmmer.janelia.org
Click Start
HMMER website: Finn et al. NAR 2011
Look for homologs
EMBO Workshop, Cape Town, 2014
http://hmmer.janelia.org
Choose “Marco-Data/Other/2KX7.fasta”
Select your dataset
EMBO Workshop, Cape Town, 2014
Select rp75 in Sequence Database
Parse hits
EMBO Workshop, Cape Town, 2014
Parse hits
EMBO Workshop, Cape Town, 2014
Click
Check conservation and coverage
EMBO Workshop, Cape Town, 2014
Check low scores
EMBO Workshop, Cape Town, 2014
Scroll down
Check taxonomic distribution
EMBO Workshop, Cape Town, 2014
Click Taxonomy
Check taxonomic distribution
EMBO Workshop, Cape Town, 2014
Check domain architectures/overlaps
EMBO Workshop, Cape Town, 2014
Click Domain
Download aligned hits
EMBO Workshop, Cape Town, 2014
CLICK on Download and then on Aligned FASTA1.
Save as “RcsD-ABL-hmmer-ali.fasta”2.
OPEN Jalview1.
File -> Input Alignment -> From File “RcsD-ABL-hmmer-ali.fasta”2.
Manipulate alignment
top related