structure-based evidence for function (tigrfam, pfam and pdb)
TRANSCRIPT
![Page 1: Structure-based Evidence for Function (TIGRfam, Pfam and PDB)](https://reader037.vdocuments.mx/reader037/viewer/2022110208/56649de55503460f94add014/html5/thumbnails/1.jpg)
Structure-based Evidence for Function(TIGRfam, Pfam and PDB)
![Page 2: Structure-based Evidence for Function (TIGRfam, Pfam and PDB)](https://reader037.vdocuments.mx/reader037/viewer/2022110208/56649de55503460f94add014/html5/thumbnails/2.jpg)
TIGRfams are protein familiescategorized by functional role
![Page 3: Structure-based Evidence for Function (TIGRfam, Pfam and PDB)](https://reader037.vdocuments.mx/reader037/viewer/2022110208/56649de55503460f94add014/html5/thumbnails/3.jpg)
Concept: HMMs• HMM: A Hidden Markov Model is a probabilistic model developed from
observed sequences of proteins of a known function. The profile HMM is used to score the alignment of the amino acid sequence entered to other proteins base on amino acid identity and position
A concrete example of an HMM:
http://en.wikipedia.org/wiki/Hidden_Markov_model
Consider two friends, Alice and Bob, who live far apart from each other and who talk together daily over the telephone about what they did that day. Bob is only interested in three activities: walking in the park, shopping, and cleaning his apartment. The choice of what to do is determined exclusively by the weather on a given day. Alice has no definite information about the weather where Bob lives, but she knows general trends. Based on what Bob tells her he did each day, Alice tries to guess what the weather must have been like.
Alice believes that the weather operates as a discrete Markov chain (system in various states that can change randomly). There are two states, "Rainy" and "Sunny", but she cannot observe them directly, that is, they are hidden from her. On each day, there is a certain chance that Bob will perform one of the following activities, depending on the weather: "walk", "shop", or "clean". Since Bob tells Alice about his activities, those are the observations. The entire system is that of a hidden Markov model (HMM).
Alice knows the general weather trends in the area, and what Bob likes to do on average. In other words, the parameters of the HMM are known.
![Page 4: Structure-based Evidence for Function (TIGRfam, Pfam and PDB)](https://reader037.vdocuments.mx/reader037/viewer/2022110208/56649de55503460f94add014/html5/thumbnails/4.jpg)
Follow this link from the lab notebook
TIGRfams:Haft et al. (2001)Nucleic Acids Research 29: 41-43.
![Page 5: Structure-based Evidence for Function (TIGRfam, Pfam and PDB)](https://reader037.vdocuments.mx/reader037/viewer/2022110208/56649de55503460f94add014/html5/thumbnails/5.jpg)
Change Database to “TIGRFAMS” Change Scope to GLOBAL Change E-value cutoff to “0.01” Enter protein sequence in FASTA format in the box Click on “Start HMM search” Then wait…
Search TIGRFAM database
“Click”
![Page 6: Structure-based Evidence for Function (TIGRfam, Pfam and PDB)](https://reader037.vdocuments.mx/reader037/viewer/2022110208/56649de55503460f94add014/html5/thumbnails/6.jpg)
Enter the TIGRfam number (format -- TIGRXXXXX) from 'Model' column into imgACT lab notebook in box for significant TIGRfam hit
Enter TIGRfam name from ‘Description’ column into notebook NOTE: If full name is cut off in ‘Description’ column, go to
http://cmr.jcvi.org/cgi-bin/CMR/shared/MakeFrontPages.cgi?page=text_search&crumbs=searches
Enter Score and E-value into Notebook as well
Score and E-Score and E-valuevalue
RESULTS:Only hits with
positive Score &E-value 10-3
should be recorded
“Click”
![Page 7: Structure-based Evidence for Function (TIGRfam, Pfam and PDB)](https://reader037.vdocuments.mx/reader037/viewer/2022110208/56649de55503460f94add014/html5/thumbnails/7.jpg)
To obtain full TIGRfam name:
![Page 8: Structure-based Evidence for Function (TIGRfam, Pfam and PDB)](https://reader037.vdocuments.mx/reader037/viewer/2022110208/56649de55503460f94add014/html5/thumbnails/8.jpg)
Then what?
Full name
Complete description
![Page 9: Structure-based Evidence for Function (TIGRfam, Pfam and PDB)](https://reader037.vdocuments.mx/reader037/viewer/2022110208/56649de55503460f94add014/html5/thumbnails/9.jpg)
TIGRfam Results in imgACT Notebook
![Page 10: Structure-based Evidence for Function (TIGRfam, Pfam and PDB)](https://reader037.vdocuments.mx/reader037/viewer/2022110208/56649de55503460f94add014/html5/thumbnails/10.jpg)
Terms to Know for Pfam
• Domain: A structural unit which can be found in multiple protein contexts.
e.g., zinc finger, leucine zipper
• Family: A collection of related proteins containingthe same domain.
e.g., immunoglobulins, CD4, MHC, TCR, etc.
• Clan: A collection of multiple protein families. The relationship may be defined by similarity of sequence,
structure, or profile-HMM.e.g., ATPase functioning in ETC vs. ATPase functioning in DNA replication.
![Page 11: Structure-based Evidence for Function (TIGRfam, Pfam and PDB)](https://reader037.vdocuments.mx/reader037/viewer/2022110208/56649de55503460f94add014/html5/thumbnails/11.jpg)
Click on the link provided in your
notebook.
![Page 12: Structure-based Evidence for Function (TIGRfam, Pfam and PDB)](https://reader037.vdocuments.mx/reader037/viewer/2022110208/56649de55503460f94add014/html5/thumbnails/12.jpg)
You know the Drill!
Enter your FASTA formatamino acid sequence
Change E-value to 0.001
“Click”
![Page 13: Structure-based Evidence for Function (TIGRfam, Pfam and PDB)](https://reader037.vdocuments.mx/reader037/viewer/2022110208/56649de55503460f94add014/html5/thumbnails/13.jpg)
WAIT…this can sometimes take awhile
![Page 14: Structure-based Evidence for Function (TIGRfam, Pfam and PDB)](https://reader037.vdocuments.mx/reader037/viewer/2022110208/56649de55503460f94add014/html5/thumbnails/14.jpg)
RESULTS!
Notice there may be two types of results based on your designated E-value:
Significant and insignificant matches. Only investigate significant matches.
NOTE: Insignificant matches may have valid E-value. . . but this Pfam result is considered insignificant because the length of the alignment is very short & Pfam has detected and flagged this.
If you do not have any significant matches, make a note of this in your notebook by creating a COMMENTS section, entering “No significant hits”.
Be sure your search criteria was accurate (e.g., E-value of 0.001)
Graphic view ofdomain organization
![Page 15: Structure-based Evidence for Function (TIGRfam, Pfam and PDB)](https://reader037.vdocuments.mx/reader037/viewer/2022110208/56649de55503460f94add014/html5/thumbnails/15.jpg)
Investigate SIGNIFICANT matches
Click on [Show] to view the “pairwise alignment” for the Pfam match
Copy/paste this pair-wise alignment into designated box in your notebook.
![Page 16: Structure-based Evidence for Function (TIGRfam, Pfam and PDB)](https://reader037.vdocuments.mx/reader037/viewer/2022110208/56649de55503460f94add014/html5/thumbnails/16.jpg)
Top row (#HMM): all capital letters indicateconserved residues in the HMM consensus sequence.Middle row (#MATCH): identical or functionallyconserved (similar) amino acids Bottom row (#SEQ): query sequence aligned toHMM representing the domain/family
How do I interpret the alignment?
Legend for #MATCH• Upper case = identical match (conserved and high frequency)
• Lower case = identical match (conserved but low frequency)
• + symbol = functionally similar (i.e. aspartic vs. glutamic acid)
• Space = no match
What is an HMM consensus sequence?
![Page 17: Structure-based Evidence for Function (TIGRfam, Pfam and PDB)](https://reader037.vdocuments.mx/reader037/viewer/2022110208/56649de55503460f94add014/html5/thumbnails/17.jpg)
The HMM consensus sequence
Right “Click” Pfam link & open in new tab
On Pfam family summary page, click on
“Alignments”’
![Page 18: Structure-based Evidence for Function (TIGRfam, Pfam and PDB)](https://reader037.vdocuments.mx/reader037/viewer/2022110208/56649de55503460f94add014/html5/thumbnails/18.jpg)
The HMM consensus sequence
Full: Total number of sequences in database that have been categorized into this Pfam family
Seed: Number of sequences within multiple sequence alignment representing architectural variations within a single Pfam family
What does this mean?
![Page 19: Structure-based Evidence for Function (TIGRfam, Pfam and PDB)](https://reader037.vdocuments.mx/reader037/viewer/2022110208/56649de55503460f94add014/html5/thumbnails/19.jpg)
Architecture Diversity
• Domain organization within context of full protein
![Page 20: Structure-based Evidence for Function (TIGRfam, Pfam and PDB)](https://reader037.vdocuments.mx/reader037/viewer/2022110208/56649de55503460f94add014/html5/thumbnails/20.jpg)
Leave default settings and press the [View] button
The HMM consensus sequence
![Page 21: Structure-based Evidence for Function (TIGRfam, Pfam and PDB)](https://reader037.vdocuments.mx/reader037/viewer/2022110208/56649de55503460f94add014/html5/thumbnails/21.jpg)
Click on [Start Jalview] button to view the multiple sequence alignment
The HMM consensus sequence
A new window will pop up as shown:
![Page 22: Structure-based Evidence for Function (TIGRfam, Pfam and PDB)](https://reader037.vdocuments.mx/reader037/viewer/2022110208/56649de55503460f94add014/html5/thumbnails/22.jpg)
TOO MANY COLORS! How do we read this?!?
The HMM consensus sequence
Another new window will pop up as shown:
![Page 23: Structure-based Evidence for Function (TIGRfam, Pfam and PDB)](https://reader037.vdocuments.mx/reader037/viewer/2022110208/56649de55503460f94add014/html5/thumbnails/23.jpg)
Let’s make the view more manageable by simplifying the colors. . .
The HMM consensus sequence
Select “Percentage Identity” from menu.
NOTE: Take the time to browse
other color schemes to learn more about your
protein.
![Page 24: Structure-based Evidence for Function (TIGRfam, Pfam and PDB)](https://reader037.vdocuments.mx/reader037/viewer/2022110208/56649de55503460f94add014/html5/thumbnails/24.jpg)
Pay special attention to
BOTTOM graph: Consensus
sequence for protein family
This view reveals the amount of conservation in your amino acid sequence.Dark = highest frequency Light = lower frequency
Letters show which amino acids
occur most frequently at that
position.This consensus sequence is used to construct the HMM
The HMM consensus sequence
![Page 25: Structure-based Evidence for Function (TIGRfam, Pfam and PDB)](https://reader037.vdocuments.mx/reader037/viewer/2022110208/56649de55503460f94add014/html5/thumbnails/25.jpg)
Return to Summary page for Pfam family
What else do I need for my notebook?Pfam name and Pfam number
Pfam number
AbbreviatedPfam name
FullPfam name
Copy/paste full & abbreviated Pfam name as well as Pfam number into your lab notebook
![Page 26: Structure-based Evidence for Function (TIGRfam, Pfam and PDB)](https://reader037.vdocuments.mx/reader037/viewer/2022110208/56649de55503460f94add014/html5/thumbnails/26.jpg)
Note: Pay Attention to possible 3D Image
• You may see a 3D image when you view your summary.
• If you see this image, then this is your first clue that you should expect to have significant hits in the PDB search (next section of this module).
• If you don’t see an image, then this suggests no structure has yet been solved for proteins containing the domain identified by Pfam.
![Page 27: Structure-based Evidence for Function (TIGRfam, Pfam and PDB)](https://reader037.vdocuments.mx/reader037/viewer/2022110208/56649de55503460f94add014/html5/thumbnails/27.jpg)
What else do I need for my notebook?HMM Logo
On Summary page, click on “HMM logo”
![Page 28: Structure-based Evidence for Function (TIGRfam, Pfam and PDB)](https://reader037.vdocuments.mx/reader037/viewer/2022110208/56649de55503460f94add014/html5/thumbnails/28.jpg)
SAVE this image in .png format and insert into your notebook.
What else do I need for my notebook?HMM Logo
![Page 29: Structure-based Evidence for Function (TIGRfam, Pfam and PDB)](https://reader037.vdocuments.mx/reader037/viewer/2022110208/56649de55503460f94add014/html5/thumbnails/29.jpg)
How do we interpret the HMM Logo?HMM Logo:
-- Highly conserved amino acids are represented by wide letters-- Amino acids with a high frequency of occurrence in the alignment used to generate the HMM consensus sequence are represented by tall letters
![Page 30: Structure-based Evidence for Function (TIGRfam, Pfam and PDB)](https://reader037.vdocuments.mx/reader037/viewer/2022110208/56649de55503460f94add014/html5/thumbnails/30.jpg)
Return to Summary page:
What else do I need for my notebook?Clan name and number
Click BROWSE to search for clan information
Use key words from Pfam family name for clan search
![Page 31: Structure-based Evidence for Function (TIGRfam, Pfam and PDB)](https://reader037.vdocuments.mx/reader037/viewer/2022110208/56649de55503460f94add014/html5/thumbnails/31.jpg)
Investigate possible clans based on key word search from Pfam family description.
To learn more about the clan, click on hyperlink for more clan information.
What else do I need for my notebook?Clan name and number
![Page 32: Structure-based Evidence for Function (TIGRfam, Pfam and PDB)](https://reader037.vdocuments.mx/reader037/viewer/2022110208/56649de55503460f94add014/html5/thumbnails/32.jpg)
AbbreviatedClan name Clan number
Tells you which Pfam families belong to this clan. If the Pfam family to which your protein belongs is not in this
list, then your protein is NOT a member of this clan.
What else do I need for my notebook?Clan name and number
FullClan name
NOTE: Not all Pfam families belong to a clan. If no clan is found, enter “None found” in your lab notebook.
![Page 33: Structure-based Evidence for Function (TIGRfam, Pfam and PDB)](https://reader037.vdocuments.mx/reader037/viewer/2022110208/56649de55503460f94add014/html5/thumbnails/33.jpg)
What else do I need for my notebook?Key functional residues
You have THREE key tools to assist you in identifying theKEY FUNCTIONAL RESIDUES of your protein.
Tool #1: Pairwise Alignment
Tool #2: HMM Logo
Tool #3: Jalview consensus
![Page 34: Structure-based Evidence for Function (TIGRfam, Pfam and PDB)](https://reader037.vdocuments.mx/reader037/viewer/2022110208/56649de55503460f94add014/html5/thumbnails/34.jpg)
Capital letter in #MATCH lineCapital letter in #MATCH line Tall, wide letter in HMM logoTall, wide letter in HMM logo
Tall bar in graphical depiction of consensus sequenceTall bar in graphical depiction of consensus sequence
How do we identify key functional residues?
![Page 35: Structure-based Evidence for Function (TIGRfam, Pfam and PDB)](https://reader037.vdocuments.mx/reader037/viewer/2022110208/56649de55503460f94add014/html5/thumbnails/35.jpg)
Formula:Formula:AA(start+HMM#-1)AA(start+HMM#-1)
Example:
C(47+8-1)= C54
How do we report key functional residuesin the notebook?
HMM#HMM#
![Page 36: Structure-based Evidence for Function (TIGRfam, Pfam and PDB)](https://reader037.vdocuments.mx/reader037/viewer/2022110208/56649de55503460f94add014/html5/thumbnails/36.jpg)
1. Use the HMM pair-wise alignment to identify possible key functional residues.
2. Use the HMM Logo and Jalview alignment tools to verify key functional residues.
3. Scan the entire amino acid sequence and record all key functional residues using proper notation.
SUMMARY: Identifying key functional residues
![Page 37: Structure-based Evidence for Function (TIGRfam, Pfam and PDB)](https://reader037.vdocuments.mx/reader037/viewer/2022110208/56649de55503460f94add014/html5/thumbnails/37.jpg)
Recording results in your Lab Notebook
Scrolldown
![Page 38: Structure-based Evidence for Function (TIGRfam, Pfam and PDB)](https://reader037.vdocuments.mx/reader037/viewer/2022110208/56649de55503460f94add014/html5/thumbnails/38.jpg)
Recording results in your Lab Notebook
![Page 39: Structure-based Evidence for Function (TIGRfam, Pfam and PDB)](https://reader037.vdocuments.mx/reader037/viewer/2022110208/56649de55503460f94add014/html5/thumbnails/39.jpg)
REPEAT procedure for all significant Pfam hits
3 hits = 3 notebook entries
![Page 40: Structure-based Evidence for Function (TIGRfam, Pfam and PDB)](https://reader037.vdocuments.mx/reader037/viewer/2022110208/56649de55503460f94add014/html5/thumbnails/40.jpg)
PDBProtein Data Bank
o Worldwide depository for three-dimensional structures of large biological molecules, including proteins and nucleic acids
o Contains information about structure such as. . .
Berman et al. (2003)Nature Structural Biology 10: 980.
• sequence details• atomic coordinates• crystallization conditions
• 3-D structure neighbors• derived geometric data• structure factors• 3-D images
![Page 41: Structure-based Evidence for Function (TIGRfam, Pfam and PDB)](https://reader037.vdocuments.mx/reader037/viewer/2022110208/56649de55503460f94add014/html5/thumbnails/41.jpg)
Click on the link provided in your
notebook.
![Page 42: Structure-based Evidence for Function (TIGRfam, Pfam and PDB)](https://reader037.vdocuments.mx/reader037/viewer/2022110208/56649de55503460f94add014/html5/thumbnails/42.jpg)
Select “Advanced Search”
![Page 43: Structure-based Evidence for Function (TIGRfam, Pfam and PDB)](https://reader037.vdocuments.mx/reader037/viewer/2022110208/56649de55503460f94add014/html5/thumbnails/43.jpg)
Select “Sequence (Blast/Fasta)” option
Copy/paste your FASTA format protein sequence into query box
Click when readyto initiate search
Change E-valuecut off to 0.001
![Page 44: Structure-based Evidence for Function (TIGRfam, Pfam and PDB)](https://reader037.vdocuments.mx/reader037/viewer/2022110208/56649de55503460f94add014/html5/thumbnails/44.jpg)
Scrolldown
Results of PDB Search
Search hits listed by ascending E-value
![Page 45: Structure-based Evidence for Function (TIGRfam, Pfam and PDB)](https://reader037.vdocuments.mx/reader037/viewer/2022110208/56649de55503460f94add014/html5/thumbnails/45.jpg)
Alignmentand statistics
Assess quality of the alignment: Is the E-value less than 10-3?Is a significant proportion of the protein aligned?(Hint: compare alignment length to total length)
PDB NAME
PDB CODE Thumbnail of 3D structure.Click on it to get a high-resolution image for notebook.
Evaluating PDB Results
If so, good hit.
Citation
![Page 46: Structure-based Evidence for Function (TIGRfam, Pfam and PDB)](https://reader037.vdocuments.mx/reader037/viewer/2022110208/56649de55503460f94add014/html5/thumbnails/46.jpg)
NOTE: Revise or add headings and boxes as needed
Recording results in your Lab Notebook
Scrolldown
Add to your notebook
![Page 47: Structure-based Evidence for Function (TIGRfam, Pfam and PDB)](https://reader037.vdocuments.mx/reader037/viewer/2022110208/56649de55503460f94add014/html5/thumbnails/47.jpg)
You cannot simply copy/paste the entire alignment with correct formatting into your lab notebook…. DELETE THIS SECTION.
X