gapped blast and psi-blast : a new generation of protein database search programs
DESCRIPTION
Gapped BLAST and PSI-BLAST : a new generation of protein database search programs. Presented by 佘健生 鄭為正 李定達 曾文鴻. Outline. BLAST 1.0 BLAST 2.0 The two-hit method Gapped alignment PSI-BLAST Performance evaluation Discussion and Conclusion NCBI website. Statistical preliminaries. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Gapped BLAST and PSI-BLAST : a new generation of protein database search programs](https://reader036.vdocuments.mx/reader036/viewer/2022081502/56815956550346895dc6921a/html5/thumbnails/1.jpg)
Gapped BLAST and PSI-BLAST :a new generation of protein database se
arch programs
Presented by佘健生鄭為正李定達曾文鴻
![Page 2: Gapped BLAST and PSI-BLAST : a new generation of protein database search programs](https://reader036.vdocuments.mx/reader036/viewer/2022081502/56815956550346895dc6921a/html5/thumbnails/2.jpg)
Outline
• BLAST 1.0
• BLAST 2.0– The two-hit method– Gapped alignment– PSI-BLAST
• Performance evaluation
• Discussion and Conclusion
• NCBI website
![Page 3: Gapped BLAST and PSI-BLAST : a new generation of protein database search programs](https://reader036.vdocuments.mx/reader036/viewer/2022081502/56815956550346895dc6921a/html5/thumbnails/3.jpg)
Statistical preliminaries
• HSP: High-scoring segment pair– Locally optimal pair
• S’ = (λS - ㏑ K) / ㏑ 2– S’: normalized score
– Pi : background probability that amino acids occur randomly at all position
– sij: score for aligning each pair of amino acids I and j
– K : minor constant– λ: constant to adjust for matrix
– sij and Pi → K and λ
![Page 4: Gapped BLAST and PSI-BLAST : a new generation of protein database search programs](https://reader036.vdocuments.mx/reader036/viewer/2022081502/56815956550346895dc6921a/html5/thumbnails/4.jpg)
• E = N / 2S’ – E: number of distinct HSPs with normalized sc
ore at least S’– N = mn is search space– S’ = log2(N/E)
• qij = PiPjeλuS
ij
– qij : target frequency of aligned pair of letters (i, j) with HSP, high-scoring segment paris
– λu: the unique positive number
![Page 5: Gapped BLAST and PSI-BLAST : a new generation of protein database search programs](https://reader036.vdocuments.mx/reader036/viewer/2022081502/56815956550346895dc6921a/html5/thumbnails/5.jpg)
BLAST
• Basic Local Alignment Search Tool(by Altschul, Gish, Miller, Myers and Lipman)
• The BLAST program are widely used tools for searching protein and DNA database for sequence similarities
• BLAST is a heuristic that attempts to optimize a specific similarity measure.
• The central idea of the BLAST algorithm is that a statistically significant alignment is likely to contain a high-scoring pair of aligned words.
![Page 6: Gapped BLAST and PSI-BLAST : a new generation of protein database search programs](https://reader036.vdocuments.mx/reader036/viewer/2022081502/56815956550346895dc6921a/html5/thumbnails/6.jpg)
The maximal segment pair measure
• MSP(maximal segment pair): the highest scoring pair of identical length segments chosen from 2 sequences– for DNA: Identities: +5; Mismatches: -4– for protein: BLOSUM62 …
• BLAST heuristically attempts to calculate the MSP score.
• DP is O(mn) ,but BLAST is O(m)the highest scoring pair
![Page 7: Gapped BLAST and PSI-BLAST : a new generation of protein database search programs](https://reader036.vdocuments.mx/reader036/viewer/2022081502/56815956550346895dc6921a/html5/thumbnails/7.jpg)
BLAST 1.0
1) Build the hash table for Sequence A.
2) Scan Sequence B for hits.
3) Extend hits.
![Page 8: Gapped BLAST and PSI-BLAST : a new generation of protein database search programs](https://reader036.vdocuments.mx/reader036/viewer/2022081502/56815956550346895dc6921a/html5/thumbnails/8.jpg)
Step 1: Build the hash table for Sequence A. (3-tuple example)
For DNA :Seq. A = ACGTAGTA 12345678 AAAAAC..ACG 1..AGT 5..CGT 2..GTA 3 6..TAG 4..TTT
For protein :
Seq. A = YGGFM
Add xyz to the hash table if Score(xyz, YGG) ≧ T;Add xyz to the hash table if Score(xyz, GGF) ≧ T;Add xyz to the hash table if Score(xyz, GFM) ≧ T;
T: ‘threshold’ parameterHigh T yelds greater speed,but weak similarities
Hash table
![Page 9: Gapped BLAST and PSI-BLAST : a new generation of protein database search programs](https://reader036.vdocuments.mx/reader036/viewer/2022081502/56815956550346895dc6921a/html5/thumbnails/9.jpg)
List all words in query
YGGFMTSEKSQTPLVTLFKNAIIKNAHKKGQYGG GGF GFM FMT MTS TSE SEK …
![Page 10: Gapped BLAST and PSI-BLAST : a new generation of protein database search programs](https://reader036.vdocuments.mx/reader036/viewer/2022081502/56815956550346895dc6921a/html5/thumbnails/10.jpg)
Augment word list
YGGFMTSEKSQTPLVTLFKNAIIKNAHKKGQYGG GGF GFM FMT MTS TSE SEK …
AAAAABAAC
…
YYY
![Page 11: Gapped BLAST and PSI-BLAST : a new generation of protein database search programs](https://reader036.vdocuments.mx/reader036/viewer/2022081502/56815956550346895dc6921a/html5/thumbnails/11.jpg)
G G FG G Y6 + 6 + 3 = 15
BLOSUM62 scores Non-match
Match
A user-specified threshold determines which three-letter words are considered matches and non-matches.
G G FA A A0 + 0 + -2 = -2
![Page 12: Gapped BLAST and PSI-BLAST : a new generation of protein database search programs](https://reader036.vdocuments.mx/reader036/viewer/2022081502/56815956550346895dc6921a/html5/thumbnails/12.jpg)
YGGFMTSEKSQTPLVTLFKNAIIKNAHKKGQYGG GGF GFM FMT MTS TSE SEK …
GGIGGLGGMGGFGGWGGY
…
![Page 13: Gapped BLAST and PSI-BLAST : a new generation of protein database search programs](https://reader036.vdocuments.mx/reader036/viewer/2022081502/56815956550346895dc6921a/html5/thumbnails/13.jpg)
Store words in search tree
Search tree
Augmented list of query words
“Does this query contain GGF?”
“Yes, at position 2.”
O(1) time
![Page 14: Gapped BLAST and PSI-BLAST : a new generation of protein database search programs](https://reader036.vdocuments.mx/reader036/viewer/2022081502/56815956550346895dc6921a/html5/thumbnails/14.jpg)
Search tree
G
G
L MF W Y
![Page 15: Gapped BLAST and PSI-BLAST : a new generation of protein database search programs](https://reader036.vdocuments.mx/reader036/viewer/2022081502/56815956550346895dc6921a/html5/thumbnails/15.jpg)
Scan the database
Database sequence
Que
ry s
eque
nce
x
x
x
x
xx
x
x
![Page 16: Gapped BLAST and PSI-BLAST : a new generation of protein database search programs](https://reader036.vdocuments.mx/reader036/viewer/2022081502/56815956550346895dc6921a/html5/thumbnails/16.jpg)
Extend hit
L P P Q G L L Query sequenceM P P E G L L Database sequence <word> 7 2 6 BLOSUM62 scores word score = 15<--- --->2 7 7 2 6 4 4 HSP SCORE = 32
This is done by extending a hit in both directions, until the running alignment’s score has dropped more than Xbelow
hit
Extend
![Page 17: Gapped BLAST and PSI-BLAST : a new generation of protein database search programs](https://reader036.vdocuments.mx/reader036/viewer/2022081502/56815956550346895dc6921a/html5/thumbnails/17.jpg)
BLAST 2.0The two-hit method
• BLAST 1.0– Extension step typically accounts for >90% of BLAST’
execution time
• Observations:– A HSP of interest is much longer than a single word
pair– Entail multiple hits on the same diagonal and within
short distance of one another
• Invoke an extension only when two non-overlapping hits are found within distance A on the same diagonal
![Page 18: Gapped BLAST and PSI-BLAST : a new generation of protein database search programs](https://reader036.vdocuments.mx/reader036/viewer/2022081502/56815956550346895dc6921a/html5/thumbnails/18.jpg)
• Recent[i]: the most recent hit found on the ith diagonal (always increasing)
overlap
< A
Extend!
> A
![Page 19: Gapped BLAST and PSI-BLAST : a new generation of protein database search programs](https://reader036.vdocuments.mx/reader036/viewer/2022081502/56815956550346895dc6921a/html5/thumbnails/19.jpg)
• T must to be lowered– one-hits : W=3 ,T=13– Two-hit : W=3 ,T=11– More one-hits while the majo
rity are dismissed
• Sensitivity– For HSPs with at least 33 bit
s, the two-hit heuristic is more sensitive
• Speed(two-hit):– Generates on average ~3.2 t
imes as many hit, but only ~0.14 times as many hit extension(decide whether a hit need be extended)
– Twice as rapid as one-hit
![Page 20: Gapped BLAST and PSI-BLAST : a new generation of protein database search programs](https://reader036.vdocuments.mx/reader036/viewer/2022081502/56815956550346895dc6921a/html5/thumbnails/20.jpg)
Gapped alignment
• Original BLAST: find several distinct HSPs– All HSPs related to one alignment should be found
• Gapped BLAST: tolerate a much higher chance of missing any single moderately scoring HSP– Seeking a single gapped alignment, rather than a collection of u
mgapped ones– For example, result should > 0.95, p: miss prob of HSP
• Orignial with 2 HSP: (1-p)(1-p)>0.95 p<0.025• Now: p2<0.05p=0.22
– T can be raised faster
• Now:– Find one HSP only– seed, than use 2-hit
![Page 21: Gapped BLAST and PSI-BLAST : a new generation of protein database search programs](https://reader036.vdocuments.mx/reader036/viewer/2022081502/56815956550346895dc6921a/html5/thumbnails/21.jpg)
Gapped alignment (contd)
• A gapped extension takes much longer to execute than an ungapped extension, but by performing very few of them the fraction of the total time could be kept low.
• Trigger a gapped extension for any HSP exceeding score Sg
• Sg should be set at ~22 bits (1:50)
![Page 22: Gapped BLAST and PSI-BLAST : a new generation of protein database search programs](https://reader036.vdocuments.mx/reader036/viewer/2022081502/56815956550346895dc6921a/html5/thumbnails/22.jpg)
Original BLAST locates only the first and the last ungapped aligment, E-value > 50 times
![Page 23: Gapped BLAST and PSI-BLAST : a new generation of protein database search programs](https://reader036.vdocuments.mx/reader036/viewer/2022081502/56815956550346895dc6921a/html5/thumbnails/23.jpg)
Gapped Local Alignments
•
http://binfo.ym.edu.tw/post/internet/gap_blast.htm
![Page 24: Gapped BLAST and PSI-BLAST : a new generation of protein database search programs](https://reader036.vdocuments.mx/reader036/viewer/2022081502/56815956550346895dc6921a/html5/thumbnails/24.jpg)
Before Gap Insertionactaactattacagactaactattacagactaactataca
|||||||||||| |||||||| | | | |
actaactattacggactaacttacagactaactaaaca
Percent Identity = 24/40 = 0.6
After Gap InsertionAfter Gap Insertionactaactattacactaactattacaagactaactgactaactatattacagactaactatacagactaactattacaaca
|||||||||||| |||||||| ||||||||||||| ||||||||||||||| |||||||| ||||||||||||| |||
actaactattacactaactattacgggactaactgactaact----tacagactaactatacagactaactaaaacaaca
Percent Identity = 36/40 = 0.9Percent Identity = 36/40 = 0.9
actaactattacactaactattacaagactaactgactaactatattacagactaactatacagactaactattacaacaactaactattacactaactattacgggactaacttacagactaactagactaacttacagactaactaaaacaaca
![Page 25: Gapped BLAST and PSI-BLAST : a new generation of protein database search programs](https://reader036.vdocuments.mx/reader036/viewer/2022081502/56815956550346895dc6921a/html5/thumbnails/25.jpg)
• Start from a single aligned pair of residues, called the seed.
Gapped Local Alignments
![Page 26: Gapped BLAST and PSI-BLAST : a new generation of protein database search programs](https://reader036.vdocuments.mx/reader036/viewer/2022081502/56815956550346895dc6921a/html5/thumbnails/26.jpg)
Gapped expansion
– Find out ungapped region with highest alignment score.
– If the length of the ungapped region larger than Sg, then try using DP
– Use its central residue pair as the seed.– Gapped extension is invoked less than onc
e per 50 database sequences.
![Page 27: Gapped BLAST and PSI-BLAST : a new generation of protein database search programs](https://reader036.vdocuments.mx/reader036/viewer/2022081502/56815956550346895dc6921a/html5/thumbnails/27.jpg)
PSSM
![Page 28: Gapped BLAST and PSI-BLAST : a new generation of protein database search programs](https://reader036.vdocuments.mx/reader036/viewer/2022081502/56815956550346895dc6921a/html5/thumbnails/28.jpg)
• conserved regions– same protein family– some regions are very similar– the structure and functionality typical to this
family
![Page 29: Gapped BLAST and PSI-BLAST : a new generation of protein database search programs](https://reader036.vdocuments.mx/reader036/viewer/2022081502/56815956550346895dc6921a/html5/thumbnails/29.jpg)
From: http://bioweb.pasteur.fr/seqanal/blast/intro-uk.html
PSI-BLAST (Position-Specific Iterated BLAST)
PSSM
PSSM
[1] Select a query and search it against a protein database
[2] PSI-BLAST constructs a multiple sequence alignmentthen creates a “profile” or specialized position-specificscoring matrix (PSSM)
[3] The PSSM is used as a query against the database
[4] PSI-BLAST estimates statistical significance (E values)
[5] Repeat steps [3] and [4] iteratively, typically 5 times.At each new search, a new profile is used as the query.
![Page 30: Gapped BLAST and PSI-BLAST : a new generation of protein database search programs](https://reader036.vdocuments.mx/reader036/viewer/2022081502/56815956550346895dc6921a/html5/thumbnails/30.jpg)
Score matrix architecture
• Each matrix has length precisely equal to that of the original query sequence.
![Page 31: Gapped BLAST and PSI-BLAST : a new generation of protein database search programs](https://reader036.vdocuments.mx/reader036/viewer/2022081502/56815956550346895dc6921a/html5/thumbnails/31.jpg)
Multiple alignment construction
• E-value < 0.01 from the output of BLAST output.
• Any row identical to the query segment with which it aligns is purged.
• Only one copy is retained of any rows that are above 98% identical to one another.
![Page 32: Gapped BLAST and PSI-BLAST : a new generation of protein database search programs](https://reader036.vdocuments.mx/reader036/viewer/2022081502/56815956550346895dc6921a/html5/thumbnails/32.jpg)
Multiple alignment construction
• Pairwise alignment columns that involve gap characters inserted into the query are simply ignored.
• So M has exactly the same length as the query.
![Page 33: Gapped BLAST and PSI-BLAST : a new generation of protein database search programs](https://reader036.vdocuments.mx/reader036/viewer/2022081502/56815956550346895dc6921a/html5/thumbnails/33.jpg)
Multiple alignment construction
• The matrix scores for a given alignment column should depand not only upon the residues appearing there.
• The set R of sequences it includes to be exactly those that contribute a residue to column C.
• The columns of MC to be just those columns of M in which all the sequences of R are represented.
![Page 34: Gapped BLAST and PSI-BLAST : a new generation of protein database search programs](https://reader036.vdocuments.mx/reader036/viewer/2022081502/56815956550346895dc6921a/html5/thumbnails/34.jpg)
![Page 35: Gapped BLAST and PSI-BLAST : a new generation of protein database search programs](https://reader036.vdocuments.mx/reader036/viewer/2022081502/56815956550346895dc6921a/html5/thumbnails/35.jpg)
Sequence weights
• A large set of closely related sequences carries little more information than a single member, but its size may allow it outvote a small number of more divergent sequences.
• One way is to assign weights.
• Gap characters are treated as a 21st distinct char.
![Page 36: Gapped BLAST and PSI-BLAST : a new generation of protein database search programs](https://reader036.vdocuments.mx/reader036/viewer/2022081502/56815956550346895dc6921a/html5/thumbnails/36.jpg)
Sequence weights
• In constructing matrix scores, not only a column’s observed residue frequencies are important.
• Estimate the relative number NC of independent observations constituted by the alignment MC.
• NC: the mean number of different residue types.
![Page 37: Gapped BLAST and PSI-BLAST : a new generation of protein database search programs](https://reader036.vdocuments.mx/reader036/viewer/2022081502/56815956550346895dc6921a/html5/thumbnails/37.jpg)
• a large number of independent sequences, the estimate of Qi should converge simply to the observed frequency of residue i in that column.
• Pseudocount frequencies
• Estimate Qi by:
iii
gfQ
ijj j
ji q
P
fg
1 CN
![Page 38: Gapped BLAST and PSI-BLAST : a new generation of protein database search programs](https://reader036.vdocuments.mx/reader036/viewer/2022081502/56815956550346895dc6921a/html5/thumbnails/38.jpg)
Performance Evaluation
![Page 39: Gapped BLAST and PSI-BLAST : a new generation of protein database search programs](https://reader036.vdocuments.mx/reader036/viewer/2022081502/56815956550346895dc6921a/html5/thumbnails/39.jpg)
![Page 40: Gapped BLAST and PSI-BLAST : a new generation of protein database search programs](https://reader036.vdocuments.mx/reader036/viewer/2022081502/56815956550346895dc6921a/html5/thumbnails/40.jpg)
Gapped BLAST: 1. 3X faster than original BLAST, finds more 2. >100X faster than S-W, misses only 8, same scores
PSI-BLAST: 1. faster than original BLAST, 40X faster than S-W, much more sensitive
2. multiple iterations is even better, better for non-redundant database of NCBI
3. slower than gapped BLAST: time for construction of PSSM
![Page 41: Gapped BLAST and PSI-BLAST : a new generation of protein database search programs](https://reader036.vdocuments.mx/reader036/viewer/2022081502/56815956550346895dc6921a/html5/thumbnails/41.jpg)
PSI-BLAST Examples(1)
二者已被證明結構相似 , 但用 HIT 當作 query, a BLAST
search of SWISS-PROT reveals hits with E<0.01 only to other HIT proteins.
1.
2. A PSI-BLAST search, using PSSM generated by
yields the E-value of 2X10-4 for uridylyltransferase.
![Page 42: Gapped BLAST and PSI-BLAST : a new generation of protein database search programs](https://reader036.vdocuments.mx/reader036/viewer/2022081502/56815956550346895dc6921a/html5/thumbnails/42.jpg)
PSI-BLAST Examples(2)BRCT proteins
![Page 43: Gapped BLAST and PSI-BLAST : a new generation of protein database search programs](https://reader036.vdocuments.mx/reader036/viewer/2022081502/56815956550346895dc6921a/html5/thumbnails/43.jpg)
![Page 44: Gapped BLAST and PSI-BLAST : a new generation of protein database search programs](https://reader036.vdocuments.mx/reader036/viewer/2022081502/56815956550346895dc6921a/html5/thumbnails/44.jpg)
Seven recent additions to the protein databases as members of BRCT superfamily
![Page 45: Gapped BLAST and PSI-BLAST : a new generation of protein database search programs](https://reader036.vdocuments.mx/reader036/viewer/2022081502/56815956550346895dc6921a/html5/thumbnails/45.jpg)
Discussion
![Page 46: Gapped BLAST and PSI-BLAST : a new generation of protein database search programs](https://reader036.vdocuments.mx/reader036/viewer/2022081502/56815956550346895dc6921a/html5/thumbnails/46.jpg)
Possible future improvement Gap costs
• Allows a gap to involve residues in both sequences rather than just one
• A gap in which k residues are inserted or deleted and j pairs of residues are left unaligned receives the score –(a+bk+cj)
![Page 47: Gapped BLAST and PSI-BLAST : a new generation of protein database search programs](https://reader036.vdocuments.mx/reader036/viewer/2022081502/56815956550346895dc6921a/html5/thumbnails/47.jpg)
Possible future improvementRealignment
• 不將所有超過 threshold 的 pairwise alignment組合成單一 multiple alignment, 而是只選出 the most significant 建構 initial multiple alignment and PSSM, 然後再以此 rescore and realign database sequences that received lower scores
• 優點– Improve weaker pairwise alignments– False positive can be downgraded by an improved
matrix– False negative can have their scores increased
![Page 48: Gapped BLAST and PSI-BLAST : a new generation of protein database search programs](https://reader036.vdocuments.mx/reader036/viewer/2022081502/56815956550346895dc6921a/html5/thumbnails/48.jpg)
Conclusion
• Gapped version of BLAST is faster than original one, and able to produce gapped alignments.
• PSI-BLAST greatly increase sensitivity to weak but biologically relevant sequence relationships.
• PSI-BLAST retains the ability to report accurate statistics, per iteration runs in times not much greater than gapped BLAST, and can be used both iteratively and fully automatically.
![Page 49: Gapped BLAST and PSI-BLAST : a new generation of protein database search programs](https://reader036.vdocuments.mx/reader036/viewer/2022081502/56815956550346895dc6921a/html5/thumbnails/49.jpg)
NCBI• Books• Pudmed• Blast
(1)Nucleotide
-- Quickly search for highly similar sequences
-- Nucleotide-nucleotide BLAST
(2)Protein
-- Protein-protein BLAST
(3)Translated
-- Translated query vs. Protein database
(4)Special
-- Align two sequences
![Page 50: Gapped BLAST and PSI-BLAST : a new generation of protein database search programs](https://reader036.vdocuments.mx/reader036/viewer/2022081502/56815956550346895dc6921a/html5/thumbnails/50.jpg)
![Page 51: Gapped BLAST and PSI-BLAST : a new generation of protein database search programs](https://reader036.vdocuments.mx/reader036/viewer/2022081502/56815956550346895dc6921a/html5/thumbnails/51.jpg)
![Page 52: Gapped BLAST and PSI-BLAST : a new generation of protein database search programs](https://reader036.vdocuments.mx/reader036/viewer/2022081502/56815956550346895dc6921a/html5/thumbnails/52.jpg)
NCBI• Books• Pudmed• Blast
(1)Nucleotide
-- Quickly search for highly similar sequences
-- Nucleotide-nucleotide BLAST
(2)Protein
-- Protein-protein BLAST
(3)Translated
-- Translated query vs. Protein database
(4)Special
-- Align two sequences
![Page 53: Gapped BLAST and PSI-BLAST : a new generation of protein database search programs](https://reader036.vdocuments.mx/reader036/viewer/2022081502/56815956550346895dc6921a/html5/thumbnails/53.jpg)
![Page 54: Gapped BLAST and PSI-BLAST : a new generation of protein database search programs](https://reader036.vdocuments.mx/reader036/viewer/2022081502/56815956550346895dc6921a/html5/thumbnails/54.jpg)
![Page 55: Gapped BLAST and PSI-BLAST : a new generation of protein database search programs](https://reader036.vdocuments.mx/reader036/viewer/2022081502/56815956550346895dc6921a/html5/thumbnails/55.jpg)
![Page 56: Gapped BLAST and PSI-BLAST : a new generation of protein database search programs](https://reader036.vdocuments.mx/reader036/viewer/2022081502/56815956550346895dc6921a/html5/thumbnails/56.jpg)
NCBI• Books• Pudmed• Blast
(1)Nucleotide
-- Quickly search for highly similar sequences
-- Nucleotide-nucleotide BLAST
(2)Protein
-- Protein-protein BLAST
(3)Translated
-- Translated query vs. Protein database
(4)Special
-- Align two sequences
![Page 57: Gapped BLAST and PSI-BLAST : a new generation of protein database search programs](https://reader036.vdocuments.mx/reader036/viewer/2022081502/56815956550346895dc6921a/html5/thumbnails/57.jpg)
Sequencedatabase
Database searching
Sequencecomparisonalgorithm
Query
Targets ranked by score
![Page 58: Gapped BLAST and PSI-BLAST : a new generation of protein database search programs](https://reader036.vdocuments.mx/reader036/viewer/2022081502/56815956550346895dc6921a/html5/thumbnails/58.jpg)
![Page 59: Gapped BLAST and PSI-BLAST : a new generation of protein database search programs](https://reader036.vdocuments.mx/reader036/viewer/2022081502/56815956550346895dc6921a/html5/thumbnails/59.jpg)
![Page 60: Gapped BLAST and PSI-BLAST : a new generation of protein database search programs](https://reader036.vdocuments.mx/reader036/viewer/2022081502/56815956550346895dc6921a/html5/thumbnails/60.jpg)
![Page 61: Gapped BLAST and PSI-BLAST : a new generation of protein database search programs](https://reader036.vdocuments.mx/reader036/viewer/2022081502/56815956550346895dc6921a/html5/thumbnails/61.jpg)
![Page 62: Gapped BLAST and PSI-BLAST : a new generation of protein database search programs](https://reader036.vdocuments.mx/reader036/viewer/2022081502/56815956550346895dc6921a/html5/thumbnails/62.jpg)
![Page 63: Gapped BLAST and PSI-BLAST : a new generation of protein database search programs](https://reader036.vdocuments.mx/reader036/viewer/2022081502/56815956550346895dc6921a/html5/thumbnails/63.jpg)
NCBI• Books• Pudmed• Blast
(1)Nucleotide
-- Quickly search for highly similar sequences
-- Nucleotide-nucleotide BLAST
(2)Protein
-- Protein-protein BLAST
(3)Translated
-- Translated query vs. Protein database
(4)Special
-- Align two sequences
![Page 64: Gapped BLAST and PSI-BLAST : a new generation of protein database search programs](https://reader036.vdocuments.mx/reader036/viewer/2022081502/56815956550346895dc6921a/html5/thumbnails/64.jpg)
![Page 65: Gapped BLAST and PSI-BLAST : a new generation of protein database search programs](https://reader036.vdocuments.mx/reader036/viewer/2022081502/56815956550346895dc6921a/html5/thumbnails/65.jpg)
![Page 66: Gapped BLAST and PSI-BLAST : a new generation of protein database search programs](https://reader036.vdocuments.mx/reader036/viewer/2022081502/56815956550346895dc6921a/html5/thumbnails/66.jpg)
![Page 67: Gapped BLAST and PSI-BLAST : a new generation of protein database search programs](https://reader036.vdocuments.mx/reader036/viewer/2022081502/56815956550346895dc6921a/html5/thumbnails/67.jpg)
![Page 68: Gapped BLAST and PSI-BLAST : a new generation of protein database search programs](https://reader036.vdocuments.mx/reader036/viewer/2022081502/56815956550346895dc6921a/html5/thumbnails/68.jpg)
![Page 69: Gapped BLAST and PSI-BLAST : a new generation of protein database search programs](https://reader036.vdocuments.mx/reader036/viewer/2022081502/56815956550346895dc6921a/html5/thumbnails/69.jpg)
![Page 70: Gapped BLAST and PSI-BLAST : a new generation of protein database search programs](https://reader036.vdocuments.mx/reader036/viewer/2022081502/56815956550346895dc6921a/html5/thumbnails/70.jpg)
NCBI• Books• Pudmed• Blast
(1)Nucleotide
-- Quickly search for highly similar sequences
-- Nucleotide-nucleotide BLAST
(2)Protein
-- Protein-protein BLAST
(3)Translated
-- Translated query vs. Protein database
(4)Special
-- Align two sequences
![Page 71: Gapped BLAST and PSI-BLAST : a new generation of protein database search programs](https://reader036.vdocuments.mx/reader036/viewer/2022081502/56815956550346895dc6921a/html5/thumbnails/71.jpg)
![Page 72: Gapped BLAST and PSI-BLAST : a new generation of protein database search programs](https://reader036.vdocuments.mx/reader036/viewer/2022081502/56815956550346895dc6921a/html5/thumbnails/72.jpg)
![Page 73: Gapped BLAST and PSI-BLAST : a new generation of protein database search programs](https://reader036.vdocuments.mx/reader036/viewer/2022081502/56815956550346895dc6921a/html5/thumbnails/73.jpg)
![Page 74: Gapped BLAST and PSI-BLAST : a new generation of protein database search programs](https://reader036.vdocuments.mx/reader036/viewer/2022081502/56815956550346895dc6921a/html5/thumbnails/74.jpg)
![Page 75: Gapped BLAST and PSI-BLAST : a new generation of protein database search programs](https://reader036.vdocuments.mx/reader036/viewer/2022081502/56815956550346895dc6921a/html5/thumbnails/75.jpg)
![Page 76: Gapped BLAST and PSI-BLAST : a new generation of protein database search programs](https://reader036.vdocuments.mx/reader036/viewer/2022081502/56815956550346895dc6921a/html5/thumbnails/76.jpg)
![Page 77: Gapped BLAST and PSI-BLAST : a new generation of protein database search programs](https://reader036.vdocuments.mx/reader036/viewer/2022081502/56815956550346895dc6921a/html5/thumbnails/77.jpg)
NCBI• Books• Pudmed• Blast
(1)Nucleotide
-- Quickly search for highly similar sequences
-- Nucleotide-nucleotide BLAST
(2)Protein
-- Protein-protein BLAST
(3)Translated
-- Translated query vs. Protein database
(4)Special
-- Align two sequences
![Page 78: Gapped BLAST and PSI-BLAST : a new generation of protein database search programs](https://reader036.vdocuments.mx/reader036/viewer/2022081502/56815956550346895dc6921a/html5/thumbnails/78.jpg)
![Page 79: Gapped BLAST and PSI-BLAST : a new generation of protein database search programs](https://reader036.vdocuments.mx/reader036/viewer/2022081502/56815956550346895dc6921a/html5/thumbnails/79.jpg)
![Page 80: Gapped BLAST and PSI-BLAST : a new generation of protein database search programs](https://reader036.vdocuments.mx/reader036/viewer/2022081502/56815956550346895dc6921a/html5/thumbnails/80.jpg)
![Page 81: Gapped BLAST and PSI-BLAST : a new generation of protein database search programs](https://reader036.vdocuments.mx/reader036/viewer/2022081502/56815956550346895dc6921a/html5/thumbnails/81.jpg)
![Page 82: Gapped BLAST and PSI-BLAST : a new generation of protein database search programs](https://reader036.vdocuments.mx/reader036/viewer/2022081502/56815956550346895dc6921a/html5/thumbnails/82.jpg)
NCBI• Books• Pudmed• Blast
(1)Nucleotide
-- Quickly search for highly similar sequences
-- Nucleotide-nucleotide BLAST
(2)Protein
-- Protein-protein BLAST
(3)Translated
-- Translated query vs. Protein database
(4)Special
-- Align two sequences
![Page 83: Gapped BLAST and PSI-BLAST : a new generation of protein database search programs](https://reader036.vdocuments.mx/reader036/viewer/2022081502/56815956550346895dc6921a/html5/thumbnails/83.jpg)
![Page 84: Gapped BLAST and PSI-BLAST : a new generation of protein database search programs](https://reader036.vdocuments.mx/reader036/viewer/2022081502/56815956550346895dc6921a/html5/thumbnails/84.jpg)
![Page 85: Gapped BLAST and PSI-BLAST : a new generation of protein database search programs](https://reader036.vdocuments.mx/reader036/viewer/2022081502/56815956550346895dc6921a/html5/thumbnails/85.jpg)
![Page 86: Gapped BLAST and PSI-BLAST : a new generation of protein database search programs](https://reader036.vdocuments.mx/reader036/viewer/2022081502/56815956550346895dc6921a/html5/thumbnails/86.jpg)
Question Set of Final Exam
• 1. 請寫出 blast 可以快速在 database 中找到 sequence 的原理
• 2. Two hit 與 One hit 不同之處為何 ?
• 3. 試簡述 PSI-BLAST 對 BLAST 做了哪些改進 ?