(psi-)blast & msa via max-planck. where? (to find homologues) structural templates- search...
TRANSCRIPT
![Page 1: (PSI-)BLAST & MSA via Max-Planck. Where? (to find homologues) Structural templates- search against the PDB Sequence homologues- search against SwissProt](https://reader036.vdocuments.mx/reader036/viewer/2022062423/5697bf9e1a28abf838c9416b/html5/thumbnails/1.jpg)
(PSI-)BLAST & MSAvia Max-Planck
![Page 2: (PSI-)BLAST & MSA via Max-Planck. Where? (to find homologues) Structural templates- search against the PDB Sequence homologues- search against SwissProt](https://reader036.vdocuments.mx/reader036/viewer/2022062423/5697bf9e1a28abf838c9416b/html5/thumbnails/2.jpg)
• Where? (to find homologues)
• Structural templates- search against the PDB
• Sequence homologues- search against SwissProt or Uniprot (recommended!)
• How many?
• As many as possible, as long as the MSA looks good (next week…)
General Issues
![Page 3: (PSI-)BLAST & MSA via Max-Planck. Where? (to find homologues) Structural templates- search against the PDB Sequence homologues- search against SwissProt](https://reader036.vdocuments.mx/reader036/viewer/2022062423/5697bf9e1a28abf838c9416b/html5/thumbnails/3.jpg)
• How long? (length of homologues)
• Fragments- short homologues (less than 50,60% the query’s length) = bad alignment
• Ensure your sequences exhibit the wanted domain(s)
• N/C terminal tend to vary in length between homologues
• How close? (distance from query sequence)
• All too close- no information
• Too many too far- bad alignment
• Ensure that you have a balanced collection!
General Issues
![Page 4: (PSI-)BLAST & MSA via Max-Planck. Where? (to find homologues) Structural templates- search against the PDB Sequence homologues- search against SwissProt](https://reader036.vdocuments.mx/reader036/viewer/2022062423/5697bf9e1a28abf838c9416b/html5/thumbnails/4.jpg)
• From who? (which species the sequence belongs to)
• Don’t care, all homologues are welcome
• Orthologues/paralogues may be helpful
• Sequences from distant/close species provide different types of information
• Which method? (BLAST/PSI-BLAST)
• Depends on the protein, available homologues, the goal in mind…
General Issues
![Page 5: (PSI-)BLAST & MSA via Max-Planck. Where? (to find homologues) Structural templates- search against the PDB Sequence homologues- search against SwissProt](https://reader036.vdocuments.mx/reader036/viewer/2022062423/5697bf9e1a28abf838c9416b/html5/thumbnails/5.jpg)
Rules For Choosing Sequences• Very similar sequences have little information
• Very different sequences cause trouble…<30% identical with more than half of the other sequences in the set
• Choose sequences as distantly related as possibleSequence between 30-80% identical with more than half of
the sequences in the set
• The more sequences the better
General Issues
![Page 6: (PSI-)BLAST & MSA via Max-Planck. Where? (to find homologues) Structural templates- search against the PDB Sequence homologues- search against SwissProt](https://reader036.vdocuments.mx/reader036/viewer/2022062423/5697bf9e1a28abf838c9416b/html5/thumbnails/6.jpg)
Overall work steps
1.Run the search- 1. Select database2. E-value threshold3. BLAST or PSI-BLAST- how many rounds?
2.Take out sequences- HSP (slider region) or full sequences
3.Align sequences- choose alignment program
4.View alignment with BioEdit tor another program
5.Calculate trees, conservation scores (ConSurf) etc…
![Page 7: (PSI-)BLAST & MSA via Max-Planck. Where? (to find homologues) Structural templates- search against the PDB Sequence homologues- search against SwissProt](https://reader036.vdocuments.mx/reader036/viewer/2022062423/5697bf9e1a28abf838c9416b/html5/thumbnails/7.jpg)
(PSI-)BLAST via Max-Planck
http://toolkit.tuebingen.mpg.de/sections/search
• Databases- swissprot, tremble, NR, env, pdb or any combination for proteins, but only NT for DNA.
• All BLAST programs
Main advantage- you can easily extract and filter the HSPs, on top of full sequences
![Page 8: (PSI-)BLAST & MSA via Max-Planck. Where? (to find homologues) Structural templates- search against the PDB Sequence homologues- search against SwissProt](https://reader036.vdocuments.mx/reader036/viewer/2022062423/5697bf9e1a28abf838c9416b/html5/thumbnails/8.jpg)
The Query Protein
Name: Dihydrodipicolinate reductase
Enzyme reaction:
Molecular process: Lysine biosynthesis (early stages)
Organism: E. coli
Sequence length: 273 aa
![Page 9: (PSI-)BLAST & MSA via Max-Planck. Where? (to find homologues) Structural templates- search against the PDB Sequence homologues- search against SwissProt](https://reader036.vdocuments.mx/reader036/viewer/2022062423/5697bf9e1a28abf838c9416b/html5/thumbnails/9.jpg)
Query:DAPB_ECOLI
>DAPB_ECOLIMHDANIRVAIAGAGGRMGRQLIQAALALEGVQLGAALEREGSSLLGSDAGELAGAGKTGVTVQSSLDAVKDDFDVFIDFTRPEGTLNHLAFCRQHGKGMVIGTTGFDEAGKQAIRDAAADIAIVFAANFSVGVNVMLKLLEKAAKVMGDYTDIEIIEAHHRHKVDAPSGTALAMGEAIAHALDKDLKDCAVYSREGHTGERVPGTIGFATVRAGDIVGEHTAMFADIGERLEITHKASSRMTFANGAVRSALWLSGKESGLFDMRDVLDLNNL
The Query Protein
![Page 10: (PSI-)BLAST & MSA via Max-Planck. Where? (to find homologues) Structural templates- search against the PDB Sequence homologues- search against SwissProt](https://reader036.vdocuments.mx/reader036/viewer/2022062423/5697bf9e1a28abf838c9416b/html5/thumbnails/10.jpg)
(PSI-)BLAST via Max-Planckhttp://toolkit.tuebingen.mpg.de/psi_blast/
Choose database or databases
(selecting a few using CTRL)
Upload sequenceor MSA
![Page 11: (PSI-)BLAST & MSA via Max-Planck. Where? (to find homologues) Structural templates- search against the PDB Sequence homologues- search against SwissProt](https://reader036.vdocuments.mx/reader036/viewer/2022062423/5697bf9e1a28abf838c9416b/html5/thumbnails/11.jpg)
(PSI-)BLAST via Max-Planc
![Page 12: (PSI-)BLAST & MSA via Max-Planck. Where? (to find homologues) Structural templates- search against the PDB Sequence homologues- search against SwissProt](https://reader036.vdocuments.mx/reader036/viewer/2022062423/5697bf9e1a28abf838c9416b/html5/thumbnails/12.jpg)
(PSI-)BLAST via Max-Planc
![Page 13: (PSI-)BLAST & MSA via Max-Planck. Where? (to find homologues) Structural templates- search against the PDB Sequence homologues- search against SwissProt](https://reader036.vdocuments.mx/reader036/viewer/2022062423/5697bf9e1a28abf838c9416b/html5/thumbnails/13.jpg)
(PSI-)BLAST via Max-Planc
![Page 14: (PSI-)BLAST & MSA via Max-Planck. Where? (to find homologues) Structural templates- search against the PDB Sequence homologues- search against SwissProt](https://reader036.vdocuments.mx/reader036/viewer/2022062423/5697bf9e1a28abf838c9416b/html5/thumbnails/14.jpg)
(PSI-)BLAST via Max-Planc
![Page 15: (PSI-)BLAST & MSA via Max-Planck. Where? (to find homologues) Structural templates- search against the PDB Sequence homologues- search against SwissProt](https://reader036.vdocuments.mx/reader036/viewer/2022062423/5697bf9e1a28abf838c9416b/html5/thumbnails/15.jpg)
(PSI-)BLAST via Max-Planck
E-value threshold can be assessed using the distribution
![Page 16: (PSI-)BLAST & MSA via Max-Planck. Where? (to find homologues) Structural templates- search against the PDB Sequence homologues- search against SwissProt](https://reader036.vdocuments.mx/reader036/viewer/2022062423/5697bf9e1a28abf838c9416b/html5/thumbnails/16.jpg)
Forward results to MSA
http://toolkit.tuebingen.mpg.de/sections/alignment
![Page 17: (PSI-)BLAST & MSA via Max-Planck. Where? (to find homologues) Structural templates- search against the PDB Sequence homologues- search against SwissProt](https://reader036.vdocuments.mx/reader036/viewer/2022062423/5697bf9e1a28abf838c9416b/html5/thumbnails/17.jpg)
Forward results to MSA
Forward results to
MSAAll marked hits or filter by e-value
HSP (sider region) or full sequences
![Page 18: (PSI-)BLAST & MSA via Max-Planck. Where? (to find homologues) Structural templates- search against the PDB Sequence homologues- search against SwissProt](https://reader036.vdocuments.mx/reader036/viewer/2022062423/5697bf9e1a28abf838c9416b/html5/thumbnails/18.jpg)
Forward results to MSA
![Page 19: (PSI-)BLAST & MSA via Max-Planck. Where? (to find homologues) Structural templates- search against the PDB Sequence homologues- search against SwissProt](https://reader036.vdocuments.mx/reader036/viewer/2022062423/5697bf9e1a28abf838c9416b/html5/thumbnails/19.jpg)
Align via Max-Planck
Alignment results:
Save the alignment
![Page 20: (PSI-)BLAST & MSA via Max-Planck. Where? (to find homologues) Structural templates- search against the PDB Sequence homologues- search against SwissProt](https://reader036.vdocuments.mx/reader036/viewer/2022062423/5697bf9e1a28abf838c9416b/html5/thumbnails/20.jpg)
Alignmen viewing & editingBioEdit
• http://www.mbio.ncsu.edu/BioEdit/BioEdit.html
• Easy-to-use sequence alignment editor
• View and manipulate alignments up to 20,000 sequences. •Four modes of manual alignment: select and slide, dynamic grab and drag, gap insert and delete by mouse click, and on-screen typing which behaves like a text editor.
•Reads and writes Genbank, Fasta, Phylip 3.2, Phylip 4, and NBRF/PIR formats. Also reads GCG and Clustal formats
![Page 21: (PSI-)BLAST & MSA via Max-Planck. Where? (to find homologues) Structural templates- search against the PDB Sequence homologues- search against SwissProt](https://reader036.vdocuments.mx/reader036/viewer/2022062423/5697bf9e1a28abf838c9416b/html5/thumbnails/21.jpg)
Easiest Using Bioedit
http://www.mbio.ncsu.edu/BioEdit/bioedit.html
Alignment viewing & editing
![Page 22: (PSI-)BLAST & MSA via Max-Planck. Where? (to find homologues) Structural templates- search against the PDB Sequence homologues- search against SwissProt](https://reader036.vdocuments.mx/reader036/viewer/2022062423/5697bf9e1a28abf838c9416b/html5/thumbnails/22.jpg)
Easiest Using Bioedit
http://www.mbio.ncsu.edu/BioEdit/bioedit.html
• Find a specific sequence: “Edit-> search -> in titles”
• Erase\add sequences: “Edit-> cut\paste\delete sequence”
• “Sequence Identity matrix” under “Alignment”- useful for a rough evaluation of distances within the alignment.
• After taking out sequences, “Minimize Alignment” under “Alignment” takes out unessential gaps.
• Can save an image using: “File -> Graphic View” & then “Edit -> Copy page as BITMAP”
Alignment viewing & editing
![Page 23: (PSI-)BLAST & MSA via Max-Planck. Where? (to find homologues) Structural templates- search against the PDB Sequence homologues- search against SwissProt](https://reader036.vdocuments.mx/reader036/viewer/2022062423/5697bf9e1a28abf838c9416b/html5/thumbnails/23.jpg)
A little of ConSurf
Compute Conservation Scores
• Give an MSA or will compute one for you
(given a FASTA sequence, BLAST & MSA)
Main advantage:filters short HSPs, removes redundant
sequences
• Shows conservation scores on sequence or on a protein structure (if available)
![Page 25: (PSI-)BLAST & MSA via Max-Planck. Where? (to find homologues) Structural templates- search against the PDB Sequence homologues- search against SwissProt](https://reader036.vdocuments.mx/reader036/viewer/2022062423/5697bf9e1a28abf838c9416b/html5/thumbnails/25.jpg)
ConSurf
![Page 26: (PSI-)BLAST & MSA via Max-Planck. Where? (to find homologues) Structural templates- search against the PDB Sequence homologues- search against SwissProt](https://reader036.vdocuments.mx/reader036/viewer/2022062423/5697bf9e1a28abf838c9416b/html5/thumbnails/26.jpg)
ConSurfhttp://consurf.tau.ac.il/results/1321532763/output.php
![Page 27: (PSI-)BLAST & MSA via Max-Planck. Where? (to find homologues) Structural templates- search against the PDB Sequence homologues- search against SwissProt](https://reader036.vdocuments.mx/reader036/viewer/2022062423/5697bf9e1a28abf838c9416b/html5/thumbnails/27.jpg)
ConSurfhttp://consurf.tau.ac.il/results/1321532763/output.php
![Page 28: (PSI-)BLAST & MSA via Max-Planck. Where? (to find homologues) Structural templates- search against the PDB Sequence homologues- search against SwissProt](https://reader036.vdocuments.mx/reader036/viewer/2022062423/5697bf9e1a28abf838c9416b/html5/thumbnails/28.jpg)
ConSurf
MSA colored by conservation
PSI-BLAST result
MSA
Phylogenetic tree
Sequences used
Sequence conservation
![Page 29: (PSI-)BLAST & MSA via Max-Planck. Where? (to find homologues) Structural templates- search against the PDB Sequence homologues- search against SwissProt](https://reader036.vdocuments.mx/reader036/viewer/2022062423/5697bf9e1a28abf838c9416b/html5/thumbnails/29.jpg)
ConSurf
![Page 30: (PSI-)BLAST & MSA via Max-Planck. Where? (to find homologues) Structural templates- search against the PDB Sequence homologues- search against SwissProt](https://reader036.vdocuments.mx/reader036/viewer/2022062423/5697bf9e1a28abf838c9416b/html5/thumbnails/30.jpg)
Jmol- Easy web-based viewer
![Page 31: (PSI-)BLAST & MSA via Max-Planck. Where? (to find homologues) Structural templates- search against the PDB Sequence homologues- search against SwissProt](https://reader036.vdocuments.mx/reader036/viewer/2022062423/5697bf9e1a28abf838c9416b/html5/thumbnails/31.jpg)
WebLogohttp://weblogo.berkeley.edu/logo.cgi
![Page 32: (PSI-)BLAST & MSA via Max-Planck. Where? (to find homologues) Structural templates- search against the PDB Sequence homologues- search against SwissProt](https://reader036.vdocuments.mx/reader036/viewer/2022062423/5697bf9e1a28abf838c9416b/html5/thumbnails/32.jpg)
WebLogohttp://weblogo.berkeley.edu/logo.cgi
![Page 33: (PSI-)BLAST & MSA via Max-Planck. Where? (to find homologues) Structural templates- search against the PDB Sequence homologues- search against SwissProt](https://reader036.vdocuments.mx/reader036/viewer/2022062423/5697bf9e1a28abf838c9416b/html5/thumbnails/33.jpg)
Each sequence is a different story
adjust parameters:
• BLAST- E-value, substitution matrix, gap penalties, database, minimum length, redundancy level, fragment overlap…
• PSI-BLAST- BLAST parameters + PSSM inclusion threshold (or chose manually), number of rounds…
• Try using HSP or full sequences, different MSA programs…
No “Miracle solution”