sequence alignments complete coverage

16
Genomics Sequence Alignment : Complete Coverage- Sequence Alignment : Complete Coverage- S.Prasanth Kumar Dept. of Bioinformatics Applied Botany Centre (ABC) Gujarat University, Ahmedabad, INDIA www.facebook.com/Prasanth Sivakumar FOLLOW ME ON ACCESS MY RESOURCES IN SLIDESHARE prasanthperceptron CONTACT ME prasanthbioinformatics@gmail. com

Upload: prasanthperceptron

Post on 11-May-2015

1.532 views

Category:

Documents


4 download

DESCRIPTION

Global and Local alignment manual interpretations

TRANSCRIPT

Page 1: Sequence alignments complete coverage

S.Prasanth Kumar, S.Prasanth Kumar, BioinformaticianBioinformatician

Genomics

Sequence Alignment : Complete Coverage-ISequence Alignment : Complete Coverage-I

S.Prasanth Kumar Dept. of Bioinformatics Applied Botany Centre (ABC) Gujarat University, Ahmedabad, INDIA

www.facebook.com/Prasanth Sivakumar

FOLLOW ME ON

ACCESS MY RESOURCES IN SLIDESHARE

prasanthperceptron

CONTACT ME

[email protected]

Page 2: Sequence alignments complete coverage

Alignment scoring schemes

Alignment of ATCGGATCT and ACGGACT

match: +2mismatch: -1indel –2

6 * 2 + 1 * -1 + 2 * -2 = 7

6 matches, 1 mismatch, and 2 indels

Page 3: Sequence alignments complete coverage

Optimal alignment of two sequences

Brute Force Method

Suppose there are two sequences X and Z to be aligned, where |X| = m and |Z| = nIf gaps are allowed in the sequences, then the potential length of both the first and second sequences is m+n.

2m+n subsequences with spaces for the sequence X2m+n subsequences with spaces for the sequence Z

Alignment = 2m+n * 2m+n = 2(2(m+n)) = 4m+n comparisons

Page 4: Sequence alignments complete coverage

Optimal alignment of two sequences

Dynamic Programming

DP align two sequences by beginning at the ends of the two sequences and attempting to align all possible pairs of characters (one from each sequence) using a scoring scheme for matches, mismatches, and gaps. The highest set of scores defines the optimal alignment between the two sequences

DP algorithms solve optimization problems by dividing theproblem into independent subproblems

Page 5: Sequence alignments complete coverage

Optimal alignment of two sequences

Dynamic Programming Matrix

s(aibj) = +5 if ai = bj (match score)s(aibj) = -3 if ai ≠ bj (mismatch score)w = -4 (gap penalty)

• Initialization• Matrix Fill (scoring)• Traceback (alignment)

Page 6: Sequence alignments complete coverage

Global Alignment: Needleman-Wunsch Algorithm

Initialization Step

Each row Si,0 is set to w * i Each column S0,j is set to w * j

Page 7: Sequence alignments complete coverage

Global Alignment: Needleman-Wunsch Algorithm

Matrix Fill Step

G-G match score = +5

Si,j = MAX [0 + 5, -4 + -4, -4 + -4] = MAX [ 5 , -8 , -8 ] = 5

Confusing ?

Diagonal + Match/Mismatch Score

Left + Gap penalty

Right + Gap penalty

Page 8: Sequence alignments complete coverage

Global Alignment: Needleman-Wunsch Algorithm

G-A mismatch score = -3

Si,j = MAX [-4 + -3, 5 + -4, -8 + -4] = MAX [ -7 , 1 , -12 ] = 1

Page 9: Sequence alignments complete coverage

Global Alignment: Needleman-Wunsch Algorithm

Trace backing

Easy ; Find the lowermost right corner and follow arrow

Page 10: Sequence alignments complete coverage

Global Alignment: Needleman-Wunsch Algorithm

5 – 3 + 5 – 4 + 5 + 5 – 4 + 5 – 4 – 4 + 5 = 11

Page 11: Sequence alignments complete coverage

Local Alignment: Smith-Waterman Algorithm

Initialization Step

Each row Si,0 is set to 0 Each column S0,j is set to 0

Same Rule Initialization different Trace backing need attention

Page 12: Sequence alignments complete coverage

Local Alignment: Smith-Waterman Algorithm

There are two cells having 14. There are multiple alignments producing the maximal alignment score What to consider ? Value in last row means aligned fully

Page 13: Sequence alignments complete coverage

Local Alignment: Smith-Waterman Algorithm

Two trace back pathway pointers

The two local alignments resulting in a score of 14

Page 14: Sequence alignments complete coverage

Local Alignment: Smith-Waterman Algorithm

5 matches, 1 mismatch, and 2 gaps

score = 5 *5 – 1 *3 – 2 *4 = 25 – 3 – 8 = 14

Page 15: Sequence alignments complete coverage

What in Next Coverage ?

Scoring Matrices: PAM & BLOSUMAssessing the significance of sequence alignments

Page 16: Sequence alignments complete coverage

Thank You For Your Attention !!!