bioinformatics life sciences_2012
DESCRIPTION
Introduction to Bioinformatics For the Life SciencesTRANSCRIPT
![Page 1: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/1.jpg)
![Page 2: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/2.jpg)
Inleiding tot de bio-informatica en computationele biologie
![Page 3: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/3.jpg)
Lab for Bioinformatics and computational genomics
10 “genome hackers” mostly engineers (statistics)
42 scientiststechnicians, geneticists, clinicians
>100 people hardware engineers,
mathematicians, molecular biologists
![Page 4: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/4.jpg)
![Page 5: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/5.jpg)
What is Bioinformatics ?
• Application of information technology to the storage, management and analysis of biological information (Facilitated by the use of computers)– Sequence analysis?– Molecular modeling (HTX) ?– Phylogeny/evolution?– Ecology and population studies?– Medical informatics?– Image Analysis ?– Statistics ? AI ?– Sterkstroom of zwakstroom ?
![Page 6: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/6.jpg)
• Medicine (Pharma)– Genome analysis allows the targeting of genetic
diseases– The effect of a disease or of a therapeutic on RNA
and protein levels can be elucidated– Knowledge of protein structure facilitates drug
design– Understanding of genomic variation allows the
tailoring of medical treatment to the individual’s genetic make-up
• The same techniques can be applied to crop (Agro) and livestock improvement (Animal Health)
Promises of genomics and bioinformatics
![Page 7: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/7.jpg)
Math
Informatics
Bioinformatics, a life science discipline …
(Molecular)Biology
![Page 8: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/8.jpg)
Math
Informatics
Bioinformatics, a life science discipline …
Theoretical Biology
Computational Biology
(Molecular)Biology
Computer Science
![Page 9: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/9.jpg)
Math
Informatics
Bioinformatics, a life science discipline …
Theoretical Biology
Computational Biology
(Molecular)Biology
Computer Science
Bioinformatics
![Page 10: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/10.jpg)
Math
Informatics
Bioinformatics, a life science discipline … management of expectations
Theoretical Biology
Computational Biology
(Molecular)Biology
Computer Science
Bioinformatics
Interface Design
AI, Image Analysisstructure prediction (HTX)
Sequence Analysis
Expert Annotation
NPDatamining
![Page 11: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/11.jpg)
Math
Informatics
Bioinformatics, a life science discipline … management of expectations
Theoretical Biology
Computational Biology
(Molecular)Biology
Computer Science
BioinformaticsDiscovery Informatics – Computational Genomics
Interface Design
AI, Image Analysisstructure prediction (HTX)
Sequence Analysis
Expert Annotation
NPDatamining
![Page 12: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/12.jpg)
Time (years)
![Page 13: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/13.jpg)
![Page 14: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/14.jpg)
• Timelin: Magaret Dayhoff …
![Page 16: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/16.jpg)
PCR + dye termination
Suddenly, a flash of insight caused him to pull the car off the road and stop. He awakened his friend dozing in the passenger seat and excitedly explained to her that he had hit upon a solution - not to his original problem, but to one of even greater significance. Kary Mullis had just conceived of a simple method for producing virtually unlimited copies of a specific DNA sequence in a test tube - the polymerase chain reaction (PCR)
![Page 17: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/17.jpg)
naturetheHumangenome
Setting the stage …
![Page 18: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/18.jpg)
Biological Research
Adapted from John McPherson, OICRAdapted from John McPherson, OICR
![Page 19: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/19.jpg)
And this is just the beginning ….
Next Generation Sequencing is here
![Page 20: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/20.jpg)
![Page 21: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/21.jpg)
![Page 22: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/22.jpg)
One additional insight ...
![Page 23: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/23.jpg)
Read Length is Not As Important For Resequencing
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
8 10 12 14 16 18 20
Length of K-mer Reads (bp)
% o
f P
aire
d K
-mer
s w
ith
Un
iqu
ely
Ass
ign
able
Lo
cati
on
E.COLI
HUMAN
Jay Shendure
![Page 24: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/24.jpg)
![Page 25: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/25.jpg)
ABI SOLID
![Page 26: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/26.jpg)
Paired End Reads are Important!
Repetitive DNAUnique DNA
Single read maps to multiple positions
Paired read maps uniquely
Read 1 Read 2
Known Distance
![Page 27: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/27.jpg)
Single Molecule Sequencing
Helicos Biosciences Corp.
Microscope slide
Single DNA molecule
dNTP-Cy3
* * *
*
primer
Super-cooledTIRF microscope
Adapted from: Barak Cohen, Washington University, Bio5488 http://tinyurl.com/6zttuq http://tinyurl.com/6k26nh
![Page 28: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/28.jpg)
Complete genomics
![Page 29: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/29.jpg)
Next next generation sequencing
Third generation sequencing
Now sequencing
![Page 30: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/30.jpg)
Pacific Biosciences: A Third Generation Sequencing Technology
Eid et al 2008
![Page 31: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/31.jpg)
Nanopore Sequencing
![Page 32: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/32.jpg)
Ultra-low-cost SINGLE molecule sequencing
![Page 33: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/33.jpg)
Genome Size
DOGS: Database Of Genome Sizes
E. coli = 4.2 x 106
Yeast = 18 x 106
Arabidopsis = 80 x 106
C.elegans = 100 x 106
Drosophila = 180 x 106
Human/Rat/Mouse = 3000 x 106
Lily = 300 000 x 106
With ... : 99.9 %To primates: 99%
![Page 34: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/34.jpg)
![Page 35: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/35.jpg)
Anno 2012
![Page 36: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/36.jpg)
Anno 2012
![Page 37: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/37.jpg)
IdentityThe extent to which two (nucleotide or amino acid) sequences are invariant.
HomologySimilarity attributed to descent from a common ancestor.
Definitions
RBP: 26 RVKENFDKARFSGTWYAMAKKDPEGLFLQDNIVAEFSVDETGQMSATAKGRVRLLNNWD- 84 + K ++ + + GTW++MA+ L + A V T + +L+ W+ glycodelin: 23 QTKQDLELPKLAGTWHSMAMA-TNNISLMATLKAPLRVHITSLLPTPEDNLEIVLHRWEN 81
![Page 38: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/38.jpg)
Orthologous Homologous sequences in different species that arose from a common ancestral gene during speciation; may or may not be responsible for a similar function.
Paralogous Homologous sequences within a single species that arose by gene duplication.
Definitions
![Page 39: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/39.jpg)
speciation
duplication
![Page 40: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/40.jpg)
• Simple identity, which scores only identical amino acids as a match.
• Genetic code changes, which scores the minimum number of nucieotide changes to change a codon for one amino acid into a codon for the other.
• Chemical similarity of amino acid side chains, which scores as a match two amino acids which have a similar side chain, such as hydrophobic, charged and polar amino acid groups.
• The Dayhoff percent accepted mutation (PAM) family of matrices, which scores amino acid pairs on the basis of the expected frequency of substitution of one amino acid for the other during protein evolution.
• The blocks substitution matrix (BLOSUM) amino acid substitution tables, which scores amino acid pairs based on the frequency of amino acid substitutions in aligned sequence motifs called blocks which are found in protein families
Overview
![Page 41: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/41.jpg)
BLOSUM (BLOck – SUM) scoring
DDNAAVDNAVDDNNVAVV
Block = ungapped alignentEg. Amino Acids D N V A
a b c d e f1
2
3
S = 3 sequencesW = 6 aaN= (W*S*(S-1))/2 = 18 pairs
![Page 42: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/42.jpg)
A. Observed pairs
DDNAAVDNAVDDNNVAVV
a b c d e f1
2
3
D N A V
D NAV
1 413
111
14
1
f fij
D N A V
D NAV
.056
.222
.056
.167
.056.056.056
.056
.222
.056
gij
/18
Relative frequency table
Probability of obtaining a pair if randomly choosing pairs from block
![Page 43: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/43.jpg)
AB. Expected pairs
DDDDDNNNNAAAAVVVVV
DDNAAVDNAVDDNNVAVV
Pi
5/184/184/185/18
P{Draw DN pair}= P{Draw D, then N or Draw M, then D}P{Draw DN pair}= PDPN + PNPD = 2 * (5/18)*(4/18) = .123
D N A V
D NAV
.077
.123
.154
.123
.049.123.099
.049
.123
.049
eijRandom rel. frequency table
Probability of obtaining a pair of each amino acid drawn independently from block
![Page 44: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/44.jpg)
C. Summary (A/B)
sij = log2 gij/eij
(sij) is basic BLOSUM score matrix
Notes:• Observed pairs in blocks contain information about relationships at all levels of evolutionary distance simultaneously (Cf: Dayhoffs’s close relationships)• Actual algorithm generates observed + expected pair distributions by accumalution over a set of approx. 2000 ungapped blocks of varrying with (w) + depth (s)
![Page 45: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/45.jpg)
• blosum30,35,40,45,50,55,60,62,65,70,75,80,85,90• transition frequencies observed directly by identifying
blocks that are at least – 45% identical (BLOSUM 45) – 50% identical (BLOSUM 50) – 62% identical (BLOSUM 62) etc.
• No extrapolation made
• High blosum - closely related sequences• Low blosum - distant sequences • blosum45 pam250• blosum62 pam160 • blosum62 is the most popular matrix
The BLOSUM Series
![Page 46: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/46.jpg)
Overview
![Page 47: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/47.jpg)
• Church of the Flying Spaghetti Monster
• http://www.venganza.org/about/open-letter
![Page 48: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/48.jpg)
– Henikoff and Henikoff have compared the BLOSUM matrices to PAM by evaluating how effectively the matrices can detect known members of a protein family from a database when searching with the ungapped local alignment program BLAST. They conclude that overall the BLOSUM 62 matrix is the most effective.
• However, all the substitution matrices investigated perform better than BLOSUM 62 for a proportion of the families. This suggests that no single matrix is the complete answer for all sequence comparisons.
• It is probably best to compliment the BLOSUM 62 matrix with comparisons using 250 PAMS, and Overington structurally derived matrices.
– It seems likely that as more protein three dimensional structures are determined, substitution tables derived from structure comparison will give the most reliable data.
Overview
![Page 49: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/49.jpg)
Rat versus mouse RBP
Rat versus bacteriallipocalin
![Page 50: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/50.jpg)
• Exhaustive …– All combinations:
• Algorithm – Dynamic programming (much faster)
• Heuristics– Needleman – Wunsh for global
alignments(Journal of Molecular Biology, 1970)
– Later adapated by Smith-Waterman for local alignment
Alignments
![Page 51: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/51.jpg)
A metric …
GACGGATTAG, GATCGGAATAG
GA-CGGATTAGGATCGGAATAG
+1 (a match), -1 (a mismatch),-2 (gap)
9*1 + 1*(-1)+1*(-2) = 6
![Page 52: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/52.jpg)
Needleman-Wunsch-edu.pl
The Score Matrix----------------
Seq1(j)1 2 3 4 5 6 7 8 9 10Seq2 * C K H V F C R V C I(i) * 0 -1 -2 -3 -4 -5 -6 -7 -8 -9 -101 C -1 1 0 -1 -2 -3 -4 -5 -6 -7 -82 K -2 0 2 1 0 -1 -2 -3 -4 -5 -63 K -3 -1 1 1 0 -1 -2 -3 -4 -5 -64 C -4 -2 0 0 0 -1 0 -1 -2 -3 -45 F -5 -3 -1 -1 -1 1 0 -1 -2 -3 -46 C -6 -4 -2 -2 -2 0 2 1 0 -1 -27 K -7 -5 -3 -3 -3 -1 1 1 0 -1 -28 C -8 -6 -4 -4 -4 -2 0 0 0 1 09 V -9 -7 -5 -5 -3 -3 -1 -1 1 0 0
![Page 53: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/53.jpg)
Needleman-Wunsch-edu.pl
The Score Matrix----------------
Seq1(j)1 2 3 4 5 6 7 8 9 10Seq2 * C K H V F C R V C I(i) * 0 -1 -2 -3 -4 -5 -6 -7 -8 -9 -101 C -1 1 0 -1 -2 -3 -4 -5 -6 -7 -82 K -2 0 2 1 0 -1 -2 -3 -4 -5 -63 K -3 -1 1 1 0 -1 -2 -3 -4 -5 -64 C -4 -2 0 0 0 -1 0 -1 -2 -3 -45 F -5 -3 -1 -1 -1 1 0 -1 -2 -3 -46 C -6 -4 -2 -2 -2 0 2 1 0 -1 -27 K -7 -5 -3 -3 -3 -1 1 1 0 -1 -28 C -8 -6 -4 -4 -4 -2 0 0 0 1 09 V -9 -7 -5 -5 -3 -3 -1 -1 1 0 0
![Page 54: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/54.jpg)
Needleman-Wunsch-edu.pl
The Score Matrix----------------
Seq1(j)1 2 3 4 5 6 7 8 9 10Seq2 * C K H V F C R V C I(i) * 0 -1 -2 -3 -4 -5 -6 -7 -8 -9 -101 C -1 1 0 -1 -2 -3 -4 -5 -6 -7 -82 K -2 0 2 1 0 -1 -2 -3 -4 -5 -63 K -3 -1 1 1 0 -1 -2 -3 -4 -5 -64 C -4 -2 0 0 0 -1 0 -1 -2 -3 -45 F -5 -3 -1 -1 -1 1 0 -1 -2 -3 -46 C -6 -4 -2 -2 -2 0 2 1 0 -1 -27 K -7 -5 -3 -3 -3 -1 1 1 0 -1 -28 C -8 -6 -4 -4 -4 -2 0 0 0 1 09 V -9 -7 -5 -5 -3 -3 -1 -1 1 0 0
abc
A: matrix(i,j) = matrix(i-1,j-1) + (MIS)MATCH if (substr(seq1,j-1,1) eq substr(seq2,i-1,1)
B: up_score = matrix(i-1,j) + GAP
C: left_score = matrix(i,j-1) + GAP
![Page 55: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/55.jpg)
Needleman-Wunsch-edu.pl
The Score Matrix----------------
Seq1(j)1 2 3 4 5 6 7 8 9 10Seq2 * C K H V F C R V C I(i) * 0 -1 -2 -3 -4 -5 -6 -7 -8 -9 -101 C -1 1 0 -1 -2 -3 -4 -5 -6 -7 -82 K -2 0 2 1 0 -1 -2 -3 -4 -5 -63 K -3 -1 1 1 0 -1 -2 -3 -4 -5 -64 C -4 -2 0 0 0 -1 0 -1 -2 -3 -45 F -5 -3 -1 -1 -1 1 0 -1 -2 -3 -46 C -6 -4 -2 -2 -2 0 2 1 0 -1 -27 K -7 -5 -3 -3 -3 -1 1 1 0 -1 -28 C -8 -6 -4 -4 -4 -2 0 0 0 1 09 V -9 -7 -5 -5 -3 -3 -1 -1 1 0 0
![Page 56: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/56.jpg)
Needleman-Wunsch-edu.pl
![Page 57: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/57.jpg)
Needleman-Wunsch-edu.pl
Seq1: CKHVFCRVCISeq2: CKKCFC-KCV ++--++--+- score = 0
![Page 58: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/58.jpg)
• Practicum: use similarity function in initialization step -> scoring tables
• Time Complexity
• Use random proteins to generate histogram of scores from aligned random sequences
![Page 59: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/59.jpg)
Time complexity with needleman-wunsch.pl
Sequence Length (aa) Execution Time (s)10 025 050 0100 1500 51000 192500 5595000 Memory could not be
written
![Page 60: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/60.jpg)
Average around -64 !
-80-78-76-74-72 **-70 *******-68 ***************-66 *************************-64 ************************************************************-60 ***********************-58 ***************-56 ********-54 ****-52 *-50-48-46-44-42-40-38
![Page 61: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/61.jpg)
If the sequences are similar, the path of the best alignment should be very close to the main diagonal.
Therefore, we may not need to fill the entire matrix, rather, we fill a narrow band of entries around the main diagonal.
An algorithm that fills in a band of width 2k+1 around the main diagonal.
![Page 62: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/62.jpg)
Multiple Alignment Method
![Page 63: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/63.jpg)
Multiple Alignment Method
![Page 64: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/64.jpg)
Phylogenetic methods may be used to solve crimes, test purity of products, and determine whether endangered species have been smuggled or mislabeled: – Vogel, G. 1998.
HIV strain analysis debuts in murder trial. Science 282(5390): 851-853.
– Lau, D. T.-W., et al. 2001. Authentication of medicinal Dendrobium species by the internal transcribed spacer of ribosomal DNA. Planta Med 67:456-460.
Examples
![Page 65: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/65.jpg)
![Page 66: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/66.jpg)
– Epidemiologists use phylogenetic methods to understand the development of pandemics, patterns of disease transmission, and development of antimicrobial resistance or pathogenicity: • Basler, C.F., et al. 2001.
Sequence of the 1918 pandemic influenza virus nonstructural gene (NS) segment and characterization of recombinant viruses bearing the 1918 NS genes. PNAS, 98(5):2746-2751.
• Ou, C.-Y., et al. 1992. Molecular epidemiology of HIV transmission in a dental practice. Science 256(5060):1165-1171.
• Bacillus Antracis:
Examples
![Page 68: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/68.jpg)
![Page 69: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/69.jpg)
![Page 70: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/70.jpg)
![Page 71: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/71.jpg)
Modeling
![Page 72: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/72.jpg)
Ramachandran / Phi-Psi Plot
![Page 73: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/73.jpg)
Protein Architecture
![Page 74: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/74.jpg)
• Finding a structural homologue• Blast
–versus PDB database or PSI-blast (E<0.005)
–Domain coverage at least 60%• Avoid Gaps
–Choose for few gaps and reasonable similarity scores instead of lots of gaps and high similarity scores
Modeling
![Page 75: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/75.jpg)
Bootstrapping - an example
Ciliate SSUrDNA - parsimony bootstrap
Majority-rule consensus
Ochromonas (1)
Symbiodinium (2)
Prorocentrum (3)
Euplotes (8)
Tetrahymena (9)
Loxodes (4)
Tracheloraphis (5)
Spirostomum (6)
Gruberia (7)
100
96
84
100
100
100
![Page 76: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/76.jpg)
![Page 77: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/77.jpg)
![Page 78: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/78.jpg)
![Page 79: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/79.jpg)
Overview
Personalized Medicine,
Biomarkers …
… Molecular Profiling
First Generation Molecular Profiling
Next Generation Molecular Profiling
Next Generation Epigenetic Profiling
Concluding Remarks
![Page 80: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/80.jpg)
Overview
Personalized Medicine,
Biomarkers …
… Molecular Profiling
First Generation Molecular Profiling
Next Generation Molecular Profiling
Next Generation Epigenetic Profiling
Concluding Remarks
![Page 81: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/81.jpg)
![Page 82: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/82.jpg)
![Page 83: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/83.jpg)
![Page 84: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/84.jpg)
![Page 85: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/85.jpg)
![Page 86: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/86.jpg)
Personalized Medicine
• The use of diagnostic tests (aka biomarkers) to identify in advance which patients are likely to respond well to a therapy
• The benefits of this approach are to– avoid adverse drug reactions– improve efficacy– adjust the dose to suit the patient– differentiate a product in a competitive market– meet future legal or regulatory requirements
• Potential uses of biomarkers– Risk assessment– Initial/early detection– Prognosis– Prediction/therapy selection– Response assessment– Monitoring for recurrence
![Page 87: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/87.jpg)
Biomarker
First used in 1971 … An objective and « predictive » measure … at the molecular level … of normal and pathogenic processes and responses to therapeutic interventions
Characteristic that is objectively measured and evaluated as an indicator of normal biologic or pathogenic processes or pharmacologic response to a drug
A biomarker is valid if:– It can be measured in a test system with well
established performance characteristics – Evidence for its clinical significance has been
established
![Page 88: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/88.jpg)
Rationale 1:Why now ? Regulatory path becoming more clear
There is more at stake than efficient drug development. FDA « critical path initiative » Pharmacogenomics guideline
Biomarkers are the foundation of « evidence based medicine » - who should be treated, how and with what.
Without Biomarkers advances in targeted therapy will be limited and treatment remain largely emperical. It is imperative that Biomarker development be accelarated along with therapeutics
![Page 89: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/89.jpg)
Why now ?
First and maturing second generation molecular profiling methodologies allow to stratify clinical trial participants to include those most likely to benefit from the drug candidate—and exclude those who likely will not—pharmacogenomics-based
Clinical trials should attain more specific results with smaller numbers of patients. Smaller numbers mean fewer costs (factor 2-10)
An additional benefit for trial participants and internal review boards (IRBs) is that stratification, given the correct biomarker, may reduce or eliminate adverse events.
![Page 90: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/90.jpg)
Molecular Profiling
The study of specific patterns (fingerprints) of proteins, DNA, and/or mRNA and how these patterns correlate with an individual's physical characteristics or symptoms of disease.
![Page 91: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/91.jpg)
Generic Health advice
• Exercise (Hypertrophic Cardiomyopathy)• Drink your milk (MCM6 Lactose intolarance)• Eat your green beans (glucose-6-phosphate
dehydrogenase Deficiency)• & your grains (HLA-DQ2 – Celiac disease)• & your iron (HFE - Hemochromatosis)• Get more rest (HLA-DR2 - Narcolepsy)
![Page 92: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/92.jpg)
Generic Health advice (UNLESS)
• Exercise (Hypertrophic Cardiomyopathy)• Drink your milk (MCM6 Lactose intolarance)• Eat your green beans (glucose-6-phosphate
dehydrogenase Deficiency)• & your grains (HLA-DQ2 – Celiac disease)• & your iron (HFE - Hemochromatosis)• Get more rest (HLA-DR2 - Narcolepsy)
![Page 93: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/93.jpg)
Generic Health advice (UNLESS)
• Exercise (Hypertrophic Cardiomyopathy)• Drink your milk (MCM6 Lactose intolerance)• Eat your green beans (glucose-6-phosphate
dehydrogenase Deficiency)• & your grains (HLA-DQ2 – Celiac disease)• & your iron (HFE - Hemochromatosis)• Get more rest (HLA-DR2 - Narcolepsy)
![Page 94: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/94.jpg)
Generic Health advice (UNLESS)
• Exercise (Hypertrophic Cardiomyopathy)• Drink your milk (MCM6 Lactose intolerance)• Eat your green beans (glucose-6-phosphate
dehydrogenase Deficiency)• & your grains (HLA-DQ2 – Celiac disease)• & your iron (HFE - Hemochromatosis)• Get more rest (HLA-DR2 - Narcolepsy)
![Page 95: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/95.jpg)
EGFR based therapy in mCRC
![Page 96: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/96.jpg)
Overview
Personalized Medicine,
Biomarkers …
… Molecular Profiling
First Generation Molecular Profiling
Next Generation Molecular Profiling
Next Generation Epigenetic Profiling
Concluding Remarks
![Page 97: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/97.jpg)
Before molecular profiling …
![Page 98: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/98.jpg)
![Page 99: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/99.jpg)
![Page 100: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/100.jpg)
![Page 101: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/101.jpg)
Before molecular profiling …
![Page 102: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/102.jpg)
Before molecular profiling …
![Page 103: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/103.jpg)
![Page 104: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/104.jpg)
First Generation Molecular Profiling
• Flow cytometry correlates surface markers, cell size and other parameters
• Circulating tumor cell assays (CTC’s) quantitate the number of tumor cells in the peripheral blood.
• Exosomes are 30-90 nm vesicles secreted by a wide range of mammalian cell types.
• Immunohistochemistry (IHC) measures protein expression, usually on the cell surface.
![Page 105: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/105.jpg)
![Page 106: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/106.jpg)
![Page 107: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/107.jpg)
![Page 108: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/108.jpg)
First Generation Molecular Profiling
• Gene sequencing for mutation detection
• Microarray for m-RNA message detection • RT-PCR for gene expression
• FISH analysis for gene copy number • Comparative Genome Hybridization (CGH) for
gene copy number
![Page 109: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/109.jpg)
Basics of the “old” technology
• Clone the DNA.• Generate a ladder of labeled (colored)
molecules that are different by 1 nucleotide.• Separate mixture on some matrix.• Detect fluorochrome by laser.• Interpret peaks as string of DNA.• Strings are 500 to 1,000 letters long• 1 machine generates 57,000 nucleotides/run• Assemble all strings into a genome.
![Page 110: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/110.jpg)
![Page 111: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/111.jpg)
Genetic Variation Among People
0.1% difference among people
GATTTAGATCGCGATAGAGGATTTAGATCTCGATAGAG
Single nucleotide polymorphisms(SNPs)
![Page 112: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/112.jpg)
The genome fits as an e-mail attachment
![Page 113: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/113.jpg)
First Generation Molecular Profiling
• Gene sequencing for mutation detection
• Microarray for m-RNA message detection • RT-PCR for gene expression
• FISH analysis for gene copy number • Comparative Genome Hybridization (CGH) for
gene copy number
![Page 114: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/114.jpg)
mRNA Expression Microarray
![Page 115: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/115.jpg)
First Generation Molecular Profiling
• Gene sequencing for mutation detection
• Microarray for m-RNA message detection • RT-PCR for gene expression
• FISH analysis for gene copy number • Comparative Genome Hybridization (CGH) for
gene copy number
![Page 116: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/116.jpg)
![Page 117: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/117.jpg)
Overview
Personalized Medicine,
Biomarkers …
… Molecular Profiling
First Generation Molecular Profiling
Next Generation Molecular Profiling
Next Generation Epigenetic Profiling
Concluding Remarks
![Page 118: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/118.jpg)
Second Generation DNA profiling
• Exome Sequencing (aka known as targeted exome capture) is an efficient strategy to selectively sequence the coding regions of the genome to identify novel genes associated with rare and common disorders.
• 160K exons
![Page 119: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/119.jpg)
Second Generation DNA profiling
![Page 120: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/120.jpg)
Second Generation DNA profiling
![Page 121: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/121.jpg)
Con
tent
s-S
ched
ule
Besides the 6000 protein coding-genes …
140 ribosomal RNA genes275 transfer RNA gnes40 small nuclear RNA genes>100 small nucleolar genes
Function of RNA genes
pRNA in 29 rotary packaging motor (Simpson et el. Nature 408:745-750,2000)Cartilage-hair hypoplasmia mapped to an RNA (Ridanpoa et al. Cell 104:195-203,2001)The human Prader-Willi ciritical region (Cavaille et al. PNAS 97:14035-7, 2000)
Second Generation RNA profiling
![Page 122: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/122.jpg)
RNA genes can be hard to detects
UGAGGUAGUAGGUUGUAUAGU
C.elegans let-27; 21 nt (Pasquinelli et al. Nature 408:86-89,2000)
Often smallSometimes multicopy and redundantOften not polyadenylated (not represented in ESTs)Immune to frameshift and nonsense mutationsNo open reading frame, no codon biasOften evolving rapidly in primary sequence
Second Generation RNA profiling
![Page 123: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/123.jpg)
ncRNAs in human genome
tRNA 60018S rRNA 2005.8S rRNA 20028S rRNA 2005S rRNA 200snoRNA 300miRNA 250U1 40U2 30U4 30U5 30U6 20U4atac 5U6atac 5U11 5U12 5
SRP RNA 1
RNase P RNA 1
Telomerase RNA 1
RNase MRP 1
Y RNA 5
Vault 4
7SK RNA 1
Xist1
H191
BIC1
Antisense RNAs 1000s?
Cis reg regions 100s?
Others ?
![Page 124: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/124.jpg)
![Page 125: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/125.jpg)
Mapping Structural Variation in Humans
- Thought to be Common 12% of the genome (Redon et al. 2006)
- Likely involved in phenotype variation and disease
- Until recently most methods fordetection were low resolution (>50 kb)
CNVs
>1 kb segments
![Page 126: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/126.jpg)
Size Distribution of CNV in a Human Genome
![Page 127: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/127.jpg)
![Page 128: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/128.jpg)
Overview
Personalized Medicine,
Biomarkers …
… Molecular Profiling
First Generation Molecular Profiling
Next Generation Molecular Profiling
Next Generation Epigenetic Profiling
Concluding Remarks
![Page 129: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/129.jpg)
CONFIDENTIAL
Defining Epigenetics
Reversible changes in gene expression/function
Without changes in DNA sequence
Can be inherited from precursor cells
Allows to integrate intrinsic with environmental signals (including diet)
Methylation I Epigenetics | Oncology | Biomarker
Genome
DNA
Gene Expression
Epigenome
Chromatin
Phenotype
I NEXT-GEN | PharmacoDX | CRC
![Page 130: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/130.jpg)
CONFIDENTIAL
Methylation I Epigenetics | Oncology | Biomarker
I NEXT-GEN | PharmacoDX | CRC
![Page 131: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/131.jpg)
CONFIDENTIAL
Epigenetic Regulation: Post Translational Modifications to Histones and Base Changes in DNA
Epigenetic modifications of histones and DNA include:– Histone acetylation and methylation, and DNA methylation
HistoneAcetylation
HistoneMethylation
DNA Methylation
MeMe
Ac
Me
Methylation I Epigenetics | Oncology | Biomarker
I NEXT-GEN | PharmacoDX | CRC
![Page 132: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/132.jpg)
![Page 133: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/133.jpg)
CONFIDENTIAL
MGMT BiologyO6 Methyl-Guanine Methyl Transferase
Essential DNA Repair Enzyme
Removes alkyl groups from damaged guanine bases
Healthy individual: - MGMT is an essential DNA repair enzymeLoss of MGMT activity makes individuals susceptible to DNA damage and prone to tumor development
Glioblastoma patient on alkylator chemotherapy: - Patients with MGMT promoter methylation show have longer PFS and OS with the use of alkylating agents as chemotherapy
Methylation I Epigenetics | Oncology | Biomarker
I NEXT-GEN | PharmacoDX | CRC
![Page 134: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/134.jpg)
CONFIDENTIAL
MGMT Promoter Methylation Predicts Benefit form DNA-Alkylating Chemotherapy
Post-hoc subgroup analysis of Temozolomide Clinical trial with primary glioblastoma patients show benefit for patients with MGMT promoter methylation
0
5
10
15
20
25Median Overall Survival
21.7 months
12.7 months
radiotherapy
plus temozolomide
Methylated MGMT Gene
Non-Methylated MGMT Gene
radiotherapy
Adapted from Hegi et al.NEJM 2005352(10):1036-8.Study with 207 patients
Methylation I Epigenetics | Oncology | Biomarker
I NEXT-GEN | PharmacoDX | CRC
![Page 135: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/135.jpg)
CONFIDENTIAL
Genome-wide methylation by methylation sensitive restriction enzymes
Methylation I Epigenetics | Oncology | Biomarker
I NEXT-GEN | PharmacoDX | CRC
![Page 136: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/136.jpg)
CONFIDENTIAL
Genome-wide methylation by probes
Methylation I Epigenetics | Oncology | Biomarker
I NEXT-GEN | PharmacoDX | CRC
![Page 137: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/137.jpg)
CONFIDENTIAL# samples
# markers
Genome-wide methylation …. by next generation sequencing
Discovery
Verification
Validation
Methylation I Epigenetics | Oncology | Biomarker
I NEXT-GEN | PharmacoDX | CRC
![Page 138: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/138.jpg)
CONFIDENTIAL
MBD_Seq
DNA Sheared
Immobilized Methyl Binding Domain
Methylation I Epigenetics | Oncology | Biomarker
Condensed Chromatin
DNA Sheared
I NEXT-GEN | PharmacoDX | CRC
![Page 139: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/139.jpg)
CONFIDENTIAL
Immobilized Methyl binding domain
MgCl2
Next Gen SequencingGA Illumina: 100 million reads
MBD_Seq
Methylation I Epigenetics | Oncology | Biomarker
I NEXT-GEN | PharmacoDX | CRC
![Page 140: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/140.jpg)
CONFIDENTIAL
MBD_SeqMGMT = dual core
Methylation I Epigenetics | Oncology | Biomarker
I NEXT-GEN | PharmacoDX | CRC
![Page 141: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/141.jpg)
CONFIDENTIAL# samples
# markers
MBD_Seq
Genome-wide methylation …. by next generation sequencing
Discovery
1-2 millionmethylation
cores
Methylation I Epigenetics | Oncology | Biomarker
I NEXT-GEN | PharmacoDX | CRC
![Page 142: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/142.jpg)
CONFIDENTIAL
Data integrationCorrelation tracks
142
methylation methylation
expression expression
Corr =-1 Corr = 1
![Page 143: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/143.jpg)
CONFIDENTIAL
Correlation trackin GBM @ MGMT
143Methylation I Epigenetics | Oncology | Biomarker
I NEXT-GEN | PharmacoDX |
+1
-1
![Page 144: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/144.jpg)
CONFIDENTIAL# samplesMethylation I Epigenetics | Oncology | Biomarker
# markers
MBD_Seq
454_BT_Seq
MSP
Genome-wide methylation …. by next generation sequencing
Discovery
Verification
Validation
I NEXT-GEN | PharmacoDX |
![Page 145: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/145.jpg)
CONFIDENTIAL
GCATCGTGACTTACGACTGATCGATGGATGCTAGCAT
unmethylated alleles
less methylationmethylated alleles
more methylation
Deep Sequencing
![Page 146: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/146.jpg)
CONFIDENTIAL
Deep MGMTHeterogenic complexity
Methylation I Epigenetics | Oncology | Biomarker
I NEXT-GEN | PharmacoDX | CRC
![Page 147: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/147.jpg)
CONFIDENTIAL
147Methylation I Epigenetics | Oncology | Biomarker
I NEXT-GEN | PharmacoDX | CRC
![Page 148: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/148.jpg)
Overview
Personalized Medicine,
Biomarkers …
… Molecular Profiling
First Generation Molecular Profiling
Next Generation Molecular Profiling
Next Generation Epigenetic Profiling
Concluding Remarks
![Page 149: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/149.jpg)
Translational Medicine: An inconvenient truth
• 1% of genome codes for proteins, however more than 90% is transcribed
• Less than 10% of protein experimentally measured can be “explained” from the genome
• 1 genome ? Structural variation• > 200 Epigenomes ??
• Space/time continuum …
![Page 150: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/150.jpg)
Translational Medicine: An inconvenient truth
• 1% of genome codes for proteins, however more than 90% is transcribed
• Less than 10% of protein experimentally measured can be “explained” from the genome
• 1 genome ? Structural variation• > 200 Epigenomes …
• “space/time” continuum
![Page 151: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/151.jpg)
![Page 152: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/152.jpg)
![Page 153: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/153.jpg)
Epigenetic (meta)information = stem cells
Cellular programming
![Page 154: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/154.jpg)
Cellular reprogramming
Tumor
Epigenetically altered, self-renewing cancer stem cells
Tumor Development and Growth
![Page 155: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/155.jpg)
Gene-specificEpigeneticreprogramming
Cellular reprogramming
![Page 156: Bioinformatics life sciences_2012](https://reader033.vdocuments.mx/reader033/viewer/2022052523/55501b13b4c90555618b5040/html5/thumbnails/156.jpg)
156
biobixwvcrieki
biobix.bebioinformatics.be