introducci ó a la bioinformàtica roderic guigó i serra [email protected]

64
Introducció a la Bioinformàtica Roderic Guigó i Serra [email protected] Bioinformàtica, UPF Curs 2013-2014

Upload: karis

Post on 23-Feb-2016

27 views

Category:

Documents


0 download

DESCRIPTION

Introducci ó a la Bioinformàtica Roderic Guigó i Serra [email protected]. Bioinform àtica , UPF Curs 2013-2014. Van Leeuwenhoek. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Introducci ó a la Bioinformàtica Roderic Guigó i Serra roderic.guigo@crg.cat

Introducció a la BioinformàticaRoderic Guigó i [email protected]

Bioinformàtica, UPF Curs 2013-2014

Page 2: Introducci ó a la Bioinformàtica Roderic Guigó i Serra roderic.guigo@crg.cat
Page 3: Introducci ó a la Bioinformàtica Roderic Guigó i Serra roderic.guigo@crg.cat

Van Leeuwenhoek

In 1676 his credibility was questioned when he sent the Royal Society a copy of his first observations of microscopic single celled organisms. Heretofore, the existence of single celled organisms was entirely unknown … The Royal Society arranged to send an English vicar, as well as a team of respected jurists and doctors to Delft, Holland to determine whether it was in fact Van Leeuwenhoek's ability to observe and reason clearly (wikipedia)

Page 4: Introducci ó a la Bioinformàtica Roderic Guigó i Serra roderic.guigo@crg.cat
Page 5: Introducci ó a la Bioinformàtica Roderic Guigó i Serra roderic.guigo@crg.cat
Page 6: Introducci ó a la Bioinformàtica Roderic Guigó i Serra roderic.guigo@crg.cat
Page 7: Introducci ó a la Bioinformàtica Roderic Guigó i Serra roderic.guigo@crg.cat
Page 8: Introducci ó a la Bioinformàtica Roderic Guigó i Serra roderic.guigo@crg.cat
Page 9: Introducci ó a la Bioinformàtica Roderic Guigó i Serra roderic.guigo@crg.cat
Page 10: Introducci ó a la Bioinformàtica Roderic Guigó i Serra roderic.guigo@crg.cat

ACTCAGCCCCAGCGGAGGTGAAGGACGTCCTTCCCCAGGAGCCGGTGAGAAGCGCAGTCGGGGGCACGGGGATGAGCTCAGGGGCCTCTAGAAAGATGTAGCTGGGACCTCGGGAAGCCCTGGCCTCCAGGTAGTCTCAGGAGAGCTACTCAGGGTCGGGCTTGGGGAGAGGAGGAGCGGGGGTGAGGCCAGCAGCAGGGGACTGGACCTGGGAAGGGCTGGGCAGCAGAGACGACCCGACCCGCTAGAAGGTGGGGTGGGGAGAGCATGTGGACTAGGAGCTAAGCCACAGCAGGACCCCCACGAGTTGTCACTGTCATTTATCGAGCACCTACTGGGTGTCCCCAGTGTCCTCAGATCTCCATAACTGGGAAGCCAGGGGCAGCGACACGGTAGCTAGCCGTCGATTGGAGAACTTTAAAATGAGGACTGAATTAGCTCATAAATGGAAAACGGCGCTTAAATGTGAGGTTAGAGCTTAGAATGTGAAGGGAGAATGAGGAATGCGAGACTGGGACTGAGATGGAACCGGCGGTGGGGAGGGGGAGGGGGTGTGGAATTTGAACCCCGGGAGAGAAAGATGGAATTTTGGCTATGGAGGCCGACCTGGGGATGGGGAAATAAGAGAAGACCAGGAGGGAGTTAAATAGGGAATGGGTTGGGGGCGGCTTGGTAACTGTTTGTGCTGGGATTAGGCTGTTGCAGATAATGGAGCAAGGCTTGGAAGGCTAACCTGGGGTGGGGCCGGGTTGGGGTCGGGCTGGGGGCGGGAGGAGTCCTCACTGGCGGTTGATTGACAGTTTCTCCTTCCCCAGACTGGCCAATCACAGGCAGGAAGATGAAGGTTCTGTGGGCTGCGTTGCTGGTCACATTCCTGGCAGGTATGGGGCGGGGCTTGCTCGGTTTTCCCCGCTTCTCCCCCTCTCATCCTCACCTCAACCTCCTGGCCCCATTCAAGCACACCCTGGGCCCCCTCTTCTTCTGCTGGTCTGTCCCCTGAGGGGAAAGCCCAGGTCTGAGGCTTCTATGCTGCTTTCTGGCTCAGAACAGCGATTTGACGCTCTGTGAGCCTCGGTTCCTCCCCCGCTTTTTTTTTTTCAGCCAGAGTCTCACTCTGTCGCCCAGGCTGGAGTGCAGTGGCGCAATCTCAGCTCACTGCAAGCTCCGCCTCCCGGGTTCACGCTATTCTCCCGCCTCAGCCTCCCGAGTAGCTGGGACTACAGGCGCCCGCCACCATGCCCGGCTAATTTTTTGTACTTTGAGTAGGGAAGGGGTTTCACTGTATTATCCAGGATGGTCTCTATCTCCTGACCTCGTGATCTGCCCGCCTGGCCTCCCAAAGTGCTGGAATTACAGGCGTGAGCCTCCGCGCCCGGCCTCCCCATCCTTAATATAGGAGTTAGAAGTTTTTGTTTGTTTGTTTTGTTTTGTTTTTGTTTTGTTTTGAGATGAAGTCCCTCTGTCGCCCAGGCTGGAGTGCAGTGGCTCCCAGGCTGGAGTTCAGTGGCTGGATCTCGGCTCACTGCAAGCTCCGCCTCCCAGGTTCACGCCATTCTCCTGCCTCAGCCTCCGGAGTAGCTGGGACTACAGGAACATGCCACCACACCCGACTAACTTTTTTTGTATTTTTAGTAGAGACGGGGTTTCACCATGTTGGCCAGGCTGGTCTGGAACTCCTGACCTCAGGTGATCTGCCTGCTTCAACCTCCCAAAGTGCTGGGATTACAGACGTGGGCCACCGCGCCCGGCTGGGAGTTAAGAGGTTTCTAATGCATTGCATTAGAATACCAGACACGGGACAGCTGTGATCTTTATTCTCCATCACCCCACACAGCCCTGCCTGGGGCACACAAGGACACTCAATACACGCTTTTCGGGCGCGGTGGCTCAAGCTGTAATCCCAGCACTTTGGGAGGCTGAGGCGGGTGGTACATGAGGTCAGGAGATCGAGACCATCCTGGCTAACATGGTGAAACCCCGTCTCTACTAAAAATACAAAAAACTAGCCCGGGCGTGGTGGCGGGCGCCTGTAGTCCCAGCTACTCGGAGGCTGAGGCAGGAGAATGGCGTGAACCTGGGAGGCGGAGCTTGCAGTGAGCCGAGATCGCGCCACTGCACTCCAGCCTGGGTGACACAGCGCGAGACTCCGTCTCAAAAAAAAAAAAAAAAAAAAAAAAAAAAAATACACGCTTTTCCGCTAGGCACGGTGGCTCACCCCTGTAATCCCAGCATTTTGGGAGGCCAAGGTGGGAGGATCACTTGAGCCCAGGAGTTCAACACCAGACTCAGCAACATAGTGAGACTCTCTCTACTAAAAATACAAAAATTAGCCAGGCCTGGTGCCACACACCTGTGGTCCCAGCTACTCAGAAGGCTAAGGCAGGAGGATCGCTTAAGCCCAGAAGGTCAAGGTTGCAGTGAACCACGTTCAGGCCACTGCAGTCCAGCCTGGGTGACAGAGCAAGACCCTGTCTGTAAATAAATAACGCTTTTCAAGTGATTAAACAGACTCCCCCCTCACCCTGCCCACCATGGCTCCAAAGCAGCATTTGTGGAGCACCTTCTGTGTGCCCCTAGGTACTAGCTGCCTGGACGGGGTCAGAAGGAACCTGAACCACCTTCAACTTGTTCCACACAGGATGCCAGGCCAAGGTGGAGCAACCGGTGGAGCCAGAGACAGAACCCGACGTTCGCCAGCAGGCTGAGTGGCAGAGCGGCCAGCCCTGGGAGCTGGCACTGGGTCGCTTTTGGGATTACCTGCGCTGGGTGCAGACACTGTCTGAGCAGGTGCAGGAGGAGCTGCTCAGCCCCCAGGTCACCCAGGAACTGACGTGAGTGTCCCCATCCCGGCCCTTGACCCTCCTGGTGGGCGGCTATACCTCCCCAGGTCCAGGTTTCATTCTGCCCCTGCCACTAAGTCTTGGGGGCCTGGGTCTCTGCTGGTTCTAGCTTCCTCTTCCCATTTCTGACTCCTGGCTTTAGCTCTCTGGAATTCTCTCTCTCAGTTCTGTTTCTCCCTCTTCCCTTCTGACTCAGCCTGTCACACTCGTCCTGGCGCTGTCTCTGTCCTTCACTAGCTCTTTTATATAGAGACAGAGAGATGGGGTCTCACTGTGTTGCCCAGGCTGGTCTTGAACTTCTGGGCTCAAGCGATCCTCCCACCTCGCCTCCCAAAGTGCTGGGAATAGAGACATGAGCCACCTTGCTCGGCCTCCTAGCTCTTTCTTCGTCTCTGCCTCTGCTCTCTGCGTCTGTCTTTGTCTCCTCTCTGCCTCTGTCCCGTTCCTTCTCTCTTGGTTCACTGCCCTTCTGTCTCTCCCTGTTCTCCTTAGGAGACTCTCCTCTCTTCCTTCTCGAGTCTCTCTGGCTGATCCCCATCTCACCCACACCTATCC

Page 11: Introducci ó a la Bioinformàtica Roderic Guigó i Serra roderic.guigo@crg.cat

ACTCAGCCCCAGCGGAGGTGAAGGACGTCCTTCCCCAGGAGCCGGTGAGAAGCGCAGTCGGGGGCACGGGGATGAGCTCAGGGGCCTCTAGAAAGATGTAGCTGGGACCTCGGGAAGCCCTGGCCTCCAGGTAGTCTCAGGAGAGCTACTCAGGGTCGGGCTTGGGGAGAGGAGGAGCGGGGGTGAGGCCAGCAGCAGGGGACTGGACCTGGGAAGGGCTGGGCAGCAGAGACGACCCGACCCGCTAGAAGGTGGGGTGGGGAGAGCATGTGGACTAGGAGCTAAGCCACAGCAGGACCCCCACGAGTTGTCACTGTCATTTATCGAGCACCTACTGGGTGTCCCCAGTGTCCTCAGATCTCCATAACTGGGAAGCCAGGGGCAGCGACACGGTAGCTAGCCGTCGATTGGAGAACTTTAAAATGAGGACTGAATTAGCTCATAAATGGAAAACGGCGCTTAAATGTGAGGTTAGAGCTTAGAATGTGAAGGGAGAATGAGGAATGCGAGACTGGGACTGAGATGGAACCGGCGGTGGGGAGGGGGAGGGGGTGTGGAATTTGAACCCCGGGAGAGAAAGATGGAATTTTGGCTATGGAGGCCGACCTGGGGATGGGGAAATAAGAGAAGACCAGGAGGGAGTTAAATAGGGAATGGGTTGGGGGCGGCTTGGTAACTGTTTGTGCTGGGATTAGGCTGTTGCAGATAATGGAGCAAGGCTTGGAAGGCTAACCTGGGGTGGGGCCGGGTTGGGGTCGGGCTGGGGGCGGGAGGAGTCCTCACTGGCGGTTGATTGACAGTTTCTCCTTCCCCAGACTGGCCAATCACAGGCAGGAAGATGAAGGTTCTGTGGGCTGCGTTGCTGGTCACATTCCTGGCAGGTATGGGGCGGGGCTTGCTCGGTTTTCCCCGCTTCTCCCCCTCTCATCCTCACCTCAACCTCCTGGCCCCATTCAAGCACACCCTGGGCCCCCTCTTCTTCTGCTGGTCTGTCCCCTGAGGGGAAAGCCCAGGTCTGAGGCTTCTATGCTGCTTTCTGGCTCAGAACAGCGATTTGACGCTCTGTGAGCCTCGGTTCCTCCCCCGCTTTTTTTTTTTCAGCCAGAGTCTCACTCTGTCGCCCAGGCTGGAGTGCAGTGGCGCAATCTCAGCTCACTGCAAGCTCCGCCTCCCGGGTTCACGCTATTCTCCCGCCTCAGCCTCCCGAGTAGCTGGGACTACAGGCGCCCGCCACCATGCCCGGCTAATTTTTTGTACTTTGAGTAGGGAAGGGGTTTCACTGTATTATCCAGGATGGTCTCTATCTCCTGACCTCGTGATCTGCCCGCCTGGCCTCCCAAAGTGCTGGAATTACAGGCGTGAGCCTCCGCGCCCGGCCTCCCCATCCTTAATATAGGAGTTAGAAGTTTTTGTTTGTTTGTTTTGTTTTGTTTTTGTTTTGTTTTGAGATGAAGTCCCTCTGTCGCCCAGGCTGGAGTGCAGTGGCTCCCAGGCTGGAGTTCAGTGGCTGGATCTCGGCTCACTGCAAGCTCCGCCTCCCAGGTTCACGCCATTCTCCTGCCTCAGCCTCCGGAGTAGCTGGGACTACAGGAACATGCCACCACACCCGACTAACTTTTTTTGTATTTTTAGTAGAGACGGGGTTTCACCATGTTGGCCAGGCTGGTCTGGAACTCCTGACCTCAGGTGATCTGCCTGCTTCAACCTCCCAAAGTGCTGGGATTACAGACGTGGGCCACCGCGCCCGGCTGGGAGTTAAGAGGTTTCTAATGCATTGCATTAGAATACCAGACACGGGACAGCTGTGATCTTTATTCTCCATCACCCCACACAGCCCTGCCTGGGGCACACAAGGACACTCAATACACGCTTTTCGGGCGCGGTGGCTCAAGCTGTAATCCCAGCACTTTGGGAGGCTGAGGCGGGTGGTACATGAGGTCAGGAGATCGAGACCATCCTGGCTAACATGGTGAAACCCCGTCTCTACTAAAAATACAAAAAACTAGCCCGGGCGTGGTGGCGGGCGCCTGTAGTCCCAGCTACTCGGAGGCTGAGGCAGGAGAATGGCGTGAACCTGGGAGGCGGAGCTTGCAGTGAGCCGAGATCGCGCCACTGCACTCCAGCCTGGGTGACACAGCGCGAGACTCCGTCTCAAAAAAAAAAAAAAAAAAAAAAAAAAAAAATACACGCTTTTCCGCTAGGCACGGTGGCTCACCCCTGTAATCCCAGCATTTTGGGAGGCCAAGGTGGGAGGATCACTTGAGCCCAGGAGTTCAACACCAGACTCAGCAACATAGTGAGACTCTCTCTACTAAAAATACAAAAATTAGCCAGGCCTGGTGCCACACACCTGTGGTCCCAGCTACTCAGAAGGCTAAGGCAGGAGGATCGCTTAAGCCCAGAAGGTCAAGGTTGCAGTGAACCACGTTCAGGCCACTGCAGTCCAGCCTGGGTGACAGAGCAAGACCCTGTCTGTAAATAAATAACGCTTTTCAAGTGATTAAACAGACTCCCCCCTCACCCTGCCCACCATGGCTCCAAAGCAGCATTTGTGGAGCACCTTCTGTGTGCCCCTAGGTACTAGCTGCCTGGACGGGGTCAGAAGGAACCTGAACCACCTTCAACTTGTTCCACACAGGATGCCAGGCCAAGGTGGAGCAACCGGTGGAGCCAGAGACAGAACCCGACGTTCGCCAGCAGGCTGAGTGGCAGAGCGGCCAGCCCTGGGAGCTGGCACTGGGTCGCTTTTGGGATTACCTGCGCTGGGTGCAGACACTGTCTGAGCAGGTGCAGGAGGAGCTGCTCAGCCCCCAGGTCACCCAGGAACTGACGTGAGTGTCCCCATCCCGGCCCTTGACCCTCCTGGTGGGCGGCTATACCTCCCCAGGTCCAGGTTTCATTCTGCCCCTGCCACTAAGTCTTGGGGGCCTGGGTCTCTGCTGGTTCTAGCTTCCTCTTCCCATTTCTGACTCCTGGCTTTAGCTCTCTGGAATTCTCTCTCTCAGTTCTGTTTCTCCCTCTTCCCTTCTGACTCAGCCTGTCACACTCGTCCTGGCGCTGTCTCTGTCCTTCACTAGCTCTTTTATATAGAGACAGAGAGATGGGGTCTCACTGTGTTGCCCAGGCTGGTCTTGAACTTCTGGGCTCAAGCGATCCTCCCACCTCGCCTCCCAAAGTGCTGGGAATAGAGACATGAGCCACCTTGCTCGGCCTCCTAGCTCTTTCTTCGTCTCTGCCTCTGCTCTCTGCGTCTGTCTTTGTCTCCTCTCTGCCTCTGTCCCGTTCCTTCTCTCTTGGTTCACTGCCCTTCTGTCTCTCCCTGTTCTCCTTAGGAGACTCTCCTCTCTTCCTTCTCGAGTCTCTCTGGCTGATCCCCATCTCACCCACACCTATCC

Page 12: Introducci ó a la Bioinformàtica Roderic Guigó i Serra roderic.guigo@crg.cat

ACTCAGCCCCAGCGGAGGTGAAGGACGTCCTTCCCCAGGAGCCGGTGAGAAGCGCAGTCGGGGGCACGGGGATGAGCTCAGGGGCCTCTAGAAAGATGTAGCTGGGACCTCGGGAAGCCCTGGCCTCCAGGTAGTCTCAGGAGAGCTACTCAGGGTCGGGCTTGGGGAGAGGAGGAGCGGGGGTGAGGCCAGCAGCAGGGGACTGGACCTGGGAAGGGCTGGGCAGCAGAGACGACCCGACCCGCTAGAAGGTGGGGTGGGGAGAGCATGTGGACTAGGAGCTAAGCCACAGCAGGACCCCCACGAGTTGTCACTGTCATTTATCGAGCACCTACTGGGTGTCCCCAGTGTCCTCAGATCTCCATAACTGGGAAGCCAGGGGCAGCGACACGGTAGCTAGCCGTCGATTGGAGAACTTTAAAATGAGGACTGAATTAGCTCATAAATGGAAAACGGCGCTTAAATGTGAGGTTAGAGCTTAGAATGTGAAGGGAGAATGAGGAATGCGAGACTGGGACTGAGATGGAACCGGCGGTGGGGAGGGGGAGGGGGTGTGGAATTTGAACCCCGGGAGAGAAAGATGGAATTTTGGCTATGGAGGCCGACCTGGGGATGGGGAAATAAGAGAAGACCAGGAGGGAGTTAAATAGGGAATGGGTTGGGGGCGGCTTGGTAACTGTTTGTGCTGGGATTAGGCTGTTGCAGATAATGGAGCAAGGCTTGGAAGGCTAACCTGGGGTGGGGCCGGGTTGGGGTCGGGCTGGGGGCGGGAGGAGTCCTCACTGGCGGTTGATTGACAGTTTCTCCTTCCCCAGACTGGCCAATCACAGGCAGGAAGATGAAGGTTCTGTGGGCTGCGTTGCTGGTCACATTCCTGGCAGGTATGGGGCGGGGCTTGCTCGGTTTTCCCCGCTTCTCCCCCTCTCATCCTCACCTCAACCTCCTGGCCCCATTCAAGCACACCCTGGGCCCCCTCTTCTTCTGCTGGTCTGTCCCCTGAGGGGAAAGCCCAGGTCTGAGGCTTCTATGCTGCTTTCTGGCTCAGAACAGCGATTTGACGCTCTGTGAGCCTCGGTTCCTCCCCCGCTTTTTTTTTTTCAGCCAGAGTCTCACTCTGTCGCCCAGGCTGGAGTGCAGTGGCGCAATCTCAGCTCACTGCAAGCTCCGCCTCCCGGGTTCACGCTATTCTCCCGCCTCAGCCTCCCGAGTAGCTGGGACTACAGGCGCCCGCCACCATGCCCGGCTAATTTTTTGTACTTTGAGTAGGGAAGGGGTTTCACTGTATTATCCAGGATGGTCTCTATCTCCTGACCTCGTGATCTGCCCGCCTGGCCTCCCAAAGTGCTGGAATTACAGGCGTGAGCCTCCGCGCCCGGCCTCCCCATCCTTAATATAGGAGTTAGAAGTTTTTGTTTGTTTGTTTTGTTTTGTTTTTGTTTTGTTTTGAGATGAAGTCCCTCTGTCGCCCAGGCTGGAGTGCAGTGGCTCCCAGGCTGGAGTTCAGTGGCTGGATCTCGGCTCACTGCAAGCTCCGCCTCCCAGGTTCACGCCATTCTCCTGCCTCAGCCTCCGGAGTAGCTGGGACTACAGGAACATGCCACCACACCCGACTAACTTTTTTTGTATTTTTAGTAGAGACGGGGTTTCACCATGTTGGCCAGGCTGGTCTGGAACTCCTGACCTCAGGTGATCTGCCTGCTTCAACCTCCCAAAGTGCTGGGATTACAGACGTGGGCCACCGCGCCCGGCTGGGAGTTAAGAGGTTTCTAATGCATTGCATTAGAATACCAGACACGGGACAGCTGTGATCTTTATTCTCCATCACCCCACACAGCCCTGCCTGGGGCACACAAGGACACTCAATACACGCTTTTCGGGCGCGGTGGCTCAAGCTGTAATCCCAGCACTTTGGGAGGCTGAGGCGGGTGGTACATGAGGTCAGGAGATCGAGACCATCCTGGCTAACATGGTGAAACCCCGTCTCTACTAAAAATACAAAAAACTAGCCCGGGCGTGGTGGCGGGCGCCTGTAGTCCCAGCTACTCGGAGGCTGAGGCAGGAGAATGGCGTGAACCTGGGAGGCGGAGCTTGCAGTGAGCCGAGATCGCGCCACTGCACTCCAGCCTGGGTGACACAGCGCGAGACTCCGTCTCAAAAAAAAAAAAAAAAAAAAAAAAAAAAAATACACGCTTTTCCGCTAGGCACGGTGGCTCACCCCTGTAATCCCAGCATTTTGGGAGGCCAAGGTGGGAGGATCACTTGAGCCCAGGAGTTCAACACCAGACTCAGCAACATAGTGAGACTCTCTCTACTAAAAATACAAAAATTAGCCAGGCCTGGTGCCACACACCTGTGGTCCCAGCTACTCAGAAGGCTAAGGCAGGAGGATCGCTTAAGCCCAGAAGGTCAAGGTTGCAGTGAACCACGTTCAGGCCACTGCAGTCCAGCCTGGGTGACAGAGCAAGACCCTGTCTGTAAATAAATAACGCTTTTCAAGTGATTAAACAGACTCCCCCCTCACCCTGCCCACCATGGCTCCAAAGCAGCATTTGTGGAGCACCTTCTGTGTGCCCCTAGGTACTAGCTGCCTGGACGGGGTCAGAAGGAACCTGAACCACCTTCAACTTGTTCCACACAGGATGCCAGGCCAAGGTGGAGCAACCGGTGGAGCCAGAGACAGAACCCGACGTTCGCCAGCAGGCTGAGTGGCAGAGCGGCCAGCCCTGGGAGCTGGCACTGGGTCGCTTTTGGGATTACCTGCGCTGGGTGCAGACACTGTCTGAGCAGGTGCAGGAGGAGCTGCTCAGCCCCCAGGTCACCCAGGAACTGACGTGAGTGTCCCCATCCCGGCCCTTGACCCTCCTGGTGGGCGGCTATACCTCCCCAGGTCCAGGTTTCATTCTGCCCCTGCCACTAAGTCTTGGGGGCCTGGGTCTCTGCTGGTTCTAGCTTCCTCTTCCCATTTCTGACTCCTGGCTTTAGCTCTCTGGAATTCTCTCTCTCAGTTCTGTTTCTCCCTCTTCCCTTCTGACTCAGCCTGTCACACTCGTCCTGGCGCTGTCTCTGTCCTTCACTAGCTCTTTTATATAGAGACAGAGAGATGGGGTCTCACTGTGTTGCCCAGGCTGGTCTTGAACTTCTGGGCTCAAGCGATCCTCCCACCTCGCCTCCCAAAGTGCTGGGAATAGAGACATGAGCCACCTTGCTCGGCCTCCTAGCTCTTTCTTCGTCTCTGCCTCTGCTCTCTGCGTCTGTCTTTGTCTCCTCTCTGCCTCTGTCCCGTTCCTTCTCTCTTGGTTCACTGCCCTTCTGTCTCTCCCTGTTCTCCTTAGGAGACTCTCCTCTCTTCCTTCTCGAGTCTCTCTGGCTGATCCCCATCTCACCCACACCTATCC

Page 13: Introducci ó a la Bioinformàtica Roderic Guigó i Serra roderic.guigo@crg.cat
Page 14: Introducci ó a la Bioinformàtica Roderic Guigó i Serra roderic.guigo@crg.cat
Page 15: Introducci ó a la Bioinformàtica Roderic Guigó i Serra roderic.guigo@crg.cat

La matèria cromosòmica és “un cristall aperiòdic”, constituït per la successió d'un nombre petit d'elements isomèrics*, la seqüència concreta dels quals és la responsable de la seva funcionalitat.

(*) “the number of atoms in such a structure need not to be very large to produce an almost unlimited number of possible arrangements. For illustration, think of the Morse code…”

1943: Schroëdinger, “What is life?”

Page 16: Introducci ó a la Bioinformàtica Roderic Guigó i Serra roderic.guigo@crg.cat

ENIAC

Late 40s: first digital computers

Page 17: Introducci ó a la Bioinformàtica Roderic Guigó i Serra roderic.guigo@crg.cat
Page 18: Introducci ó a la Bioinformàtica Roderic Guigó i Serra roderic.guigo@crg.cat
Page 19: Introducci ó a la Bioinformàtica Roderic Guigó i Serra roderic.guigo@crg.cat
Page 20: Introducci ó a la Bioinformàtica Roderic Guigó i Serra roderic.guigo@crg.cat
Page 21: Introducci ó a la Bioinformàtica Roderic Guigó i Serra roderic.guigo@crg.cat

MALWTRLRPLLALLALWPPPPARAFVNQHLCGSHLVEALYLVCGERGFFYTPKARREVEGPQVGALELAGGPGAGGLEGPPQKRGIVEQCCASVCSLYQLENYCN

Amino acid sequence of the bovine insuline

Page 22: Introducci ó a la Bioinformàtica Roderic Guigó i Serra roderic.guigo@crg.cat

http://www.ict-science-to-society.org/

Early 60s: the genetic code

Page 23: Introducci ó a la Bioinformàtica Roderic Guigó i Serra roderic.guigo@crg.cat

GAGTTTTATCGCTTCCATGACGCAGAAGTTAACACTTTCGGATATTTCTGATGAGTCGAAAAATTATCTTGATAAAGCAGGAATTACTACTGCTTGTTTACGAATTAAATCGAAGTGGACTGCTGGCGGAAAATGAGAAAATTCGACCTATCCTTGCGCAGCTCGAGAAGCTCTTACTTTGCGACCTTTCGCCATCAACTAACGATTCTGTCAAAAACTGACGCGTTGGATGAGGAGAAGTGGCTTAATATGCTTGGCACGTTCGTCAAGGACTGGTTTAGATATGAGTCACATTTTGTTCATGGTAGAGATTCTCTTGT

MALWTRLRPLLALLALWPPPPARAFVNQHLCGSHLVEALYLVCGERGFFYTPKARREVEGPQVGALELAGGPGAGGLEGPPQKRGIVEQCCASVCSLYQLENYCN

Page 24: Introducci ó a la Bioinformàtica Roderic Guigó i Serra roderic.guigo@crg.cat

1957: invention of the programming language FORTRAN

Page 25: Introducci ó a la Bioinformàtica Roderic Guigó i Serra roderic.guigo@crg.cat

Computers become smaller and therefore faster and cheaper

During the 60s computers are introduced into bancs, financial institutions, universities and research centers

60s: Transistors and integrated circuits

Page 26: Introducci ó a la Bioinformàtica Roderic Guigó i Serra roderic.guigo@crg.cat
Page 27: Introducci ó a la Bioinformàtica Roderic Guigó i Serra roderic.guigo@crg.cat
Page 28: Introducci ó a la Bioinformàtica Roderic Guigó i Serra roderic.guigo@crg.cat

Sequence alignment and comparison

Page 29: Introducci ó a la Bioinformàtica Roderic Guigó i Serra roderic.guigo@crg.cat

substitution matrices

Page 30: Introducci ó a la Bioinformàtica Roderic Guigó i Serra roderic.guigo@crg.cat

Sequence alignment The substitution matrices provided a model

under which the concept of optimal alignment could be formalized, and computed. The optimal alignment between two sequences is the alignment that maximizes the sum of the amino acid substitution values at each aligned position.

A R N D C Q A R N D C Q S K - E A E - S K E A E +1+3-1+3-2+2=6 -1+0+1+3-2+2=3

Page 31: Introducci ó a la Bioinformàtica Roderic Guigó i Serra roderic.guigo@crg.cat

The total number of possible alignments between two sequences of length 100 is approximately 10200.

With DP the number of operations required to obtain the optimal alignment is aproximately 3x1002

Query: 25 IPREVIERLARSQIHSIRDLQRLLEIDSVGSEDSLDTSLRAHGVHATKHVPEKRPLPIRR 84 IP E+ + L+ I S DLQRLL+ DS G ED + L H+ + R Sbjct: 10 IPEELYKMLSGHSIRSFDDLQRLLQGDS-GKEDGAELDLNMTRSHSGGELESLA----RG 64

Query: 85 KRSI------EEAVPAVCKTRTVIYEIPRSQVDPTSANFLIWPPCVEVKRCTGCCNTSSV 138 KRS+ E A+ A CKTRT ++EI R +D T+ANFL+WPPCVEV+RC+GCCN +VSbjct: 65 KRSLGSLSVAEPAMIAECKTRTEVFEISRRLIDRTNANFLVWPPCVEVQRCSGCCNNRNV 124

Query: 139 KCQPSRVHHRSVKVAKVEYVRKKPKLKEVQVRLEEHLECAC 179 +C+P++V R V+V K+E VRKKP K+ V LE+HL C CSbjct: 125 QCRPTQVQLRPVQVRKIEIVRKKPIFKKATVTLEDHLACKC 165

DYNAMIC PROGRAMMING,Nedleman and Wunsch, 1970Smith and Waterman, 1981

70’s: Optimal sequence alignment

Page 32: Introducci ó a la Bioinformàtica Roderic Guigó i Serra roderic.guigo@crg.cat

mid70’s: DNA sequencing, Sanger. Maxam and Gilbert

By the end of the sixties, hundreds of proteins had been sequenced, but the sequencing on nucleic acids remained elusive

Sanger (Cambridge)Maxam and Gilbert (Harvard)

Page 33: Introducci ó a la Bioinformàtica Roderic Guigó i Serra roderic.guigo@crg.cat

Anys 70: Internet. Advanced Research Projects Agency

Page 34: Introducci ó a la Bioinformàtica Roderic Guigó i Serra roderic.guigo@crg.cat

gagttttatcgcttccatgacgcagaagttaacactttcggatatttctgatgagtcgaaaaattatcttgataaagcaggaattactactgcttgtttacgaattaaatcgaagtggactgctggcggaaaatgagaaaattcgacctatccttgcgcagctcgagaagctcttactttgcgacctttcgccatcaactaacgattctgtcaaaaactgacgcgttggatgaggagaagtggcttaatatgcttggcacgttcgtcaaggactggtttagatatgagtcacattttgttcatggtagagattctcttgttgacattttaaaagagcgtggattactatctgagtccgatgctgttcaaccactaataggtaagaaatcatgagtcaagttactgaacaatccgtacgtttccagaccgctttggcctctattaagctcattcaggcttctgccgttttggatttaaccgaagatgatttcgattttctgacgagtaacaaagtttggattgctactgaccgctctcgtgctcgtcgctgcgttgaggcttgcgtttatggtacgctggactttgtgggataccctcgctttcctgctcctgttgagtttattgctgccgtcattgcttattatgttcatcccgtcaacattcaaacggcctgtctcatcatggaaggcgctgaatttacggaaaacattattaatggcgtcgagcgtccggttaaagccgctgaattgttcgcgtttaccttgcgtgtacgcgcaggaaacactgacgttcttactgacgcagaagaaaacgtgcgtcaaaaattacgtgcggaaggagtgatgtaatgtctaaaggtaaaaaacgttctggcgctcgccctggtcgtccgcagccgttgcgaggtactaaaggcaagcgtaaaggcgctcgtctttggtatgtaggtggtcaacaattttaattgcaggggcttcggccccttacttgaggataaattatgtctaatattcaaactggcgccgagcgtatgccgcatgacctttcccatcttggcttccttgctggtcagattggtcgtcttattaccatttcaactactccggttatcgctggcgactccttcgagatggacgccgttggcgctctccgtctttctccattgcgtcgtggccttgctattgactctactgtagacatttttactttttatgtccctcatcgtcacgtttatggtgaacagtggattaagttcatgaaggatggtgttaatgccactcctctcccgactgttaacactactggttatattgaccatgccgcttttcttggcacgattaaccctgataccaataaaatccctaagcatttgtttcagggttatttgaatatctataacaactattttaaagcgccgtggatgcctgaccgtaccgaggctaaccctaatgagcttaatcaagatgatgctcgttatggtttccgttgctgccatctcaaaaacatttggactgctccgcttcctcctgagactgagctttctcgccaaatgacgacttctaccacatctattgacattatgggtctgcaagctgcttatgctaatttgcatactgaccaagaacgtgattacttcatgcagcgttaccatgatgttatttcttcatttggaggtaaaacctcttatgacgctgacaaccgtcctttacttgtcatgcgctctaatctctgggcatctggctatgatgttgatggaactgaccaaacgtcgttaggccagttttctggtcgtgttcaacagacctataaacattctgtgccgcgtttctttgttcctgagcatggcactatgtttactcttgcgcttgttcgttttccgcctactgcgactaaagagattcagtaccttaacgctaaaggtgctttgacttataccgatattgctggcgaccctgttttgtatggcaacttgccgccgcgtgaaatttctatgaaggatgttttccgttctggtgattcgtctaagaagtttaagattgctgagggtcagtggtatcgttatgcgccttcgtatgtttctcctgcttatcaccttcttgaaggcttcccattcattcaggaaccgccttctggtgatttgcaagaacgcgtacttattcgccaccatgattatgaccagtgtttccagtccgttcagttgttgcagtggaatagtcaggttaaatttaatgtgaccgtttatcgcaatctgccgaccactcgcgattcaatcatgacttcgtgataaaagattgagtgtgaggttataacgccgaagcggtaaaaattttaatttttgccgctgaggggttgaccaagcgaagcgcggtaggttttctgcttaggagtttaatcatgtttcagacttttatttctcgccataattcaaactttttttctgataagctggttctcacttctgttactccagcttcttcggcacctgttttacagacacctaaagctacatcgtcaacgttatattttgatagtttgacggttaatgctggtaatggtggttttcttcattgcattcagatggatacatctgtcaacgccgctaatcaggttgtttctgttggtgctgatattgcttttgatgccgaccctaaattttttgcctgtttggttcgctttgagtcttcttcggttccgactaccctcccgactgcctatgatgtttatcctttgaatggtcgccatgatggtggttattataccgtcaaggactgtgtgactattgacgtccttccccgtacgccgggcaataacgtttatgttggtttcatggtttggtctaactttaccgctactaaatgccgcggattggtttcgctgaatcaggttattaaagagattatttgtctccagccacttaagtgaggtgatttatgtttggtgctattgctggcggtattgcttctgctcttgctggtggcgccatgtctaaattgtttggaggcggtcaaaaagccgcctccggtggcattcaaggtgatgtgcttgctaccgataacaatactgtaggcatgggtgatgctggtattaaatctgccattcaaggctctaatgttcctaaccctgatgaggccgcccctagttttgtttctggtgctatggctaaagctggtaaaggacttcttgaaggtacgttgcaggctggcacttctgccgtttctgataagttgcttgatttggttggacttggtggcaagtctgccgctgataaaggaaaggatactcgtgattatcttgctgctgcatttcctgagcttaatgcttgggagcgtgctggtgctgatgcttcctctgctggtatggttgacgccggatttgagaatcaaaaagagcttactaaaatgcaactggacaatcagaaagagattgccgagatgcaaaatgagactcaaaaagagattgctggcattcagtcggcgacttcacgccagaatacgaaagaccaggtatatgcacaaaatgagatgcttgcttatcaacagaaggagtctactgctcgcgttgcgtctattatggaaaacaccaatcttcccaagcaacagcaggtttccgagattatgcgccaaatgcttactcaagctcaaacggctggtcagtattttaccaatgaccaaatcaaagaaatgactcgcaaggttagtgctgaggttgacttagttcatcagcaaacgcagaatcagcggtatggctcttctcatattggcgctactgcaaaggatatttctaatgtcgtcactgatgctgcttctggtgtggttgatatttttcatggtattgataaagctgttgccgatacttggaacaatttctggaaagacggtaaagctgatggtattggctctaatttgtctaggaaataaccgtcaggattgacaccctcccaattgtatgttttcatgcctccaaatcttggaggcttttttatggttcgttcttattacccttctgaatgtcacgctgattattttgactttgag

1977: X174 virus genome

Page 35: Introducci ó a la Bioinformàtica Roderic Guigó i Serra roderic.guigo@crg.cat

1982: the first electronic databases

Page 36: Introducci ó a la Bioinformàtica Roderic Guigó i Serra roderic.guigo@crg.cat

FASTA, 1982: Wilbur and Lipman, 1985: Lipman and PearsonBLAST, 1990: Altschul, Gish, Miller, Myers and Lipman

accelerating database searches hash methods

1 2 3 4 5 6 7 8 9 10 11 12 13

W A T S N A N D C R I C K

A C D I K N R S T W26

912

8 11

13

57

10 4 3 1

Query Sequence

Hash table K=1

http://www.ccl.rutgers.edu/~ouyang/5020/FASTA-BLAST.ppt

Page 37: Introducci ó a la Bioinformàtica Roderic Guigó i Serra roderic.guigo@crg.cat

Search of the Platelet Derived Growth Factor sequence

1982, Doolittle: relationship between oncogenes and growth factors

Page 38: Introducci ó a la Bioinformàtica Roderic Guigó i Serra roderic.guigo@crg.cat
Page 39: Introducci ó a la Bioinformàtica Roderic Guigó i Serra roderic.guigo@crg.cat

1990:The human genome projectTHE HUMAN GENOME PROGRAM (HGP) is producing large quantities of complex map and DNA sequence data. Informatics projects in algorithms, software, and databases are crucial in accumulating and interpreting these data in a robust and automated fashion at genome and sequencing centers

Computer systems play essential roles in all aspects of genome research, from data acquisition and analysis to data management. Without powerful computers and appropriately designed data–management systems, high–volume genome research cannot proceed.

Page 40: Introducci ó a la Bioinformàtica Roderic Guigó i Serra roderic.guigo@crg.cat

This proposal concerns the management of general information about accelerators and experiments at CERN. It discusses the problems of loss of information about complex evolving systems and derives a solution based on a distributed hypertext system (Tim Berners-Lee)

1990:WWW at CERN

Page 41: Introducci ó a la Bioinformàtica Roderic Guigó i Serra roderic.guigo@crg.cat

Human Genome Project Milestones

Page 42: Introducci ó a la Bioinformàtica Roderic Guigó i Serra roderic.guigo@crg.cat

2001: la culminació del projecte

Page 43: Introducci ó a la Bioinformàtica Roderic Guigó i Serra roderic.guigo@crg.cat
Page 44: Introducci ó a la Bioinformàtica Roderic Guigó i Serra roderic.guigo@crg.cat

bioinformatics

Medline articles with keyword Bioinformatics.

year # articlesTo 1990 0

Page 45: Introducci ó a la Bioinformàtica Roderic Guigó i Serra roderic.guigo@crg.cat

bioinformatics

Medline articles with keyword Bioinformatics.

year # articlesTo 1990 0

1990-1994 15

Page 46: Introducci ó a la Bioinformàtica Roderic Guigó i Serra roderic.guigo@crg.cat

bioinformatics

Medline articles with keyword Bioinformatics.

year # articlesTo 1990 0

1990-1994 151995-1999 823

Page 47: Introducci ó a la Bioinformàtica Roderic Guigó i Serra roderic.guigo@crg.cat

bioinformatics

Medline articles with keyword Bioinformatics.

year # articlesTo 1990 0

1990-1994 151995-1999 8232000-2004 7827

Page 48: Introducci ó a la Bioinformàtica Roderic Guigó i Serra roderic.guigo@crg.cat

bioinformatics

Medline articles with keyword Bioinformatics.

year # articlesTo 1990 0

1990-1994 151995-1999 8232000-2004 78272005-2008 18822

Page 49: Introducci ó a la Bioinformàtica Roderic Guigó i Serra roderic.guigo@crg.cat

bioinformatics

Medline articles with keyword Bioinformatics.

year # articlesTo 1990 0

1990-1994 151995-1999 8232000-2004 78272005-2009 188222010-2013 52835

Page 50: Introducci ó a la Bioinformàtica Roderic Guigó i Serra roderic.guigo@crg.cat

Bioinformatics, Genomics, Systems Biology in Medline

Page 51: Introducci ó a la Bioinformàtica Roderic Guigó i Serra roderic.guigo@crg.cat

What is past, is a prologue

W. Shakespeare, La Tempestad,

Page 52: Introducci ó a la Bioinformàtica Roderic Guigó i Serra roderic.guigo@crg.cat

Evolution of Sequencing technologies

Page 53: Introducci ó a la Bioinformàtica Roderic Guigó i Serra roderic.guigo@crg.cat

Evolution of sequencing technologies

• The human genome project– 12-15 years– 5 large centers: hundreds of instruments– Hundreds of scientists world wide– 3,000 millions dollars

• Today– A single instrument

2-3 genomes/today– 3,000 dollars/genoma

Page 54: Introducci ó a la Bioinformàtica Roderic Guigó i Serra roderic.guigo@crg.cat

ENIAC, 1950s2.4 x 0.9 x 30 (m) 385 operations/second. 10-6 operations/second/cm3

Page 55: Introducci ó a la Bioinformàtica Roderic Guigó i Serra roderic.guigo@crg.cat

ENIAC, 1950s2.4 x 0.9 x 30 (m) 385 operations/second. 10-6 operations/second/cm3

MAC AIR, 2010s~1 x 32.5 x 22.7 (cm) 133,656,056 operations/second. 105 operations/second/cm3

Page 56: Introducci ó a la Bioinformàtica Roderic Guigó i Serra roderic.guigo@crg.cat

CELERA GENOMICS, year 20001,000 m2. 2 yr. 3GB at 10x5x10-6 Gb/day/m3

Page 57: Introducci ó a la Bioinformàtica Roderic Guigó i Serra roderic.guigo@crg.cat

CELERA GENOMICS, year 20001,000 m2. 2 yr. 3GB at 10x5x10-6 Gb/day/m3

HISEQ 2500. year 2012119 x 94 x 76 (cm). 1 day 120 Gb102 Gb/day/m3

Page 58: Introducci ó a la Bioinformàtica Roderic Guigó i Serra roderic.guigo@crg.cat

Moore’s Law

Page 59: Introducci ó a la Bioinformàtica Roderic Guigó i Serra roderic.guigo@crg.cat
Page 60: Introducci ó a la Bioinformàtica Roderic Guigó i Serra roderic.guigo@crg.cat
Page 61: Introducci ó a la Bioinformàtica Roderic Guigó i Serra roderic.guigo@crg.cat
Page 62: Introducci ó a la Bioinformàtica Roderic Guigó i Serra roderic.guigo@crg.cat

En un futur proper,disposarem del

genoma individual de cadascu de nosaltres

Page 63: Introducci ó a la Bioinformàtica Roderic Guigó i Serra roderic.guigo@crg.cat

El futur es aqui: el nostre genoma

Page 64: Introducci ó a la Bioinformàtica Roderic Guigó i Serra roderic.guigo@crg.cat