utaccel 2010 adventures in biotechnology graham cromar
TRANSCRIPT
UTACCEL 2010
Adventures in Biotechnology
Graham Cromar
Bioinformatics
Bioinformatics is about integrating biological themes together with the help of computer tools and biological databases, and gaining new knowledge from this.
Sanger sequencing
Automated Sequencing
In the past, the separation of the DNA strands by electrophoresis was a time consuming process. Today, fluorescent labels and new advances in gel electrophoresis have made DNA sequencing fast and accurate. Also, the process is almost fully automated, including the read out of the final sequence.
Parallelizing Sequencing
6Introduction 1.0
Genbank doubles every 14 months
(from the National Centre for Biotechnology Information)Shorter than Moore’s law (computer power doubling every 20 months!)
7Introduction 1.0
Genomes Number of base pairs___________________________________________________________
1971 First published DNA sequence 121977 PhiX174 5,3751982 Lambda 48,5021992 Yeast Chromosome III 316,613 1995 Haemophilus influenza 1,830,138 1996 Saccharomyces 12,068,0001998 C. elegans 97,000,0002000 D. melanogaster 120,000,0002001 H. sapines (draft) 2,600,000,0002003 H. sapiens 2,850,000,000
Complexity does not always correlate with size. The largest genome known to date is the amoeba!
10
The next step is to locate all of the genes and regulatory regions, describe their functions, and identify how they differ between different groups (i.e. “disease” vs “healthy”)… …bioinformatics plays a critical role
Storage, search, retrieval and visualization are key
Bioinformatics will help with…….
Structure-Function Relationships
Can we predict the function of protein molecules from their sequence?sequence > structure > function
Prediction of some simple 3-D structures (a-helix, b-sheet, membrane spanning, etc.)
12Introduction 1.0
BLAST Result
BasicLocalAlignmentSearchTool
13Introduction 1.0
Micro-array analysis:
Figure 4Figure 1
Science Jan 1 1999: 83-87
The Transcriptional Program in the Response of Human Fibroblasts to Serum
Vishwanath R. Iyer, Michael B. Eisen, Douglas T. Ross, Greg Schuler, Troy Moore, Jeffrey C. F. Lee, Jeffrey M. Trent, Louis M. Staudt, James Hudson Jr., Mark S. Boguski, Deval Lashkari, Dari Shalon, David Botstein, Patrick O. Brown
14
Genetic Analysis of Cancer in Families
The Genetic Predisposition to Cancer
PubMed Text Neighboring
• Common terms could indicate similar subject matter
• Statistical method• Weights based on term
frequencies within document and within the database as a whole
• Some terms are better than others
There are over 1 million papers published in the life sciences each year!
15Introduction 1.0
Top 10 Future Challenges for Bioinformatics
Precise, predictive model of transcription initiation and termination: ability to predict where and when transcription will occur in a genome
Precise, predictive model of RNA splicing/alternative splicing: ability to predict the splicing pattern of any primary transcript in any tissue
Precise, quantitative models of signal transduction pathways: ability to predict cellular responses to external stimuli
Determining effective protein:DNA, protein:RNA and protein:protein recognition codes
Accurate ab initio protein structure prediction
Rational design of small molecule inhibitors of proteins
Mechanistic understanding of protein evolution: understanding exactly how new protein functions evolve
Mechanistic understanding of speciation: molecular details of how speciation occurs
Continued development of effective gene ontologies - systematic ways to describe the functions of any gene or protein
Education: development of appropriate bioinformatics curricula for secondary, undergraduate and graduate education
Tutorial