1_introduction to bioinformatics
TRANSCRIPT
-
7/29/2019 1_introduction to Bioinformatics
1/28
341: Introduction to Bioinformatics
Dr. Nataa Prulj & Prof. Yike Guo
Department of Computing
Imperial College London
Winter 2011
1
-
7/29/2019 1_introduction to Bioinformatics
2/28
Course overview
Explosion in the availability of biological data: Sequences and microarrays (Prof. Guo)
Networks: expected to be as useful as the sequence data inuncovering new biology (Dr. Prulj)
The goal of systems biology: Systems-level understanding of biological systems, e.g. the cell
Analyze not only individual components, but their interactions aswell and its functioning as a whole
E.g.: Learn new biology from the topology of such interaction
networks However, biological network research faces considerable
challenges Incomplete and noisy data
Computational infeasibility of many graph theoretic problems2
-
7/29/2019 1_introduction to Bioinformatics
3/28
Course overviewWe will cover:
1. Biological aspects: Basic biological concepts (e.g., DNA, genes, proteins, gene expression, ) Different types of biological networks Experimental techniques for acquiring the data and their biases Public databases and other sources of biological network data
2. Sequence analysis (Prof. Yi-Ke Guo)3. Microarray analysis (Prof. Yi-Ke Guo)4. Graph theoretic aspects:
Fundamental topics in graph theory (e.g. basic graph notation, graph representation, andspecial graph types)
Basic graph algorithms (e.g., graph search/traversal algorithms and running time analysis) Important computational complexity concepts (e.g., complexity classes, subgraph
isomorphism, and NP-completeness) which pose challenges on analyzing biological nets
5. Existing approaches for analyzing and modeling biological networks: Structural properties of large networks
Network models Network clustering Graph alignment Software tools for network analysis
6. Applications: interplay of topology and biology Learn how the above methods have been applied Discuss valuable insights that have been learned: into biological function, evolution,
complex diseases (e.g., cancer) and drug discovery3
-
7/29/2019 1_introduction to Bioinformatics
4/28
Course overview
Grading scheme: Two homework assignments
Each assignment worth equally
Due at the beginning of the class
Written exam
Standard College Grading Scheme will be used
4
-
7/29/2019 1_introduction to Bioinformatics
5/28
Course overview
Course organization:1. Lectures
Relevant theoretical concepts and examples
2. Tutorials Exercises covering concepts covered in class
3. Two homework assignments
Opportunity to solve practical problems using the methods learned in class
4. Written exam Testing students understanding of the concepts learned in lectures
5
-
7/29/2019 1_introduction to Bioinformatics
6/28
Course overview
Textbooks and readings Recommended textbooks:
Junker and Schreiber, Analysis of Biological Networks, Wiley, 2008. West, Introduction to graph theory, 2nd edition, Prentice Hall, 2001or T. Cormen et al., Analysis of Algorithms, 3rd eddition, MIT press, 2009. A list of up-to-date research papers selected by the instructor.
Recommended readings: F. Kepes (Author, Editor), Biological Networks (Complex Systems and
Interdisciplinary Science), World Scientific Publishing Company; 1stedition, 2007.
Bornholdt and Schuster (Editors), Handbook of Graphs and Networks:From the Genome to the Internet, Wiley, 2003.or
Dorogovtsev and Mendes (Authors), Evolution of Networks: FromBiological Nets to the Internet and WWW (Physics), Oxford UniversityPress, 2003.
Chapter 17 from: Chen and Lonardi (Editors), Biological Data Mining,Chapman and Hall/CRC press, 2009.
Chapter 4 from: Jurisica and Wigle (Editors), Knowledge Discovery inProteomics, CRC Press, 2005.
LEDA: A Platform for Combinatorial and Geometric Computing, by Kurt
Mehlhorn, Stefan Nher, Cambridge University Press, 1999. 6
-
7/29/2019 1_introduction to Bioinformatics
7/28
Course overview
When and where:
Fridays, 2 5 pm (2 hours of lecture, 1 hour tutorail)
145 Huxley
Contact:
E-mail: [email protected]
Subject: 341 Bioinformatics
Office hours:
Fridays after class, 5 pm
Office: 407 A Huxley
7
-
7/29/2019 1_introduction to Bioinformatics
8/28
Course overview
Prerequisites: none
Basic programming skills are desirable
Introduction into biological concepts will be provided
Course website (curriculum, class material, etc.):
http://www.doc.ic.ac.uk/~natasha/course/index.html
Academic code of honor
8
-
7/29/2019 1_introduction to Bioinformatics
9/28
Topics Introduction: biology (Dr. Przulj)
Sequence analysis (Prof. Guo, 2 lectures)Microarray analysis (Prof. Guo, 3 lectures)
Network biology (Dr. Przulj):
Introduction to graph theory
Network properties
Network/node centralities
Network motifs
Network models
Network/node clustering
Network comparison/alignment
Software tools for network analysis
Interplay between topology and biology 9
-
7/29/2019 1_introduction to Bioinformatics
10/28
Course overview
Any questions so far?
10
-
7/29/2019 1_introduction to Bioinformatics
11/28
Course overview
About you
11
-
7/29/2019 1_introduction to Bioinformatics
12/28
Introduction: biology
12
-
7/29/2019 1_introduction to Bioinformatics
13/28
Introduction: biology
Cell- the building block of life
Cytoplasm and organelles separated by membranes:
Mitochondria, nucleus, etc.
13
-
7/29/2019 1_introduction to Bioinformatics
14/28
Introduction: biology
Distinguish between:
ProkaryotesSingle-celled, no cell nucleus or any other
membrane-bound organelles The genetic material in prokaryotes is not membrane-bound
The bacteriaand the archaea
Model organism: E.coli
EukaryotesHave "true" nuclei containing their DNA
May be unicellular, as in amoebae
May be multicellular, as in plants and animals
Model organism: S. cerevisiae (bakers yeast) 14
-
7/29/2019 1_introduction to Bioinformatics
15/28
Introduction: biology
Nucleus contains DNA
Deoxyribonucleic acid
DNA nucleotides: A and T, C and G
DNA structure: double helix
15
-
7/29/2019 1_introduction to Bioinformatics
16/28
Introduction: biology
Chromosomes
RNA:similar to DNA, except T U and single stranded
16
-
7/29/2019 1_introduction to Bioinformatics
17/28
17
Introduction: biology
Main role of DNA: long-term storage of genetic information
Genes: DNA segments that carry this information
Intron: part of gene not translated into protein, spliced out of mRNA
Exon: mRNA translated into protein consists only of exon-derivedsequences
Genome: total set of (unique) genes in an organism
Every cell (except sex cells and
mature red blood cells) contains
the complete genome of an organism
17
-
7/29/2019 1_introduction to Bioinformatics
18/28
18
Introduction: biology
Codons: sets of three nucleotides
4 nucleotides 43=64 possible codons
Each codon codes for an amino acid
64 codons produce 20different amino acidsMore than one codon stands for one amino acid
Polypeptide:
String of amino acids, composed from a 20-character alphabet
Proteins: String composed of one or more polypeptides (70-3000 amino acids)
Sequence of amino acids is defined by a gene
Gene expression: information transmission from DNA to proteins
Proteome: total set of proteins in an organism
-
7/29/2019 1_introduction to Bioinformatics
19/28
Introduction: biology
The 20 amino acids
19
-
7/29/2019 1_introduction to Bioinformatics
20/28
20
Introduction: biology
Levels of protein
structure:
20
-
7/29/2019 1_introduction to Bioinformatics
21/28
Introduction: biology
Genes vs. proteins
Genes passive; proteins active
Protein synthesis: from genes to proteins
Transcription(in nucleus)Splicing(eukaryotes)
Translation (in cytoplasm)
21
-
7/29/2019 1_introduction to Bioinformatics
22/28
Introduction: biology
Transcription(in nucleus)
RNA polymeraseenzyme builds an RNA strandfrom a gene (DNA is "unzipped)
The gene is transcribed to messenger RNA(mRNA)
Transcription is regulated by proteins called
transcription factors
22
-
7/29/2019 1_introduction to Bioinformatics
23/28
-
7/29/2019 1_introduction to Bioinformatics
24/28
Introduction: biology
Translation(in cytoplasm)
Ribosomes synthesize proteins from mRNA
mRNA is decoded and used as a template to guide the
synthesis of a chain of amino acids that form a proteinTranslation: the process of converting the mRNA codon
sequences into an amino acid polypeptide chain
24
-
7/29/2019 1_introduction to Bioinformatics
25/28
Introduction: biology
Microarrays:
Measure mRNA abundance for each gene
The amount of transcribed mRNA correlates with
gene expressionThe rate at which a gene produces the corresponding protein
It is hard to measureprotein level directly!
25
-
7/29/2019 1_introduction to Bioinformatics
26/28
Introduction: biology
Every cell* contains the complete genome of an organism
How is the variety of different tissues encoded andexpressed?
26
-
7/29/2019 1_introduction to Bioinformatics
27/28
Introduction: biology
27
-
7/29/2019 1_introduction to Bioinformatics
28/28
Introduction: biology
-ome and omics
Genome and genomics
Proteome and proteomics
28