advanced algorithms and models for computational biology class overview eric xing & ziv...
TRANSCRIPT
Advanced Algorithms and Models for
Computational Biology
Class OverviewClass Overview
Eric Xing & Ziv Bar-Joseph Eric Xing & Ziv Bar-Joseph
Lecture 1, January 18, 2005
Reading: Chap. 1, DTM book
Class webpage: http://www.cs.cmu.edu/~epxing/Class/10810-06/
Logistics
Logistics
4 homeworks: 40% of grade Theory exercises Implementation exercises
Final project: 40% of grade Applying what you learned in the class to a realistic, non-trivial CompBio problem
Sequence analysis, network modeling, microarray mining, genetic polymorphism, evolution … If on your own current research, higher expectation, talk to us first …
Expected outcome A deliverable software based on an new/extended model or algorithm Addressing a biological problem via a thorough analysis of a realistic dataset A mini-paper …
Collaboration policies … Tow persons per team at most
Class participation and reading: 20% of grade
Logistics
No required text books, but suggested reading will be announced prior every class from: Durbin et al, Biological Sequence Analysis. Deonier, Tavare and Waterman, Computational Genome Analysis. Selected papers
Mailing Lists: Send email to [email protected] with: full name, department, register/audit To contact the instructors: [email protected] Class announcements list: [email protected]
Class Assistant: Monica Hopes, Wean Hall 4616, x8-5527
Books
Class Plan
Introduction (1week)
Biological Sequence Analysis (2.5 weeks)
Gene Expression Analysis (3 weeks)
Population Genetics (2 weeks)
Evolution and Phylogeny (2 weeks)
Systems Biology (4 weeks)
Introduction to Cell Biology, Introduction to Cell Biology, Functional Genomics, Functional Genomics,
Development, Probability, etc.Development, Probability, etc.
Model Organisms
Bacterial Phage: T4
Bacteria: E. Coli
The Budding Yeast:Saccharomyces cerevisiae
The Fission Yeast:Schizosaccharomyces pombe
FEATURES OF THE NEMATODE Caenorhabditis elegans
• SMALL: ~ 250 µm • TRANSPARENT • 959 CELLS • 300 NEURONS
• SHORT GENERATION TIME • SIMPLE GROWTH MEDIUM • SELF- FERTILIZING HERMAPHRODITE • RAPID ISOLATION AND CLONING OF MULTIPLE TYPES OF MUTANT ORGANISMS
The Nematode: Caenorhabditis elegans
The Fruit Fly: Drosophila Melanogaster
The Mouse
transgenic for human growth hormone
Prokaryotic and Eukaryotic Cells
A Close Look of a Eukaryotic Cell
The structure:
The information flow:
Cell Cycle
Signal Transduction
A variety of plasma membrane receptor proteins bind extracellular signaling molecules and transmit signals across the membrane to the cell interior
Signal Transduction Pathway
Functional Genomics and X-omics
A Multi-resolution View of the Chromosome
Organism
PROKARYOTIC Mycoplasma genitalum (Bacterium) Helicobacter pylori (Bacterium) Haemophilus influenza (Bacterium) EUKARYOTIC Saccharomyces cerevisiae (yeast) Drosophila melanogaster (insect) Caenorhabditis elegans (worm) Homo sapiens (human) Arabidopsis thaliana (plant)
Number of base pairs (millions)
0.58
1.67
1.83
12
165
97
2900
125
Number of encoded proteins
470
1590
1743
5885
13,601
19,099
30,000 TO 40,000
25,498
Number of chromosomes
1
1
1
17
4
6
23
10
DNA Content of Representative Types of Cells
Functional Genomics
The various genome projects have yielded the complete DNA sequences of many organisms. E.g. human, mouse, yeast, fruitfly, etc. Human: 3 billion base-pairs, 30-40 thousand genes.
Challenge: go from sequence to function, i.e., define the role of each gene and understand how the genome
functions as a whole.
motif
Regulatory Machinery of Gene Expression
Free DNA probe
*
*Protein-DNA complex
Advantage: sensitive Disadvantage: requires stable complex; little “structural” information about which protein is binding
Classical Analysis of Transcription Regulation Interactions
“Gel shift”: electorphoretic mobility shift assay (“EMSA”) for DNA-binding proteins
Modern Analysis of Transcription Regulation Interactions
Genome-wide Location Analysis (ChIP-chip)
Advantage: High throughput Disadvantage: Inaccurate
Gene Regulatory Network
Gene Expression networks
Regulatory networks
Protein-protein Interaction networks
Metabolic networks
Biological Networks and Systems Biology
Systems Biology:
understanding cellular event under a system-level context
Genome + proteome + lipome + …
Gene Regulatory Functions in Development
Temporal-spatial Gene Regulationand Regulatory Artifacts
A normal fly Hopeful monster?
Gene Regulation and Carcinogenesis
PCNA (not cycle specific)
G0 or G1 M G2
S
G1
E
AB
+
PCNA
Gadd45DNA repair
Rb
E2F
Rb P
Cyclin
CdkPhosphorylation of
+ -
Apoptosis
FasTNF
TGF-...
p53
Pro
mo
tes
oncogeneticstimuli
(ie. Ras)
extracellularstimuli(TGF-)In
hibi
tsac
tiva
tes
acti
vate
s
p16
p15
p53
p14
tran
scrip
tiona
l ac
tivat
ion
p21
acti
vate
s
cell damagetime required for DNA repair severe DNA damage
Cancer !Cancer !
Normal BCH
CIS
DYS
SCC
The Pathogenesis of Cancer
Bio-technology: Manipulating the Genome
Restriction Enzymes, naturally occurring in bacteria, that cut DNA at very specific places.
Recombinant DNA
Transformation
Formation of Cell Colony
How was Dolly cloned?
Dolly is an exact genetic replica of another sheep.
Definitions
Recombinant DNA: Two or more segments of DNA that have been combined by humans into a sequence that does not exist in nature.
Cloning: Making an exact genetic copy. A clone is one of the exact genetic copies.
Cloning vector: Self-replicating agents that serve as vehicles to transfer and replicate genetic material.
Software and Databases
NCBI/NLM Databases Genbank, PubMed, PDB DNA Protein Protein 3D Literature