english okay? masters studies offer tracks: this is part of: vl microarray data analyis tuesday,...
TRANSCRIPT
English okay? Masters studies offer tracks:This is part of:
VL Microarray data analyisTuesday, 8:30 – 10:00
Ü Thursday 10:15-11:45 (start: Oct. 23)
Next semester: Praktikum + SeminarThereafter possibility for Masters thesis. Anwesenheitspflicht in VL und Ü (Liste!) Literature:See course web page.
1 21. Okt Microarray-Technologien Martin Vingron
2 28. Okt Grundlagen der Datenanalyse Christine Steinhoff
3 4. Nov Varianzanalyse I Christine Steinhoff
4 11. Nov Varianzanalyse II Christine Steinhoff
5 18. Nov LOWESS, Varianzstabilisierung Anja von Heydebreck
6 25. Nov Statistisches Testen Anja von Heydebreck
7 2. Dez Clusterverfahren Anja von Heydebreck
8 9. Dez Klassifikation, Lin. Diskriminanzanalyse Rainer Spang
9 16. Dez Anwendungen in der Krebsforschung Rainer Spang
10 6. Jan Hauptkomponentenanalyse Martin Vingron
11 13. Jan Statistische Lerntheorie Rainer Spang
12 20. Jan Sequenzannotation Rainer Spang
13 27. Jan Bayessche Netzwerke Rainer Spang
14 3. Feb Regulation Martin Vingron15 10. Feb Zusammenfassung, Wiederholung, Ausblick
Functional Genomics:
Genome Sequencing:Determination of DNA sequenceDerivation of amino acid sequencesAnalysis, comparison, classification
Study of gene function gene expression studies proteomicsmetabolic networks
DNAgene
transcription
messenger RNA (mRNA)
proteinsequence
structure
translation
A cell and its population of genes:
What is the problem?
Determine the amount of mRNA for each
gene that is present in a cell/tissue.
DNA forms double strands by a process calledhybridization:
Labeling
Hybridization
Expression Arrays
cDNA Arrays Oligonucleotide Arrays
Glas Arrays Membrane based Arrays
Glass Slide Microarrays
… were first produced at Stanford University (Schena et al, 1995).
Whole cDNA:500-1500 bp
Filter “Macro”arrays
… were first published by Lennon and Lehrach, 1991
Ca 21 cm
7.5x2.5cm
Oligonucleotide Arrays
… were first published by Lockhardt et al, 1996
...
...PMMM
1 2 3 4 ... 17 18 19 20probe pair
probe set
probe cell
... TGTGATGGTGGGAATGGGTCAGAAGGACTCCTATGTGGGTGACGAGGCC TTACCCAGTCTTCCTGAGGATACAC TTACCCAGTCTTGCTGAGGATACACca 25bp
GC AC
GC AC
GC AC
GC AC G
C ACG
C AC
Probe - Reference
GC AC G
C AC
There are other technologies, too, to estimate expression levels:
• EST sequencing – „electronic northern“
• SAGE: tags of mRNAs are concatenated and sequenced
• Reliability of results depends on depth of probing (number of ESTs, number of tags)
Why do we want to know?
• „tissue profiling“: which genes are expressed in a tissue
• Comparing healthy and diseased (e.g., tumor) tissue
• Studying dynamic processes: E.g., cell cycle (time series)
Example: Renal clear cell carcinoma
Comparison of kidney cancer cells to normal tissue. Which genes are altered in their expression?
T98-8880
N98-8880
Molecular Genome Analysis Dr. Judith Boer
G1 S G2 M
Spellman et al took several samples per time-point and hybridized the RNA to a glass chips with all yeast genes
Example: Cell cycle time course
Data processing
• Image collection
• Image analysis, intensity determination
• Within slide normalization
Trends in BiotechHess et al, 19(11),2001
OUPUT: Scanner + Scanner-Software
...
... Trends in BiotechHess et al, 19(11),2001
Different technologies
• Support: membrane or glass slide
• Spotted material: PCR product or oligo (short/long)
• Labeling: – 1-channel: radioactive, Affy
– Absolute values
– 2-channel: 2 color fluorescent labeling– Relative values
Quality issues
-0.2 -0.1 0.0 0.1 0.2 0.3 0.4 0.5
0.0
0.2
0.4
0.6
0.8
1.0
43 a73-u02400vene.txt
log(fg.green/fg.red)
F̂
Kidney1Kidney2Kidney3Kidney4Kidney5Kidney6Onco1Onco2Onco3Onco4Onco5
subpopulations: PCR subpopulations: PCR
Remedies: improve PCR protocols; model “random effect” through plate-wise calibration
Remedies: improve PCR protocols; model “random effect” through plate-wise calibration
subpopulations: pin subpopulations: pin
-0.8 -0.6 -0.4 -0.2 0.0 0.2
0.0
0.2
0.4
0.6
0.8
1.0
41 (a42-u07639vene.txt) by spotting pin
log(fg.green/fg.red)
F̂
1:11:21:31:42:12:22:32:43:13:23:33:44:14:24:34:4
Remedies: handling of pins; pin-wise calibrationRemedies: handling of pins; pin-wise calibration
Distribution of intensities: log-normal?
intensities log intensities
QQPlot
Histogramm
Chip design
• Type of chip: – Global „whole genome“ (yeast, drosophila,
mouse, man)– Domain specific, e.g. cancer, infection
• Spots:– PCR products: E.g., 3´ UTR (avoid crosshyb.)– Oligos: uniqueness, stability
Databases
• Stanford• TIGR• Gene expression atlas • GEO• Arrayexpress
• MIAME standard: Minimum Information About a Microarray Experiment
Software
• R + Bioconductor
• Jexpress
• Genesprings
• Rosetta Resolver
Affymetrix technology
• Per gene, spot 20 perfectly matching oligos and 20 oligos with 1 mismatch
• Intensity: weighted average of pixel intensities in perfect and mismatch oligos
(More on this next week)