microarray technology and analysis of gene expression data hillevi lindroos

24
Microarray technology and analysis of gene expression data Hillevi Lindroos

Post on 18-Dec-2015

221 views

Category:

Documents


5 download

TRANSCRIPT

Page 1: Microarray technology and analysis of gene expression data Hillevi Lindroos

Microarray technology and analysis of gene expression data

Hillevi Lindroos

Page 2: Microarray technology and analysis of gene expression data Hillevi Lindroos

Introduction to microarray technology

• Technique for studying gene expression for thousands of genes simultaneously.

• Study gene regulation, effects of treatments, differences between healthy and diseased cells...

• Comparative Genome Hybridization:

- gene content in related strains/species

- gene dosage in cancer cells

• Microarray: glass slide with spots, each containing DNA from one gene

Page 3: Microarray technology and analysis of gene expression data Hillevi Lindroos

Two-colour spotted microarrays

Spot = PCR-product (~500 bp) from one gene or long oligonucleotide (~50 bp)

Differential expression (two samples compared)

Page 4: Microarray technology and analysis of gene expression data Hillevi Lindroos

Experimental procedure:

1. Isolate RNA from 2 samples (experiment and control).

2. Reverse transcribe to cDNA with fluorescently labelled nucleotides, e.g. Cy3-dCTP (control) or Cy5-dCTP (experiment).

3. Mix and hybridize to microarray.

4. Laser scan: measure fluorescent intensities

Page 5: Microarray technology and analysis of gene expression data Hillevi Lindroos

In principle...

Red spot: up-regulated gene, ratio >1

Green spot: down-regulated gene, ratio <1

Yellow spot: no differential expression, ratio =1

Red and green images superimposed:

Page 6: Microarray technology and analysis of gene expression data Hillevi Lindroos

Sample (e.g. heat shock)Sample (e.g. heat shock)

gene A

RT

+ red dye

ControlControl

RT

+ green dye

mixing equal amounts

of cDNA

competitive

hybridization

Microarray

Red dot in imageUp-regulation

Page 7: Microarray technology and analysis of gene expression data Hillevi Lindroos

Why differential expression?

Fluorescent intensities do not directly correspond to mRNA concentrations, due to:

• different shapes and densities of spots

• different hybridization properties between genes

• different amounts of dye incorporation between genes

Compare intensities (expression) from two samples.

Page 8: Microarray technology and analysis of gene expression data Hillevi Lindroos

Data processing and analysis

1. Image analysis

Locate spots in image

Quantify fluorescence intensity (spot + background)

Mean / median of pixel intensities

Page 9: Microarray technology and analysis of gene expression data Hillevi Lindroos

2. Background correction

– local background for each spot, or global for whole array

– assuming additive background:

Spot intensity = True intensity + Background

Page 10: Microarray technology and analysis of gene expression data Hillevi Lindroos

Output

Cy5 (R) and Cy3 (G) intensities

Ratio = R/G

~ [mRNA_experiment] / [mRNA_control]

Up-regulated genes: ratio >1

Down-regulated genes: ratio= 0-1

Assymetry!

Page 11: Microarray technology and analysis of gene expression data Hillevi Lindroos

Use logarithm!

M = log2(ratio) is symmetrically distributed around 0

Upregulated 2 times: ratio= 2, M= 1

Downregulated 2 times: ratio= 0.5, M= -1

Page 12: Microarray technology and analysis of gene expression data Hillevi Lindroos

3. Normalization: correction of systematic errors (dye bias)

• different amounts of control and experiment samples

• different fluorescent intensities of Cy3 and Cy5

• different labelling and detection efficiencies

Page 13: Microarray technology and analysis of gene expression data Hillevi Lindroos

Dye bias: Most genes seem to be upregulated (higher Cy5 than Cy3 intensity).

Plot of Cy5 intensity (R) vs Cy3 intensity (G):

Page 14: Microarray technology and analysis of gene expression data Hillevi Lindroos

Corrected for by scaling Cy5 values with total_Cy3/total_Cy5.

Assumes most genes unaffected by treatment.

Page 15: Microarray technology and analysis of gene expression data Hillevi Lindroos

Dye bias may depend on total spot intensity A

(A =½(log2R+log2G)), position on array, print-tip…

Intensity dependent dye bias

Page 16: Microarray technology and analysis of gene expression data Hillevi Lindroos

Correction:

Mnormalized = M – Mtrend(A)

Page 17: Microarray technology and analysis of gene expression data Hillevi Lindroos

Identify differentially expressed genes

•Simple: cutoff (e.g. |M| > 1)

•Better: statistical test, e.g. t-test (replicate spots or repeated experiments) => Significance

–Unstable mRNAs may have high ratios – and high variation!

–Weak spots: small difference in signal may be big relative difference (high ratio).

Page 18: Microarray technology and analysis of gene expression data Hillevi Lindroos

Affymetrix genchips

Spots = 25 bp oligonucleotides

Pairs of perfectly matching probe + probe with 1 mismatch for each gene

One sample per array

Radioactive labelling

Expression level computed from difference in intensity between matching and mis-matching probe

Page 19: Microarray technology and analysis of gene expression data Hillevi Lindroos

Expression profiles

Plot expression over a series of experiment (e.g. time series)

Expression profiles

-4

-3

-2

-1

0

1

2

3

0 1 2 3 4 5 6

Time

M =

lo

g2

(R/G

)

Gene_AGene_B

Page 20: Microarray technology and analysis of gene expression data Hillevi Lindroos

Clustering expression profiles

Analyze multiple experiments to identify common patterns of gene expression

Similar function – similar expression (co-regulation)

Goals:

•Identify regulatory motifs

•Infer function of unknown genes

•Distinguish cell types, e.g. tumors (cluster arrays)

Page 21: Microarray technology and analysis of gene expression data Hillevi Lindroos

Hierarchical clustering

Expression profile -> vector

Compute similarity between expression profiles (e.g. correlation coefficient)

Successively join the most similar genes to clusters, and clusters to superclusters

Page 22: Microarray technology and analysis of gene expression data Hillevi Lindroos

Serum stimulation of human fibroblasts, time series.

A: cholesterol biosynthesis

B: cell cycle

C: immediate-early response

D: signaling and angiogenesis

E: wound healing

from: Eisen et al., 1998, PNAS 95(25): 14863-14868

Distance: correlation coefficient

Agglomeration: average linkage

Page 23: Microarray technology and analysis of gene expression data Hillevi Lindroos

Clustering of arrays:classification of cancer cells.

From Chen et al. (2002). Mol Biol Cell 13(6):1929-39

Page 24: Microarray technology and analysis of gene expression data Hillevi Lindroos

Exercise:

Normalization (Excel):

R-G plot

M-A plot

most up- and downregulated genes