array platforms

25
Array Platforms 16K Agilent inkjet printed cDNA arrays The recently developed inkjet printing method (Agilent Technologies) produces more uniform spots than pin spotting techniques Array includes cDNAs selected from the RIKEN FANTOM collection supplemented by cDNAs from AfCS protein list Affymetrix GeneChip system U74A v.2 chip (represents approx. 13,000 mouse genes) 16k Agilent inkjet printed Oligonucleotide arrays (in preparation) Operon 70mers (13,443) and Compugen 65mers (2,304)

Upload: redford

Post on 22-Jan-2016

43 views

Category:

Documents


0 download

DESCRIPTION

Array Platforms. 16K Agilent inkjet printed cDNA arrays The recently developed inkjet printing method (Agilent Technologies) produces more uniform spots than pin spotting techniques Array includes cDNAs selected from the RIKEN FANTOM collection supplemented by cDNAs from AfCS protein list - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Array Platforms

Array Platforms

• 16K Agilent inkjet printed cDNA arrays– The recently developed inkjet printing method (Agilent

Technologies) produces more uniform spots than pin spotting techniques

– Array includes cDNAs selected from the RIKEN FANTOM collection supplemented by cDNAs from AfCS protein list

• Affymetrix GeneChip system– U74A v.2 chip (represents approx. 13,000 mouse genes)

• 16k Agilent inkjet printed Oligonucleotide arrays (in preparation)– Operon 70mers (13,443) and Compugen 65mers (2,304)

Page 2: Array Platforms

Ligand Screen Transcript Analysis

• B cell samples prepared by Cell Lab.• Cultured for different time periods (.5, 1, 2, and 4 hr) in the presence

or absence of ligands before harvesting for total RNA isolation.• Treated and untreated time-course samples hybridized against a spleen

reference.• After removing the common spleen denominator, comparison to 0

time point data reflects the changes in mRNA levels due to ligand treatment and/or time in culture.

• All of the experiments were done in triplicate. Including in controls >450 arrays

Page 3: Array Platforms

Molecular Biology Laboratory

Microarray & Analysis

Sangdun Choi

Xiaocui Zhu Rebecca Hart

Anna CaoMi Sook ChangJong Woo KimSun Young Lee

Page 4: Array Platforms

a. Calculate gene expression value:Compute log2(Treated/0hr) = log2(Treated/Spleen) – log2(0hr/Spleen) using

processedSignalIntensity b. Hierarchical cluster:

with genes showing >= 2 fold change in at least one condition while keeping ligands in alphabetical/time course order:

Gene 1Gene 2Gene 3……..

30m

in1h

r2h

r4h

r30

min

2M

A1h

r 2M

A2h

r 2M

A4h

r 2M

A30

min

AIG

1hr

AIG

2hr

AIG

4hr

AIG

….

Average of triplicates Average of 6-23 replicates

5281 genes

132 conditions

Clustering Analysis of Gene Expression Profile Using log2Ratio (Treated/0hr)

Page 5: Array Platforms

Gen

es, c

lust

ered

Ligands, time course ( i.e. medium- 30 min, 1hr, 2hr, 4hr; 2MA- 30 min, 1hr, 2hr, 4hr…)

Page 6: Array Platforms

Genes up regulated in AIG, CD40L, IL4, LPS and CpG

IL4

LP

S

CD

40L

AIG

CpG

317 features

Ccnd2

Cdk4

Caspase 4

Bax

Ak2Hk2

Atf

cdk6

Ifrd2

Image contrast: 1.07

Non

e

Page 7: Array Platforms

Genes down regulated in AIG, CD40L, IL4, LPS and CpG

IL4

LP

S

CD

40L

AIG

CpG

319 features

id3

Bnip3l

Gnai2

Gprk6Bcap31

Image contrast: 1.07

cAMP-GEFII

Non

e

Page 8: Array Platforms

IL4

LP

S

CD

40L

AIG

CpG

Genes showing AIG & CD40L specific changes

235 features

Par-6Gadd45b

Dagk1

Mapk12

Image contrast: 1.16

IL3raIL10ra

Non

e

Page 9: Array Platforms

Genes up regulated in IL4

IL4

LP

S

CD

40L

AIG

CpG

42 featuresImage contrast: 1.14

Non

e

Socs1

Caspase 6Xbp1

Rgs14

Dapp1

Page 10: Array Platforms

IL4

LP

S

CD

40L

AIG

CpG

Genes showing AIG specific changes

65 features

Stress induced protein

Bak1

Image contrast: 1.54

apolipoprotein E

Bcl2l11LTb

Non

e

Page 11: Array Platforms

Madhusudan NatarajanRama Ranganathan

Page 12: Array Platforms

basal Observed value

basalσ obsσ

basal

basalobsz

σ−

=

Clustering Analysis of Gene Expression Profile Using Z Score

Z score: a measurement of the distance between an observed value and the mean of a population

Page 13: Array Platforms

a. Calculate gene expression metric, x:For each gene i on a given chip j: xij ={rMedianIntensity (treated) / gMedianIntensity

(spleen) }/ xj , where xj is the mean of intensity ratio of all genes on chip j

c. Calculate the mean and standard deviation of gene expression in 27 sets of 0hr untreated data: For each gene i, calculate the mean(i) and the standard deviation (σi) of expression on

27 0hr chips;

d. Calculate Z score as a measurement of differential expression from 0hr condition For each gene i on a given chip j, Zij = (xij – i) / σi

f. Cluster genes and ligands using Z-score:with genes whose Z > 2 in any of the ligands

Clustering Analysis of Gene Expression Profile Using Z Score

Page 14: Array Platforms

Clustering ligand based on Z scores

Page 15: Array Platforms

AfCS Data Analysis- Microarray

Dennis Mock

UC Principal Statistician

University of California, San Diego

Director: Shankar Subramaniam

Acknowledgment: Eugene Ke, Bob Sinkovits, Brian Saunders

Page 16: Array Platforms

Two-way hierarchical clustering –unsupervised- Ligands (n=33)

(0hr, .5h, 1h, 2h, 4h)

Note: the ligand cluster according early –late conditions with 90-100% accuracy

(metrics: sample = Euclidean; gene = Pearson)

.

.

.

.

.

.

.

.

.

late 2-4 hrearly .5-1 hr

0 hr early .5-1 hr

(non-mitogenic)

late 2-4 hr

mitogenic

Interleukins

Dennis Mock - UCSD

Page 17: Array Platforms

Significance analysis of microarrays* (SAM)(R. Tibshirani, G. Chu 2002)

Objective: The replicated expression for each gene is taken for the 4hr time condition (untreated vs ligand) to determine whether the gene is statistically

differentially up- or down- regulated.

The t-statistics for all the genes are ordered and noted. The labels are then permutated and the t-statistic is calculated again. After many iterations, the cumulative t-statistics is averaged for each gene. Finally, for a given false positive rate, [called “False Discovery Rate” or FDR], the significant genes are selected.

For each gene, define the adjusted “t-statistic” as follows:

treated - untreated

σ + adjustment factor

mean of replicates

σ standard deviation for the gene

Dennis Mock - UCSD

Page 18: Array Platforms

Differentially expressed genes for ligands vs UNTREATED @ 4hr [ SAM ; False Discovery Rate ( ) ]

ligand (4hr)

40L (1%)LPS (1%)AIG (1%)IL4 (1%)CPG (1%)IFB (1.5%)GRH (1%)2MA (18%)LPA (17%)

CGS (2.9%)BOM (35%)IGF (8%)

S1P (38%)PAF (2.4%)70L (6%)

NPY (10%)DIM (9%)

LB4 (23%)M3A (3.5%)FML (11%)TGF (2.5%)TER (35%)IL10 (20%)ELC (26%)PGE (11%)

BAFF (11%)BLC (57%)NGF (42%)TNF (33%)SDF (20%)IFG (25%)NEB (25%)

SLC (NA)

number of genes (probes)differentially expressed

0

50

100

150

200

500

600

700

800

900

1000

1100

down-regulated

up-regulated

Page 19: Array Platforms

Concordance of significantly up (+) or down (-) regulated genes mitogenic ligands (FDR = 1%)

756 (-)

1082 (+)

337 (-)135 (-) 553 (-)

147 (-)

“down-regulated” matches

“up-regulated” matches

3 (-)

446 (-)

887 (+)

96 (-)

Mosaic plot

578 (+)

73 (+)

597 (+)

117 (+)

47 (+)

477 (+)

117 (+)

4 (+) 6 (+) 3 (+)

796 (-)

854 (+)

5 (+) 4 (+)

3 (-)

10 (+)

1 (-)

3 (-)

2 (-)

3 (-)

72 (+)

18 (+)

341 (-)

143 (-)

152(-)

80(+)

108 (+)

171 (-)

163 (+)

151 (-)

119 (-)

Discordance matrix

Example: CD40L had 756 down-regulated and 1082 up-regulated genes.

Those which were similarly regulated in AIG:

337 down

578 up.

72 (-)

40L AIG CPG LPS IL4 IL1040L - 17 0 0 9 0AIG 4 - 0 1 3 0CPG 0 6 - 0 2 0LPS 1 17 0 - 11 1IL4 3 3 0 0 - 0

IL10 0 0 0 0 0 -

Page 20: Array Platforms

Beyond Clustering

• How can we obtain biological information from array data at the level of individual genes and correlations in expression between genes?

• Can we use the correlations to build a connection network that reflects correlations in expression? Is there biological significance to this?

Page 21: Array Platforms

Two-way hierarchical cluster:

mean ratio (vs control) of phosphoprotein levels and ligand

Note: the ligands that elicit an ERK response (chemokines + AIG, CD40L) clustered together.

Page 22: Array Platforms

Transcription factor encoded by fos is stabilized by ERK and continues to affect other IE genes such as jun

from Nature Cell Biology august 2002 v 4 issue 8

Page 23: Array Platforms

ObservationsTo a first approximation the resting B-cells behave as if they have evolved to respondwith massive transcriptional changes to a very specific subset of ligands. Somecommon pathways are activated and some gene expression changes are restricted toindividual ligands.

Caveats:a)The experimental design was optimized to allow comparison of many data sets byusing a spleen reference. Would direct comparisons between treated and untreatedtime matched samples have allowed small transient changes in gene expression to bereadily seen. – Some comparative experiments done Cell Lab (UTSWM) – beinganalyzed.

b)There is more information in this data set. The analysis thus far deemphasizes timeseries data and supervised analysis methods suggests that changes correlated withsome of the apparently less active ligands can be unearthed. – see posters D.Mock(UCSD) & R.Scheurmann(UTSWM)

c)These resting B-cells are beginning to undergo apoptosis and thus the experimentswere done over a short time period. The full articulation of the response to the activeligands is not observed. – Experiments done with Bcl-2 resting B-cells –B.Seamanand T. Roach (UCSF)

d) Transcription changes the “state of the cell”. Thus double ligand experiments willneed to account for the order of addition of ligands as well as concentration.Somepreliminary experiments done CellLab (UTSWM)- being analyzed

Page 24: Array Platforms

A clear lesson that we must implement as soon as possible is to decrease the cycle time from experimental design - data collection - data analysis - conclusions, models - to experimental redesign. In the past the rate limiting step has been data analysis

Page 25: Array Platforms

Input Signals

Signal Processing

Translocation

Gene Expression

Cytoskeleton

Transcription Translation

Transcription Translation