statistical techniques for temporal microarray data analysis

Ritesh KrishnaDepartment Of Computer Science

WPCCS July 1, 2008

Why should you listen to my talk ?System Biology is everybody’s playground in

this room – Image processing, Algorithms, Parallel processing etc.

Importance of System Biology in today’s context –AgricultureEnergy sources (Bio Fuels)Gene TherapyWaste clean-up

Use of Computational TechniquesMassive data generated by molecular biology

experimentsNeed to analyse outputs files produced in

various formats, facilitate storage of bulk data, quick and precise retrieval, and most importantly understanding the behaviour and pattern in the data

How are these experiments performed

• Major revolution in the world of molecular biology• No limitation of one gene in one experiment• Possible to monitor expression levels of thousands of genes simultaneously

An example - Arabidopsis Thaliana• Popular in plant biology as a model plant• One of the smallest plant genome• First plant genome to be sequenced

Present Study• The present study is about understanding leaf senescence process in Arabidopsis.• Senescence refers to the biological processes of a living organism approaching an advanced age, caused due to age and stress in plant• It is a programmed event responding to a wide range of external and internal signals and is controlled in a tightly regulated manner by different genes and proteins..

Experimental Design

Issues with dataBiological variations vs. Technical

variationsTechnical variations – Sample bias, Dye

bias, Slide bias, Experimental conditions variations, Scanning and Imaging errors, Human errors

Massive dataset with ~31,000 genesGoal is to understand functioning of

certain sets of genes (needle in the haystack)

Step one – Clean the raw data using Normalization

To assess different sources of technical biasesTo remove the correlations between replicates to

make them independent from each other Fitting a multivariate error model - Normal

distribution with mean zero and constant variance for the residuals associated with genes

Propose statistical tests for evaluating the effects of normalization

Step two - ClusteringReduce the data dimensionSimilar genes sit in the same cluster.

Step three – Causal Network inference

0 5 105

time (day)

1 3 5 7 9 110

o CCA1+ LHY

o ELF4+ TOC1

CCA1LFY

ERS2 ETR2 ETR1

EIL3 EIL4EIL5

EIL1EIL2

PDF1.2

More information….Affymetrix Inc. (

http://www.affymetrix.com/index.affx)

Agilent Technologies (http://www.chem.agilent.com)

Microarray Analysis , Gibson G (2003) Microarray Analysis. PLoS Biol 1(1): e15

statistical techniques for temporal microarray data analysis

plant biology

model plant

data dimensionsimilar

different genes

world of molecular biology

importance of system

smallest plant genome

raw data

Documents

statistical methods for analyzing ordered gene expression...

statistical analysis of human microarray data shows that...

statistical issues in the design of microarray experiments

statistical analysis of microarray data · clustering...

statistical analysis of microarray data -...

statistical analysis on microarray data: selection of …1...

statistical issues in the analysis of microarray data ·...

statistical clustering of temporal networks through...

statistical analysis of spatio-temporal point process data...

improved statistical inference from dna microarray data...

microarray analysis - bioconductor · i g.k. smyth, linear...

gene expression data qifang xu. outline cdna microarray...

statistical analysis of microarray data

database and r interfacing for annotated microarray data ·...

normalization and statistical analysis - cbs€¦ ·...

bayesian statistical parameter synthesis for linear temporal...

statistical mechanics of temporal association in neural

microarray-based multiclass classiﬁcation using relative...

statistical analysis of the temporal-spatial …

statistical analysis of microarray data by h. bjørn nielsen