gene expression and regulation bioinformatics january 11, 2006 d. a. mcclellan...

24
Gene Expression And Regulation Bioinformatics January 11, 2006 D. A. McClellan ([email protected])

Post on 19-Dec-2015

219 views

Category:

Documents


4 download

TRANSCRIPT

Gene Expression And Regulation

Bioinformatics January 11, 2006

D. A. McClellan ([email protected])

Gene Expression

• Expressed in the transcriptome

• Every eukaryotic genome contains between 5000-60,000 protein-coding genes

• Only a small subset of those genes are transcribed

• by region (e.g. brain versus kidney)

• in development (e.g. fetal versus adult tissue)

• in dynamic response to environmental signals

(e.g. immediate-early response genes)

• in disease states

• by gene activity

Gene expression is regulated in several basic ways

Page 157

DNA RNA

cDNA

phenotypeprotein

Page 159

Central Dogma of Molecular Biology

DNA RNA

cDNA

protein DNA RNA

cDNA

protein

UniGene

SAGE

microarray

Fig. 6.2Page 159

Expression Databases & Analyses

• UniGene: for the comparison of cDNA libraries– Goals: (1) create one unique entry for each

gene, (2) collect all the ESTs associated with each gene

• SAGE: Serial Analysis of Gene Expression library

• DNA microarrays

Fig. 6.3Page 161

exon 1 exon 2 exon 3intron intron

transcription

RNA splicing (remove introns)

polyadenylation

Export to cytoplasm

AAAAA 3’5’

5’

5’

5’ 3’5’3’

3’

3’

Relationship of mRNA to genomic DNA for RBP4

Fig. 6.4Page 162

Analysis of gene expression in cDNA libraries

A fundamental approach to studying gene expressionis through cDNA libraries.

• Isolate RNA (always from a specific organism, region, and time point)

• Convert RNA to complementary DNA

• Subclone into a vector

• Sequence the cDNA inserts. These are Expressed Sequence Tags

Page 162-163

vector

insert

UniGene: unique genes via ESTs

• Find UniGene at NCBI: www.ncbi.nlm.nih.gov/UniGene

• UniGene clusters contain many ESTs

• UniGene data come from many cDNA libraries. Thus, when you look up a gene in UniGene you get information on its abundance and its regional distribution.

Page 164

Cluster sizes in UniGene

This is a gene with1 EST associated;the cluster size is 1

Page 164& Fig. 2.3,Page 23

Cluster sizes in UniGene

This is a gene with10 ESTs associated;the cluster size is 10

Page 164

Cluster sizes in UniGene (human)

Cluster size Number of clusters1 10,4002 7,1003-4 6,8005-8 5,3009-16 3,80017-32 3,100

500-1000 1,5002000-4000 1308000-16,000 1216,000-30,000 3

UniGene build 186, 9/05 Page 164

Ten largest human UniGene clusters

Cluster size Gene22,925 eukary. translation EF (Hs. 522463)22,320 eukary. translation EF (Hs. 4395522)16,562 actin, gamma 1 (Hs.514581)16,309 GAPDH (Hs.169476)16,231 actin, beta (Hs.520640)11,076 ribosomal prot. L3 (Hs.119598)10,517 dehydrin (Hs.524390)

10,087 enolase 1 (alpha)(Hs.517145)

9,973 ferritin (Hs.433670)8,966 metastasis associated (Hs.187199)

UniGene build 186, 9/05

Table 6.2Page 165

UniGene brainlibraries

UniGene lunglibraries

Fig. 6.7Page 167

Fig. 6.7Page 167

Brain Lung

n-sec1 up-regulated in brain

CamKII up-regulated

in brain

surfactant up-regulated in lung

Page 167

Fisher’s exact test provides a p value

Digital differential display (DDD) results in UniGeneare assessed for significance using Fisher’s exact testto generate a p value.

p =

The null hypothesis (that gene 1 is not differentiallyregulated in a comparison of two libraries) is rejectedwhen p is < 0.05/G (where G = the number of UniGeneclusters analyzed).

Pages 165

NA! NB! c! C!

(NA + NB)! g1A! g1B! (NA – g1A)!(NB – g1B)!

Pitfalls in interpreting cDNA library data

• bias in library construction• variable depth of sequencing• library normalization• error rate in sequencing• contamination (chimeric sequences)

Pages 166-168

Fig. 6.8p. 168-169

http://mgc.nci.nih.gov

Serial analysis of gene expression (SAGE)

• 9 to 11 base “tags” correspond to genes

• measure of gene expression in different biological samples

• SAGE tags can be compared electronically

Page 169

Tag 1

Tag 1Tag 2Tag n

Cluster 1Cluster 2Cluster 3

Cluster 1

SAGE tags are mapped to UniGene clusters

Page 169