gene expression guy nimrod. microarrays the microarrays technology is aimed to measure the gene...

35
Gene expression Guy Nimrod

Post on 20-Dec-2015

223 views

Category:

Documents


0 download

TRANSCRIPT

Gene expression

Guy Nimrod

Microarrays• The microarrays

technology is aimed to measure the gene expression profile of cell.

• This is done by measuring the mRNA levels of different genes in the cell.

• The method can be applied to thousands of genes and complete genomes simultaneously

DNA chips

• DNA chips are arrays of different DNA fragments attached at specific locations on glass slides at very high density.

• Fragments at each specific location are usually designed as complementary to part of the mRNA (or its cDNA) of a certain gene.

• The use of the DNA chips is based on hybridization between the fragments attached to the glass and the mRNA (or its cDNA) from the query organism cells.

The method

Reverse transcription

Hybridization

(Actual strand ~25b)

A. B.

Disadvantages:

• mRNA levels do not necessarily reflect the levels of the proteins.– Different half-life time for different proteins.– Regulation in the protein level

• Potential noise e.g. :– Imperfect hybridization and paralogs.– Alternative splicing.

• Measurements are relative to a control specimen.

Applications

• Analysis and characterization of:– Cell’s response to different conditions.– Cell cycle regulated genes. – Different expression profile in different tissues

of the organism.– Sources and their implications in diseases.– Mode of action of drugs.

Data analysis• A basis for organizing gene expression

data is to group together genes with similar pattern of expression

• Define similarity. E.g. :– Euclidean distance

– Correlation coefficient

(The data is usually log transformed)• Clustering the data. This could be done by

a supervised or unsupervised clustering.

Experiments

Gen

es

Hierarchical clustering• Compute distances between each pair of genes

each gene is considered as a node with weight of one unit.

• Find the most similar pair of nodes, and join them into one node with expression profile as an average of them both. Weight the new node as the sum of weights of its components.

• Compute the distances of the new node from all the nodes in the list. (Discard the nodes which compose the new one)

• There are 2n-1 linear ordering consistent with the structure of the tree. The ordering is usually according to some weight function (e.g., time of maximal induction)

Experiments

Gen

es

K-meansObjective: divide the objects into K clusters such that some metric (e.g., variance) relative to the centroids of the clusters is minimized.

Example of simple version of K-means: (Assume K=3)

1. Place K points into the space. These points represent initial group centroids.

2 .Assign each object to the group that has the closest centroid .

3. Recalculate the positions of the k centroids.

4. Repeat Steps 2 and 3 until the centroids no longer move (or changes below a certain cutoff.

•Need to choose K.•Global minimum is not guaranteed (because the assignments are discrete, not necessarily a local minimum). Dependence on starting point.

Example: K-means2. amino-acid biosynthesis.7. genes induced as part of the environmental stress response.14. mitochondrial protein synthesis.39. genes involved in nitrogen utilization.45. oxidative phosphorylation and respiration components. 53. specific amino-acid transporters.67. glycolysis genes.72. secretion, protein synthesis, and membrane synthesis genes73. genes repressed as part of the environmental stress response.80. amino-acid biosynthesis genes86. histone genes.

91 centroids

91 clusters

(Gacsh et al., 2002)

Response of yeast cells to environmental changes *

• Cells require specific internal conditions for optimal growth.

• Unicellular organisms such as yeast (S.cerevisia) have evolved mechanisms for adapting to drastic environmental changes.

• The following research explores the genomic expression pattern in the complete genome of the yeast, in response to diverse environmental transitions.

*Gasch et al., 2000

Methods

• Yeast:– Unicellular organism, requires rapid recovery

and adjustment to the new surroundings.– Available ‘whole genome’ microarrys, each

contained ~6200 known/predicted genes.– One of the most researched organisms with

many annotated genes.

Methods

• The expression pattern of the genes was examined in the response to a variety extreme environments, e.g. :– Heat shock– Amino-acid starvation– Nitrogen depletion– Hyper-osmotic shock– Progression into stationary phase.

• It was measured relatively to an unstressed culture/beginning of the experiment.

Results:Hierarchical clustering

• Two major clusters (F&P) showed reciprocal but nearly identical profiles.

• These ~900 (15%) genes responded to almost all of the examined stress conditions (ESRs).

• Some other clusters are of genes that respond to specific extreme conditions.

The Enviromental Stress Response- ESR

• ~600 repressed genes– Growth related processes

– Nucleotides biosynthesis

– Ribosomal genes

• These genes seems to be coregulated and promotor analysis revealed two novel and conserved motifs upstream the genes,

The Enviromental Stress Response- ESR

• ~300 induced genes (60% uncharacterized)– Carbohydrate metabolism– Detoxification of reactive

oxygen– DNA damage repair– Metabolite transport– Intracellular signaling

• Many of these genes have previously been proposed to function as cellular protection of stress.

Regulation of the genes induces in the ESR

• A set of ~50 genes induced by a variety of stress conditions through a stress response element (STRE), was previously known. It is recognized by the transcription factors Msn2p and Msn4p.

• Half of those genes are induced in the identified ESR

• Sub-clusters within the induced ESR genes suggests differences in the regulation of those genes.

Genes dependent on Msn2/Msn4p

• A- Partially dependent on Msn2/Msn4p in response to both stresses.

• B- Largely dependent on Msn2/Msn4p in response to both stresses.

• C- Dependent on Msn2/Msn4p in response to heat shock.

• A substantial fraction in the ESR genes was unaffected by over expression or deletion of Msn2/Msn4p.

H2O2 Heat

Course of the reaction• ESR genes responded

immediately with large changes.• However, over time new steady

state of transcript levels is reached with small differences comparing to the initial steady state.– Maintaining new levels?– Some Overcome from the

stress?

• Duration and amplitude of the transient changes varied with the magnitude of the environmental change.

Isozymes

• Isozymes are enzymes having similar structure that catalyze the same reaction.

• Analysis showed differential expression of some isozymes.– Different properties of the

isozymes (localization, affinity, substrate specificity etc.)

– Divergence of regulation

)74%id(

)78%id(

Reciprocal metabolic roles

• Among the genes induced in the ESR were many whose products play reciprocal metabolic role– E.g., enzymes that synthesize glycogen, and their

precursors, as well as catabolic enzymes for degrading glycogen.

• The activity of many of these enzymes is controlled in the posttranslational level.

• Induction of both way enzymes enhances the cell’s ability to rapidly manage osmotic instability and energy reserves.

What triggers ESR?• Hypothesis- ESR is initiated in

response to any extreme change in cell’s environment.

• 25oC to 37oC- massive and transient changes in ESR expression

• 37oC to 25oC- reciprocal response. simple transition to the gene expression program characteristic to of steady state growth at 25oC.

• ESR- seems to respond to conditions that enhance the environmental stress.

Identification of cell cycle-regulated genes in Yeast *

• Cell cycle- The sequence of events from one division of a cell to the next.

• Cyclins- proteins that control the cell cycle.

• CDKs- cyclin-dependent protein kinases.

• G1 - growth and preparation of the chromosomes for replication.

• S - synthesis of DNA • G2 - preparation for mitosis. • M - mitosis • G0 - cell leaves the cell cycle,

temporarily or permanently. *Spellman et al., 1998

Methods:

• In an untreated culture of cells, the cells are in various stages of the cell cycle.

• The experiments tracked cell cultures synchronized by three different methods (another experiment set was taken from Cho et al., 1998).– Applying different independent methods was essential

to diminish artifacts characteristic for a certain method.

– Cultures were considered as synchronized at the next 2-3 cycles after synchronization.

• As control an unsynchronized culture was used.

Extracting cell cycle-regulated genes

• Two factors were used to score the periodicity of each gene:– Measurement of the periodicity of the gene comparing to the

period of the yeast cell cycle (~80min).– Measurement of the correlation between the gene and each of

five different profiles, each represent a gene known to be expressed at a certain stage.

• The 800 genes with the highest combined score were chosen:– Maximize the number of known cell cycle regulated genes in the

list (95/104 known at that time).– Minimize false positive. (+ measure ~3% false positive in

random data)– Somewhat arbitrary.

Example of periodic and non-periodic genes

Results:• The periodic genes ordered by the

time at which they reach peak expression.– G1: 300 genes– S: 71 genes– G2: 121 genes– M: 195 genes– M/G: 113 genes

• Most genes: cell cycle control, DNA replication, DNA repair, budding, nuclear division, glycosylation mitosis etc.

• Many genes needed for replication and repair reach peak expression just before they are needed.

Hierarchical clustering

• Many known groups of genes were clustered together:

• The histone cluster formed the tightest cluster, having very high peak at the S phase.• The Histons have three known modes of regulation:

– Repressing elements– Activating transcription– Destabilization of the mRNA.

• Cdc28 seems to cause here some artifacts in the expression pattern.

Gene expression• The microarrays technology along with bioinformatics

methods, and the sequencing of complete genomes supplies a revolutionary novel sight to processes in the cells.

• The researches presented here demonstrate the ability to:– Discover genes with a certain pattern of regulation.– Suggest functions for un-annotated genes.– Refine characterization of regulatory elements.– Propose new regulatory elements.– Better understanding of pathways in the cell.– And many others…