20 march, 1998 chapter 10a gene expression: transcription ... · off the mrna, the rna polymerase...

21
Biology 210 GENETICS 20 March, 1998 Chapter 10a Gene Expression: Transcription Brief Outline 1. The flow of Genetic Information 2. Synthesizing Proteins from the Instructions of DNA 3. The Genetic Code 4. RNA: Intermediary in Protein Synthesis 1. The flow of Genetic Information: DNA -> RNA -> protein How does the sequence of a strand of DNA correspond to the amino acid sequence of a protein? This concept is explained by

Upload: others

Post on 16-Jul-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 20 March, 1998 Chapter 10a Gene Expression: Transcription ... · off the mRNA, the RNA polymerase falls off the DNA and transcription ceases. Gene Expression: Transcription The majority

Biology 210

GENETICS

20 March, 1998

Chapter 10a

Gene Expression:

Transcription

Brief Outline

1. The flow of Genetic Information2. Synthesizing Proteins from the Instructions of DNA3. The Genetic Code4. RNA: Intermediary in Protein Synthesis

1. The flow of Genetic Information:

DNA -> RNA -> proteinHow does the sequence of a strand of DNA correspond to the amino acid sequence of

a protein? This concept is explained by

Page 2: 20 March, 1998 Chapter 10a Gene Expression: Transcription ... · off the mRNA, the RNA polymerase falls off the DNA and transcription ceases. Gene Expression: Transcription The majority

The Central Dogma of

Molecular Biology:

The Relationship between Genes and Proteins

Most genes encode the information for the synthesis of a protein

The sequence of bases in DNA codes for the sequence of amino acids in proteins

Shown below is an Illustration of the transcription of DNA to RNA to protein

which forms the backbone of molecular biology.

Page 3: 20 March, 1998 Chapter 10a Gene Expression: Transcription ... · off the mRNA, the RNA polymerase falls off the DNA and transcription ceases. Gene Expression: Transcription The majority

LEGEND

DNA codes for the production of RNA.

RNA codes for the production of protein.

Protein does not code for the production of protein, RNA or DNA.

The end.

Or in the words of Francis Crick:

Once information has passed into protein, it cannot get out again.

This was taken from Genetech's homepage:

Page 4: 20 March, 1998 Chapter 10a Gene Expression: Transcription ... · off the mRNA, the RNA polymerase falls off the DNA and transcription ceases. Gene Expression: Transcription The majority

However, the "Central Dogma" has had to be revised a bit. It turns out that you

CAN go back from RNA to DNA, and that RNA can also make copies of itself. It is

still not possible to go from Proteins back to RNA or DNA, and no known mechanism

has yet been demonstrated for proteins making copies of themselves.

Try it for youself on the "DNA Workshop" (from PBS).

Click HERE for a link to nice historical review of The Central Dogma.

2. Synthesizing Proteins from the Instructions ofDNA

Genetic information flows in a cell from:

Page 5: 20 March, 1998 Chapter 10a Gene Expression: Transcription ... · off the mRNA, the RNA polymerase falls off the DNA and transcription ceases. Gene Expression: Transcription The majority

DNA ->RNARNARNARNA-> ProteinIn a prokaryotic cell, this process happens at the same time:

However, in an eukaryotic cell, the transcription & translation occur in different places:

Page 6: 20 March, 1998 Chapter 10a Gene Expression: Transcription ... · off the mRNA, the RNA polymerase falls off the DNA and transcription ceases. Gene Expression: Transcription The majority

3. The Genetic Code

The Genetic Code uses three bases to specify each amino acid

Page 7: 20 March, 1998 Chapter 10a Gene Expression: Transcription ... · off the mRNA, the RNA polymerase falls off the DNA and transcription ceases. Gene Expression: Transcription The majority

4. RNA: Intermediary in Protein Synthesis

Why would the cell want to have an intermediate

between DNA and the proteins it encodes?

The DNA can then stay pristine and protected, away from the caustic chemistry

of the cytoplasm.

Gene information can be amplified by having many copies of an RNA made from

one copy of DNA.

Regulation of gene expression can be effected by having specific controls at

each element of the pathway between DNA and proteins. The more elements

there are in the pathway, the more opportunities there are to control it in different

Page 8: 20 March, 1998 Chapter 10a Gene Expression: Transcription ... · off the mRNA, the RNA polymerase falls off the DNA and transcription ceases. Gene Expression: Transcription The majority

circumstances.

What is RNA?RNA has the same primary structure as DNA. It consists of a sugar-phosphate

backbone, with nucleotides attaches to the 1' carbon of the sugar. The differences

between DNA and RNA are that:

Page 9: 20 March, 1998 Chapter 10a Gene Expression: Transcription ... · off the mRNA, the RNA polymerase falls off the DNA and transcription ceases. Gene Expression: Transcription The majority

1. RNA has a hydroxyl group on the 2' carbon of the sugar (thus, the

difference between deoxyribonucleic acid and ribonucleic acid).

2. Instead of using the nucleotide thymine, RNA uses another nucleotide

called uracil:

Because of the extra hydroxyl group on the sugar, RNA is too bulky to form a

a stable double helix. RNA exists as a single-stranded molecule. However,

regions of double helix can form where there is some base pair complementation

(U and A , G and C), resulting in hairpin loops. The RNA molecule with its

hairpin loops is said to have a secondary structure.

In addition, because the RNA molecule is not restricted to a rigid

double helix, it can form many different tertiary structures. Each RNA

molecule, depending on the sequence of its bases, can fold into a stable

three-dimensional structure.

Page 10: 20 March, 1998 Chapter 10a Gene Expression: Transcription ... · off the mRNA, the RNA polymerase falls off the DNA and transcription ceases. Gene Expression: Transcription The majority

From http://motif.stanford.edu/thesis/tRNA.html.

Transcription produces RNA molecules that are complimentary copies of one strand of DNA

Page 11: 20 March, 1998 Chapter 10a Gene Expression: Transcription ... · off the mRNA, the RNA polymerase falls off the DNA and transcription ceases. Gene Expression: Transcription The majority

Three types of RNA cooperate in protein synthesis

The Genetic Code

Page 12: 20 March, 1998 Chapter 10a Gene Expression: Transcription ... · off the mRNA, the RNA polymerase falls off the DNA and transcription ceases. Gene Expression: Transcription The majority

How does an mRNA specify amino acid sequence? The answer lies in the genetic code. It would be

impossible for each amino aciud to be specified by one nucleotide, because there are only 4 nucleotides and

20 amino acids. Similarly, two nucleotide combinations could only specify 16 amino acids. The final

conclusion is that each amino acid is specified by a particular combination of three nucleotides, called a

codon:

Note the degeneracy of the genetic code. Each amino acid might have up to six codons that specify it. It is

also interesting to note that different organisms have different frequencies of codon usage. A giraffe might use

CGC for arginine much more often than CGA, and the reverse might be true for a sperm whale. Another

interesting point is that some species vary from the codon association described above, and use different

codons fo different amino acids. In general, however, the code depicted can be relied upon.

How do tRNAs recognize to which codon to bring an amino acid? The tRNA has an anticodon on its

mRNA-binding end that is complementary to the codon on the mRNA. Each tRNA only binds the appropriate

amino acid for its anticodon.

Page 13: 20 March, 1998 Chapter 10a Gene Expression: Transcription ... · off the mRNA, the RNA polymerase falls off the DNA and transcription ceases. Gene Expression: Transcription The majority

From http://motif.stanford.edu/thesis/tRNA.html.

[email protected]

Central Dogma, Part 1: Transcription

link to Kimball biology page.

How does the sequence information from DNA get transferred to mRNA so that it can

be carried to the ribosomes in the cytoplasm? This process, called transcription is

highly analogous to DNA replication. Of course, there are different effectors, or

proteins, that direct transcription. Primary among these is the RNA polymerase

holoenzyme, an agglomeration of many different factors that together direct the

synthesis of mRNA on a DNA template.

Page 14: 20 March, 1998 Chapter 10a Gene Expression: Transcription ... · off the mRNA, the RNA polymerase falls off the DNA and transcription ceases. Gene Expression: Transcription The majority

As mentioned above, transcription (like ANY polymerisation process) is divided into

three parts:

1. Initiation of Transcription

RNA polymerase must be able to recognize the beginning of a gene so that it knows

where to start synthesizing an mRNA. It is directed to the start site of transcription by

one of its subunits' affinity to a particular DNA sequence that appears at the beginning

of genes. This sequence is called a promoter. It is a unidirectional sequence on one

strand of the DNA that tells the RNA polymerase both where to start and in which

direction (that is, on which strand) to continue synthesis. The bacterial promoter almost

always contains some version of the following elements:

The two sequences shown in red are known

as the "-35" (TTGACA) and "-10" (TATAAT)

sites, based on their positions from the start

of transcription. These two sequences

represent the CONSENSUS, based on

comparison of several different sequences

aligned at the transcription start site.

Another way of representing this consensus is by the application of information theory

to sequence analysis. One currently used method is "sequence logos", (this is based

on "Shannon information", for those of you who are interested - see Schneider, T.M.,

Stepehns,R.M., "Sequence logos: a new way to display Consensus Sequences", Nucleic Acids

Research, 18:6097-6100, (1990).) The sequence logo, based on the promoter region of

167 different genes, (aligned by their transcriptional start site) is shown below:

Page 15: 20 March, 1998 Chapter 10a Gene Expression: Transcription ... · off the mRNA, the RNA polymerase falls off the DNA and transcription ceases. Gene Expression: Transcription The majority

The sequence logo for the -10 "TATA" box for 60 human promoters, aligned on the

TATA box, is shown below:

2. Elongation of Transcription

The RNA polymerase then stretches open the double helix at that point in the DNA and

begins synthesis of an RNA strand complementary to one of the strands of DNA. We

call the strand from which it copies the antisense or template strand, and the other

strand, to which it is identical, the sense or coding strand.

The RNA polymerase recruits rNTPs (ribonucleic nucleotides triphosphates) in the

Page 16: 20 March, 1998 Chapter 10a Gene Expression: Transcription ... · off the mRNA, the RNA polymerase falls off the DNA and transcription ceases. Gene Expression: Transcription The majority

same way that DNA polymerase recruits dNTPs.

However, since synthesis is single stranded and only

proceeds in the 5' to 3' direction, there is no need for

Okazaki fragments.

It is important to note that synthesis once again

proceeds in a unidirectional fashion, because of the

reasons outlined in the previous section.

3. Termination of Transcription

How does RNA polymerase know when to stop transcribing a gene? This system has

been elucidated in prokaryotes. It is important to know that since there is no nucleus in

prokaryotes, ribosomes can begin making protein from an mRNA immediately upon its

synthesis. At the end of a gene, the sequence of the mRNA allows it to form a hairpin

loop, which blocks the ribosome. The ribosome falls off the mRNA, and that is the

termination signal recognized by the RNA polymerase. As soon as the ribosome falls

off the mRNA, the RNA polymerase falls off the DNA and transcription ceases.

Gene Expression: Transcription

The majority of genes are expressed as the proteins they encode. The process occurs in two steps:

Transcription = DNA -> RNATranslation = RNA -> protein

Taken together, they make up the "central dogma" of biology: DNA -> RNA -> protein. Here is an

Page 17: 20 March, 1998 Chapter 10a Gene Expression: Transcription ... · off the mRNA, the RNA polymerase falls off the DNA and transcription ceases. Gene Expression: Transcription The majority

overview.

This page examines the first step:

Gene Transcription: DNA -> RNA

DNA serves as the template for the synthesis of RNA much as it does for its own replication.

The Steps

several protein transcription factors bind to promoter sites, usually on the 5' side of the geneto be transcribedan enzyme, RNA polymerase, binds to the complex of transcription factorsworking together, they open the DNA double helixRNA polymerase proceeds down one strand moving in the 3' -> 5' directionas it does so, it assembles ribonucleotides (supplied as triphosphates, e.g., ATP) into a strandof RNAeach ribonucleotide is inserted into the growing RNA strand following the rules of basepairing. Thus for each C encountered on the DNA strand, a G is inserted in the RNA; for eachG, a C; and for each T, an A. However, each A on the DNA guides the insertion of thepyrimidine uracil (U, from uridine triphosphate, UTP). There is no T in RNA.synthesis of the RNA proceeds in the 5' -> 3' direction.as each nucleoside triphosphate is brought in to add to the 3' end of the growing strand, thetwo terminal phosphates are removed

Note that at any place in a DNA molecule, either strand may be serving as the template; that is,some genes "run" one way, some the other (and in a few remarkable cases, the same segment ofdouble helix contains genetic information on both strands!). In all cases, however, RNApolymerase proceeds along a strand in its 3' -> 5' direction.

Types of RNA

Several types of RNA are synthesized:

messenger RNA (mRNA). This will later be translated into a polypeptide.ribosomal RNA (rRNA). This will be used in the building of ribosomes: machinery forsynthesizing proteins by translating mRNA.transfer RNA (tRNA). RNA molecules that carry amino acids to the growing polypeptide.small nuclear RNA (snRNA). DNA transcription of the genes for mRNA, rRNA, and tRNA

Page 18: 20 March, 1998 Chapter 10a Gene Expression: Transcription ... · off the mRNA, the RNA polymerase falls off the DNA and transcription ceases. Gene Expression: Transcription The majority

produces large precursor molecules ("primary transcripts") that must be processed within thenucleus to produce the functional molecules for export to the cytosol. Some of theseprocessing steps are mediated by snRNAs.

Ribosomal RNA (rRNA)

There are 4 kinds. In eukaryotes, these are

18S rRNA. One of these molecules, along with some 30 different protein molecules, is used tomake the small subunit of the ribosome.28S, 5.8S, and 5S rRNA. One each of these molecules, along with some 45 different proteins,are used to make the large subunit of the ribosome.

The name given each type of rRNA reflects the rate at which the molecules sediment in theultracentrifuge. The larger the number, the larger the molecule (but not proportionally).

The 28S, 18S, and 5.8S molecules are produced by the processing of a single primary transcriptfrom a cluster of identical copies of a single gene. The 5S molecules are produced from a differentcluster of identical genes.

Transfer RNA (tRNA)

There are some 32 different kinds of tRNA in a typical eukaryotic cell.

each is the product of a separate genethey are small (~4S), containing 73-93 nucleotidesmany of the bases in the chain pair with each other forming sections of double helixthe unpaired regions form 3 loopseach kind of tRNA carries (at its 3' end) one of the 20 amino acids (thus most amino acidshave more than one tRNA responsible for them)at one loop, 3 unpaired bases form an anticodonbase pairing between the anticodon and the complementary codon on a mRNA moleculebrings the correct amino acid into the growing polypeptide chain. Further details of thisprocess are described in the discussion of translation.

Messenger RNA (mRNA)

Messenger RNA comes in a wide range of sizes reflecting the size of the polypeptide it encodes.Most cells produce small amounts of thousands of different mRNA molecules, each to betranslated into a peptide needed by the cell. Many mRNAs are common to most cells, encoding"housekeeping" proteins needed by all cells (e.g. the enzymes of glycolysis). Other mRNAs arespecific for only certain types of cells. These encode proteins needed for the function of thatparticular cell (e.g., the mRNA for hemoglobin in the precursors of red blood cells).

Small Nuclear RNA (snRNA)

Approximately a dozen different genes for snRNAs, each present in multiple copies, have beenidentified. The snRNAs have various roles in the processing of the other classes of RNA. Forexample, several snRNAs are part of the spliceosome that participates in converting pre-mRNAinto mRNA by excising the introns and splicing the exons.

Page 19: 20 March, 1998 Chapter 10a Gene Expression: Transcription ... · off the mRNA, the RNA polymerase falls off the DNA and transcription ceases. Gene Expression: Transcription The majority

The RNA polymerases

The RNA polymerases are huge multi-subunit protein complexes. Three kinds are found ineukaryotes.

RNA polymerase I (Pol I). It transcribes the rRNA genes for the precursor of the 28S, 18S, and5.8S molecules. (and is the busiest of the RNA polymerases)RNA polymerase II (Pol II). It transcribes the mRNA and snRNA genes.RNA polymerase III (Pol III). It transcribes the 5S rRNA genes and all the tRNA genes.

RNA Processing: pre-mRNA -> mRNA

All the primary transcripts produced in the nucleus must undergo processing steps to produce functional RNA

molecules for export to the cytosol. We shall confine ourselves to a view of the steps as they occur in the

processing of pre-mRNA to mRNA.

Synthesis of the cap. This is a stretch of three modified nucleotides attached to the 5' end of the

pre-mRNA.

Synthesis of the poly(A) tail. This is a stretch of adenine nucleotides attached to the 3' end of the

pre-mRNA.

Step-by-step removal of introns present in the pre-mRNA and splicing of the remaining exons. This

step is required because most eukaryotic genes are split.

Split Genes

Most eukaryotic genes are split into segments. In decoding the open reading frame of a gene for a known

protein, one usually encounters periodic stretches of DNA calling for amino acids that do not occur in the

actual protein product of that gene. Such stretches of DNA, which get transcribed into RNA but not translated

into protein, are called introns. Those stretches of DNA that do code for amino acids in the protein are called

exons. Examples:

the gene for one type of collagen found in chickens is split into 52 separate exons

the gene for dystrophin, which is mutated in boys with muscular dystrophy, has 79 exons

Page 20: 20 March, 1998 Chapter 10a Gene Expression: Transcription ... · off the mRNA, the RNA polymerase falls off the DNA and transcription ceases. Gene Expression: Transcription The majority

even the genes for rRNA and tRNA are split.

The cutting and splicing of mRNA must be done with great precision. If even one nucleotide is left over from

an intron or one is removed from an exon, the reading frame from that point on will be shifted, producing

new codons specifying a totally different sequence of amino acids from that point to the end of the molecule

(which often ends prematurely anyway when the shifted reading frame generates a STOP codon).

The removal of introns and splicing of exons is done with the spliceosome. This is a complex of several

snRNA molecules and several proteins. The introns in most pre-mRNAs begin with a GU and end with an

AG. Presumably these short sequences are essential for guiding the spliceosome.

Alternate Splicing

The processing of pre-mRNA for many proteins proceeds along various paths in different cells or under

different conditions. For example, early in the differentiation of a B cell (a lymphocyte that synthesizes an

antibody) the cell first uses an exon that encodes a transmembrane domain that causes the molecule to be

retained at the cell surface. Later, the B cell switches to using a different exon whose domain enables the

protein to be secreted from the cell as a circulating antibody molecule.

So, whether a particular segment of RNA will be retained as an exon or excised as an intron can vary under

different circumstances. Clearly the switching to an alternate splicing pathway must be closely regulated.

Why split genes?

Perhaps during evolution, eukaryotic genes have been assembled from smaller, primitive genes - today's

exons. Some proteins, like the antibodies mentioned in the previous section, are organized in a set of separate

sections or domains each with a special function to perform in the complete molecule. Each domain is

encoded by a separate exon. Having the different functional parts of the antibody molecule encoded by

separate exons makes it possible to use these units in different combinations. Thus a set of exons in the

genome may be the genetic equivalent of the various modular pieces in a box of "Lego" for children to

assemble in whatever forms they wish.

But the boundaries of other exons do not seem to correspond domain boundaries of the protein. Furthermore,

rRNA and tRNA genes are also split, and these do not encode proteins. So perhaps some exons are simply

"junk" DNA that was inserted into the gene at some point in evolution without causing any harm.

Summary

Gene expression occurs in two steps:

transcription of the information encoded in DNA into a molecule of RNA (described here) and

translation of the information encoded in the nucleotides of mRNA into a defined sequence of amino

acids in a protein (discussed in Gene Translation: RNA -> Protein).

Page 21: 20 March, 1998 Chapter 10a Gene Expression: Transcription ... · off the mRNA, the RNA polymerase falls off the DNA and transcription ceases. Gene Expression: Transcription The majority

Back to the Genetics Syllabus

Last modified on: 4 February, 2000 by Dave Ussery