characterization of mouse h3.3-like histone genes

11
Gene. 59 (1987) 29-39 Elsevier 29 GEN 02151 ~aracterization of mouse H3.3-like kiitoue genes (Recombinant DNA; pseudogene; evolution; reverse transcription; phage I vector; alternate polyadenylation site) Susan E. Wellman, Peter J. Casano, Duane R. Pilch*, William F. Marzluff * and Donald B. Sittman Department of Biochemishy, The University of Mirsissippi Medical Center. Jackson, MS 39216 (U.S.A.) and * Department of Chemistry, Florida State University, Tallahassee. FL 32306 (U.S.A.) Tel. (904)644-5.282 Received 22 June 1987 Accepted 29 July 1987 SUMMARY We designed a strategy to select genomic clones of mouse replication-independent H3.3 histone genes. We obtained three clones which met our selection criteria for being H3.3 genes. Upon sequencing two of these clones we found that they were unlike previously isolated chicken H3.3 clones: they code for several unpr~ct~ amino acid substitutions and contain no introns in the coding regions. We showed by S 1 nuclease assays that these genes are protected by mRNAs that have expression characteristics of H3.3 mRNA. The protection data and nucleotide sequence analysis show that the H3.3 transcripts can be processed at one of four cleavage/polyadenylation sites. We show that these genes probably evolved through reverse transcription intermediates, and are processed pseudogenes which are no longer under selective pressure. The 5’ and 3’ transcribed, nontranslated sequences show extensive homology to those of a human cDNA clone, and we suggest that these sequences may be required for appropriate regulation of expression of H3.3 genes. Histone proteins are highly conserved interspeciti- tally, but can show considerable intraspecific variation. Primary sequence variants of Hl, H2a, H2b and 273 have been identified in the somatic tissues of mammals. S~~at~~e-s~~c Hl, Correspondence to: Dr. D.B. Sittman, Department of Bio- chemistry, The University of Mississippi Medical Center, 2500 N. State Street,Jackson, MS 39216-4504(U.S.A.)Tel. (601)98i- 1513. Abbreviations: aa, amino acid(s); bp, base pair(s); MEL, murine erythroleukemia; nt, nucleotide.(s); SSC, 0.15 M NaCl, 0.015 M Na, citrate, ~11.7.6. H2a, H2b, and H3 proteins have also been identified (Lennox and Cohen, 1984; Zweidler, 1984). The cell-cycle dependence of histone variants has been studied extensively by Zweidler (1984). From these studies, four expression classes of histones have been defined. (I ) Re~i~~~o~-de~~dent variants. This class pr~ominates in rapidly dividing tissues. Synthesis of these histones is tightly linked to DNA synthesis. (2) Partially replication-dependent variants. Synthesis of proteins in this class is induced at the onset of DNA synthesis, but is not completely repressed at the cessation of DNA synthesis in some diflerentiating cells. (3) Replication-independent var- iants. The synthesis of these proteins is not linked to DNA synthesis. This class is also called replacement 0378-l1 I9~87~~0~.5~ 0 1987 Else&r Science Publishers B.V. (Bi~m~c~i Division)

Upload: unc

Post on 25-Nov-2023

0 views

Category:

Documents


0 download

TRANSCRIPT

Gene. 59 (1987) 29-39

Elsevier

29

GEN 02151

~aracterization of mouse H3.3-like kiitoue genes

(Recombinant DNA; pseudogene; evolution; reverse transcription; phage I vector; alternate polyadenylation site)

Susan E. Wellman, Peter J. Casano, Duane R. Pilch*, William F. Marzluff * and Donald B. Sittman

Department of Biochemishy, The University of Mirsissippi Medical Center. Jackson, MS 39216 (U.S.A.) and * Department of Chemistry, Florida State University, Tallahassee. FL 32306 (U.S.A.) Tel. (904)644-5.282

Received 22 June 1987 Accepted 29 July 1987

SUMMARY

We designed a strategy to select genomic clones of mouse replication-independent H3.3 histone genes. We obtained three clones which met our selection criteria for being H3.3 genes. Upon sequencing two of these clones we found that they were unlike previously isolated chicken H3.3 clones: they code for several unpr~ct~ amino acid substitutions and contain no introns in the coding regions. We showed by S 1 nuclease assays that these genes are protected by mRNAs that have expression characteristics of H3.3 mRNA. The protection data and nucleotide sequence analysis show that the H3.3 transcripts can be processed at one of four cleavage/polyadenylation sites. We show that these genes probably evolved through reverse transcription intermediates, and are processed pseudogenes which are no longer under selective pressure. The 5’ and 3’ transcribed, nontranslated sequences show extensive homology to those of a human cDNA clone, and we suggest that these sequences may be required for appropriate regulation of expression of H3.3 genes.

Histone proteins are highly conserved interspeciti- tally, but can show considerable intraspecific variation. Primary sequence variants of Hl, H2a, H2b and 273 have been identified in the somatic tissues of mammals. S~~at~~e-s~~c Hl,

Correspondence to: Dr. D.B. Sittman, Department of Bio- chemistry, The University of Mississippi Medical Center, 2500 N. State Street,Jackson, MS 39216-4504(U.S.A.)Tel. (601)98i- 1513.

Abbreviations: aa, amino acid(s); bp, base pair(s); MEL, murine erythroleukemia; nt, nucleotide.(s); SSC, 0.15 M NaCl, 0.015 M Na, citrate, ~11.7.6.

H2a, H2b, and H3 proteins have also been identified (Lennox and Cohen, 1984; Zweidler, 1984).

The cell-cycle dependence of histone variants has been studied extensively by Zweidler (1984). From these studies, four expression classes of histones have been defined. (I ) Re~i~~~o~-de~~dent variants.

This class pr~ominates in rapidly dividing tissues. Synthesis of these histones is tightly linked to DNA synthesis. (2) Partially replication-dependent variants.

Synthesis of proteins in this class is induced at the onset of DNA synthesis, but is not completely repressed at the cessation of DNA synthesis in some diflerentiating cells. (3) Replication-independent var-

iants. The synthesis of these proteins is not linked to DNA synthesis. This class is also called replacement

0378-l 1 I9~87~~0~.5~ 0 1987 Else&r Science Publishers B.V. (Bi~m~c~i Division)

30

or basal. (4) Tissue-specific vmunts. In spermato- cytes a unique set of proteins is synthesized, but little is known about the regulation of these or other tissue-specific variants (Zweidler, 1984).

To understand the mechanism of regulation of these different classes of histones, we and others have tried to isolate and identify representative clones of each expression type. Until recently, pre- dominantly replication-dependent histone genes and their cell-cycle regulation have been studied (re- viewed in Maxson et al., 1983; Old and Woodland, 1984; see also Sittman et al., 1983b; Marzluff and Graves, 1984; Daily et al., 1986). However, fully replication-dependent and partially replication- dependent histone genes have now been distinguish- ed (Brown et al., 1985b). Severalgroups are studying mammalian tissue-specific spermatocyte variants (Shires et al., 1976; Kumaroo and Irvin, 1980; Seyedin and Kistler, 1980; Trostle-Weige et al., 1984; Zweidler, 1984; Cole et al., 1986) and avian red blood cell-specific H5 variants (Yaguchi et al., 1979; Kreig et al., 1983; Coles and Wells, 1985; Affolter and Ruiz-Carillo, 1986). Genomic clones for chicken replication-independent H3 genes (Engel et al., 1982; Brush et al., 1985) and cDNA clones for replication-independent H3 genes from human sub- jects (Wells and Kedes, 1985) have been obtained. A number of histone genes coding for proteins with unpredicted amino acids and no known function have been isolated (Sittman et al., 1981; Moorman et al., 1981; Tonjes and Doenecke, 1985; see also references within Wells, 1986).

We describe in this paper our strategy for obtaining mouse genomic H3.3 clones, and the ana- lysis of two of the clones. They have several unpredicted amino acid changes but nevertheless are H3.3-like. We show that these genes are probably processed pseudogenes, and we present evidence that mouse H3.3 transcripts can be polyadenylated at four sites. Comparison of these cloned mouse genes to a human H3.3 cDNA clone has shown us an interesting evolutionarily conserved feature which may be involved in the different expression charac- teristics of the replication-independent histone genes.

MATERIALS AND METHODS

(a) Materials

Enzymes were purchased from Bethesda Re- search Labs, New England Biolabs or International Biotechnologies, Inc. The deoxynucleotides and di- deoxynucleotides used in sequencing were from Pharmacia-P-L Biochemicals. [ a-32P]dCTP (>3000 Ci/mmol) and [Y-~~P]ATP (>7000 Ci/mmol) were from ICN Radiochemicals. [u- 35S]dATP was from New England Nuclear. DNA synthesis reagents were from Applied Biosystems. All other chemicals were reagent grade or better.

(b) Genomic library

A library of EcoRI-digested Balb/c mouse spleen DNA in the bacteriophage ACharon4A was a gift from R. Perhnutter and L. Hood.

(c) Hybridization probes

A subclone of a sea urchin H3 gene, p3H-1, and a human H3.3 gene, HH3-1, were gifts from R. Cohn and L. Kedes. Subclones of the chicken H4 and H 1 genes, pCH3dr8 and pCHlaBlBK1, were from J.D. Engel (Sugarman et al., 1983). Subclones of mouse histone genes, pMH3.2, pMH2a, and pMH2b, have been described (S&man et al., 1981; 1983a; Graves et al., 1985).

(d) Selection of histone genes

The genomic library was screened with probes derived from the genes listed above, singly or in combination. Phage DNA was transferred to nitro- cellulose filters using a procedure similar to that of Benton and Davis (1977). The denaturing solution was 0.5 M NaOH and 1.5 M NaCl, the neutralizing solution was 1 M Tris * HCl, pH 7.5,3 M NaCl, and a rinse in 2 x SSC was added. Duplicate transfers were made. Hybridizations were done as described (Sittman et al., 1981) but in 1.5 x SSC and Escheri-

chia coli DNA at 50 pg/ml.

31

(e) Sequencing

Nucleotide sequences were determined by chain termination of elongation reactions with dideoxynu- cleotides (Sanger et al., 1977). Products were labeled with [c+~~P]~CTP or [c+ 35S]dATP. Samples were analyzed on 8 y0 or 6-8 y0 gradient polyacrylamide sequencing gels. Oligodeoxynucleotide primers were synthesized using an Applied Biosystems DNA syn- thesizer.

(f) Preparation of RNA

MEL cell RNA was prepared as described (Brown et al., 1985b). Myeloma cell RNA was pre- pared as described (Graves et al., 1985).

(g) Sl nuclease and exonuclease VII assays

S 1 nuclease mapping was performed as described (Sittman et al., 1983a). DNA-RNA hybrids, made as for S 1 assays, were digested with exonuclease VII at 45°C for 1 h at 10 units/ml.

(h) Nucleotide sequence analysis

The computer programs used in nucleotide se- quence analysis were either written by James Pustell and purchased from International Biotechnology, Inc., or were those of Bionet.

RESULTS AND DISCUSSION

(a) Isolation of mouse H3.3-like histone genes

We screened a mouse genomic library of EcoRI- digested DNA in the vector 1Charon4A with the probe HH3-1. This probe clone, recently described as a human H3.3 pseudogene (Wells et al., 1986), differs from a human H3.3 cDNA clone by 3 bp resulting in 2 aa changes. Otherwise this clone codes for a protein with the amino acids characteristic of H3.3: a serine at aa 31, an alanine at aa 87, an isoleucine at aa 89, a glycine at aa 90 and a serine at aa 96 (Zweidler, 1984; Brush et al., 1985). We have previously shown that HH3-1 was highly comple- mentary to a mouse H3.3 RNA: it protected both

mouse and human RNA in an Sl nuclease assay.

The RNA protected in the S 1 assay is constitutively expressed during the cell cycle (Sittman et al., 1983b) and throughout differentiation (Brown et al., 1985b), as is expected of the replication-independent H3.3. We therefore considered this human H3.3-like clone to be a satisfactory probe for a mouse genomic H3.3 clone.

Nine different recombinant phages were selected with HH3-1. We tested these clones for the presence of a conserved BglII site within the presumed gene, for the presence of conserved H3.3-specific 5’- flanking sequences, and for their ability to protect RNA in an Sl nuclease protection assay. Three of these clones were positive for these predicted features of an H3.3 gene (Welhnan, 1986): MH611, MH321, and MH921. A 6.1-kb EcoRI fragment from MH611 containing the H3 gene was subcloned into the plasmid vector pUC8. A 4.7-kb EcoRI frag- ment from MH321 was subcloned into pUC9. The 5’ end of MH321, on a 2.1-kb Hind111 fragment, and the 3’ end, on a 1-kb Hind111 fragment, were cloned into the phage vectors M13mp8 and M 13mp9. A 2.6-kb Hind111 fragment from MH921 was sub- cloned into pUC9, and into M 13mp18 and M13mp19.

(b) Sequence of two mouse H3.3-like genes and their

homology to other mammalian H3.3 genes

The two genes which showed the largest protected fragments in an S 1 assay (MH321 and MH921) were sequenced. The sequencing strategies for these two clones are shown in Fig. 1. Fig. 2 shows the nucleo- tide sequences and deduced amino acid sequences of MH321 and MH921, as well as the sequences of two other H3.3-like genes, relative to the human H3.3 cDNA sequence (HH3B-2; Wells and Kedes, 1985). Each gene codes for a protein with the characteristic amino acid sequence of H3.3. These are indicated with an asterisk. However, each has some additional amino acid replacements, as indicated.

The gene in MH321 has 4 aa differences and the MH92 1 gene has 2 aa differences from the predicted H3.3 sequence. One of these changes is identical in each gene, cysteine at aa 128. The two mouse genes show striking homology to each other and to the human H3.3 cDNA clone. The nucleotide sequences of the MH921 and MH321 genes are 98%

32

921

s= - 3110 I

525ol c VFL -

5’JcL 1-w 3cl. I--

.200 -150 400 -50

. I I’

. ~WIJX I

-5’ e law0 a

a-1 5’J e-1 al55 a

321

3llO )-*

5n- ws* 7 3155 t-9

S’ucl. 1-a s=-

400 SW -200 450 -100 -50 50mo150~250aooa5o4al4505005505005507w750mo550

- 3’

-37w

- Mm ---,3155 cl.

- 5’

. I 5’J

Fig. 1. Nucleotide sequencing strategies for clones MH921 and MH321. The heavy arrows show the coding regions and the direction of transcription of the genes. The positions of the oligomers used to prime dideoxy sequencing reactions in Ml3 subclones and the sequence obtained with those oligomers are indicated by the thin arrows. The symbols beside the arrows are the names of the oligomers. Oligomers were from 14 to 17 nt in length. Numbers refer to bp counted from the ATG codon. Only a portion of each subclone is shown.

homologous in the coding region. Comparison of the human cDNA sequence to MH921 shows a 96% sequence conservation in the coding region. In the 3’4lanking region, 33 to 50 bp from the termination codon, there is a region of variation: in the human gene, HH3B-2, this region is primarily poly(A), while in MH321 and MH921 there are poly(A)/poly(T) sequences which differ slightly from one another in length. MH321 and MH921 are otherwise identical for 365 bp in the 3’-flanking region. MH921 and HH3B-2 are 93% homologous for 500 bp of the 3’ end, excluding the poly(A)/poly(T) region. This homology is over the entire 3’ transcribed, non- translated region of the human cDNA.

MH921 and MH321 have only 6 nt changes in 91 bp 5’ to the ATG codon, for 93% homology. MH921 and HH3B-2 show less homology in the 5’ end, 86% homology for 103 bp 5’ to the ATG codon.

A surprising feature is the extent of homology between the mouse and human genes in the 5’ and 3’ transcribed, nontranslated flanking sequences.

The interspecific conservation of such extensive stretches of 5’ and 3’ transcribed flanking sequences has not been reported for other RNA polymerase II genes. Replication-dependent histone genes show little homology beyond the coding sequences, within a species (Taylor et al., 1986). The implication of this level of homology in noncoding sequences of the H3.3 histone genes is that these flanking sequences serve a function which cannot tolerate a high degree of change.

It is possible that there are multiple classes of H3.3 genes and that only those within a class are highly conserved. The two chicken H3.3 genes that have been isolated are not at all similar in their 5’- and 3’-flanking sequences (Brush et al., 1985), and are only 82% homologous in their coding sequence, even though they code for identical proteins. The mouse and human H3.3 genes are more similar than these two chicken genes. The mouse H3.3-like and human H3.3 genes were isolated with, and are therefore homologous to, the mouse H3.3-like gene, H3.6

Fig. 2. Comparison of H3.3 and H3.3-like coding sequences from human (HH3B-2, HH3-1) and mouse (MH921, MH321, H3-6) genes with a mouse H3.3 consensus sequence (MH33CN). The 5’- and 3’4anking sequences of MH921, MH321, and HH3B-I are also compared. Nucleotides are numbered only in flanking sequences whereas amino acids are numbered in coding regions. The asterisks above the amino acid sequence indicate amino acids which are characteristic of H3.3 proteins (see RESULTS AND DISCUSSION, section a). Dots indicate that the bases are the same as those of HH3B-2, and dashes indicate deletions. Bases above or below lines, set off by slashes, are insertions, relative to the sequence of HH3B-2.

1 234

D

242

C

242 110

190 6 78

E

so- 9 10 11

4

- mm +

a43

.,

1 2 3 4

351

s 6

(Sittman et al., 1981). It is possible that H3.3 genes of other classes have gone undet~ted in man and mouse.

(c) Analysis of 5’ and 3’ Sl nuclease

The size of the product when the gene within clone MH6 11 was used in a 5’ S 1 assay, as calculated from the data shown in Fig. 3A, was approx. 300 nt, indicating that the mRNA protects this fragment for about 60 nt beyond the ATG codon. The size of the product when the gene within MH321 was used in an S 1 assay indicated that the mRNA protected this fragment to approx. 100 nt beyond the ATG codon (Fig. 3B). The gene within clone MH921 showed the longest protected fragment in an Sl assay (Fig. 3C, lanes 1 through 8). Protection of the probe extended about 110 nt beyond the ATG codon. S 1 analysis was repeated, using an NcoI fragment which spans the ATG codon of the MH921 gene. A smaller pro- tected fragment, which could be more accurately sized, was produced. These results are shown in Fig. 3C, lanes 9 through 11. Protection to approx. 110 nt beyond the ATG codon was seen. The transcripts complementary to these genes, like those complementary to the human HH3-1 gene, are con- stitutively expressed during the cell cycle and through differentiation, providing an explanation for the mechanism of H3.3 replacement of the repli- cation-dependent H3 histones (Sittman et al., 1983b; Brown et al., 1985b; Wellman, 1986).

Using MH92 1 as a probe, 3’ S 1 nuclease analysis

35

resulted in a major band at a position indicating protection to the stretch of A’s and Ts 33-50 nt beyond the termination codon (Fig. 3D). This is the only region in the 3’-flanking sequences which shows a significant loss of homology with the human gene HH3B-2. The human gene is known to be transcribed well beyond this point (Wells and Kedes, 1985). It seems likely that, because of the high homology of the mouse genes with the human cDNA, either MH32 1 and MH92 1 or the gene from which they were derived is transcribed beyond this pomt. It is possible that the AC,,,TC,,, sequences form an Sl-sensitive structure in RNA-DNA hybrids. Faint but reproducible bands could be seen, indicating protection by longer transcripts (Fig. 3E, lane 5). To avoid S l-sensitive areas, we used exonu- clease VII in a protection assay of MH921 to map

the 3’ end (Fig. 3E). Four protected fragments were seen (Fig. 3E, lanes 3 and 6) which comigrated with the faint bands seen in the S 1 analysis (Fig. 3E, lane 5). (In Fig. 3E, lane 6, the upper two bands appear as one due to the low resolution of the gel.) These bands correspond to 3’ termini at nt 521,499, 367; and 303. Examination of the MH921 and HH3B-2 sequences in three of these regions reveals consensus cleavage/polyadenylation signals (Bimstiel et al., 1985). These are shown in Table I. Sequences in the region around nt 499 are also shown. Except for the TA at nt 500, a possible site for poly(A) addition, the consensus cleavage/polyadenylation sequences are not present in this region.

Fig. 3. Mapping of H3.3 mRNA with Sl nuclease and exonuclease VII. The 5’ ends (panels A, B, and C) and 3’ ends (panels D and E) of H3.3 mRNA were mapped with Sl nucleaae (paneis A-D and E, lane 5) and exonuclease VII (panel E, lanes l-4 and lane 6). Fragments with unique restriction sites within the genes were labeled and hybridized to mouse RNA. The duplexes were digested with either S 1 nuclease or exonuclease VII and the products were separated on a 8 y0 polyactylamide gel. (A) 5’ S 1 analysis of MH611. The Bg1II site at codon 81 was labeled: (1) HinfI-digested pBR322, (2) protection from Sl digestion by 10 pg MEL cell RNA, (3) protection by 20 pg MEL cell RNA, and (4) protection by 10 pg tRNA. (B) 5’ Sl analysis of MH321. The BgZII site at codon 81 was labeled: (I) HinfI-digested pBR322, (2) protection by 10 pg tRNA, (3) protection by 10 ,ug MEL cell RNA, and (4) protection by 20 peg MEL cell RNA. (C) 5’ Sl analysis of MH921. The Bg1II site at codon 81 (lanes 1-4, 7 and 8), or the NcoI site that spans the ATG codon (lanes 10 and 11) was labeled: (1) protection by 10 pg MEL cell RNA, (2) protection by 20 pg MEL cell RNA, (3) protection by 10 pg tRNA, (4) end-labeled fragment alone, (5) aide-digested pBR322, (6) ~~~II~igested pUCl8, (7) protection by 0.6 pg polyadenylated myeloma cell RNA, (8) protection by 10 pg tRNA, (9) H&II-digested pUCI8, (10) protection by 0.6 pg myeloma cell RNA, and (11) protection by 10 pg tRNA. (D) 3’ Sl analysis of MH921. The &a11 site at codon 11 was labeled: (1) protection by 10 p’g tRNA, (2) protection by 0.6 pg polyadenylated myeloma cell RNA, (3) HpaII-digested pUC18, and (4) protection by 0.6 pg polyadenylated myeloma cell RNA. (E) 3’ exonuclease VII analysis of MH921. The ffpaI1 site at codon 11 was labeled. Marker fragments, from HpaII digestion of pUC18, are not shown but their sixes (504,489,404 and 351 bp) are indicated: (1) a 780-bp end-labeled BgfII-Hind111 fragment from pLvUl.1 (Brown, et al., 1985a) used as a marker, (2) protection by 10 pg tRNA, (3) protection by 0.6 pg polyadenylated myeloma cell RNA, (4) the 943-bp end-labeled probe fragment alone, (5) protection from S 1 nuclease by 0.6 pg polyadenylated myeloma cell RNA and (6) protection by exonuclease VII by 0.6 pg polyadenylated myeloma cell RNA. The arrows beside lane 5 show the positions of the faint bands from Sl analysis that corn&rate with the exonuclease VII products.

36

TABLE I

Consensus cleavage/polyadenylation sites in mouse H3.3 genes

Hexamer a

=sAATAAA 343AG ( G ) TAAA

=‘ATTAAA

Addition siteb

29’GTT_eAAATTTTTCmACAAT>CCAGCATTTGGA 365ACmAATGGTGTTTGTAGCATTTTTATCATACAC 490GCTATTAAAATACATTAAACTATA. . . 5’8TAG. _ .

a The sequences AATAAA, AGTAAA, ATTAAA, or a related hexamer are found 10 to 30 nt upstream from the polyadenylation site in many genes (Bimstiel et al., 1985). b The usual site of poly(A) addition is the dinucleotide TA or CA. Downstream from the addition site, T + G-rich clusters often occur (Bimstiel et al., 1985). The dinucleotides at which addition may occur and downstream T/G clusters that occur are underlined.

(d) Are the mouse H3.34ke genes pseudogenes ?

Sl nuclease results like those shown in Fig. 3 are normally considered evidence that the genes used as probes are expressed. However, MH921 and MH321 have several features characteristic of pro- cessed pseudogenes which have arisen as reverse- transcription intermediates (Wilde, 1986). They lack introns, whereas the H3.3 genes from chicken have introns. At the point in the 3’4lanking sequence where MH921 and MH321 lose homology, the MH321 gene has a stretch of 16 A residues, 2 T’s and 5 more A’s, which are possibly the remnants of a poly(A) tail. This sequence is then followed by a short sequence (TT AAAAATGGGTAAT) which has a nearly identical direct repeat sequence (TTAAAAATAGGGTTAT) at the 5’ end adjacent to the point at which this gene loses homology with MH921 (Fig. 2). Thegene from MH921 has aregion of 13 A’s at nt 520, the position of the poly(A) tail in HH3B-2. The A’s are followed by the sequence GAATGACTGG, which is directly repeated in the 5 ‘-flanking region at the point where MH921 and HH3B-2 lose homology. MH321 and MH921 are likely processed pseudogenes, and these regions are remnants of poly(A) tails. The presence of these poly(A) sequences and the exonuclease VII results indicate that mammalian H3.3 genes can use mul- tiple sites for polyadenylation.

These mouse H3.3-like histone genes are probably pseudogenes that originated as reverse transcripts; however, it is possible that they are expressed. Transcriptionally active pseudogenes have been found (Soares et al., 1985; Stein et al., 1983; Wilde, 1986; Gruskin et al., 1987) and it is possible that all

of the replication-dependent histone genes arose through such a process. Variant H3 genes have been isolated from other organisms: an H3 gene from duck which codes for a protein with 10 aa replace- ments from the H3.1 sequence has been isolated (Tonjes and Doenecke, 1985). A variant H3 gene has been isolated from Xenopus Iaevis which has 5 un- predicted aa codon replacements (Moorman et al., 198 1). All of these genes appear capable of coding for complete H3 proteins and none have accumulated termination codons. The unpredicted amino acid replacements do not necessarily argue that these genes are silent. The only amino acid sequences of isolated histones that are available are of the abun- dant variants, and these genes may code for rare variants.

(e) Analysis of base substitutions

Gojobori et al. (1982) have previously shown in a detailed analysis of eight pseudogenes and three functions genes that most base changes of genes no longer under selective pressure are due to losses of C’s and G’s, with the preferred changes being C to T and G to A. In Table II we present the relative substitution frequencies of MH921, MH321, H3-6 and HH3-1. These data are compiled from the se- quences presented in Fig. 2. It is clear that for these histone genes C to T and G to A changes are pre- ferred, too. There is a higher frequency of G to T changes than Gojobori et al. (1982) observed, 13.8 % vs. 7%. This may be due to a low number of data points, Of all the nucleotide changes, 73% result in an amino acid change. This is a reasonable frequency since changes in the first two bases of most codons would cause an amino acid replacement.

31

TABLE II

Relative substitution frequencies in H3.3-like histone genes *

A T C G

A - 0 0 5.2 5.2

(2/108)

T 6.4 - 3.2 0 9.6

(2/87) (l/87)

C 5.1 28.1 - 5.1 38.3

(2009) (lljlO9) (2~09)

G 30.3 13.8 2.8 - 46.9

(1ljlOl) (5/101) (l/101)

41.8 41.9 6.0 10.3

a The substitution frequencies presented were calculated by the

method of Gojobori et al. (1982) and are given as percentages.

They correct for the base composition. The actual proportions of

base substitutions are presented in parentheses. The data are

taken from the nt changes ofMH921, MH321, H3-6 and HH3-1

as compared to the mouse H3.3 consensus sequence (Fig. 2).

It is apparent that the C’s and G’s in CG doublets (site of C methylation) change with a higher frequency than C’s and G’s that are not in CG doublets. Gojobori et al. (1982) calculated that 74% of the CG doublets in eight pseudogenes had a base change. In the group of mouse variant genes shown in Fig. 2, 72% of the CG doublets have a base change. Compared to the mouse H3.3 consensus sequence (Fig. 2), 13/34 (38%) of the C’s and G’s in doublets which, if changed, could alter amino acid sequence, have changed in these genes. In contrast, of the 142 C’s and G’s which are not in doublets and which could cause amino acid changes, only 15, or 11%) have changed, compared to the human cDNA sequence. It is likely then that the high rate of change in arginine codons which we observe in these histone genes is due to a preference for base substitutions in CG doublets and not due to a histone gene-specific selective change.

There appears to be no region of MH321 and MH921 (5’~flanking, coding, or 3’-flanking) which has diverged at a significantly faster rate than another, beyond that which can be accounted for by the frequency of CG doublets. For example, the 5’-flanking ends of these genes are less homologous (93% for 90 bp from the AUG codon) than the 3’-flanking regions (100% for the 365 bp beyond the stop codon). However, all the changes in the 5’ end occur in the 10 CG doublets. The 3’-flanking end has

only 3 CG doublets in a much longer stretch of DNA. As pointed out above, most of the coding region changes occur at CG doublets.

The diminution of CG doublets in these H3.3-like genes is consistent with the conclusion that they are not expressed. Recent characterization of a muscle- specific cahnodulin gene from chicken showed that it probably arose through a reverse transcriptase- mediated event, but no diminution of CG doublets has occurred (Gruskin et al., 1987).

Taking into account that only a limited number of single base changes can form a termination codon (37 in the mouse H3.3 consensus sequence) and the relative substitution frequency of bases (Table II) we estimate that for the H3.3 genes there is only a 1% chance that any single base change will result in the formation of a stop codon. If transitions and trans- versions are the only form of mutations which occur in pseudogenes, then the absence of termination codons in any of the genes presented in Fig. 2 cannot be used to argue convincingly that they are under selective pressure. We do not know the probability of occurrence of insertions or deletions in genes not under selective pressure and if the absence of any frameshift mutations in these genes is significant. It is interesting that the only recorded deletion, in H3-6, is of a codon.

(f) Conclusions

We obtained three mouse genomic clones which met our selection criteria for being replication-inde- pendent H3.3 genes. Two of these clones were unlike previously isolated chicken H3.3 clones: they coded for proteins with several unpredicted amino acid changes and contained no introns in the coding regions. These genes were strongly protected by an mRNA that had expression characteristics of an H3.3 mRNA. (1) We discussed the probability that these genes evolved through reverse transcription intermediates and are processed pseudogenes. (2) We showed that the pattern of base changes in these genes can be explained by the nucleotide sub- stitution frequency of genes no longer under selective pressure. (3) We presented evidence that mouse H3.3 transcripts are polyadenylated at four sites. (4) Sequence analysis of these genes showed that there is a high level of conservation of 5 ’ and 3 ’ transcribed, nontranslated sequences in the H3.3

38

genes of human and mouse. This interspecific con- servation led us to hypothesize that the 5’ and 3’ flanking sequences are essential for mammalian H3.3 expression characteristics.

ACKNOWLEDGEMENTS

We thank David T. Brown for providing MEL cell RNA. This research was supported by grants HD17682 to D.B.S. and GM29832 to W.F.M. from the National Institutes of Health.

REFERENCES

Affolter, M. and Ruiz-Carillo, A.: Transcription unit of the chicken histone H5 gene and mapping of H5 pre-mRNA sequences. J. Biol. Chem. 261 (1986) 11496-l 1502.

Benton, D.W. and Davis, R.W.: Screening lzgt recombinant clones by hybridization to single plaques in situ. Science 196 (1977) 180-182.

Birnstiel, M.L., Busslinger, M. and Strub, K.: Transcription ter- mination and 3’ processing: the end is in site! Cell 41 (1985) 349-359.

Brown, D.T., Morris, G.F., Chodchoy, N., Sprecher, C. and Marzluff, W.F.: Structure of the sea urchin Ul RNA repeat. Nucl. Acids Res. 13 (1985a) 537-556.

Brown, D.T., Wellman, SE. and Sittman, D.B.: Changes in the levels of three different classes of histone mRNA during murine erythroleukemia cell differentiation. Mol. Cell. Biol. 5 (1985b) 2879-2886.

Brush, D., Dodgson, J.B., Choi, O-R., Stevens, P.W. and Engel, J.D.: Replacement variant histone genes contain intervening sequences. Mol. Cell. Biol. 5 (1985) 1307-1317.

Cole, K.D., Kandala, J.C. and Kistler, W.S.: Isolation ofthe gene for the testis-specific H 1 histone variant H It. J. Biol. Chem. 261 (1986) 7178-7183.

Coles, L.S. and Wells, J.R.E.: An Hl histone gene-specific 5’ element and evolution of Hl and H5 genes. Nucl. Acids Res. 13 (1985) 585-594.

Daily, L., Hardy, S.M., Roeder, R.G. and Heintz, N.: Distinct transcription factors bind specifically to two regions of the human histone H4 promoter. Proc. Natl. Acad. Sci. USA 83 (1986) 7241-7245.

Engel, J.D., Sugarman, B.J. and Dodgson, J.B.: A chicken histone H3 gene contains intervening sequences. Nature 297 (1982) 434-436.

Gojobori, T., Li, W.-H., and Gram, D.: Patterns of nucleotide substitution in pseudogenes and functional genes. J. Mol. Evol. 18 (1982) 360-369.

Graves, R.A., Wellman, S.E., Chiu, I.-M. and Marzluff, W.F.: Differential expression of two clusters of mouse histone genes. J. Mol. Biol. 183 (1985) 179-194.

Gruskin, K.D., Smith, T.F. and Goodman, M.: Possible origin of a calmodulin gene that lacks intervening sequences. Proc.

Natl. Acad. Sci. USA 84 (1987) 1605-1608. Kreig, P.A., Robins, A.J., D’Andrea, R. and Wells, J.R.E.: The

chicken H5 gene is unlinked to core and Hl histone genes. Nucl. Acids Res. 11 (1983) 619-627.

Kumaroo, K.K. and Irvin, J.L.: Isolation of histone TH I-xB from rat testis. Biochem. Biophys. Res. Commun. 94 (1980) 49-54.

Lennox, R.W. and Cohen, L.H.: The Hl subtypes of mammals: metabolic characteristics and tissue distribution. In Stein, G.S., Stein, J.L. and Marzluff, W.F. (Eds.), Histone Genes: Structure, Organization and Regulation. Wiley, New York, 1984, pp. 373-395.

Marzluff, W.F. and Graves, R.A.: Core histone variants of the mouse: primary structure and differential expression. In Stein, G.S., Stein, J.L. and Marzluff, W.F. (Eds.), Histone Genes: Structure, Organization and Regulation. Wiley, New York, 1984, pp. 281-315.

Maxson, R., Cohn, R., Kedes, L. and Mohun, T.: Expression and organization of mouse histone genes. Annu. Rev. Genet. 17 (1983) 239-277.

Moorman, A.F.M., De Boer, P.A.J., De Laaf, R.T.M., Van Dongen, W.M.A.M. and Destree, O.H.J.: Primary structure of the histone H3 and H4 genes and their flanking sequences in a minor histone gene cluster of Xenopus In&s. FEBS Lett. 136 (1981) 45-52.

Old, R.W. and Woodland, H.R.: Histone genes: not so simple after all. Cell 38 (1984) 624-626.

Sanger, F., Nicklen, S. and Coulson, A.R.: DNA sequencing with chain-terminating inhibitors. Proc. Natl. Acad. Sci. USA 74 (1977) 5463-5467.

Seyedin, SM. and Kistler, W.S.: Isolation and characterization of rat testis Hlt. J. Biol. Chem. 255 (1980) 5949-5954.

Shires, A., Carpenter, M.P. and Chahdey, R.: A cysteine- containing HZB-like histone found in mature mammalian testis. J. Biol. Chem. 251 (1976) 4155-4158.

Sittman, D.B., Chiu, I.-M., Pan, C.-J., Cohn, R.H., Kedes, L.H. and Marzluff, W.F.: Isolation of two clusters of mouse histone genes. Proc. Natl. Acad. Sci. USA 78 (1981) 4078-4082.

Sittman, D.B., Graves, R.A. and Marzluff, W.F.: Structure of a cluster of mouse histone genes. Nucl. Acids Res. 11 (1983a) 6679-6697.

Sittman, D.B., Graves, R.A. and Marzluff, W.F.: Histone mRNA concentrations are regulated at the level of transcription and mRNA degradation. Proc. Natl. Acad. Sci. USA 80 (1983b) 1849-1853.

Soares, M.B., Schon, E., Henderson, A., Karathanasis, S.K., Cate, R., Zeitlin, S., Chirgwin, J. and Efstratiadis, A.: RNA- mediated gene duplication: the rat preproinsulin I gene is a functional retroposon. Mol. Cell. Biol. 5 (1985) 2090-2103.

Stein, J.P., Munjaal, R.P., Lagace, L., Lai, E.C., O’Malley, B.W. and Means, A.R.: Tissue-specific expression of a chicken calmodulin pseudogene lacking intervening sequences. Proc. Natl. Acad. Sci. USA 80 (1983) 6485-6489.

Sugarman, B.J., Dodgson, J.B. and Engel, J.D.: Genomic organi- zation, DNA sequence, and expression of chicken embryonic histone genes. J. Biol. Chem. 258 (1983) 9005-9016.

39

Taylor, J.D., Wellman, SE. and Marzluff, W.F.: Sequences of four mouse histone H3 genes: implications for evolution of

mouse histone genes. J. Mol. Evol. 23 (1986) 242-249. Tonjes, R. and Doenecke, D.: Structure of a duck H3 subtype

with four cysteine residues. Gene 39 (1985) 275-279. Trostle-Weige, P.K., Meistrich, M.L., Brock, W.A. and Nishioka,

K.: Isolation and characterization of TH3, a germ cell- specific variant of histone 3 in rat testis. J. Biol. Chem. 259 (1984) 8769-8776.

Wellman, SE.: Isolation and Characterization of Mouse Repli- cation-dependent and Replication-independent Histone Genes. Ph. D. Dissertation, Florida State University, Talla-

hassee, FL, 1986. Wells, D.E.: Compilation analysis of histones and histone genes.

Nucl. Acids Res. 14 (1986) rll9-r149. Wells, D., Bains, W. and Kedes, L.: Codon usage in histone gene

families of higher eukaryotes reflects functional rather than phylogenetic relationships. J. Mol. Evol. 23 (1986) 224-241.

Wells, D. and Kedes, L.: Structure of a human histone cDNA: evidence that basally expressed histone genes have inter-

vening sequences and encode polyadenylated mRNAs. Proc.

Natl. Acad. Sci. USA 82 (1985) 2834-2838. Wilde, C.D.: Pseudogenes. Crit. Rev. Biochem. 19 (1986)

323-352. Yaguchi, M., Roy, C. and Seligy, V.L.: Complete amino acid

sequence of goose erytbrocyte H5 histone and the homology between Hl and H5 histones. Biochem. Biophys. Res. Commun. 90 (1979) 1400-1406.

Zweidler, A.: Core histone variants of the mouse: primary struc- ture and differential expression. In Stein, G.S., Stein, J.L. and Marzluff, W.F. (Eds.), Histone Genes: Structure, Organi- zation and Regulation. Wiley, New York, 1984, pp. 339-371.

Communicated by S.T. Case.