a etjemeni! from teie genoime: of species distribution...
TRANSCRIPT
ROn-1 SINES: A SHORT INERSPERSED REPETITIVE ETJEMENI! FROM TEIE GENOIME: OF OREOCERO~SNlLOTICUS AND ITS
SPECIES DISTRIBUTION IN CICFnlrn FISHES
Louis J. Bryden
Submitted in partial fulnUment of the requirements for the degree of Master of Science
Dalhousie University Halifax, Nova Scotia
Febniary, 1997
Copyright by Louis J. Bryden, 1997
Acquisitians and Acquisitions et Bibliographie SeMces services bibliographiques 395 Wellington Street 395, nie Wellington OttawaON K1A ON4 Ottawa ON Kt A ON4 Canada Camda~
The author has granted a non- exclusive licence allowing the National Lfirary of Canada to reproduce, loan, distribute or seIl copies of this thesis in microfonn, paper or electronic formats.
The author retains ownership of the copyright in this thesis. Neither the thesis nor substantid extracts fkom it may be printed or otherwise reproduced without the author's permission.
L'auteur a accordé une licence non exclusive permettant à la Bibliothèque nationale du Canada de reproduire, prêter, distn'buer ou vendre des copies de cette thèse sous la forme de microfiche/film, de reproduction sur papier ou sur format électronique.
L'auteur conserve la propriété du droit d'auteur qui protège cette thèse. Ni la thèse ni des extraits substantieIs de celle-ci ne doivent être imprimés ou autrement reproduits sans son autorisation.
To my d e , Glenda, my deepest love and thanks.
To my children, Joshua and Victoria, who are my best work and of whom 1 am most proud.
CONTENTS
Chapter 1. Introduction. 1 - 1 Repetitive DNA sequences 1 . 2 Tandemly arrayed DNA Sequences 1 .3 Interspersed Repeats 1.4 m s 1 . 5 SINES 1 . 6 Initial Generation of S W s 1 .7 SINE Subfamilies 1 .8 Mechanism of Subfamily Generation 1 . 9 SllWFunction 1 . 10 Cichüd Biology 1 -11 Goals of this Study
Chapter 2. fiterials and Methods. 2 . 1 FishSampies 21 2 . 2 DNA Isolation 21 2 . 3 Subtractive Hybridization and Cloning 22 2 . 4 Transformation of pUC 18 Vectors into E. coli. 23 2 - 5 Characterization of Recombinant Colonies 24 2 . 6 Plasmid Preparation 25 2 -7 Restriction Endonuclease Digestion 25 2 .8 Gel Electrophoresis and Southern Transfer 26 2 - 9 Recovery of Plasmid Inserts and Radiolabelling of DNA Probes 26 2.10 Partial Digestions 27 2 -11 Hybridization Conditions 27 2 -12 Isolation and Subcloning of Repetitive DNAs firom Genomic Library 28 2 . 13 Plating Bacteriophage Lambda 29 2 - 1 4 Large Scale Bacteriophage DNA Preparation 30 2 . 15 Large Scale Plasmid Preparation 31 2 -16 Generation of Nested Sets of Deletions 32 2 .17 DNA Sequencing and Analysis 33
Chapter 3. R e d t s and Discussion 3.1 Subtractive Hybridization and Analysis 3.2 Sequencing of Repetitive Fragments 3 .3 Genomic Organization 3.4 Partial Digestion Analysis 3 .5 Species Blots 3.6 Isolation of Full Lengeh Repetitive Elements 3 .7 Identification of the Repetitive element
Chapter 4. Summary and Conclusions. 90
LIST OF FTGURFS AND TABUS
FIGURES page
1. Hybridization of radiolabelleci p80 insert to 0. niloticus genomic DNA 37 2. Hjôridization of radiolabelled p34 insert to O. niloticm genomic DNA 38 3. Hybridization of radiolabelled p43 insert to O. niloticus genomic DNA 39 4. Hybridization of radiolabUed p44 insert to O. nilotkus genomic DNA 40 5. Hybridization of radiolabelled p54 insert to O. niloticus genomic DNA 41 6. Nucleotide sequence data for PERT clones 43 7. Molecular characterization of the p8O repetitive element 48 8. Molecular characterization of the p34 repetitive element 50 9. Hybridization of p8O to O. niloticus genomic DNA Pst I partial digest 53
10. Hybridization of @O to O. niloticus genomic DNA Mbo 1 partial digest 55 11. Hybridization of p34 to O. niloticus genomic DNA Mbo 1 partial digest 57 12. Hybridization of p34 to O. niloticus genomic DNA Hae III partial digest 59 13. Hybridization of p80 to zoo blot 63 14. Hybridization of p34 to zoo blot 65 15. Hybridization of p80 to cichlid species blot 67 16. Hybridization of p34 to cichlid species blot 69 17. Complete nucleotide sequence of p80A9.2 subclone 76 18. Complete nucleotide sequence of p80A7.3 subclone 79 19. Partial sequence data from p80h9.1 subclone 80 20. Multiple sequence alignment of p80h9.2, p80h7.3 and p80 85
21. Schematic representation of the O. niloticus ROn-1 SINE 22. Secondary structure of the ROn-1 SINE tRNA-Like region
TABLE
1. List of putative open reading fiames in PERT clones
ROn-1 SINES: A short interspersed repetitive element fkom the genome of Oreochmmis niloticus and its species-specinc distribution in cichlid fïshes.
While attempting to isolate sex-specific markers fkom Oreochmmis n&ticus using the phenol enhanced reassociation technique (PERT), subtractive hybridization, 1 identifieci partial sequences for five novel repetitive DNAs. One highly repetitive DNA element, termed ROn-l (retroposon O. niloticus-1), was characterized in detail. Using a partial fiagrnent of a ROn-1 element (clone p80), we screened an 0. niloticus genomic library for fidl length Rûn-1 elements. Approximately 600 positive plaques were detected among 1.5 x 104 plated indicating 6000 copies of ROn-1 element per haploid genome. The aügnment sequence kom two independent clones showed that the ROn-1 element is 343 bp long and fianked by 52 bp direct repeats. Moreover, the sequence of the element revealed a tRNA-related domain with putative RNA polymerase III control boxes, a tRNA-unrelated domain and an A-rich tail, characteristic of SINE elements found in other species. Sequence and secondary structural similarities suggest that ROn-1 is derived fkom tRNA lysine. Southern analysis using an interna1 fiagrnent spanning the tRNA related and WA-unrelated regions confirmed that ROn-1 is a highly repetitive and dispersed element in cichlid fish genomes (i. e., genera, Oreochmmis, Tihpia , Samtherodon, Haplochrornis, Hernichrornis, and Pelicicachmmis), but is absent in the gemmes of representative noncichlid fishes and mammals.
LIST OF ABBREVLATLONS AND SYlVlBOLS USED
bp - basepairs BAP - bovine alkaline phospahatase BSA - bovine serum albumin cpm - counts per minute dATP - deoxyadenosinetriphosphate dCrP - deo~cytosinetrîphosphate DNA - Deoxyribonucleic acid EDTA - ethylenediaminetetraace tic aad g - &=a=' Kb - küo-basepairs LDL - low density lipoprotein LI3 - Luria- Bertani M - molar concentration pg - microgram
OC - degrees celsius PEG - polyethylene glycol Pm - plaque forming units rpm - revolutions per minute ROn- Retroposon O r e o c h r ~ ~ s niloticus ÇDS - sodium dodeql sulfate TE - Tris EDTA tRNA - transfer ribonucleic acïd U - units U V - ultraviolet v/cm - volts per centimeter
1.1 Repetitive DNA Sequemes
Early work on the organization of eukaryotic genomes utilizing
renatirration kinetics (Britten and Kohne 1968) revealed that, generally, a
large proportion of the genome consisted of repeated DNA sequences. The
remainder consists of unique sequence or protein-coding sequences and
constituts less thsn 10% of the genome. Repetitive sequences were e s t
observed as peaks, or "sateIlites", flanking the main genomic fraction in
buoyant density gradient analysis due to their biased nucleotide composition
(Kit 1961; reviewed in Miklos 1985). Reassociation experiments have indicated
that the genomic DNA of higher eukaryotes can be subdivided into three major
classes: highly repetitive, moderately repetitive and unique sequence DNA
(Britten and Kohne 1968). There are two types of repeated DNA sequences in
the eukaryotic genome that have been classified according to their structure,
distribution and reiteration frequency and include the clustered tanderniy
repeated DNA sequences and interspersed DNA sequences (Singer 1982;
Weiner et al. 1986).
1.2 Tandemly Arrayed DNA Sequences
Tandemly repeated or arrayed sequences, commonly known as satellite
DNAs, consist of head to tail monomeric DNA sequence repeats that Vary in
length fkom just a few to several hundred base pairs (Bniaag 1980; Miklos
1985). These repetitive elements have been M e r classified, based on the
size of the monomeric unit within the array, (reviewed in Charlesworth 1994)
as either satellite, minisatellite (JefYreys et ai. 1985 a, b) or microsatellites
(Dover 1989; Tautz 1989) based on the relative size of the monomer unit.
SateIlite DNA sequences are characterized on the basis of the monomer length 1
Y
ranging generally fkom 100 to 2000 bp in Iength. Typicdy they are organized
in very large clusters of up to 100 megabases and are localixed in chromosomal
heterochromatic regions especidy at or near centromeres and telorneres
(reviewed in Charlesworth 1994 and references therein).
Microsatellites also termed simple sequence repeats consist of
tandemly-arrayed stretches or tracts of nucleotide motifs about 1-10 base
pairs in Iength (Tautz 1989). Generally they are less than 400 bp in length and
they have been identined in vertebrate, plant and insect genomes but not in
yeast (Bdord and Wayne 1993; Stallings et al. 1991). Arrays of simple
repetitive DNA M e r in length, organization and base composition and they are
widely dispersed throughout the genome comprising up to 5% of the genome
DNA content. Simple sequence DNA is more ubiquitous in eukaryotes than in
prokaryotes and simple motifs, of 1-3 bp in length, occur with a fi-equency 5-10
times more ofken than random motifs. Motifs found in microsatellites are
generally polypyrimidine or polypurine and poly CA motifs (Frank et al. 1991).
Minisatellites, otherwise k n o m as VNTRs (Variable Number Tandem
Repeats) because of the Merences in the number of repeat units at a
parücular locus (Jefneys et al. 1985 a, b), are tandemly repeated sequences of
DNA, 9-65 bp in length, that are reiterated in tandem forming arrays up to 20
kbp long. They feature a high G C content ( although this may be biased by
methods of isolation ) and strand asymmetry and the arrays are flanked on
either side by unique sequence DNA (Wright 1994). It is within the tandemly
repeated structures that the molecular basis of the variability of these
sequences lies in that minisatellites fkequently show substantial allelic
variation in the number of repeat unita and sequence analysis of cloned
u
minisatellites has shown that the repeat unïts within a minisateIlite are
seldom all identical but usually display some variation in sequence between
repeats. It has been shown that minisatellites exist as families, the members
of which are related by homology of the core unit of theïr tandem repeats and
are scattered throughout the genome (Jefieys e t al. 1985 a, b).
Tandem arrays are clustered mostly in telomeric, centromeric and
heterochromatic region of chromosomes (Miklos 1982; Frank e t al. 1991).
Tandemly repeated sequences are thought to be generated by gene duplication
events at the DNA level that include possible mechanisms such as unequal
crossover, replication slippage during DNA synthesis and rolling circle
replication ( Denison and Weiner 1982; DiRienzo et al. 1994; Levinson and
Gutman 1987; ScMotterer and Tautz 1992; Singer and Berg 1991) .
1.3 Interspersed Repeats
In addition to the highly repetitive tandemly repeated DNA sequences the
genome also contains repeated sequences that are intersperseci among single-
copy sequences and are referred to as transposable elements ( Miklos 1985;
Charlesworth et al. 1994). Tramposable elements are capable of inserting
copies of themselves into new genomic locations (Berg and Howe 1989) and are
classified into two major groups, based on their mode of transposition or
mechanism of action. The h t group includes DNA elements tha t are able to
transpose directly from DNA to DNA. These elements are characterized by
small inverted terminal repeats and contain an intemal sequence that encodes
a fùnctional transposase capable of b c t i o n i n g in DNA-only mediated
transpositions (Finnegan 1989,1992).
x
The second group comprise the retrotransposons. They are made up of
transposable elements that transpose by reverse transcription of an RNA
intermediate back into the genome. Weiner e t al. (1986) classified dispersecl
repetitive DNAs on the basis of their putative origîn, viral or nonviral referring
to the two retroelement subclasses. Since a number of repeated DNA
elements in eukaryotes have structural similarities to retroviruses and appear
to be repetitive DNAs of viral origin, they are referred to as class 1
retrotransposons or LTR retrotxansposons. They are characterized by the
presence of direct long terminal repeats (X1TRs) flanking a . interna1 sequence
that may contain one or more open reading &es. Generally, they possess
genes coding for products containing structural homology to the retroviral gw-
andpl- encoded proteins such as reverse transcriptase. The second class of
retrotransposons are the non-LTR retrotransposons, aiso cailed the non-viral
retrotransposons. This class of repetitive elements include the Long
Interspersed Nuclear Elements (LINES) and the Short Interspersed Nuclear
Elements (SINES). The LINES also encode poJ -1ike proteins, however, they do
not poses LTRs but they do have a poly-A tail at the 3 prime terminus.
SINES, on the other hand, do not encode a reverse transcriptase necessary for
retrotransposition. S M s and LINEs do not have a completely random
distribution in the genome (Deininger e t al. 1989; Hutchison et al. 1989).
While the distribution of repeated sequences correlates with the general
structural features of chromosomes, no retroelement has been found to be
exclusively restricted to a pat-ticular chromosomal location. However, it has
been shown that human Alu sequences or SINES are preferentially located in
GC rich or R (reverse G bands) banding regions of chromosomes h o w n to
contain a high densîty of active genes (Wichman et al. 1992; Antequera and
5
Bird 1993; Chen and ManueIidis 1989). LINE L1 repetitive elements as well as
some retrovirus like elements are preferentially located in AT rich G bmding
regions of chromosomes michman et al. 1992).
SINES are short interspersed elements, less than 500 bp in length and
contain promoters for RNA polymerase III transcription (Deininger 1989).
LINEs, however, are much longer and are believed to be active retroposons
because they encode proteins that are thought to mediate their own
retroposition. Both LINES and SINEs are referred to as Li.etroposonsn
because they use RNA intermediates during amplification and they do not
possess a retrovirus like structure. Retroposons may be defined as a
nucleotide sequence, present initially as a cellular RNA transcript, that has
been incorporated back into the genome, presumably via a reverse
transcription and generation of a cDNA intermediate (Jagageeswaran et al.
1991). Deininger et al. (1993) expanded on this dekition in limiting
retroposons to those that do not code for any of the proteins required durùig the
duplication of these elements. He uses the term SINE and retroposon
interchangeably.
1.4LtFW~3
LW£s were initially defhed as a famiy of long repeated DNA sequences
dispersed in the genome of humans, primates and rodents and make up a
significant portion of the mammalian genome. They differ fkom SINE elements
in that they are generally much larger, greater than 5 kb in length, are usually
present at copy numbers of approximately 104- 105 per haploid mammalian
genome (Hutchison et al. 1989) and are thought to be rernnants of
retroposition events, via reverse transcription, of genes transcribed by RNA
6
polymerase IL AU LINES isolated to date belong to the L1 family of repetitive
elements tbat have been weil characterized in mammalian genomes. Typically,
all L1 sequences are structurally similar in that they ail they exhibit a
consensus polyadenylation sequence at their 3 prime end as well as 5 prime
and 3 prime untranslated regions that are of variable size and sequence
composition. They d - e r from retroviral transposons in that they lack the long
terminal repeats required for seEexpression, but they contain two open
reading fiames ORE' 1 and ORE' 2 (Hutchison et al. 1989). ORF 2 encodes a
reverse transcriptase and is believed to code for enzymes that may play a role
in mediating their own retroposition (1Mathias et al. 1991). Not all L1 elements
isolated have been fidl length DNA sequences since they are o h n truncated
at their 5 prime ends. Truncation at the 5 prime ends of LINES has been
attributed to incomplete reverse transcription. Other L1 elements exhibit 5
prime inversions or deletions (Hutchison et al. 1989).
1.5 SINES
SINEs are repetitive DNA elements of approràmately 73-500 bp in length
and are usually present in copy numbers of approximately 103-105 per haploid
genome (Deininger et al. 1989). SINEs have a composite structure consisting
of a 5 prime region with sequence identity to transfer RNAs, which tends to be
conserved in families, followed by a tRNA unrelated region of variable length in
the center and an A + T rich region at the 3 prime end. The A + T rich region
rnay not always be present but may be replaced with an A-rich motif or some
other short simple repeatuig unit (Okada and Ohshima 1993). SINE s are
flanked by short direct repeats, which vaiy in length and base sequence. The
direct repeats represent a target site duplication of genornic DNA generated by
the repair of a staggered break formed at the SINE insertion site and are not
part of the repetitive f d y . The tRNA related region contains the RNA
polymerase III promoter A and B boxes required for transcription. Normdy,
the RNA polymerase III is responsible for the transcrÏption of small nuclear
RNAs (snRNAs), transfer RNAs (tRNAs) and 55 ribosomal RNA (5s rRNA)
(Deininger 1989).
The best characterized SINE is the Alu element, The Alu elements are
repetitive DNA elements found as families in primates that contain a unique
restriction site at the 5 prime terminus. They can be transcribed in vitro by
RNA polymerase III into a snRNA which has sequence identiw to the 7SL
RNA component of the signal recognition particle, which plays a key role in
intra-cellular protein transport (Weiner 1980). The Alu family shares
approximately 90% sequence identity with the 7SL RNA gene but is missing
about 150 bp of sequence seen in the middle of the 7SL RNA gene. The human
Alu family accounts for approxïmately 5% of the human genome, is present at
roughly 500,000 copies, and is the best characterized farnily of SINES (Rubin
et al. 1980). The modern Alu element is about 300bp long, contains a well
defined RNA polymerase III promoter, is flanked by direct repeats, has a 3
prime oligo(dA>rich tail region and is composed of two related sequences
(Fuhrman et al. 1981; Deininger et al. 1981). The left and right monomers of
the Alu sequence are manged in tandem presenting as a dimer like structure.
This dimeric organization is a cornmon feature in primates. An alignment of
both halves of the Alu element indicated that there is 68% sequence identity
between both monomers with the right half containing an extra 31 bases not
seen in the leR haK The RNA polymerase III promoter is seen only in the lefi
monomer and it directs the transcription of the entire element (Okada 1991a).
There is no apparent promoter fiuiction in the right monomer (Deininger 1989).
8
The rodent B1 element, found in the mouse and other rodents, is similarly
related to the 7SL RNA gene. The B1 element, like the Alu element, is a
sequence of about 140 bp and contains an intemal tandem repeat of 29 bp and
a 9 bp deletion when compared to the left Alu monomer (Krayev et al. 1980,
1982; Haynes et al. 1981). The major structurai Merence between the two is
that the BI element is present as a monomeric unit, whereas Alu is dimeric in
sîructure. Both are thought to be derived from the 7SL RNA gene but the
genesis of each family is thought to have arisen independently and by Merent
mechanisms (Ulu and Tschudi 1984). Recently fixe leR Alu monomers
0 and fkee right Alu monomers (FRAM) have ben detected in primates
(Quentin 1994). The fiee leR Alu monomers are composed of at least two
subfamilies each charackrized by point mutations at diagnostic positions and
are thought in f?li the gap between the 7SL RNA gene and modern Alu
elements (Quentin 1994).
The ability of some SINES to fuse and form composite transposable
elements composed of related subunits or unrelated subunits is not uncornmon.
The human Alu element possessea a dimeric structure composed of two related
sequences as diacussed earlier. A SINE element has also been found in the
genome of the prosimian, Galago crassicuudatus, referred to as the Galago type
II famiy. This element is a fusion of a 7SL derived repeat with a tRNA derived
repeat (Danieh and Deininger 1983). In both examples, the right monomers of
these elements appear to lack any function. Recently a family of composite,
tRNA-derived SINES was reported (Izsvak et al. 1996). The DANA SINE
family, specinc to the genus Dunio, has an unique structure composed of tRNA
derived region followed by multiple unrelated sequence blocks. It appears that
these elements are derived from the assembly of short sequences into a -A-
3
derived-dement which were subsequently amplined as a new tramposable
element (ksvak et al. 1996).
The Cpl SINE found in chironomids also possess a cassette type of
structure (He et al. 1995). These SINEs are polymorphic consisting of two
sequence modules, A and B, found in different numbers and in variable orders
relative to each other. The B module contains the polymerase III promoter
boxes. They oRen have inverted segments at the 5 prime ends that have been
shown to be related to module elimination in this area. SINEk have also been
show to be associated with other repetitive sequences (Izsvak et al. 1996;
Shimoda et al. 1996a; He et al. 1995; Takasaki et al. 1994).
SINES can be assigned to either of two large superfamilies, those related
to 7SL RNA and those related to tRNAs. The Alu repeats and the rodent BI
repeats are related to 7SL RNA Unlike the N u elements, most SINES are
thought to be derived h m tRNAs and were often regarded as tRNA
pseudogenes (Okada 1991a). Early work in this field provided evidence for the
existence of other f d e s of SINEs in mammals. These elements were
characterized and referred to as 'Nu-like' elements but distrnctly different fiom
Alu families (Okada 1991a). It was determined that these S'INEs were not
simple tRNA pseudogenes but were actually composed of a tRNA like region
followed by a tEWA unlike region and an AT-rich region, as discussed earlier.
F'urthermore, it has been shown that SINEk are widespread throughout the
animal and plant kingdoms (Deininger et al. 1989).
SINES related to tRNAs or derived ancestrally fkom tRNA precursors
have been characterized in humans and referred to as mammalian wide
10
interspersecl repeats (MIRs) because they are ubiquitous in all placental
mammals (Smit and Riggs 1995; Jurka et al. 1995). Other tRNA-related
S W s have been identiiïed in vertebrates and invertebrates and include:
rodents, such as the B2 element and ID sequences (Sakamoto and Okada
1985; Lawrence et al. 1985; Daniels and Deininger 1985); the porcine SINE
(Frengen et al. 1991); canoid SINES (Coltman and Wright, 1994; Minnick et al.
1992); the quine ERE SINE family (Sakagarni et al 1994); the squid SK
family (Okada and Ohshima 1994) and the octopus OK, OR1 and OR2 families
(Oshima and Okada 1995); in higher plants such as rice (Hiram et al. 1994;
Mochizuki et al. 1992) oil seed rape Brassica nappa SINE (SlBn) (Deragon et
al. 1994) and the tobacco TS family of SINES Voshika et al. 1993); the insect
Cpl SINE (He et al. 1995); in fish the salmon Sma 1 family, charr Fok 1
f d y and the sahonid Hpa 1 family (Kido et al. 1991) and the zebrafish
DANA SINES (Izsvak et al. 1996); the rabbit C and goat repeats (Sakamoto
and Okada 1985) in a fungus that causes a powdery mildew, Erysiphe
grIiminins (Rasmussen et al. 1993) and in the rice blast fungus, the M g - S W
(Kachroo et al. 1995). The presence of a given SINE is usually restricted to a
relatively few related species, but the recently characterized %ermaidn family
of SINES is the most widespread SINE currently fond in the animal kingdom
with members present in mammals, frogs and fish (Shimoda et al. 1996a).
1.6 Initial Generation of SINES
There are two contrasting theories about the initial generation of SINE
elements. Deininger and Daniels (1986) have argued that SINEs may have
arisen from tDNAs that accumulated mutations at a neutral rate but had no
effect on tRNA function. Okada (1991b) has a contrasting point of view and
suggests that the tRNA-related region of several SINES was derived from
I I
tRNA and not D N A He argues that the presence of a CCA motif, like that
found at the 3 prime ends of mature tRNAs, is also found at the 3 prime end of
the tRNA region of these SINEs. These contrasting points of view shed no light
on the nature of the genesis of the composite structure found in SINEs but
recently a possible mode1 for the initial generation of SINES has been proposed
(Oshima et ai. 1993; Okada and Oshima 1993). The discovery of large
numbers of tRNA-related SINEs dowed researchers to categorize SINES in
tems of their relatedness to possible parent tRNA genes based on the primary
and secondary structures of the repetitive elements. This led to the ernergence
of superfamilies of SINES based on homology to parental tRNA genes. A
majority of SINES have been categorized as being members of the tRNA-
lysine related SINE superfamily because of their homology to tRNA-lysine
(Okada and Oshima 1993). The next most common superfamiles are the
tRNAglycine and WA-arginine-related SINES.
Oshima et al. (1993) aligned the consensus sequences from five different
SINES with WA-lysine related sequence stmctures from phylogenetically
distinct species. Species used included the charr Fok 1 family, the salmon Sma
1 family, the squid SK family, the rodent type 2 (B2) family and the tortoise Pol
III/ SINI3 f d y . It was found that in the tRNA-unrelated region, two
sequence motifs, GATCTG and TSGG, separated by 10-11 nucleotides are highly
conserved. The results indicate that the similarities were signincant and that
the two conserved motifs may be fiuictionally important in the genesis and
maintenance of these elements. Similar sequence motifs were also found to be
present in the U5 sequences of several mammaüan retrovllyses that utilize
tRNA-lysine as a primer during reverse transcription. This has led to a
proposed mode1 for the initial generation of SINES refemed to as the strong
10
stop DNA model (Oshima et al. 1993: Okada and Oshima 1993). In this model
the 3 prime end of the terminal end of a tRNA-lysine hybridizes to the primer
binding site in the viral genome. The viral genome is reverse transcribed nom
the CCA motif at the 3 prime end toward the 5 prime end of the genome. The
product is a single stranded DNA with tRNA-lysine at its 5 prime terminus.
This is referred to as the "strong stop DNA" During reverse transcription the
transcribed DNA sequence 3umps7 to the 3 prime end of the viral genome due
to the presence of flanking repeats. Subsequently, through a number of
udmown processes, the primer tRNA sequence is not removed and is copied
either into DNA or is inserted directly into the genome as a tRNA-DNA hybrid.
This process produces a tRNA-lysine pseudogene (Oshima et al. 1993; Okada
and Oshima 1993). The model also accounts for the presence of the CCA motif
found the 3 prime terminus of the tRNA related regions of most SINEs.
Recently a family of LINES that were very similar to those of the avian
CR1 LIME family were isolated fkom the turtle genome (Oshima et al. 1996).
The 3 prime end region of the turtle CRI-like LINES were reported to be shared
with the 3 prime end of the tRNA unrelated region of the tortoise POL W
SINE suggesting a cornmon mechanism may be responsible for the
retroposition of both elements. These authors suggest that tRNA derived
SINEs may be composed of a chimeric structure, with a tRNA related region
dong with the le& halfof the tRNA unreIated region and the right half of the
tRNA unrelated region which is homologous to the 3 prime end of a LINE.
They now suggest that SINEs may have been generated by a recombination
between a strong-stop DNA with a primer tRNA and the DNA fkom the 3
prime end of a LINE (Oshima et al. 1996). The isolation and characterization
of other SINEs and retroviral sequences, dong with the elucidation of the
intermediate processes, would be required to validate the hypothesis.
1.7 SINE Subîhmilies
Many of the major SINE f d e s can be divided into s u b f d e s which
are defhed in terms of common nucleotide variations at diagnostic locations
(Deininger et al. 1993). S u b f d y structures have been reported for several
SINEs including the A h repeats. Several groups have divided the Alu repeats
into difEerent s u b f ~ e s which appear to have arisen at different times in the
primate genome. There are Merences of opinion regardhg the sequence
assignment of each subfamily and the exact number of Alu s u b f d e s . The
major A h subfamilies include the Predicted Variant (PV) or Human specinc
(HS) subfamily, the Precise subfamily and the Major subfamily and each
subfamily can also be divided into several subgroups (Batzer et al. 1990;
Matera et al. 1990; Okada 1991; Schmidt and Maria 1992). Each subfamily
was apparently inserted back into the genome a t different times and each
shares blocks of nucleotides that were different fkom the current consensus
sequence at diagnostic positions (Slagel et al. 1987; Willard et al. 1987; Britten
et al. 1988; Quentin 1988; Jurka and Smith 1988). The extent of divergence
correlates with its appearance with the youngest subfamily having the most
diagnostic changes compared to the consensus sequence. The oldest Alu
s u b f d y is likely to be very similar to 7SL DNA at diagnostic positions while
the youngest subfamilies have diverged from it. Although older Alus differ by
mutations that mostly accumulated &r retroinsertion, the analysis of
subfamilies suggest that part of the sequence diversity of young Alus is due to
diversity of source or founder genes generated by successive waves of
amplification over time. SINEs in other species than primate also were
reported to have undergone successive waves of amplification and they include:
A-x
the rodent B1 family W b et al. 1983; Quentin, 1989), the rodent B2 family
(Rogers 1985; Bains and Smith 1989), the rabbit C repeats m a n e et al.
1991), the tobacco TS family (Yoshika et aL 1993) and the salmon families
(Edo et al. 1994). These results indicate that retroposition is not a single
discrete event but rather an ongoing process.
1.8 Mechanism of SubfhÛly Generation
The precise mechanism of SINE amplincation is un.kn0w-n as are the
forces that govem the amplincation of the various SINE families. However,
several plausible models have been put forward to account for retropositional
events. Deininger et al. (1992) identEed several subfrimilies of human Alu
sequences. Based on his observations, they hypothesized that retroposition
events are due to a single master A h source gene. In the "Master Gene
Model," Deininger argues that a single master gene locus is responsible for the
amplification of alI subfamilies of Alu sequences. However, the copies that this
master gene creates are rarely active in retroposition. The reasons why most
copies are incapable of retroposition are not completely understood; however,
Schmidt and Maria (1992) have discussed a number of potentially important
factors influencing retroposition. One of the most important implications of
the master gene mode1 is that it predicts that the master Alu gene has a
dehed function that must be maintaineci to ensure the survival of the
organism.
An alternative proposal was submitted by Schmidt and Maria (1992).
Based on their observations on a recently amplified Alu subfdy, they have
proposed that multiple Alu elements are or were potential sources for ongoing
retroposition, with some elements being more successful than others. In their
La
transposon model, they propose that the retroposition of each source element
may be affected by many factors at Merent levels. These factors may
include upstream elements or factors controlled by the chromatin context near
a newly transposed SINE element, methylation or mutations within the
individual elements, &-acting and tram-acting elements that affect
transcriptional activity, RNA processing or poly A metabolism such that the
detailed structure of the RNA transcript being generated from different copies
may dso innuence the efficiency of reverse transcription of the element. Since
the effects of these factors may differ among species, fkequencies of
retroposition may also be different arnong specinc lineages.
Deragon et al. (1994) support the master gene model based on the
random distribution of diagnostic mutations in the Slgn repetitive elements in
the genome of oil seed rape Brassica napus. On the other hand, Schmidt and
Maria (1992) support the transposon model for the formation of the Predicted
Variant and Precise Alu s u b f d e s . Murata et al. (1996) and Takasaki et al.
(1994, 1996) both support the hypothesis that multiple source genes are
responsible for the Hpa I subfamily generation in salmonids. Thus the
mechanism for the generation for SINE s u b f d e s st i l l remains to be
resolved.
1.9 SINE Function
The genome of higher eukaryotes is known to contain a large amount of
what was thought to be seemingly useless interspersed and tandemly repeated
sequence elements. Our knowledge regardhg the hct ion or biological
significance of repetitive sequences is lirnited but there are opposing views as
II)
to the propagation and maintenance of repetitive DNAs within aIl eukaryotic
genomes. Non-functioI1Cilists maintain that repetitive DNA is parasitic or
selnsh DNA (Doolittle and Sapienza 1980; Orgel and Crick 1980). They argue
that repetitive DNA exists as a result of a sequence specïfîc strategy whose
only function is to maintain and increase their numbers in the genome and that
this is independent of the organism's phenotype. Functionalists argue that
repetitive DNAs are maintaineci because they directly contribute to the
functioning of the genome. This is supported by indications that satellite DNA
is involved in the organization and fùnction of chromatin and that transposable
elements have a signincant impact on their genomes (Charlesworth et al.
1994; Wichman et al, 1992).
SINEs form an unique class of transposable elernents. Although no
functional role has been demonstrated for them, the data are still incomplete.
It is likely that SINEs have a major impact on the genomes and their most
obvious effect is in their sequence dismption at their sites of integration.
Insertion of these eiements at previously unoccupied sites has been shown to
result in the inactivation of genes such as the case of an Alu insertion into the
intron of the NF1 gene producing a shift in the reading fiame and resulting in
neurofibromatosis type 1 (Wallace et al. 1991). SINES have been identifid in
illegitimate recombination such as the deleted LDL receptor gene that red ted
from a recombination between two Alu sequences. The receptur lacked the
normal membrane spanning region affecting internalization of the receptor and
resulted in familial hypercholesterolemia (Lehrman et al. 1987). Most studies
clearly outline the mutagenic effects of repetitive elements in genomes and
that these effects are consequences rather than the cause of repeated
sequences. Recently it has been shown that cell stress and translational
LU
specincdly SINEs, have been considered powerful phylogenetic markers
because they appear to be inserted irreversibly into the genome (Okada
1991a). The uiility of S M s as tools wiU facilitate these shidies as well as
provide insight into genetic mechanians, gene regdation, developmental
processes, and provide a greater database with which the genomes of higher
eukaryotes can be compared in comparative studies. F'urthermore they may
have the potential to elucidate possible roles of SINES in the genomic
organization and speciation of cichlids. For example, SINES were used to
verify the reclassification of steehead trout fkom Salrno to Oncorrhynchus
(Murata et aL 1993) and the Y (chromosome) Alu polymorphic element (YAP
element) was used as a marker to study human population history (Hammer
1994).
1.11 Goals Of This Study
This study originally set out to isolate fkom Oreochmmis niLoticus, male-
specific DNA sequences from the Y chromosome that would be capable of
identifjhg or distinguishing genetic males and fernales. Since no studies to date
have been able to identify heteromorphic sex chromosomes or sex-specinc
genetic markers in Oreochromis species, 1 used the phenol enhanced
reassociation technique (PERT) subtractive hybridization procedure to
selectively enrich for male specinc DNA sequences from Oreochmrnis niloticus
(Kohne et al. 1977). This technique is based on mixixtg a small amount of male
"tracer" DNA, which has k e n digested with a restriction endonuclease and
denatured, with a large excess of denatured, randomly sheared, "driver" DNA
£kom a female putatively not containing Y-specific sequences. When DNA is
allowed to reanneal, that &action of "tracer" DNA sequences that is common
to both sexes will be removed by the "driver" sequences and the DNA fkaction
19
that is unique to the ?racer" sequences will reanneal to complementary
"tracer" DNA The result is a small fkaction of male-specinc DNA fragments
capable of being cloned intn a suitable vector.
This procedure has yielded sex-specinc markers in humans (Kunkel et al;
1976,1985), gulls (Grifnths and Holland 1990) and in cbinook salmon (Devlin et
al. 1991). Devh et al. (1991) is the only published study to have successfully
used the PERT method described by Kunkel et al. (1985) to isolate an
apparently Y-spedic DNA sequence in fish. Of the 18 clones analyzed in
chinook salmon, (Onchorhynclucs tshuwytschu) a 250 bp fragment, identified an
8 kb male-specifïc restriction * m e n t in a Southern blot of Bam Hi digested
genomic DNA A number of other restriction enzymes also provided male-
specifk patterns; however, homologous sequences were also seen in female
DNA The utrlity of subtractive hybridization procedures to isolate sex-specific
DNA sequences in tilapias is limited by the degree of enrichment for those
sequences. Since Ytracer" DNA will be enriched by less than 100 fold by single
step enrichment techniques it is possible for non sex-specific "tracer"
sequences to reanneal and be available for cloning (Straus and Ausubel 1990).
In this report the isolation of21 recombinant clones were t e s t d for their
sex specincity when used to probe Southern blots of restricted O. niloticus
genomic DNA Since no sex-specZc marker was detected and five of the cloned
DNA fkagments exhibited patterns upon hybridization of a repetitive nature,
experiments were undertaken to characterize these repetitive DNAs.
L U
I have cioned and characterized a new family of highly repetitive DNA
eiements in the genome of O. niloticus termed Bon-1 (Retroposon 0. niloticas-
1) that resemble the tRNA-derîved SINEs. 1 also report the partial sequence
data on three other non-related, but possibly tRNA derived SINES isolated
fkom the genome of O. niloticus.
MATERIALS AND n(lETH0DS
2.1 Fish samples
Samples of Oreochmmis nihticus were obtained from breeding stocks at
Dalhousie University. Liver samples were recovered and stored &ozen a t
-700C. These fish were siblings and were assumed to have homologous genes at
many loci. DNA samples fiom other cichlids were provided by Dr. Brendan
M k d r e w at the University of Stirling, Stirling, Scotland.
2.2 DNA Isolation
High molecular weight DNA was pulverized in iiquid N2 and then digested
in 2.0 ml of proteinme K lysis b e e r (10 mM Tris-HCI, 10 mM EDTA, 400 mM
NaCl, 10% sodium dodecylsulfate (SDS) and 20 mg of Proteinase IV ml)
(Sambrook et al. 1989). The mixaire was incubateci a t 550 C for at least four
hours. The remaining protein was precipitated by the addition of NaCl to 1.5 M
and the sample subjected to centrifbgation at 10 000 x g for 10 minutes. The
supernatant was extracted with one volume of TE saturated phenol ( 10 mM
Tris-HC1,l mM EDTA, pH 8.0) followed by extraction with one volume of a
mixture of phenoV chloroform/ isoamyl alcohol(50:50:1). DNA was
precipitated by the addition of a 10% volume of 3 M sodium acetate (pH 5.2)
and 2.5 volumes of absolute ethanol. The precipitate was collected by
centrifugation, washed in 70% ethanol, c e n m g e d , dned under vacuum and
dissolved in TE b d e r to a concentration of approxïmately 1 mg/ml. DNA
concentrations were determined by spectrophotometry and the quality of the
DNA was determined by subjecting an diquot of each sample ta
electrophoresis on a 0.8% agarose gel. Gels were stained with ethidium
bromide. The DNA was visuahed by placing the gel on an ultraviolet
trmillumînator and photographed using Kodak Tmax 100 film.
21
&Li
2.3 Subtractive Hybridization and Clonixtg
Subtractive hybridization was performed as described by Devlin et aL
(1991) using one female and one male O. nilotieus tilapia. Approximately 250
pg of high molecular weight DNA fkom a single individual female was randomly
sheared to a size of between 500 and 8000 bp with 10 passes through a sterile
25 gauge needle. The female DNA represented the driver DNA and this was
employed to eliminate the non sex-specific sequences fkom (2 pg) male DNA
digested to completion with Mbo 1. The quality of the DNA fkom both
procedures was assessed by electrophoresis and exambation in a 0.8%
agarose gel. The hybridization reaction was conducted in a 10 ml glass
scintillation vial c o n m g 2.5 ml of a solution consisting of 1.25 M NaC104,
120 mM sodium phosphate, pH 6.8,12% phenol, equilibrated to pH 7.5 with
Tris base, and a mixhure of male and female DNh. The DNA was boüed for 5
minutes to dissociate DNA strands prior to addition to the hybridization
mixture. The annealing reaction proceeded for 8 days at room temperature
with constant shaking on a Vortex Genie (Fisher) at setting 2, enough to
sustain a mixture possessing a millry appearance. This method was used to
produce a small fraction of double stranded DNA fkagments that would possess
Mbo 1 ends capable of ligation into the Bam HI site of a dephosphorylated pUC
18 vector. After anneaüng, the producta of the hybridization reaction were
extracted with an equal volume chloroform : isoarnyl alcohol(50:l). The
aqueous phase was precipitated twice with absolute ethanol and then dialyzed
against TE buffer. DNA was precipitated with 1/10 volume of 3M sodium
acetate (pH 5.2) and 2.5 volumes of absolute ethanol, washed in 70% ethanol,
dried and resuspended in TE buffer. DNA samples were stored at -200 C.
23
Reannealed, putative male-speciflc, DNA fragments containing Mbo 1 and
Barn Hi compatible ends were recovered by ligation into a Barn Hi digested
dephosphorylated pUC 18 plasmid (Phmatia). T4 DNA ligase (Pharmacia)
was utiIized to kate double stranded molecules with vector DNA under
conditions suitable for clonuig moledes possessing compatible ends only. The
ratio of insert DNA to veetor DNA was 2:l meamed in available picornole
ends.
Insert ligation conditions adapted from Sambrook et al. 1989.
Ligation A (pI) Ligation B (pl) Control (pl) pUC 18 Barn Hl/BAP (50 @ml ) 4.0 4.0 4.0 Insert DNA (0.267 mg/ml) 0.5 0.5 O 10X ligation B d e r 1.0 1.0 1.0 T4 DNA Ligase (8 U/$) 0.5 O. 5 0.5 dH20 2.5 2.5 3.5 d m (10 mM) 1.0 1.0 1.0
Ligation was carried out at 1 6 O C for 22 hours. The control was used to
test the integrity of the commercial pUC 18 Barn RV BAP vector for quality
control.
2 . 4 Transformation of pUC 18 vectors into E. Coli.
Competent E. coli DH5a cells were transformeci with ligated pUC 18
vector following a procedure provided by the manufacturer (Gibco-BRL).
Competent cens were thawed on ice and a 100 p l aliquot of cells was placed into
prechilled 15 ml Falcon 2059 polypropylene tubes on ice. A 1.7 pl aiiquot of a
1/10 dilution of f3-mercaptoethanol was added to give a h a 1 concentration of 25
mM. CeUs were incubated on ice for ten minutes with gentle swirling every two
minutes. Approximately 50 ng of ligation reaction was added, swirled gently
and left on ice for 30 minutes. The ceUs were heat shocked at 420 C for 45
z4
seconds and returned to ice. After 2 minutes, 0.9 ml of SOC medium (SOC:
20g bactotryptone; 5 g yeast extract., 0.5 g Nam 10 ml of MgCldMgs04.7H20
solution; 1 ml of 2 M glucose solution. MgCl2 / wo4.7H20 solution is made up
of 12 g MgC12 and 9.5 g MgS04. 7H20.) was added and the cells were incubated
at 370 C for one hour with constant shaking at 225 rpm. Transformed ceIls
(10,25, 50,100,200 or 400 p l amounts) were plated on prewarmed (370 C) LB
(Luria- Bertani) plates (10 g tryptone, 5 g yeast extract, 10 g NaCl, 15 g agar;
in 1 Mer of distillecl water). Transformants were selected by plating cells on LB
plates containuig 100 pg/ml ampicillin (Sigma) and 100 pl of 2% X-gal. Colonies
were dowed to grow up overnight at 370 C. Blue colonies containing non-
recombinant plamnids were discarded while individual white transformants,
presumably containing recombinant plasmids were picked and replated on
master LB ampicillin plates, grown overnight at 370 C and stored at 4 O C.
2 - 5 Characterîzation of recombinant colonies
Mini-preparations of plasmid DNAs were prepared and screened for the
presence and size of inserts by restriction analysis with Barn Hi or Sca 1
(Pharmacia). Plasmid DNA digested with Barn HI would cleaved the plasmid
(pUC 18 = 2686 bp) leaving a h e m plasmid plus insert. Sca I is a rare cutting
enzyme that will cleave pUC 18 at only one position thereby linearizing the
plasmid containing the insert. Insert size was assessed on 0.8% - 1.0% agarose
gels after staining with ethidiun bromide solution. In samples where Bam HI
was not capable of releasing the insert or the insert was too small to Msualize
accurately, Sca I was used to digest the recombinant plasmids because the
combined size of the insert and the vector allowed easier visualization on the
agarose gels and allowed a more accurate assessrnent of the size of the insert.
In most cases, Barn HZ was capable of cleaving out the entire inserts h m
YU
recombinant plasmids; however, the insert fkom the p34 clone could not be
released with Barn HI because of a site change at the pUC forward sequencing
primer end of the insert. The cloning site at the other end of the insert, the 3
prime end remained intact. The expected sequence at the forward sequencing
primer end should have been AGAG GATCC, however what was observed was
AGAGATC which indicated a G deletion and loss of the Bam HI restriction site
possibly due to damage at the plasmid insertion site.
2.6 Plasmid Preparation
Bacterial E. coli plasmid clones were grown at 370 C overnight in LB
broth (10 g tqptone, 5 g yeast extract, 10 g NaCl; in 1.0 litre of distilled water)
containing ampicillin (100 pg/ml) in 2059 falcon tubes (Fisher) (Sambrook et al.
1989). Using the speed prep protocol (Good and Feinstein 1992), ceus were
sedimented by c e n m g a t i o n in 2.0 ml Eppendorf tubes, the supernatant
discardeci and the cells resuspended in 200 p l of speed prep solution A (50 mM
Tris-HC1, pH 8.0,4% Triton X-100,2.5 M LiCl and 62.5 mM EDTA) with
vortexing. The mixture was extracted with an equal volume of phenou
chloroform / isoarnyl alcohol(50:50:1) and subjected to centrifugation for 3
minutes at top speed in an eppendorf microfuge. The aqueous layer was
transferred to a 1.5 ml Eppendorf tube and nucleic acid precipitated by addition
of two volumes of absolute ethanol. The DNA precipitate was subjected to
centrifbgation for 6 minutes at hi& speed, washed in 70% ethanol, dried and
resuspended in 32 pl of TE b s e r pH 8.0. DNA concentrations were
approximately 25 ng/ml.
2 .7 Restriction E~~donuclease Digestion
Genomic DNA (10-20 pg, h m a single individual) was digested, according
DU
to the manufactureers instructions, with high concentrations of the appropriate
restriction endonuclease (Pharmacia) in 50 pI at 370 C for 4 hours. To ensure
complete digestion, the DNA was extracted with phenou chlomform,
precipitated by ethanol, resuspended in TE bder, and digesteci again with the
same restriction enzyme. Samples were assessed on a 1% agarose gel.
2 .8 Gel Electrophoresie and Southern Transfer
DNA sarnples (5-10 pg) digested by various restriction endonucleases
were fractionated by gel-electrophoresis in 0.8% or 1.0% agarose gels at 2.5
V/cm for approximately 24 hours in lx TAE (40 mM Tris acetate, 2 m M
EDTA) electrophoresis buffer. Samples were nui on Owl subgel apparatus
(Fisher). Each gel contained either a Hind III lambda marker or a kilobase
marker (Gibco-BRL) for size estimation or assessrnent by cornparison. Gels
were then stained with ethïdium bromide and photographed on a W
transilluminator aRer eledmphoresis. For Southern trader, DNA was
transferred fkom agarose gels to Nylon membrane (Hybond-N, Amhersham)
by vacuum blotting according to the manufacturers instructions (Pharmacia
VacuGene apparatus). DNA was depurinated in 0.25 M HCl for 20min., the
gels were denatured in 1.5M NaCl, 0.5 M NaOH for 20 min., neutralized with
1.0 M Tris-HC1,1.5 M NaCl for 20mui and transferred with 20X SSC (1X
SSC = 0.15 M NaCl, 0.015 M sodium citrate) for 60 minutes. Membranes
were rinsed in 3X SSC for 1 minute, air dried for 30 minutes and baked at 80° C
for 2 hours. Filters were stored at -200 C in plastic bags.
2 .9 Recovery of Plasmid Inseris and Radioiabelling of DNA probes
Plasmid inserts greater than 100 bp were recovered fiom recombinant
pUC 18 vectors by digestion with appropriate restriction endonucleases. The
z-1
DNA fkagments were fkactionated by electrophoresis in 0.8% low melting point
agarose gels (Sigma type VID excised h m the gel, puri£ied and labelled by
random priming with [a-32P] dCTP (3,000 Ci/ mmol), (Feinberg and Vogelstein
1983,1984). A 1 mg/ml mixture of hexamer primer D N 4 16 p l of linear DNA
fragment in Low Melt gel diluted 300 fold with TE bufTer and 12 pl water for a
final volume of 32 pI. The mixture was denatureci by boiling for 5 minutes and
then incubated 370 C for at least 10 &utes. Ten milliliters of 5X Oligo-Iabeiing
b a e r (250 mM Tris-CI p H 6.8,25 mM MgCl2, 1.0 M HEPES p H 6.6,5 mM fl-
mercaptoethanol, 2 mM each dATP, dGTP, dTTP), 2 p.I BSA (10 mg/ml), 5 pl
3,000 Ci/mmol ad* dCTP and 2 pl of Klenow DNA polymerase 1 (8 U/pl) was
added to the reaction mixture. The reaction was incubated at 370 C overnight
and the radiolabelleci DNA purifieci on a Sephadex G50 column hydrated with
TES (TE + 1% SDS). A tracking dye (0.05% bromophenol blue and 10%
dextran blue) was used to monitor probe purincation. DNA was routinely
labelled to a specifïc activity of 10s cpm/pg.
2.10 Partial Digestions
Genomic DNA fkom a single O. niloticus male fish was digested with
either Mbo 1, Pstl, or Hae III. In the reaction, 16 pg of DNA was digested, in
separate tubes, with 2.5, 1.0,0.5, 0.25,0.10,0.05,0.025 Units of enzyme per
microgram of genomic DNA for three hours at 370 C. Samples were subjected
to electrophoresis overnight (5Vkrn) on a 1% agarose gel, stained with
ethidium bromide, photographed and Southern blotted to Hybond-N
(Amersham) according to Southern (1975).
2.11 Hybridization Conditions
Nylon membranes were incubated for two hours in Westneat
DU
hybridization %der (7% SDS, 1 mM EDTA (pH &O), 0.263 M Na2HP04, 1%
BSA Faction V) at 600 C in a Hybaid hybridization oven (Westneat et al.
1988). Radiolsbelled probe was added to the hybridization solution to a nnal
concentration of 106 cpm/ml and hybridization was allowed to proceed for up to
24 hours. Membranes were washed twice at room temperature in 2X SSC,
0.1% SDS for 20 minutes and one tirne in O.lx SSC, 0.1% SDS at 600 C for 15
minutes. Medium strhgency washmg conditions (four changes of O.= SSC,
0.1% SDS for 15 minutes at room temperature) were used on the species blots
(Figures 13 and 14). Each membrane was exposed to Kodak X-AR nIm at
-7OOC with an [email protected] screen for 24 hours or more. The blots that were
used repeatedly in hybridization experiments involving different probes were
stripped, after t he initial probing experiment, with a boiling solution of 0.1%
SDS (sodium dodecylsulfate) with constant gentle shaking for one hour
(Sambrook et al. 1989). The procedure was repeated twice and the blots were
exposed to X-AR nIm at -700 C for up to a week to verify that the probe
stripping was successfid. Blots were then stored at -200 C in wrapped in Saran
wrap.
2 .12 Isolation and Subcloniag of Repetitive DNAs h m a Genomic
Library
Genomic DNA fiom a single O. niloticus individual was partially digested
with Mbo 1 and was used to construct a genomic DNA library in the lambda
replacement vector EMBL 3. Aliquots of the bacteriophage library (5.1 x 109
pWml) were plated on E. coli NM539 cells and incubated at 370 C for 12 hours
(Sambrook et al. 1989). A total of 1.5 x IO* plaque forming units were plated
out representing 10% of one genome equivalent. The LB plates were
supplernented with 10 mM MgS04 and 0.2% maltose. Plaques were
ZY
trderred to Hybond-N nylon membranes (Amersham) and denatured by
placing the membranes, plaque side up, on 3MM paper saturated with 0.5 M
NaOH, 1.5 M NaCl for 5 minutes (Sambrook et al. 1989). Membranes were
neutralized on 3MM paper saturated with 0.5 M Tris-HCl, p H 8.0, twice for 4
minutes each and washed for 3 minutes in 2X SSC. Membranes were air dried
30 minutes and baked at 800 C for 2 hours. Membranes were hybridized with
the appropriate probe, washed under high stringency conditions and exposed to
Kodak =AR nIm. Positive plaques were picked and purifieci by an additional
round of plathg and hybridization. Positive plaques were placed in 1 ml SM
medium contaïning 1 drop of chloroform in a polypropylene tube and stored at
40 C.
2.13 Plating Bacteriophage Lambda
A sterile 250 mi flask containing 50 ml of sterile IJ3 medium
supplemented with 10 m M MgS04 and 0.2% maltose into a sterile 250 ml flask
was inoculated with a single colony of NM539 cells grown on an LI3 plate and
d o w to grow overnight at 370 C with constant shaking at 250 rpm. The ceUs
were poured into a sterile 50 ml Falcon tube and sedimented by cenegat ion
in a Beckman GPR centrifuge at 3000 rpm for 10 minutes. The supernatant
was discarded and the ceUs resuspended by vortexing in appnoximately 20 ml of
sterile 10 mM MgS04. Cells were diluted to an 0Dsoo=2, or approximately 1.6 x
109 ceUs/ml and stored at 40 C until use (Sambrook et al. 1989).
Serial (10-fold) dilution of bacteriophage stocks were prepared in SM
medium. A 100 p l amount of each dilution was dispensed into 15 ml sterile
Falcon 2059 tubes containing 100 pl of plating (NM539) E. coli bacteria and
incubated for 20 minutes at 37* C. Three milliliters of molten LI3 top agarose
UV
(45-550 C) supplemented with 10 mM -O4 was added and the entire mixture
was poined onto prewarmed (370 C), two day old, LB plates supplemented with
10 mM MgSOr (Sambrook et al. 1989). The plates were dowed to harden for a
few minutes and placed in an incubator for up to 12 hours. Plaques were
counted and the titer was determineci to be 5.1 X 109 pfitlml.
2.14 Large Scale Bacteriophage DNA Preparation
Large quantities of bacteriophage DNA were isolated by infection of E.
coli NM539 cells (ODsoo= 0.5) growing in 500 ml ofLB broth, supplemented
with 10 mM M@04 and 0.2% maltose at 370 C, with 109 p h of bacteriophage
lambda (Sambrook et al. 1989). Cultures were incubated for up to five hours
with constant shaking until lysis occurred. Bacteriophage was purined with
the addition of DNAase 1 and RNAase at a final concenbation of 1 mg/& for
30 minutes. Bacterial debris was removed by the addition of NaCl (to 1M) and
cen-ation at 10,000 rpm in a Sowal GSR rotor. Bacteriophage particles
were recovered by the addition of PEG (8000) to 10% fhal conceneation and
c e n ~ a t i o n at 11,000 rpm for 10 minuks at 40 C. Bacteriophage particles
were resuspended in 8 ml SM media. Polyethyiene glycol was removed by
extraction with an equal volume of chloroform. Cesium chloride (0.5 g/d) was
dissolved in the aqueous phase and the bacteriophage suspension was layered
on top of a cesium chloride gradient in 40 ml polypropgene ultracentrifuge
tubes and subjected to centrifugation in a Beckman SW 28 ultracentrifuge
rotor for 2 hours at 22,000 rpm (Sambrook et al. 1989). Bacteriophage
particles were removed, placed in SM containing cesium chloride at 1.5 &ml,
and again subjected to centrifugation in the same rotor at 35,000 rpm for 24
hours at 40 C (Sambrook et al. 1989). The bacteriophage pellet was dissolved
in 2 ml SM, dialyzed in buffer (1 rnM NaCl, 50 mM Tris.HC1 pH 8.0,10 mM
da
Vti50 rotor. The plasmid band was removed with an 18 gauge needle and the
ettiidium bromide was extracted three times with NaCl- saturated isopropanol.
Samples were dialyzed against two changes of TE buffer, pH 8.0 for 24 hours
each. Plasmid DNA was precipitated by addition of sodium acetate, pH 5.2, to
0.3 M and 2 volumes of 100% ethanol, and then stored at -200 C (Sambrook et
al. 1989). Plasmid DNA was dissolved in 200 to 500 TE and the
concentration determined by spectroscopy.
2.16 Generation of Nested Sets of Deletions
Nested deletion clones were prepared for several lambda sublones using
the Exonuclease III unidirectional deletion mapping pmtocol (Sambrook et al,
1989). The procedure requires that circular recombinant pUC 18 be linearized
with two restriction enzymes each cleaving the recombinant plasmid on the
same end of the insert and within the polycloning region of the plasmid. The
restriction enzyme that cleaves the plasmid closest to the insert is reqyired to
produce a blunt or a recessed 3 prime end only (such as Sma I) and the second
restriction enzyme neeh to produce a 3 to 4 bp protruding 3 prime terminus.
Both enzymes should only cleave the pUC 18 vector at one site and not cleave
within the insert, thus generating a lineu molecule. The 2.3 kb Eco RI
fiagrnent from p80h9.2, the 1.4 kb Eco RI fkagment fkom p34h7.l and the 2.0
kb fiagrnent h m p80h9.1 subclones were linearized with Sma 1 and Sph 1 as
required by the protocol. Plasmids (5 pg) were digested (quality was assessed
by electrophoresis) and extracted with phenol: chloroform, precipitated and re-
suspended in 60 pl of UL Exonuclease III bufEer (10X Exonuclease b s e r (0.66
M Tris-HC1, (pH 8.0) and 66 mM MgC12). Samples were deleted at 370 C with
500U Exonuclease III and aliquots were removed at 30 second intervals
allowiag a 200 bp d o m deletion of each plasmid subclone and placed into
or)
separate tubes containuig S1 nudease mixture ( 60U SI nuclease, 27 pl SI
b s e r {10X S1 b d e r = 5 M NaCl; 3 M potassium acetate, pH 4.5; 5095
glyceml and 1 M Z&04 ) and 172 pl water) at room temperature.
Approximately 1 p l of SI stop mixture (0.3 M Tris base, 50 rnM EDTA, pH 8.0)
was added to each aliquot and samples heated to 700 C to stop the reaction. A
portion of each aliquot was electmphoresed to assess the extent of digestion by
Exonuclease III and the appropriate aliguots were chosen for re-ligation and
transformed into E. coli cells. The p80A7.3- 2.5 Sac I/ Rsa 1 subelone derived
fkom the 4.5 kb Eco RI p80A7.3 subcione was deleted with exonuclease III.
2 .17 DNA Sequencing and Analpis
Double stranded recombinant pUC 18 templates were sequenced by the
dideoxy chah texmination method (Sanger et al. 1977) using [a-35S] dATP
(1000 Ci/mmol, Dupont) and a T7 sequencing kit (Pharmacia). The universal
primer was used to sequence in the forward direction and the Ml3 reverse
primer was used to sequence the reverse direction. The reaction products were
fkactionated by electrophoresis on either 5% or 8% polyacrytamide ionic wedge
gels. The electrophoresis buffer was lX TBE (Sambrook et al. 1989). DNA
sequence data were also obtained by sequencing some nested deletion
subclones on the LICOR automated DNA sequencer at the National Research
Council Institute of Marine Biosciences, Halifax. Regions not spanned by the
deletion subclones were sequenced using oligonucleotide primers. Primers Ttr-a
(S1CAGATCACTGATCCACC3') and Ttr-b (WAGACTTGTGTACAGCC3')
were both derived fiom a consensus sequence obtained h m an alignment of
the p80 sequence and 2.3kb Eco RI fkagment from p80h9.2 subclone. This
region represented the tRNA-unlike region of the SINE element. The Ttr-a
primer did generate sequence data however; the Ttr-b primer did not.
34
Sequences were aiigned with CLUSTAL V multiple alignment program for the
Macintosh operating system (Higgins and Sharp 1988) and in every instance
manual alignments were necessary.
Sequence data obtained fkom the plasmid inserts were malyzed using a
number of programs available on the World Wide Web capable of DNA or
protein sequence cornparison and analysis. Nucleotide sequences were
submitted to the National Center for Biotechnology Information (NCBI) and
compared to the GenBank and EMBL nucleotide databases using the basic
local alignment search tool program BLAST (Altschul et al., 1990). Nucleotide
sequences were also analyzed with a suite of programs available on the World
Wide Web at the Baylor College of Medicine @CM) web site
(www.bcm.tmc.edu & www.dotimgen.bcm.tmc.edu). Their search launcher and
genefinder programs were utiüzed for restriction mapping, multiple nucleotide
and amino acid alignments, nucleotide and amho acids database searches,
motif searches and repetitive element analysis and gene feature searches.
Each insert sequence was analyzed for the presence of open reading fkames
(ORFs) in all six -es for at least 20 codons and containing no stop codons.
ORFs were translated to protein sequences manually and were submitted to
NCBI and compared to the SWTSSPROT database ushg the BLAST program.
RESULTS AND DISCUSSION
3.1 Subtractive Eybridization and Analysis
DNA fkom the subtractive hybridization readion was used in two
lïgation reactions using the plasmid pUC 18 Barn Hi7 BAP. Plasmids were
t r d o m e d into E. coli DHSa cells using standard techniques describeci earlier
and recombinant colonies were isolated. The k t ligation experiment (ligation
A) produced 97 white colonies, labelleci Ai- 97, and the second expriment
(ligation B) produced 83 white colonies, labelled BI- 83. Blue and white color
selection was employed in the transformation assay. The white colonies were
the recombinant clones. AU of the colonies were malyzed for the presence of
suitabIy sized inserts using restriction enzymes. Of the 180 clones isolated,
only 21 clones contained plasmids with inserts greater than 100bp. This was
based on Sca 1 and or Barn HI digested plasmid preparations analyzed on 0.8-
1% agarose gels.
Most clones contained inserts of varying sizes and clones containing no
insert or inserts less than 100 bp were discarded. In both experiments the
control plate, contsininp plasmid and no foreign DNA, had a number of blue
colonies. This indicated a possible quaüw control problem with Pharmacia's
pUC 18 since this result indicates that dephosphorylation reaction was
incomplete. This may have decreased the number of the ligations reactions of
0. niloticus DNA because of reannealed plasmids without inserts.
Each clone was cleaved with the appropriate restriction endonuclease(s)
and inserts fiom all suitable clones were purified fkom LMP agarose gels and
labelled with a32P- dCTP by random priming (Feinberg and Vogelstein 1984).
These labeIled DNA fkagments were used as probes and hybridized to Southern
35
36
blots containing Barn HI digested genomic DNA fkom male and female O.
nilotieus. Figures 1-5 show auturadiographs of the hybridization of radiolaMd
plasmid inserts for clones p34, p80, p43, p44 and p54 to O. niloticm total
genomic DNA &om both male and female tilapia Although most probes failed
to provide clear hybridization patterns on autoradiographs, seven clones
hybridized to numemus distinct bands; no sex-specinc patterns were identifieci.
The complexïty of the band profiles did not appear to Vary between individuals,
with the exception of p43, and band profiles appeared to Vary with different
probes. The variable numbers of bands presumably correspond to memben of
highly repetitive DNA f d e s . Variation did not appear to be a fundion of
stringency since 1 did not see a ciifference between blots washed at room
temperature or at the hybridization temperature for 15 minutes.
Based on the results of the hybridization experimentation, 1 decided to
abandon future screening of recombinants for sex-specifïc markers and focus
on the analysis those fkagments that exhibited a repetitive pattern. The
recombinant plasmids containhg inserts of interest include: A34, A80, B40,
B43, B44, B54 and B63. The labelled insert of the plasmid designated B63
hybridized onlyto one lacus per individual and had no variation between
individuais so was ornitted h m future study. The other plasmids, now
designated p34, p80, p40, p43, p44 and p54, were sequenced.
Figure 1. Hybridization of radiolabened plasmid pûû insert to Oreochmmis nilothus genomic DNA
Lanes 1,2 & 6 are Barn HI digested male O. niloticics genomic DNA (10 pg) samples.
Lanes 3,4 & 5 are Barn HI digested female O. niloticus genomic DNA (10pg) samples.
Figare 2. Hybridization of mdiolabelled plasniid p34 inmrt to Oreochromis nibticus genomic DNA.
Lanes 1,2 & 6 are Bam HI digested male O. niLoticus genomic DNA (10 pg) samples.
Lanes 3,4 & 5 are Bam HI digested female O. niloticus genomic DNA (10pg) samples.
Figure 3. Hybridization of radiolabelled plasmid p43 jllprrrt to Oreochromis nibticus genomic DNA.
Lanes 1,2 & 6 are Barn Hi digested male O. niloticus genomic DNA (10 pg) samples.
Lanes 3,4 & 5 are Barn HI digested female O. niloticus genomic DNA (lOpg) samples.
Figure 4. Hybridization of radiolabelleci plasmid p44 insert to Oreochromîs nibticus genomic DNA.
Lanes 1,2 & 6 are Barn HI digested male O. niCoticus genomic DNA (10 pg) samples.
Lanes 3,4 & 5 are Barn HI digested female 0. nüoticus genomic DNA (10pg) samples.
Fignie 5. Hybridhation of radiolabelled plasaiid pS4 h r t to Oreochromis nibt2cus genomic DNA.
Lanes 1,2 & 6 are Bam HI digested male O. niloticus genomic DNA (10 pg) samples.
Lanes 3,4 & 5 are Barn HI digested female O. niloticus genomic DNA (10pg) samples.
42
3.2 Sequencing of Repetitive Fragments
AU six plasmid inserts were sequenced completely on both strands. Figure 6
shows the nucleotide sequence data compiled for clones p34 (180 bp), p4û (199
bp), p43 (164 bp), p44 (149 bp), p54 (387 bp) and p80 (219 bp). The nucleotide
sequences of each insert was compared to the nucleotide databases at NCBI
using the program BLASTn. In 1994 and 1995, all sequences were submitted
to GenbanWCBI nucleotide database for cornparison. In most cases, BLAST
resdts indicated no sequence identiw with any sequence in the database.
However, the sequence of p8O showed identity (28 of 31 bases shared 90%
identity of the 219bp submitted) with S. scrofa mRNA for protein phosphatase
(accession #SSP2A55B). I considered this information to be of limited value as
no clear identim could be positively determined based on this result BLAST
results for the clone p43 insert indicated sequence identiw with 18s rRNA from
a wide range of species. Based on this result 1 suspended work on clone p43
because it was likely to be a fiagrnent of 18s rRNA from 0. nilotieus . Each insert sequence was andyzed for the presence of open reading fiames (ORFs).
Sequences were analyzed in all six fiames for at least 20 codons and containing
no stop codons. ORFs were translated to protein sequences manually and were
submitted to NCBI and compared to the SWISSPROT database using
BLAST. Sequences were what 1 considered too short for proper assessrnent
were not submitted due to the fact that short protein sequences may provide
misleading matches.
In all amino acid submissions no significant matches were seen and no sex
specinc markers were idenaed with either amino acid or nucleotide
cornparisons of the appropriate databases. There were no amino acid matches,
Figure 6. Nucleotide sequence data for PERT clones.
A. Clone pûû insert,
1 CTGCAGAGAGTC CTGTCCTCCC TGTGATGGAG GACTGACGTC A ( x x m x r m 51 GTAAACAAAG ACCACTCCCT CCAACCTCCC AGCTGCTITC TATAAAGATG 101 GCTCCCTCAT CAGGAAGCAG CCTACAGGTC ACATGACCAT CCAGCATGTT 15 1 TCCAGGTCTG ATGAAGGCCT CTACAAGTGT GACATCAGCG GTCATGGAGA 201 GTCTCCATCC AGCTGGATC
B. Clone p34insert.
1 GATCCTGGAG CCACTCGCCT AGCIITGGGAG TCACCGCACC TAGTGCTCCC 51 GATTACCACG GGGACCACCG TCACCTL'CAC CCTCCACATC CTCTCAAGCT 101 CTTCTCmAG G C m T A T TI'CTCCAGCT TCTCGTGTIY3 CITCiTCCTG 15 1 ATATI'GCTGT CATTCGGAAC TGGCTACATC
Co Clone p40 insert.
1 GATCCAGTGA TGAGAGCCGT TI'CCCCCATG TCCAGTCCTI' GTACTAAGCT 51 AGGCTAAACA TGTCCAGTGC C T G T C m T GCAGC'I'L'CrA ATCTGATGTT 101 AATAAGGTAG AAATGGAGGA AGAAGAAAAT GACTICATCA ATCAACATGC 15 1 AGCTACCAAC CCAAATGGTG AGGTGGTGGT TAGTTA'ITAC TGmITGA
D. Clone p43 insert.
1 CTGATITAAT GAGCCA'ITCG CAGTTTCACT GTACCGGCCG CGTGTACTTACTI'A 51 GACCTGCATG GC'lTAAlC'lT TGAGACAAGACAAGC ATATMTACT GGCAGGATCA 101 GTACCATTAA ACAAGTACGC AGAGAAAGAC AGCAAMCAA AAGATATGAC 151 CAAA'ITATCT CTCC
E o Clone p44 insert.
1 GATCTGkAAA TGTLYfTGTAA TACTACACC TGAAATGACA ATTAGATKX 5 1 GTITGCCTGT ACAGGAGATA AAGGMMAGA A'ITTAACAGC AGCTGTGAAA 10 1 TGGAATAGTC CAAATCACCA AAGTAAACTG GACCTGGGTT T G D A G G
CGGAGCTIGT TL'CTAACACC AGCACCAG-TG TGGTGG'ITAG CATCATE-CC TCACAGCAAG A A G G m T G A GTITGAATCC AGGC'ITCCTC CCACAGTCCA CAGGCATGCT GTTACTAACA AAGCAGCAAA AGCAAAATAA CACGCTTACA AAGITTATGT ACTGAGCET TCAAAGGAAC TACAGCTCAA ATGCACCTAC ACA'IWCAGT 'IWXTKTI'c TTUTITAGC CTACAGCCTT TIY='lTIGTAC TACCTGGTGA TTAAGGATGA GGAAGGAAAG ACAGATTACT AATGAGCATT GACTGCTGAA AGTGGACACA TACAGCATAA GCAGCAGATT GAATGTCGTT AAATAATKT GTTTTTCTTC TGTACGTGAA ATTGATC
Table 1. mt of putative open readïng -es in PERT clones.
Clone 0R.F (bp) Amino acid sequence ~ 3 4 + 62-130 MWPESLLELSYTSHPHCHQQHK - (a) 18-104 M T A i S G ~ S ~ G S E E C S ~ G C G G
(a) Reverse- complimentary strand ORF = Open readiog &me
'Lu
with SWLSSPROT data base, that could be associated with repetitive element
proteins including anyviral sequence in part or in whole. Table 1 provides a L i s t
of the open reading hunes derived h m each plasmid insert. At this point 1
decided to suspend work on al l clones except p34 and p80 until these were both
M y characterized. In reviewing the sequence data for this report in 1996,
clones p34, p40, p44, p54 and p80 were again compared to the nucleotide
databases using the BLASTn program. The results were memarkable;
however, portions of the insert fkom clone pS4 has shown significant sequence
identity with severai segments of recently deposited genbank submissions
corresponding to the Danio rerio retroposons and zebrafkh memaid repeats
(Izsvak et al. 1996; Shimoda et al. 1996a). An alignment with Danio rerb
clone DANA-16 DANA retroposon (accession number L42294) or the
zebrafïsh mermaid repeat gene (accession number D78162) with clone p54
indicated up to 82% sequence identity in the region corresponding to the tRNA-
like region and not to the niermaid" specifïc domain. Analysis of the p54
repetitive sequence indicates that it is likeiy a full length novel retroposon on
the basis of two identifiable flanking 5 basepair direct repeats (TGTTT), a
W A like region containing identifiable POL III A and B boxes and a 3 prime
polyA tail. To gain M e r insight into the nature and origin of this sequence,
the isolation and characterization of several other full length repeats would be
required. No M e r analysis was conducted on this new family of tilapüne
SW-like elements termecl ROn-2.
3.3 Genomic Organization
The genomic organization of repetitive elements was investigated for
cloned elements designated pû0 and p34. High molecular weight genomic DNA
f h n a single O. niloticus individual was digested to completion by a number of
46
different restriction enzymes. Southern blot and hybridization to radiolabelled
inserts fkom either p80 or p34 were used to determine the restriction enyme
recognition profile of the repetitive elements in the O. niloticus genome
(Figures 7 and 8, respectively).
The Labelleci p80 insert detected on a multiple restriction enzyme blot a
single Mbo I band at approximately 220 bp (Figure 7). In addition to Mbo 1, the
repetitive element was detected as a single fragment in Pst 1 digested genomic
DNA as a broad band at approximately 300 bp and the restriction enzyme Pvu
II recognized two faint bands of 330 bp and 130 bp. In contra&, digestion with
other enzymes, including Eco RI, Hind III, Barn Hi, Ava II, Hinc II, SSty I, Hae
III and Ban II, resulted in a broad range of bands or a smear on the auto-
radiogram indicatùig that restriction sites for these enzymes are probably
located in the unique flanking DNk Restriction map data for the p80 insert
indicate that there are single restriction sites for the enzymes Mbo 1, Pst I and
Hae III. There are two restriction sites for Pmi II and this sequence
corresponds to the 130 bp band (b) seen in Figure 7.
The labelled p34 insert detected a distinct Mbo 1 (and Sau 3A) doublet at
approximately 300 and 390 bp against a background srnear on the auto-
radiograph, and Hae III doublet at 380 and 560 bp (Figure 8). The probe also
detected a single band with Ban II at 2.0 kb against a background smear while
the enzyme Sty 1 provided a band at 280 bp and a fainter band at 1.1 kb. Two
bands were evident in the Ava II digest at 260 bp and 1500 bp. Digestion with
Eco RI, Hind III, Barn HI, Hinc II or Pvu II resulted in a smear on the
autoradiograph. Restriction map data based on the p34 insert indicated that
there are single restriction sites for the enzymes Mbo 1, Hae III, Ava II, and
Figure 7. Molecular characterization and specificity of the p80 repetitive element. Hybridization of radiolabelled p80 insert to O. niloticzcs genomic DNA digested with multiple enzymes. Genomic DNA (1Opg) was digested with Mbo 1 (lane 1); Eco RI (lane 2); Hind III &ne 3); Hae III (lane 4); Pst 1 (lane 5); Ban II (lane 6); Pvu II (lane 7); Sau 3a (lane 8); Ava I I (lane 9); Barn HI (lane 10); Hinc 11 aane 11); Sty 1 (lane 12). Numbers on the left indicate the relative position of the molecular weight markers (kb).
Figure 8. Molecular characterization and specifïcity of the p34 repetitive element. Hybridization of radiolabelleci p34 insert to O. niloticus genomic DNA digested with multiple enzymes. Genomic DNA (10pg) was digested with Mbo 1 (lane 1); Eco RI (lane 2); Hind III (lane 3); Hae III (lane 4); Pst I (lane 5); Ban II (lane 6); Pvu II (lane 7); Sau 3a (lane 8); Ava II (Iane 9); Barn HI (lane 10); Hinc II (lane 11); Sty 1 (lane 12). Numbers on the left indicate the relative position of the molecular weight markers (kb).
Figure 8.
C I L
Sty 1 and no restriction site for Ban II. The limited data for the p34 repetitive
element -est that this element may represent a fragment of a very large
repetitive element.
In all digestions no higher order periodiicity was evident that would suggest
a tandemly-arrayed element for p80 or p34, but that the repetitive elements
are Iikely dispersed in the 0. niloticus genome. To ver@ this partial digestion of
genomic DNA followed by hybridization of labelled insert is requked. The p80
data correlate with the full length repetitive element restriction map data for
the bacteriophage lambda subclone p80h9.2.
3.4 PartialDigdonAnaIysis
When genomic DNA was digested to completion with Mbo 1, a single 220
bp band was prominent on the autoradiograph as well as a single band at 300
bp in Pst 1-digested DNA Partial digestion of O. niloticus genomic DNA with
either Mbo 1 or Pst 1, followed by Southem blot and hybridization to, radio-
labelled p80, failed to generate a ladder of hybridizing fkagments (Figures 9-
12). These patterns were, therefore, inconsistent with the organization of
these repetitive elements in tandem arrays. The evidence clearly indicates
that the repetitive elements are not tandemly arrayed in the genome but are
dispersed throughout the genome and possess intenial Mbo 1 and Pst 1
restriction sites for p34 and intemal Hae III and Mbo 1 sites in p80.
Figure 9. Hybridization of radiolabelled p80 insert to O. niloticus genomic DNA partially digested with Pst 1. Aliquots of genomic DNA (10pg) from a
single O. niloticus male fish were digested with Mb o 1 (limes £kom leR to right; 2.5, 1.0, 0.5, 0.25, 0.10, 0.05 and 0.025 units of enzyme per microgram of genomic DNA) for three hours at 370C. Markers shown on the leR margins are in kilobases.
Figure 9.
Figure 10. Hybridization of radiolabelled p80 insert to O. niloticus genomic DNA partially digested with Mbo 1. Aliquots of genomic DNA (10pg) fkom a single O. niloticus male fish were digested with Mbo 1 Oanes fkom left to right; 5.0.2.5, 1.0,0.5, 0.25, 0.10, 0.05 and 0.025 units of enzyme per microgram of genomic DNA) for three hours at 37W. Markers shown on the left ma- are in kilobases.
Figure 11. Hybridization of radiolabelled p34 insert to O. niloticus genomic DNA partially digested with Mbo 1. Alîquots of genomic DNA (10pg) from a single O. nilotieus male fish were digested with Mbo I (lanes fkom left ta right; 5.0,2.5,1.0, 0.5, 0.25, 0.10, 0.05 and 0.025 mits of enzyme per microgram of genomic DNA) for three hours at 370C. Markers shown on the left margins are in kilobases,
Figure 12. Hybridization of radiolabelled p34 insert to O. niloticus genomic DNA partially digested with Hae III. Aliquots of genomic DNA (10pg) fkom a single O. niloticus male fish were digested with Mbo 1 flanes fkom left to right; 2.5, 1.0, 0.5, 0.25, 0.10, 0.05 and 0.025 unïts of enzyme per microgram of genomic DNA) for three hours at 370C. Markers shown on the leR margins are in kilobases,
- -
3.5 Species Blots
To determine if the repetitive DNAs detected by p80 and p34 are present
in phylogenetically related or unrelateci species, Mbo 1 digests fkom several
representatives of the family Cichlidae, Sahonidae, Gadidae and several
members of mammalian orders were screened. In ]Figure 13, the p80 insert
hybrïdizes only to orthologous sequences in the family Cichlidae, except
Etroplus maculatus, but f d e d to hybridize to the DNA fiom any other species.
This indicates consewation of this element within most members of the family
Cichlidae but does not provide evidence to indicate the the element is present
at more than one locus. Hybridization by the Iabelled p34 insert to the same
Southern blot (Figure 14) indicates strong hybridization to members of the
f d y Cichlidae: however, there is hybridization to the DNA of other species of
fish. Weak hybridization with members of mammalian genera seen. The
Southern blots for these figures were washed under medium stringency
conditions and exposed to Kodak X-AR film for up to ten days.
Species represented on the blot include three species fkorn the family
Gadidae; W u s norhua (atlantic cod), Melanogmmrnus aeglefi nus (haddock),
Urophycis tenuîs (whitehake), cichlids fkom the Tilappine genera include:
Oreochmmis, Sarothemdon and Tilapia; two Salmonids: Oncorhyrtchus mykiss
(rainbow trout) and Salmo salar (atlantic salmon); several species fkom
mammalian orders including: Phoca vitulina corux>lour ( harbour seal),
Lcgonwrpha (rabbit), Canis furniliaris (domestic dog) and the cetacean,
Physeter macmcephalus (sperm whale).
Figures 15 and 16 show a Southern blot-hybridization analysis of Mbo I
digested genomic DNA from the family Cichlidae. DNA was digested fkom fish
V I
of the three tilapiine genera, Oreochmmis, Samthemdon and Tilapia. and
representatives fkom Old World Qchlids. Species represented on the blot, other
than h m the tilapiine tribe, iaclude two haplochrornine fïshes: Haplochromis
maori and Huplochromis aumhs; the West Af'ricm cichlid Hemiciiromis
bimaculatus, a member of the chromidotilapiine tribe Pelicicczchmmispulcher;
and an Asian cichlid Etrophs maeu latus. The cichlid species blot indicates that
the p80 repetitive element was detected in all cichlid species except in the
Asian cichlid. The intensity of hfiridization to members of the Tilapiine
lineage was the greatest; however, it was less intense in the closely related
haplochromine lineage, very faint in the West M c a n samples and signal was
absent in the Asian cichlid. Similar profiles were evident in the cichlids species
blot probed with the p34 insert.
As will be shown in Figures 13 and 15 the data indicate that the p80
repetitive element was detected in most cichlid species but was absent from
other phylogenetically distanced species indicating the genus-specific nature of
this repetitive element. The characterization of homologous repetitive
elements in other cichlid species by using SINE insertions as irreversible
events may serve as informative markers in constructing phylogenetic
relationships among cichlids. Since the p80 insert represents the M A -
unrelated region of this retroposon, the data are characteristic of a retroposon.
The p34 repetitive element is also an interspersed element, however its exact
nature remains to be determineci. The reverse complement sequence of p34
appears to contain the putative POL III A and B boxes and may represent the
tRNA like region of a SINE element and would account for the hybridization
patterns seen in the species blots; however, multiple enzyme digestion andysis
indicates that this element is part of a larger repetitive element.
m e 13. Southeni blot hybridization analysis of Mbo 1 digested genomic DNA fkom phylogeneticdy different species with labelled p80 insert. DNA (10pg) was digested with Mbo I to completion and electrophoresed on a 0.9%
agarose gel. Autorad exposure was 10 days. Lane 1= Oncorhynchus mykks (rainbow bout) , Lane 2= Melanogrammus aeglefinus (haddock), Lane 3 =
W u s morhua (atlantic c d ) , Lane 4 = Lophius amerkanus (goosefish), Lane5 = Urophycis tenuis (whïtehake), Lane 6 = PoUachius virens (pollock), Lane 7 = Salmo salar (atlantic salrnon), Lane 8 = Oreochrwrnis niloticus male, Lane 9 =
Oreochromis niloticus female, Lane 10 = Oreochrornis artreus, Lane 11 = Oreochmmis mosarnbicus, Lane 12 = TihpiQ zillii, Lane 13 = Tilapia rendilLi, Lane 14 = Samthemdongalilaeus, Lane 15 = P h c a uitulina concolour (harbour seal), Lane 16 = Lagomorpha (rabbit), Lane 17 = Canis familiaris (domestic dog), Laue 18 = Physeter rnacmcephalus (sperm whale). Marken shown on the left margins are in kilobases.
Figure 14. Southern blot hybridization analysis of Mbo 1 digested genornic DNA from phylogenetically different species with labelled p34 insert. DNA (10pg) was digested with Mbo 1 to completion and electrophoresed on a 0.9%
agarose gel. Autorad exposure was 10 days. Lane 1= Oncorhynchus mykiss (rainbow trout), Lane 2= M e l à m g r a m m u s a ~ ~ n u s (haddock), Lane 3 =
M u s morhua (atlantic cod), Lane 4 = LophUls arnericanus (goosefish), Lane5 = Urophycis tenuis (whitehake), Lane 6 = PoUachius virens (pollock), Lane 7 =
Salrno sakzr (atlantic salmon), Lane 8 = Oreuchmrnis niloticus male, Lane 9 =
Oreochromis niloticus female, Lane 10 = Oreochromis aureus, Lane 11 = Oreochrornis mosambicus, Lane 12 = Tilrrpicr zillii, Lane 13 = Tilapia rendilli, Lane 14 = Samtherodon galilaeus, Lane 15 = Phoca vitulina concolour (
harbour seal), Lane 16 = Lagomorphcr (rabbit), Lane 17 = Cank familiaris (domestic dog), Lane 18 = Physeter macmephalus (sperm whale). Markers shown on the leR margins are in kilobases.
Figure 15. Southern blot hybridization analysis of Mbo 1 digested genomic DNA f?om the family Cichlidae with radiolabellecl p80 insert. DNA (lOpg) was digested with Mbo I to completion and electrophoresed on a 0.9% agarose gel. Lane 1 = Oreochmrnisaureus, Lane 2 = Oreochromis mosambicus, Lane 3 = Tilapia rendiUi, Lane 4 = Samthedongdilaeus, Lane5 = TiLupia zillii, Lane 6 = Oreochromis nilotieus, Lane 7 = Oreochromis placedus, Lane 8 =
Haplochrornis auratus, Lane 9 = Oreochromis homurum, Lane 10 = Haplochromis moori, Lane 11= Hernichrornis birnaculatus, Lane 12 =
Pelicicachromispulcher, Lane 13 =Etroplus maculatus. Markers shown on the left m a r e s are in kilobases.
Figure 16. Southern blot hybridization anaiysis of Mbo 1 digested genomic DNA from the family Cichlidae with radiolabelled p34 insert. DNA (lOpg) was digested wi th Mbo 1 to completion and electrophoresed on a 0.9% agarose gel. Lane 1 = Oreochmrnisaureus, Lane 2 = Oreochromis cisarnbicus, Lane 3 =
Tilapia rendilli, Lane 4 = Samthemdongalilaeus, Lane5 = Tilapia zillii, Lane 6 = Orwchromis nibticus, Lane 7 = Oreochromis placedus, Lane 8 = Haplochromis auratus, Lane 9 = Oreochromis hornamrn, Lane 10 = Haplochromis moori, Lane 11= Hemichromis bimucukztus, Lane 12 =
Pelieicachmrnispulcher, Lane 13 =Etroplus maculatus. Markers shown on the leR m a r e s are in kilobases.
Figure 16.
3 .6 Isolation of Full Length Repetitive Elements
To determine the molecular composition of the fidl length repetitive
element associated with each repeat, p34 and p80, the inserts fkom these
plasmids were purifjeci fkom low melt agarose gels, radiolabelled and used to
screen an 0. nüoticus EMBL 3 genomic library. DNA h m a single individual
was partially digested with Mbo 1 and cloned into the EMBL 3 bacteriophage
lambda vector. Approxhately 1.5 X 104 plaque forming units were plated out
and the resulting plaques blotted onto nylon membranes and then hybridized to
either the radiolabelled p80 or p34 inserts. The proportion of the genome
represented by the p80 fkagment repetitive sequence was determined by
probing the EMBL 3 bacteriophage lambda genomic library with the labelleci
p80 plasmid insert and estimating the SINE copy number based on the
number of positive clones. There were 600 positive plaques seen on a two day
auto radiograph exposure. Assuming the average length of the bacteriophage
insert is 15 kb, an even distribution of the repetitive eiement in the 0. niEoticus
genome, that there is only one repetitive element homologue per positive
plaque and that the fraction cloned was random, it was determined that there
are 6000 copies of the fidl length repetitive element per haploid genome based
on a 'C' value of 1 pg (Majumdar and McAndrew 1986). The repetitive element,
therefore, represents 0.4% of the genome. The proportion of the genome
represented by the p34 fiagrnent repetitive sequence was determined to be
approximately 20000 copies and represented 1.2% of the genome.
1 selected five plaques from each library that hybridized strongly to each
probe on the assumption that each EMBL 3 bacteriophage lambda clone
containecl the fidl le@ repetitive element. The plaques selected fkom the p80
library were bacteriophage lambda clones 7.3,9.1,9.2, 14.1 and 15.2. The
digits prïor to the period represent the plate number and the digit aRer the
period represents the plaque number. The plaques selected h m the p34
library were bacteriophage lambda clones 5.5,6.2,7.1,11.5, and 14.1. Each
plaque was assigned a library designation such as p80h9.2 indicating that this
particular clone was bacteriophage lambda clone two fi-om plate number nine
in the p80 library.
Large quantities of DNA were isolateci fkom each bacteriophage lambda
clone using the large scale liquid lysate protoc01 and purified by cesium chloride
centrifugation (Sambrook et al. 1989). Concurrently, all bacteriophage lambda
clones were digested with a number of restriction enzymes in order to map the
repetitive element within each bacteriophage lambda clone. Each clone was
digested with Barn HI, Sma 1, Hind III, Sa1 1, Kpn 1 and double digestions with
combinations of each restriction enzyme. These enzymes were chosen
because they were six base pair restriction enzymes with known sites in the
EMBL 3 vector, and expeded to cut the insert DNA infkequently. 1 expected
this to produce a number of vector and insert fkagments that could be easily
identined on agarose gels. Eco RI, Barn Hi and Sal 1 all cut within the cloning
site of the vector and produced DNA fragments of approximately 19.95 kbp
and 8.78 kbp as well as other fragments representing the insert. The other
enzymes were used to facilitate in the mapping of the ends of the insert and
provide a proper orientation for effective mapping. Electrophoresis of both
Kpn 1 and Sma I digesteci bacteriophage lambda did not yield reproducible
results in that both generated many more DNA fkagments than could be
accounted for considering the size of lambda phage and a 15 kb insert This
limited the possibility of producing detailed restriction maps of each
bacteriophage lambda clone. However, I was able to subclone a number of
72
smd DNA firagments from several bacteriophage lambda clones into pUC 18
vectors for fkther mapping or for sequencing as nested deletion fkagments
(data not shown). 1 subcloned a 2.3 kb Eco RI fiagrnent fkom p80h9.2 into
pUC 18 Eco RU BAP. Also subcloned into pUC 18 Eco R U BAP vectors was a
2.0 kb fiagrnent fiom pBOh9.1, a 4.5 kb Eco RI fragment from ~807~7.3, a 1 kb
and a 6 kb Eco RI fiagrnent from p34h5.5 and a 1.4 kb and 4.0 kb EcoRI
fkagment from p347L7.1. The DNA fkagments chosen for subcloning
represented the smallest DNA nagrnent possible, thought to contain the entire
repetitive element fkom each library as assessed by hybridization with the
appropriate probe.
Nested deletion clones were prepared for several of these bacteriophage
lambda subclones using the Exonuclease III unidirectional deletion mapping
protocol (Sambrook et al. 1989). Exonuclease III reactions were performed on
the 1.4 kb Eco RI subclone kom p34h7.1, the 2.3 kb Eco RI subclone fkom
p80h9.2 and the 2.0 kb subclone from p80h9.1. The 4.5 kb Eco RI subclone
fkom p80h7.3 was mapped with Eco RI, Sac 1, Sph 1 and Rsa I and a 2.5 kb
DNA fkagment fkom this clone was subcloned into pUC 18. The pSOh7 .3-2.5
Sac I/Rsa I subclone was deleted with Exonuclease III generating several
deletions for sequencing.
Several nested deletions derived fkom bacteriophage lambda subclones
p80A9.1(2.0 kb Eco RI) and p34h7.1(1.4 kb Eco RI fkagrnent) were partidy
sequenced ushg a T7 sequencing kit (Pharmacia). Initial sequencing results
indicated that both clones did not contain the fidl length repetitive elements.
Unidirectional nested deletion mutants (15 deletion clones with decreasing
insert size of 150-200 bp) derived fimm the bacteriophage lambda subclone
p8Oh9.2 (2.3 kb Eco RI fkgment) were seqyenced and a contiguous sequence
was determined (Figure 17). A . alignment with the p80h9.2 (2.3 kb Eco RI
fragment) sequence (2276 bp) and the 219 bp p80 clone sequence indicated
that this subclone contains a sequence identical to the p80 insert but 1 was not
able to define the boundaries for the fidl length repetitive element based on this
Iimited idormation. This subclone was sequenced again on the LICOR
automated sequencing apparatus at the National Research Council Institute
for Marine Biosciences to c0nfh-m the accuracy of the sequence data. I
suspended work on the p34 library clones until work on the p80 repetitive
element was completed.
The bacteriophage lambda subclone p80h9.1(2.0 kb Eco RI) was not
sequenced in its entirety for two reasons. Mt 1 only had six deletion clones
that did not span the entire insert and this meant having to obtain other
deletion clones. Second and more specifically a Southern blot containing these
six deletion clones restricted with Eco RI and probed with a labelled p80 insert
did not hybridize to the same extent that deletion clones f?om the p80h7.3-2.5
Sac m a 1 subclone. Although approximately 2 pg of plasmid DNA for each
deletion clone was blotted, the signal intensity for the p80h7.3-2.5 Sac I/Rsa 1
deletion clone was four-fold greater. Based on this result, 1 assumed that the
bacteriophage lambda subclone p8OM. 1 (2.0 kb Eco RJ) may not contain the
entire repetitive element or does contain sequences with a great deal of
sequence identity to the probe but not to the same extent seen with the
p80h9.2 -2.3 kb Eco RI subclone. Three p80h7.3-2.5 Sac I/Rsa I deletion
clones, diEering in insert size by 500 bp and providing a strongly hybridizing
signal when probed with a labelleci p80 insert were sequenced on the LICOR
automated sequencer. The alignment of the p80 sequence with sequence data
8 - obtained fiom the p80h9.2 -2.3 kb Eco RI insert and the p80A7.3-2.5 Sac
m a 1 sequence provided enough data to characterize the fidl length p80
repetitive element. Note that approrrimately 500 bp of sequence were not
obtained fkom the 5 prime end of the p8OA7.3-2.5 Sac YRsa 1 clone because
suEcient data were obtained to define the repetitive element (See Figure 18 for
the nucleotide sequence of this subclone).
Based on the sequences derived nom the p80h9.1(2.0 kb Eco RI)
bacteriophage lambda subclone a 294bp sequence derived fkom the 3 prime
end of the insert was compared to the nucleotide databases using BLASTn
indicated identity with the Lake trout (Salvelinus namaycush) Hpa SINE 16
element. There was a 79% identity with the 69 bp of the 5 prime end with the
trout SINE which was derived fkom a phenylalanine-tRNA (accession
numbers U27087 and U27090) (Reed and Philips, 1995). &fer to Figure 19
for the sequence of the p80h9.1(2.0kb Eco RJ) bacteriophage lambda
subclone. Cornparison with the trout SINE indicates that the bacteriophage
lambda subclone contains the t-RNA unlike region and that the t-RNA
phenylalanine region was lost during restriction and subcloning. This
information may be useful in obtaining a possible fidl length phe-tRNA SINE
from 0. niluticus.
Figure 1 7. Cornpiete nucleotide sequence data of bacteriophage lambda subclone p80h9.2, 2.276kb Eco RI fragment, (ROn-1 a). The dashes represent sequence homologous t o the p80 insert and restriction enzyme recognition sites for Mbo 1, Pst I and Pvu II are underlined.
Figure 17. Complete nucleotide sequence data o f bacteriophage lambda subclone p809~9.2, 2.276kb Eco RI fragment, (ROn-1 a).
Pst I ~~CGCT~~Ar4C=d(CrrGTGTACAGCcrCCAGCCTTACGTGTGACACCC 1 6 80
Figure 18. Nucleotide sequence data of bacteriophage lambda subclone p80h7.3, 2.3kb Eco RI fragment, (ROn-1 b). This sequence representsl241 bp of sequence derived frorn the 3 prime end of this insert in the plasmid pUC 18 polycloning site. The dashes represent sequence homologous t o the p80 insert and restriction enzyme recognition sites for Mbo 1, Pst I and Pvu II are underlined.
FIGURE 18. Nucleotide sequence data o f bacteriophage lambda subclone p80h7.3, 2.3kb Eco RI fragment, (ROn-1 b).
Figure 19. Nucleotide sequence of p80h9.1 (2.0kb Eco RI) bacteriophage lambda subclone derived fkom the 3 prime end of the pUC 18 plasmid insert.
ATATAGAATT AGGTGGTAAC CAAAAAAAAT GTGAAATAAC TCAAAACATG 50
TTTTATATTT TATATTCTTC AAAGTAGCTG CCCTTTGACC TCATAAGGT 100
AGTCACCTGA AATTGTTTTC CAACAGTCTT AAAGGAGTTA CCGGAGATGC 150
TGGG;AACTTC TTGGCTCTTT TTCCTTCACT CTGCGGTCCA TCTCATCCCA 200
AACTATCTCG ACTGGGTTAG TTCACATGAC TGTGGAGGTC AGGCCATCTG 250
GTGGAGCACT TCATCACTCA TCTTCTGGTCM
The underlined sequence represents 79% identity with the trout (Salvelinus nanaycush) phe-tRNA SINE element shown in 3 prime to 5 prime direction.
81
The DNA repeat sequences fkom the p80L9.2 -2.3 kb Eco RI subclone
and the p80A7.3-2.5 Sac m a 1 subclone were compared against the
EMBXJGenbank DNA database using the BLASTn program. The results were
same as those using the pû0 insert Genbank analysis. No signincant sequence
identiw was obtained with any other sequence in that database. Each insert
sequence was analyzed for the presence of open reading frames (ORFs).
Sequences were analyzed in a l l six fiames for at least 20 codons and containing
no stop codons. ORFs were trandated to protein sequences rnanudy and were
submitted to NCBI and compared to the SWISSPROT database using
B W T . There are 13 OWs in the p80h9.2 Eco RI sequence ranging in length
of 25 amino acids to 109 amino acids. Results indicated no sequence aimilarity
with any lmown ORF fkom any repetitive element including reverse
transcriptase or transposase-encoded genes.
Restriction mapping data derived fiom bacteriophage lambda subclone
p80h9.2 (2.3 kb Eco RI fragment) indicate that there are four Mbo I sites in
the entire 2276 bp sequence separated by 490,7 and 220 bp. The 220 bp
sequence is consistent with that seen on the multiple restriction enzyme blot
(Figure 7) hybridized with the p80 labelled insert. The enzyme Pst 1 cleaves
this sequence three times and is separated by 300 and 614 bp with the 300bp
sequence corresponding with hybridization data (Figure 7). Pvu II also has
three restriction sites separated by 333 and 130 bp. Both Pvu II sites are
recognized in figure 7 as bands 'a' and 'b' respectively.
Restriction mapping data derived from bacteriophage lambda subclone
p80h7.3 (2.3 kb Eco RI fkagment), refening to Figure 18, indicate that there
are 5 Mbo 1 sites in the 1241 bp sequence separated by 221,376,7 and 220
bp. The 22Obp sequence is consistent with Figure 7 multiple restriction
enzyme blot redts. There are two Pst 1 sites in the p80h7.3 partial sequence
separated by 256 bp. This is smaller than expected, by 44 bp, however the
second Pst 1 site is found outside the SINE element, at the 3 prime end, wit;hin
the repetitive DNA into which the SINE has been inserted. The restriction
enzyme Pvu II restricts the ~807~7.3 sequence two times and not three as
expected. Hybridization data (Figure 7) indicate that a 300 bp Pvu II sequence
should be present dong with a 130 bp sequence. A Pvu II site at the 5 prime
end of the SINE element is missing. A likely explmation is a possible G to A
transition mutation at position 420 of the pûOh7.3 clone. The polymorphism
seen in the repetitive sequences flanking the SINE element is characteristic of
each cloned element since each clone represents one of six thousand copies and
comparison with other homologues would greatly enhance our understanding of
this repetitive element. It is obvious that the repetitive DNA flanking the
SINE element, based on restriction mapping, is an intrinsic part of the S M
element and may indicate either that the SINE has been inserted into specinc
sites within the genome or that the flan- repetitive DNA was ampMïed with
the SINE element (Figure 18). More information is required to M y
characterize th is element inc1uding the comparison with orthologous sequences
within this species and with homologous sequences in other species.
3 : 7 Identification of the Repetitive element
DNA sequences were aligned using the multiple sequence m e n t
program CLUSTAL V (Higgins and Sharp, 1988) followed by manual
optimization as seen in figure 20. Cornparison of the p80h9.2 (ROn-la) and
the p80h7.3 (ROn-lb) sequences, with ROn -1 designated Retroposon O.
niloticus - 1 , allowed us to characterize the repetitive elernent. The cioned
03
ROn-1 members share a similar composite structure: they contain an interna1
region that is 343 bp in length, evidenced by the presence of two flanking direct
repeats seen in the p80A9.2 (ROn-la) sequence. The internal repeat is 90%
identical in nucleotide composition between both sequences and GC-content is
50.33%. Percent identiw was calculateci according to Sakagami et al. (1994).
In the p80A9.2 ( R h - l a ) clone this 343 bp sequence is flanked by a 52 bp
direct repeat. This direct repeat is seen in the p80h7.3 (ROn-lb) clone 3 prime
end but is missing at the 5 prime end. The length of the direct repeat is
unusually long. Direct repeats in SINES are generally 7- 31 bp in Length but
others have been reported king up to 60 bp in length (Weiner et al. 1986;
Deininger 1989). The lack of target site duplications appears not to be
essential characteristics of retroposons since they are not seen in the tortoise
Pol III SINE. The internal 343 bp region flanked by the 52 bp direct repeats
are embedded into a much larger repetitive element The p80h9.2 (ROn-la)
repetitive element has an overd size of 611 bp and is fianked by a 6 bp
(CTTCAC) direct repeat. The p80A7.3 repetitive element has an overall size of
603 bp and is flanked by a CTCAC direct repeat. Since the flanlàng direct
repeats are indicated as hallmarks of mobile sequences that have been
integrated into the genome via duplication at the insertion site I presume that
the interna1 repeat originated fkom a transposition event. The presence of the
short direct repeats flanking the fidl length sequences remains to be explained
and would require analysis of other cloned members to explain their existence
and may have implications for the amplification and dispersion of this SINE in
the tilapiine genome.
Figrire 20. Multiple sequence alignment of bacteriophage lambda clones p80A9.2 (ROn-la), p80h7.3 ( R h - l b ) and p80 based on the primary data. The tRNA-like region of the ROn-la sequence is underhed. Dashes represent deletions and asterisks represent sequence identity. The dashed m o w (---O>) represents short direct repeats and the solid arrow (->) represenb the 52 bp repeat.
Figure 20.
> Pol A CACAmrraK---TGT-T(K3GT(xrr'bd ------- ------ * * *** * * * *** * -TwGmaAA ************************
Pol B CCA C T m ~ ~ ~ T C A G T A A C A ~ A A C - m C A m r r o C ; mm- ~ T C A G T A A C A m A A m C A G T C A ~ ************** ***************************** *************
86
The 'generic' SINE sequence is 73-500 bp in length and consists of three
domains. The 5 prime-region or =A-like region has sequence identity to
t r d e r RN& and contains an interna1 RNA Polymerase DI promoter A and
B boxes separated by 17-60 bp. This domain ends with a characteristic CCA
motif. A central region or =A-unlike region which is family specific and is
variable in length. This is followed by an A t T rich or poly A region at the 3
prime end. The A-rich region varies in length &om 8 bp to greater than 50 bp
and simple-sequence repeats are oRen found in this region (Deininger, 1989).
The interna1 repeat of the ROn-1 element is also composed of three
domains (Figures 20 and 21). Figure 21 outlines the composite structure of the
Rûn-1 SINE. The 5 prime domain, the tRNA-like region, is 126 bp in length
and contains two stretches of nucleotides similar to the RNA Polymerase III
promoter A-box and B-box consensus sequence and are separated by 64 bp.
The CCA triplet is retained 11 bp downstream fiom the putative B-box and
marks the end of the tRNA like region. Since the CCA sequence is present in
tRNA molecules, but not in their tDNAs, this suggests that the tRNA
molecule was the precursor of the tRNA like region. Although the sequence
between the putative A and B box promoter sites is longer than the usual
spacing of these elements, it presents similarities with the D-loop and the T
pseudouridine loop and a complex pokntial secondary structure cm be
composed (Figure 22). The extent of simildty of the tRNA like region to other
tRNAs is low and a cornputer assisted search of sequence identity failed to
determine precisely the parental tRNk It is not unusual to f h d ancient SINES
with tRNA regions so diverged that a secondary structure cannot be
constructed (Reed and Phüips 1995). No similarities were found with other
sequences present in the EMBL database.
Figure 2 1. Schematic representation of the O. niloticus ROn-l SINE.
Poly A region
5' Pol A Pol B 3'
A A h CCA GATCTG TGG
Composite structure of the 0. niloticm ROn-1 SINE outlining the locations of conserved motifs, polymerase III A and B boxes, the terminal CCA and the two conserved motifs in the MA-unrelated regions of tRNA-lysine SINFA Adapted from Oshima et al. 1996.
Figolre 22: Potentail Secondary Structure of the ROn-1 SINE M A - EkeRegion.
5' P \ A C T C A G T G T
1 G G G
G T A G A G T * * G A A C A T G G C G A
\ G T T A A
C T \ C
3' OH /
A C C C
* T * G
G * G
A G
* A G T G
A A C C * * * A
C T G G T A T G
\ T T T
A G * C T * A * A C G
G A G * T G T
G T G T T A C A C C C A
T T A G T * A C * G G A T A T A
G
Possible secondary structure of ROn-1 SINES. The sequence is shown as DNA with regions of base-pairing indicated by '*." The 5 prime and 3 prime termini, aminoacyl stem (l), the dihydrouridine loop (II), the variable loop (III), the anticodon loop (IV) and the pseudouridine loop CV) are indicated. The putative RNA polymerase III promoter regions (Pol III A and B boxes) are shown in boid type.
89
The tRNA-like sequence is followed by a tRNA-unrelated sequence that is
173 bp in length. This region normally contains DNA sequences that are either
specinc to a particula. species, genus or family. An important observation in
the ROn-1 elements is the presence of the GATCTG and TG motifs. Okada and
Oshima (1993) and Oshima et ai. (1993) aügned the consensus sequences fkom
the rodent B2 element, the tortoise Pol III SINE, the salmon Sma 1 f d y ,
the charr Fok 1 f d y and squid SIC family and has shown that ail had two
conserved sequence motifs in the tRNA-unrelated region. This has only been
observed in the tRNAlysine superfamily of related S m s and sequences
similar to these motifs are also seen in U5 regions of several retrovinm.
Though normdy found h m 7-33 bp aRer the CCA motif these motifs are
found 118 bp after the CCA motif in the tRNA like region of the ROn-1
elements. These motifs are usually separated by 10-11 bp but in this case
they are separated by 9 bp and is acceptable. It has also been shown that
position 34 after f h t motifis a G and position 38 is a T. This structure is
consistent with my data.
The three prime region of most SINES is of variable length and is
characterized by the presence of a . A rich (poly A) or A+ T rich region and, or a
simple repeatuig unit such as (TTG)n. The 3 prime region of the Rûn-1
element is 42 bp and contains a short poly A region. These structural
properties are similar to those characterized by other SINE sequences found in
a variety of other organisms.
4 S-Y AND CONCLUSIONS
In this report 1 have described the characteristics of a short repetitive
element fkom Oreochmrnis niloticw that has primary and potential secondary
structural sidarities with =A-derived S M s . The ROn-1 element share a
number of conserved features with mammaüan, fish and plant retroposons.
These properties indude the presence of a tRNA -1ike region containhg a spiit
RNA Pol III promoter A-box and B-box a primary and secondary structural
identity with tRNAs, a tRNA-unrelated region that is normdy family, genus
or species specific, a 3 prime region of variable length characterized by the
presence of an A rich region and flmkhg direct repeats.
The ROn-1 retroelements are present in the 0. niloticus genome at about
6000 copies per haploid genome an estimate sirnilar to that of SINES in other
species. Hybridization of genomic DNA to representatives of a wide range of
cichlid, mamrnalian and other teleosts have confirmed that the R h - 1
elements are unique to the f d y Cichlidae and are dispersed throughout the
genome. The pû0 insert corresponds to the WA-udike region of the
retroposon and thus eliminated the possibility of cross hybridization with other
homologues as would be the case if the tZNA-Like region were part of the probe
thus ensuring the family specific nature of this retroposon.
The tRNA-related region, of the ROn-1 SINE, is 126 bp in length and
contains two putative RNA Pol III promoter A-box and B-box consensus
sequence, separated by 64 bp. The B box is well conserveci; however, the A
box, which starts 24 bp fkom the 5 prime end of the element, is less conserved
owing possibly to the old age for the element. The sequence between the
putative Pol III A and Pol III B boxes is longer than the usual spacing between
90
a
these domains in =As and other SINES. Figure 22 repersents the potential
secondary structure of the Rom1 repetitive element. It presents structural
similarities wîth other tRNAlysine denived SINES in that the D-loop contains
the Pol III A box and the T pseudouridine loop retains the Pol III B box. The
CCA triplet is retained 11 bp downstream h m the putative B-box and
markhg the end of the tRNA-like region and the anti codon corresponds to a
lysine amino acid. The extent of similarity of the tRNA like region to other
tRN& is low but not unusual in older retroposons but it is a tRNA-lysine
SINE (Shimoda et al. 1996a).
The tRNA-unrelated region contains DNA sequences that have been
shown to hybridize to genomic DNA of specifïc families or genus or species.
The presence of the GATCTG and TG motifs as describeci by Okada (1993)
and Oshima et al. (1993) in this region of the retroposon indicate that ROn-1
may be derived fkom the tRNA lysine superf'y of related S W s . The
diagnostic nucleotides G and T at positions 34 and 38 afbr the second motif is
characteristic of the tRNA lysine superfdy of retroposons. With respect for
progenitor tRNAs of vertebrate SIN'S, tRNA lysine is the most common
tRNA species. The 3 prime region of most SINES is characterized by the
presence of an A rich region. The short direct repeats flanking retroposons are
most likely target site duplications of genomic DNA generated by repairing a
staggered break fomed at the insertion point and are hallmarks of
retroposition. These ch~acteristics support the characterization of ROn-1
element as a retroposon.
Since no studies have been able to iden= heteromorphic sex
chromosomes or sex-specinc genetic markers in Oreochmmis species, 1
JLi
originally set out to isolate male-specific DNA sequences fkom Oieochrvrnis
niloticus using the phenol enhanced reassociation technique PERT)
subtractive hybridization procedure (Rohe et al. 1977). The utility of
subtractive hybridization procedures to isolate sex-sp&c DNA sequences in
tilapias is limited by the degree of enrichment for those sequences. Since
"tracef DNA will be enriched by less than 100 fold by single step enrichment
techniques it is possible for non sex-specinc 'Yracef sequences to reameal and
be available for cloning as seen in this report (Straus and Ausubel 1990). The
PERT technique is based on mi.ànp: a s m d amount of male %acer" DNA with
an excess of "driver" DNA fiom a female putatively not containing Y-speQnc
sequences. Although the sex determiniiip system in O. niloticus is generally
described as XXfemale and XY male with a total diploid chromosome number
of 44, it is still not clear whether or not the sex-switching system is
multifactorial or based on sex differentiated chromosomes (Majumdar and
McAndrew 1986). UtüiPng DNA fkom a YY male tiIapia to selectively enrich
for male specifïc DNA sequences using the PERT protocol may increase the
probability of obtaining male specinc sequences (Scott et al. 1989). However,
failure to isolate sex-specinc sequences using subtractive hybridization
procedures is no indication that the sex chromosomes in 0. niloticus are
dispensable or that sex-switching is multifacbrid. Further research is
reqirired.
Altshul, S. F., Gisg, W., Miller, W., Meyers, E. W. and Lipman, D. J., 1990. Basic local alignment searching tooL J. Mol. Biol. 215:403-410.
Antequera, F. and Bird, A, 1993. Number of CpG islands and genes in humans and mouse. Proc. Natl. Acad. Sci. USA 90:11995-11999.
Bains, W. and Temple-Smith, K, 1989. Similarity and divergence mong rodent repetitive DNA sequences. J. Mol. Evol. 28: 191-19.
Berg, D. E. and Howe, M. M., eds., 1989. Mobile DNA American Society for Microbiology, Wwashuigton, DC.
Britten, R. J., and Barron, W. F., Stout, D. B. and Davidson, E. IL, 1988. Sources and evolution of the human ALu repeated sequences. Proc. Natl. Acad. Sci. 85:4770-4774.
Britten, R. J. and Hohne, D. E., 1968. Repeated sequences in DNA. Science 161:529-540.
Bruford, M. W. and Wayne, R. K., 1993. Curr. Opin. Genet. Dev. 3:939-943.
Bmuag, D. L., 1980. Molecular arrangement and evohtion of heterochromatic DNA Annu. Rev. Genet. 14:314-331.
Charlesworth, B., Sniegowskî, P. and Stephen, W., 1994. The evolutionary dynamics of repetitive DNA in eukaryotes. Nature 371(15):215-220.
Chen, T. L.,and Manuefidis, L., 1989. SINEs and LINES cluster in distinct DNA fragments of Giemsa band size. Chromosoma 98:309-315.
Chu, W., Liu, W. and Schmid, W. C., 1995. RNA polymerase III promoter and terminatm elements affect Alu RNA expression. Nucleic Acids Research, 23(lO): 1750-1757.
Coltman, D. W. and Wright, J. M., 1994. C a - Sines: A family of t-RNA derived r e t r o ~ ~ s o n s specinc to the superfdy Canoidea. Nucleic Acids Research 22(14): 2726-2730.
Daniels, G. R. and Deininger, P. L., 1983. A second major class of Alu f d y repeated DNA sequences in a primate genome. Nucleic Acids Research 11: 7595-7610.
Daniels, G. R. and Deininger, P. L, 1985. Repeat sequence f d i e s derived fkom mammalian tRNA genes. Nature (London). 317:819-822.
Deininger, P. L., and Batzer, M. A, 1993. Evolution of retroposons. In: Max K. Hecht e t al ( e h ) Evolutionary Biology, Vol. 27. Plenum Press. New York, pp. 157-196,
Deininger, P. L. and Daniels, G. R.,1986. The recent evolution of mammalian repetitive DNA elements. Trends in Genetics 2:76-80.
Deininger, P. L., Batzer, M. A., Hutchinson, C. k and Edgell, M. H., 1992. Master genes in mammalian repetitive DNA amplifzcation. Trends in Genetics. 8(9):307-3 11.
Deininger, P. L., Jolly, D. J., Rubin, C. M., Friedmann, T. and Schmidt, C. W., 1981. Base sequence studies of 300 nucleotide renatured repeated human DNA clones. J. Mol. Biol. 151: 17-33.
Deininger, P. L., SINES: Short interspersed repeated DNA elements in higher eukaryotes. Chapter 27. Berg, D. E., and Howe, M. U (eds.) Mobile DNA (American Society for Microbiology, Washington, DC, 1989).
Denison, R. k and Weiner, A. M., 1982. Human U1 RNA pseudogenes may be generated by both DNA- and RNA- mediated mechanisms. Mol. Cell. Biol. 2:815-828.
Deragon, J. M., Lmdry, B. S., Pelissier, T.!Tutois, S., Tourmente, S., and Picard, G., 1994. An malysis of retroposîhon in plants based on a family of SINEs h m Brassica napus. J. Mol. Evol. 39:37&386.
Devlui, R. H, McNeil, B. Ki, Groves, T. D. D. and Doddson, E. M., 1991. Isolation of a Y-chromosomal DNA probe capable of determining genetic sex in Chinook Salmon (Oncorhynchus tshawytsclur). Can. J. Fish. Aquat. Sci. 48,1606-1612.
Di Rienzo, A, Peterson, k C., Gana, J. C., Valdes, k M., Slatkh, M. and Freimer, N. 1994. Mutational processes of simple sequence repeat loci in human populations. Proc. Natl. Acad. Sci. USA, 91:3166-3170.
Doolittle ,W. F. and Sapienza, C., 1982. Selfish DNA, The phenotype paradigm and genome evolution. Nature 284: 601-603.
Dover, G. A, 1989. DNA fingerprints: victims or perpetrators of DNA turnover. Nature 342~347-348.
Duffy, k J., Coltman, D. W. and Wright, J. M., 1995. MicrosateIlites at a common site in the second ORF of L1 elements in mammalian genomes. Mammalian Genone 7:386-387.
Feinberg, k P.,and Vogelstein, B., 1983. A technique for radiolabelling DNA restriction endonuclease fragments to high specific activity. Anal. Biochem. 132:6-13.
Feinberg, k P.,and Vogelstein, B., 1984. A technique for radiolabelling DNA restriction endonuclease fkagments to high specinc activity. Anal. Biochem. 137:266-267.
Fh& G. R., Boeke,. D. and Garnnkle, D. J., 1986. The mechanism and consequences of retroposition. Trends in Genetics. May 1986:118-123.
Finnegan, D. J., 1989. Eukaryotic transposable elements and genome evolution. Trends in Genetics. 5(4):103-106.
Finnegan, D. J., 1992. Tramposable elements. Current Opinion in Genetics and Development. Vol. 2:861-867.
Frank, J. P. C., Hanis, A. S., Bentzen, P., Wright, E. M. and Wright, J. M., 1991. ûrganization and evolution of satellite, minisateIlite and microsatellite DNAs in te1eost fish. In MacLean, N. (eds), O d o d Surveys on Eukaryotic Genes, Odord University Press, pp. 51-82.
Frengen, E., Thompsen, P., Kristensen, Tg, Kki.n, S., Miller, R. and Davies, W., 1991. Porcine SINES: Characterization and use in species specific amplincation. Genomics 10:949-956.
Fryer, G. and I h , T. D., 1972. The Cichlid fishes of the Great Lakes of Anica: Their biology and evolution. Oliver and Boyd, Edinburgh.
Fuhrman, S. A, Deininger, P. L., LaPorte, P., Friedman, T. and Gieduschek, E. P., 1981. Analysis of transcription of the human N u f d y ubiquitous repeating element by eukaryotic RNA polymerase III. Nucleic Acids Research 9: 6439-6456.
Goode, B. L. and Feinstein, C., 1992. "Speedprep" purincation of templates for double-stranded DNA sequencing. Biotechniques 12:374375.
G f i t h s , R. and Houand, P. W. H., 1990. A novel avian W chromosome DNA repeat sequence in the lesser black-backed gull ( ~ n t s f i s c u s ) . Chromosoma 99:243-250.
Hammer, M. F., 1994. A recent insertion of an Alu element on the Y chromosome is a useful marker for human population studies. Mol. Biol. Evol. 11(5):749-761.
Haynes, S. R., Toomey, T. P., Leinwand, L. and Jebek , W. R., 1981. Mol. Cell. Biol. 1:573-583.
He, H., Ravira, C., Pimentel, S., Liao, C. and Edstrom, J., 1995. Polymorphic S W s and Chironomids with DNA derived from the insertion site. J. Mol. Biol. 245:34-42.
Higgins, D. G. and Sharp, P. M., 1988. CLUSTAL: A package for performing multiple sequence alignment on a microcornputer. Gene 73:237-244.
Hirmo, H. Y., Mochizula, Ki, Umeda, M., Ohtsubo, E., and Sano, Y., 1994. Retrotransposition of a plant SINE into the wx locus during evolution of rice. J. Mol. Evol. 38:132-137.
Hutchison, C. A, Hardies, S. C., beb, D. D., Shehee, W. R. and Edgell, M. H., LINES and related retroposons: Long interspersed repeated sequences in the eukaryotic genome. In: Mobile DNA, D. E. Berg and M. M. Howe, eds. (American Society for Microbiology, Washington DC, 1989). pp. 593-617.
Izsvak, Z., Ivics, Z., Estefania, D., Fahrenlmig, S. C. and Hackett, P. B., 1996. DANA elements: A f d y of composite, tRNA derived short interspersed DNA elements associated with mutational activities in Zebraiish. &oc. Natl. Acad. Sci. USA, 93:1077-1081.
Jagadeeswaran, P., Forget, B. G. and Weissman, S. M., 1981. Short interspersed repetitive DNA elements in eukaryotes: transposable DNA elements generated by reverse transcription of RNA polymerase III transcrîpts. CeU, 26:141-142.
Jefieys, k J., Wilson, V. and Thein, S. L, 1985a Hypervariable "minisatelliten regions in human DNA. Nature 314: 67-73.
Jeffreys, A. J., Wilson, V., and Thein, S. L., 198513. Individual specific %ngerprintsn of human DNA Nature 316: 76-79.
Jelinek, W. R. and Schmid, C. W., 1982. Repetitive sequences in eulrrun,tic DNA and their expression. Ann. Rev. Biochem. 51:813-844.
Joomyeong, K, Martignetti, J. A, Shen, M. R., Brosius, J. and Deininger, P., 1994. Rodent BC1 RNA gene as a master gene for ID element amplincation. Proc. Natl. Acad. Sci. USA, 91:3607-3611.
Jurka, J. and Smith, 1988. A fiindamental division in the human Alu family of repeated sequences. Proc. Natl. &ad. Sci. USA, 85:475-478.
Jurka, J., Zietkiewicz, E. and Labuda, D., 1995. Ubiquitous mammalian-wide interspersed repeats (MIRS) are molecular fossils h m the Mesozoic era. Nucleic Acids Research, 23(1): 170-175.
Ecachroo, P., Leong, S. k and Chattoo, B. B., 1995. Mg-SINE: A short intersperseci nuclear element h m the rice blast fungus, Magnaporthegrisea. Proc. Natl. Acad. Sci., USA 92:11125-11129.
Kalb, V.F., Glasser, S., King, D. and Lingrel, J. B., 1983. A cluster of repetitive elements within a 700bp region in the mouse genome. Nucleic Acids Research 11:2177-2184.
L(aukinen, J. and V a ~ o , S., 1992. Artiodactyl retroposons: Association with microsateUtes and use in SINE morph detection by PCR. Nucleic Acids Research, 20(12):2955-2958.
Edo, Y., Ono, M., Yamaki, T., Matsumoto, K., Murata, S., Sanepshi, M. and Okada, N., 1991. Shaping and reshaping of salmonid gemmes by amplification of t-RNA-derived retroposons during evolution. Proc. Natl. Acad. Sci., USA 88:2326-2330.
Edo, Y., Himberg, M, Takasaki, N. and Okada, N., 1994. Ampli16:cation of distinct s u b f d e s of short interspersecl elements during evolution of the Salmonidae. J. Mol. Biol. 241:633-644,
Kim, J., Mgne t t i , J. A, Shen, M. R., Brosius, J. and Deininger, P., 1994. Rodent BC1 RNA gene as a master gene for ID element amplincation. Proc. Natl. Acad. Sci. USA, 91:3607-3611.
Kit, S., 1961. Equilibrium centrifugation in density gradients of DNA preparations fkom animal tissues. J. Mol. Biol. 3:711-716.
Kohne, D. E., Levinson, S. k and Byers, M. J., 1977. Room temperature method for increasing the rate of DNA reassociation by many thousand fold: the phenol emulsion reassociation technique. Biochemistry 16(24):5329-5341.
Krane, D. E., Clark, A. G., Cheng, J. F. and Hardison, R. C., 1991. Subfamily relationships and clustering of rabbit C repeats. Mol. Biol. Evol. 8:l-30.
Bayez, A. S., Kramerov, D. A, Skryabin, IC G., Ryskov, k P., Bayev, A.. k and Georgiev, G. P., 1980. Nucleic Acids Research, 8: 120 1-1215.
h y e z , k S., Markusheva, T. V., Kramemv, D. A, Ryskov, A. P., Skryabin, I(. G., Bayev, k k and Georgiev, G. P., 1982. Nucleic Acids Research, 10:7416- 7475.
Ktinkel, L. M., Monaco, k P., Middleworth, H. D., Ochs, EL D., and Latt, S. A, 1985. Specinc cloning of DNA fragments absent from the DNA of a male patient with an X chromosome deletion. Proc. Natl. Acad. Sci. (USA) 82:4778- 4782.
Kunkel, L. M., Smith, K. D., and Boyer, S. EL, 1976. Human Y-chromosome specific reiterated DNA. Science 191: 1189-1190.
Lawrence, C. B., McDonnell, D. P. and Ramsey, W. J., 1985. Analysis of repetitive sequence elements contauiiiig tRNA-like sequences. Nucleic Acids Research, 13:4239-4252.
Lehrman, M. A., Goldstein, J. Lw, Russell, D.W. and Brown, M. S., 1987. Duplication of seven exons in the LDL receptor gene caused by Alu-Alu recombination in a subject with famial hypercholesterolemia Cell48: 827-835.
Levinson, G. and Gutman, G. A, 1987. Slipped strand misspiring a major mechanism for DNA sequence evolution. Mol. Biol. Evol., 4: 203-221.
Liu, W. and Schmid, 1993. Proposed roles for the DNA methylation in Alu transcriptional repression and mutational inactivation. Nucleic Acids Research, Zl(6): 1331-1359.
Liu, W., Chu, W., Choudary, P. V. and Schmid, W. C., 1995. Cell stress and translational inhibitors transiently increase the abundance of mammalian SINE transcripts. Nucleic Acids Research, 23(lO): 1758-1'765.
Majumdar, K. C . and McAndrew, B. J., 1986. Relative DNA content of somatic nuclei and chromosomal studies in three genera, Tilapia, Sarothemdon, and Oreochromis of the tribe Tilapiini (Pisces, Cichüdae). Genetica 68: 175-188.
Mathias, S. L., Scott, A F., Kazazian, H. H. Jr., Boeke, J. D. and Gabriel, A, 1991. Reverse transcriptase encoded by a human transposable element. Science 254: 1808-1810.
McAndrew, B. J. and Majumdar, K C., 1983. Tilapia stock identification using electrophoretic markers. Aquaculture 30:249-261.
McComeU, S. K J., Frank, J. P. C. and Wright, J. M, 1997. Moleculm genetic markers and their application to the Tilapias. In: Reviews in Applied Genetic of Tilapias, G. Mair, ed, InternatioL181. Centre for Li* Aquactic Resource Management, Manila, Philippines. In press.
MMos, G. L. C. SequenQng and Manipulating highlyrepeated DNA In Dover, G. k and Flavell, R B. (Eds.) Genome hrolution and Phenotypic Variations. Academic Press, London, 1982, pp. 41-68.
Miklos, G. L. C. Localized highly repetitive DNA sequences in vertebrate and invertebrate genomes. In MacIntyre, R. J., (Ed.), Moledar Evolutionary Genetics. Plenum, New York, 1985, pp. 241-321.
Minnick, M. F., Stillwell, L. C., Heineman, J. M. and Stiegler, G. L., 1992. A highly repetitive DNA sequence possibly unique to Canids. Gene 110:235238.
Mochizuki, K, Umeda, M., Ohtsubo, H. and Ohtsubo, E., 1992. Characterization of a plant SINE, p-SM1, in rice genomes. Japanese Journal of Genetics 57:155-166.
Murata, S., Takasaki, N., Saitoh, M. and Okada, N., 1993. Determination of the phylogenetic relationships among PaQnc salmonids by using short interspersed elements (SINEs) as temporal landmarks of evolution. Proc. Natl. Acad. Sci. USA. 90:6995-6999.
Murata, S., Takasaki, N., Saitoh, Ed, Tachida, H. and Okada, N., 1996. Details of retropositional genome dynamics that provide a rationale for a generic division: The distinct brmching of all the Pacinc salmon and trout (Onoorrhphus) fkom the Atlantic salmon and trout (Salmo). Genetics 142: 915-926.
Murata, S., Takasaki, N., Saitoh, h!L, Tachida, H. and Okada, N., 1996. Details of retropositional genome dynamics that provide rationale for a generic division: The distinct brmching of all the pacifie salmon and trout (0ncorrhynchu.s) f h n the atlantic salmon and trout (Salmo). Genetics 142:915-926.
Oshima, K., Hamada, M., Terai, Y, and Okada, N., 1996. The 3 prime ends of tRNA-derived short intemperd repetitive elements are derived h m the 3 prime ends of long interspersed repetitive elements. Mol. CeU. BioL 16(7):3756- 3764.
Ohshima, IC and Okada, N., 1994. Generality of the tRNA origin of short interspersed repetitive elements (SINEs). J. Mol. Biol. 243:25-37.
Ohshima, K, Roishi, Roy Matsuo, M. and Okada, N., 1993. Several short interspersed repetitive elements (SINES) in distant species may have originated h m a cornmon ancestral retmvirus: Characterization of a squid SINE and a possible mechanism for generation of tR+NA derived retroposons. Proc. Natl. Acad. Sei-USA 90:6260-6264.
Okada, N. and Ohshima, R, 1993. A model for the mechanism of initial generation of short interspersed elements (SINES). J. Mol. Evol. 37: 167-170.
Okada, N. and Ohshima, R, 1995. Evolution of tRNA-derived SINES, p. 61-79. In R. J. 31Maraia W.). The impact of short interspersed elements (SINES) on the host genome. R. G. landes Co., Austin, Texas.
Okada, N., 1991a. SINEs. Current Opinion in Genetics and Development. 1:498-504.
Okada, N., 1991b. SINES: Short Interspersed Repeated Elements of the Eukaryotic Genome. TEE. 6(11):358-361.
Orgel, L. C. and Crick, F. H. C., 1980. Selnsh DNA: The ultimate parasite. Nature 284:604-607.
Quentin, Y., 1988. The Alu family developed through successive waves of fixation closely connected with primate lineage history. J. Mol. Evol. 27: 194- 202.
Quentin, Y., 1989. Successive waves of fixation of B l variants in rodent lineage history. J. Mol. Evol. 28:299-305.
Quentin, Y., 1994. A master sequence related to a fkee left Alu monomer (FLAM) at the origin of the B1 f d y in rodent genomes. Nucleic Acids Research, 22(12):2222-2227.
Rasmussen, N, Rossen, L., and Giese, H., 1993. SINE-like properties of a hïghly repetitive element in the genome of the obligative parasitic fungus Erysiphegmminis f.sp. honEei. Mol. Gen. Genet. 239: 298-303.
R e d , K M. and Phillips, R. B., 1995. Molecular characterization and cytogenetic analysis of higaiy repeated DNAs of lake trout, Saluelinus namaycush. Chromosoma 104(4):242-251.
Rogers, JyH-, 1985. The origin and evolution of retroposons. Internat. Rev. Cytol. 93: 187-279.
Rubin, C . M., Leeflang, E. P., Rinehart, F. P. and Schmidt, C. W., 1993. Paucity of novel short interspersed repetitive element (SINE) families in human DNA and isolation of a novel MER repeat Genomics 18:322-328.
Rubin, C. Me, Houch, C. M, Deininger, P. L., Friedmann, T. and Schmidt, C. W., 1980. Partial nucleotide sequence of the 300 nucleotide interspersed repeated human DNA sequences. Nature 284:372-374.
Sakagami, M., Oshima, K, Mukoyama, H., Yasue, H. and Okada, N., 1994. A novel tRNA species as an origin of Short Interspersed repetitive Elements (SINES). J. Mol. Biol. 239: 731-735.
Sakamoto, K. and Okada, N., 1985. Rodent type 2 Alu family, Rat Identifier sequence, Rabbit C Family and Bovine or Goat 73-bp repeat may have evolved from tRNA genes. J. Mol. Evol. 22:134-140.
Sambmok, J., Fritsch, E. F., and Maniatis, T.,1989. Molecular Clonhg A Laboratory Manual, Second Edition, Cold Spring Harbour Laboratory, Cold Spring Harbour, New York.
Sanger, F., Nicklen, S. and Coulsin, A. R., 1977. DNA sequencing with chain- terminating inhibitors. Proc. Natl. Acad. Sci. U.SA 74:5463-5467.
Schlotterer, C. and Tautz, D., 1992. SIippage synthesis of simple secpence DNA Nucleic Acids Research U)(2):211-215.
~ c ~ d t , C. and Maraia, T., 1992. Transcriptional regulation and transpositional selection of active SINE sequences. Curr. Opin. Gen. Dev. 2~874-882.
Scott, A G., Penman, P. J., Beardmore, J. A, and Skibinski, D. O. F., 1989. The 'YY" supermale in Oreochromis niloticus a.) and its potential in aquaculture. Aquaculture, 78:237-251.
Shirnoda, N., Chevrette, M., Rikuchi, Y., Hotta, Y. and Okamoto, H., 1996a. Mermai& A f d y of short interspersed repetitive elements widespread in vertebrates. Biochem. Biophys. Res. Comrn. 220:226-232.
Shimoda, N., Chevrette, M., EClkiichi, Y., Hotta, Y. and Okamoto, H., 199613. Mermaid: A family of short interspersed repetitive elements is usefiii for zebrafïsh genome mapping. Biochem. Biophys. Res. Comm. 220:233-237.
Singer, M. F., 1982. SINES and LINES: Highly repeated short and long interspersed sequences in mammalian genomes. Cell28:433.
Singer, M. F. and Berg, P. 1991. Genes and genomes. A changing perspective. Blackwell, Odord.
Slagel, V., Flemming, E., 'h-aina-Dorge, V., Bradshaw, H. and Deininger, P. L., 1987. Clustering and subfamily relationships of the Alu family in the human genome. Mol. Biol. Evol. 4:19-29.
Smit, A F. A. and Riggs, k D., 1995. MIRs are classic, tRNA-derived S m s that amplified before the mammalian radiation. Nucleic Acids Research, 23(1): 98-102.
Southern, E. M. 1975. Detection of specinc sequences arnong DNA fragments separated by gel electrophoresis. J. Mol. Biol. 98:503-517.
Stallings, R. L., Ford, A F., Nelson, D., Tomey, D. C., Hildebrand, C. E. and Moyzis, R. &, 1991. Evolution and distribution of (GT)n repetitive sequences in mamrnaüan genomes. Genomics 10:807-815.
Stiassny, M. L. J., 1991. Phylogenetic interrelationships of the family Cic?hlidae: an overview. p. 1-35. In Cichüd Fishes: Behavior, Ecology and Evolution. Edited by M. H. k Keenleyside. Chapman and Hall, London
Straus, D., and Ausubel, F. M., 1990. Genomic subtractiond for cloning DNA corresponding to deletion mutations. hoc. Natl. Acad. Sci., USA 87:1899- 1893.
Tachida, H. and Lizuka, M., 1993. A population genetic study of the evolution of SINES. 1. Polymorphism with regard to the presence or absence of an element. Genetics 133~1023-1030.
Takasaki, N., Murata, S., Saitoh, M., Kobayashi, T., Park, L. and Okada, N., 1994. Species-specinc amplincation of tRNA-derived short interspersed repetitive elements (SINES) by retroposition: A process of parasitization of entire genomes during the evohtion of srilmonidS. Proc. Natl. Acad. Sci., USA 91:10153-10157.
Takasaki, N., Park, L., Kaeriyama, M., Gharrett, A. J. and Okada, N., 1996. Characterization of species-specifically amplifid SINES in three Salmonid species- Chum Salmon, Pink Salmon and Kokanee: The local environment of the gemme may be important for the generation of a dominant source gene at a newt retroposed locus. J. Mol. Evol. 42: 103-116.
Tautz, D., 1989. Hmervariabfity of simple sequences as a general source for polymorphic DNA markers. Nucleic Acids Research. 17:6463-6471.
Trewavas, E., 1982. Generic groupings of Tilapiini used in aquaculture. Aquadture 2279-81.
W u , E., and Tschudi, C., 1984. Alu sequences are processed 7SL RNA genes. Nature 312:171.
Van der Vlugt, H. H. J. and Lenstra J. A, 1995. SINE elements of carnivores. Mlimmalian Genome 6:49-51.
Wallace, M. R., Anderson, L. B., Saulino, k M., Gregory, P. E., Glover, T. W. and Collins, F. S., 1991. A denovo N u insertion results in neurdibromatosis type 1. Nature 353: 864-866.
Weiner, A. M., 1980. An abundant cytoplawiic 75 RNA is cornplùnentary to the dominant intersperseci middle repetitive DNA sequence f d y in the human genome. Cell22:209-218.
Weiner, A M., Deininger, P. L. and Efbtratiadis, A., 1986. Nonviral retroposons: Genes, Pseudogenes, and transposable elements generated by the reversed flow of genetic information. Ann. Rev. Biochem. 55:631-661.
Westneat, D. F., Noon, W. A., Reeve, H. H, and Aquadro, C. F., 1988. Improved hybridization conditions for DNA "nngerprints" probed with Ml% Nucleic Acids Research 16:4161.
Wichman, H. A, Van Den Bussche, R. A, Hamilton, M. J. and Baker, R. J., 1992. Transposable elements and the evolution of genome organizatim in mammds. Genetica 86: 287-293.
Willard, C., Nguyen, H. T., and Schmid, C. W., 1987. Existence of at least three distinct A h subfamilies. J. Mol. Evol. 26:180.
Wright, J. M., 1994. Mutation at VNTRs: Are minisatellites the evolutionary progeny of microsatellites? Genome, 373345-347.
Wright, J. M., DNA fhgerprinting of fishes. In Biochernistry and Molecula. Biology of Fishes. Vol. 2. Edited by P. Hoachachka and T. Mommsen. Elsevier, New York, 1993, pp. 57-91.
Yoshhioh, Y., Matsumoto, S., Kojima, S., Oshima, IC, Okada, N. and Machida, Y., 1993. Molecdar characterization of a short interspersed repetitive element fkom tobacco that exhibits sequence homology to specific trCNAs. Proc. Natl. Acad. Sci. 90:6562-6566.
O 1893. Appiïed Image. Inc. AM Rights Resenred