katsu reptilian genome cgr 2009

Upload: xobina

Post on 05-Apr-2018

229 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/2/2019 Katsu Reptilian Genome CGR 2009

    1/15

    Fax +41 61 306 12 34E-Mail [email protected]

    Genome Composition

    Cytogenet Genome Res 2009;127:7993

    DOI: 10.1159/000297715

    From Reptilian Phylogenomics toReptilian Genomes: Analyses of c-Jun andDJ-1 Proto-Oncogenes

    Y. Katsua, b E.L. Braunc L.J. Guillette, Jr.c T. Iguchia

    aDivision of Molecular Environmental Endocrinology, Okazaki Institute for Integrative Bioscience,

    National Institute for Basic Biology, National Institutes of Natural Sciences, and Department of Basic Biology,

    School of Life Science, The Graduate University for Advanced Studies, Okazaki, bDepartment of Biosystems

    Science, Graduate School of Life Science, Hokkaido University, Sapporo, Japan; cDepartment of Biology,

    University of Florida, Gainesville, Fla., USA

    data. The conflict between the c-Jun topology and expecta-

    tion appears to reflect the overlap between c-Jun and a CpG

    island in most taxa, including crocodilians. This CpG island is

    absent in the frog and turtle, and convergence in base com-

    position appears to be at least partially responsible for the

    signal uniting these taxa. Noise reduction approaches can

    eliminate the unexpected frog-turtle clade, demonstrating

    that multiple signals are present in the c-Jun alignment. We

    used phylogenetic methods to visualize these signals; we

    suggest that examining both historical and non-historical

    signals will prove important for phylogenomic analyses.

    Copyright 2010 S. Karger AG, Basel

    Advances in sequence data acquisition and bioinfor-matics have revolutionized many areas of biology and ac-

    celerated research focused on many groups of organisms.This has resulted in many complete genome sequences[Janssen et al., 2005], including complete drafts of verte-brate genomes [Thomas, 2008; Hubbard et al., 2009].However, reptiles have largely missed the genomic revo-lution. This is surprising given the pivotal place occupied

    Key Words

    Compositional convergence CpG island Isochore

    Phylogenetic shadowing Taxon sampling Transversion

    analyses

    Abstract

    Genome projects have revolutionized our understanding of

    both molecular biology and evolution, but there has been a

    limited collection of genomic data from reptiles. This is sur-

    prising given the pivotal position of reptiles in vertebrate

    phylogeny and the potential utility of information from rep-

    tiles for understanding a number of biological phenomena,

    such as sex determination. Although there are many poten-

    tial uses for genomic data, one important and useful ap-

    proach is phylogenomics. Here we report cDNA sequences

    for the c-Jun(JUN)and DJ-1(PARK7) proto-oncogenes from 3reptiles (the American alligator, Nile crocodile, and Florida

    red-belly turtle), show that both genes are expressed in the

    alligator, and integrate them into analyses of their homologs

    from other organisms. With these taxa it was possible to con-

    duct analyses that include all major vertebrate lineages.

    Analyses of c-Jun revealed an unexpected but well-support-

    ed frog-turtle clade while analyses ofDJ-1 revealed a topol-

    ogy largely congruent with expectation based upon other

    Published online: March 17, 2010

    Edward L. BraunDepartment of BiologyUniversity of FloridaGainesville, FL 32611 (USA)Tel. +1 352 846 1124, Fax +1 352 392 3704, E-Mail ebraun68 @ ufl.edu

    2010 S. Karger AG, Basel14248581/09/12740079$26.00/0

    Accessible online at:www.karger.com/cgr

    Y.K. and E.L.B. contributed equally to this paper.

    http://dx.doi.org/10.1159%2F000297715http://dx.doi.org/10.1159%2F000297715
  • 8/2/2019 Katsu Reptilian Genome CGR 2009

    2/15

    Katsu /Braun /Guillette /IguchiCytogenet Genome Res 2009;127:799380

    by reptiles in vertebrate evolution, with both mammals

    and birds having reptile-like ancestors (fig. 1). The firstgenome sequence obtained from a member of the reptileclade was actually from a bird, the chicken [InternationalChicken Genome Sequencing Consortium, 2004], ratherthan a traditional reptile. However, the chicken has anevolutionary distance from mammals that appears to benearly ideal to highlight functional elements in genomecomparisons, suggesting information from other mem-bers of the reptile clade will also prove useful. Thus, rep-

    tilian genomes may be especially informative for com-

    parative genomics and they are likely to be critical forunderstanding the evolution of vertebrate genomes.There are many approaches to comparative genomics,

    ranging from studies of structural variation in chromo-somes to fine-scale comparisons of specific sequences.Phylogenomics is the field within comparative genomicsfocused on using evolutionary history to predict genefunction [Eisen et al., 1997; Eisen, 1998]. Phylogenomicshas expanded to include phylogenetic estimation using

    A B

    C

    D

    E

    Fig. 1. Reptilian phylogeny. A Traditional phylogeny [e.g. Zardoyaand Meyer, 2001] with approximate divergence times [Benton andDonoghue, 2007]. Patterns of temporal fenestration traditionallyused to classify amniotes are illustrated using stylized skulls. Ex-tinct groups, like the non-mammalian synapsids (mammal-likereptiles) are included to illustrate the pivotal position of reptiles

    in vertebrate phylogeny. Some extinct groups (the non-mamma-lian synapsids, extinct parareptilia, and non-avian dinosaurs) areparaphyletic with respect to the extant taxa, but they are shownas a single lineage in the interest of simplicity. This topology issupported by morphology [Gauthier et al., 1988; Lee, 2001], com-bined morphological and molecular evidence [Eernisse andKluge, 1993; Lee, 2001], and information about development[Werneburg and Snchez-Villagra, 2009]. Hypothesized changesin isochore structure (see text) are indicated by and . Model postulates a shift to GC-rich isochore structure at the base of the

    amniotes followed by a reversion to the GC-poor state in turtles[see Chojnowski and Braun, 2008], while model postulates 2independent shifts to a GC-rich isochore structure. B Alternativetopology supported by many molecular studies [e.g. Rest et al.,2003; Iwabe et al., 2005; Hugall et al., 2007; Jiang et al., 2007] thatwas also advocated by Lvtrup [1977]. Most recent analyses sup-

    port topologies A or B, so we view these topologies as the bestavailable estimates of the species tree. C Alternative topology witha turtle-lepidosaur clade supported by a morphological study[deBraga and Rieppel, 1997]. D A turtle-crocodilian clade is sup-ported by some molecular studies [e.g. Hedges and Poling, 1999;Mannen and Li, 1999] and clustering of genomic signatures[Shedlock et al., 2007]. E A bird-turtle clade was suggested in ananalysis of gene duplication and loss [Cotton and Page, 2002],although support for the clade was limited.

  • 8/2/2019 Katsu Reptilian Genome CGR 2009

    3/15

    Phylogenomics of Reptilian Oncogenes Cytogenet Genome Res 2009;127:7993 81

    large-scale (or even genome-scale) alignments [e.g. Del-suc et al., 2005; Hackett et al., 2008; Wiens et al., 2008].Although these 2 different uses of the term phylogenom-ics appear very different, they actually have the potentialto be complementary. Ultimately, the conclusions of phy-logenomic analyses (in the original sense of the term) re-

    flect the comparison of individual gene trees to the spe-cies tree [Eisen, 1998] and large-scale phylogenetic data-sets have the greatest potential to provide an accuratespecies tree estimate (assuming appropriate analyses)[see Jeffroy et al., 2006]. Indeed, large-scale datasets couldbe especially important in reptiles, where substantial de-bate regarding relationships remains (see fig. 1).

    Phylogenomic analyses can reveal genes that have un-dergone functional changes by highlighting gene dupli-cations, gene losses, and major changes in evolutionaryrates. It is even possible to extract functional informationabout specific gene products when the gene tree is con-

    gruent with the species tree by focusing on specific sitesto identify those that have undergone evolutionary rateshifts or positive selection [e.g. Gaucher et al., 2002;Georgelis et al., 2008]. However, many individual genetrees show differences from the species tree and these dif-ferences arise for a number of reasons. Several of thesereasons for gene tree-species tree incongruence, such asgene duplication and loss [Maddison, 1997; Page andCharleston, 1997], apply even when the estimate of thegene tree is accurate. However, the incongruence couldalso reflect biased gene tree estimation [see Jeffroy et al.,2006] or sampling error due to the limited amount of in-formation available in alignments of individual genes[see Braun and Grotewold, 2001; Chojnowski et al., 2008].Sampling error may be especially important to consider,since fairly long sequences are necessary to have a highprobability of recovering trees that match the true treeexactly [Chojnowski et al., 2008]. However, bias is poten-tially interesting, since it can arise for several reasons,including both genome-wide (e.g. global acceleration ofevolutionary rates that might lead to long-branch attrac-tion or genome-wide convergence in base composition)and locus-specific effects. It is unclear whether locus-spe-

    cific effects are similar for functionally related genes, butif functionally related genes do exhibit similar biases thenanalyses focused on revealing those biases could provideanother source of phylogenomic information.

    Reptilian Genomes and Vertebrate PhylogenomicsReptilian genomics is likely to be especially useful for

    vertebrate phylogenomics, in part because divergencesamong the major reptilian lineages are more ancient than

    those among the major avian and mammalian lineages[Benton and Donoghue, 2007]. Thus, adding reptiliansequences adds more diversity than adding avian andmammalian sequences. Increased taxon sampling (theinclusion of more sequences) is generally thought to havea positive impact on phylogenomic analyses [Pollock et

    al., 2000; Heath et al., 2008]. However, adding reptiliansequence data may be especially interesting given the di-verse patterns of chromosomal and genomic evolutionthey exhibit [see Olmo, 2008; Organ et al., 2008]. For ex-ample, the diversity of reptilian sex determination sys-tems [see Modi and Crews, 2005; Organ and Janes, 2008]will provide insights into the mechanisms of environ-mental sex determination and illuminate sex chromo-some evolution. Reptiles could have also retained moreancestral features in their genomes, since mammalianand avian development and morphology are highly mod-ified relative to their reptilian (or reptile-like) ancestors.

    The final argument assumes that rates of morphologicaland molecular evolution are related, although support forthis hypothesis remains unclear [for contrasting viewssee Omland, 1997; Bromham et al., 2002]. Furthermore,the fact that squamates appear to exhibit a high rate ofevolution [Gorr et al., 1998; Hughes and Mouchiroud,2001] and turtles exhibit highly modified developmentand morphology raises further questions about the exis-tence of a correlation between the rates of morphologicaland molecular evolution in reptiles. However, it will beimpossible to test the hypothesis without obtaining ad-ditional reptilian genome sequences. Even if reptilian ge-nomes have not retained a larger number of ancestra l ge-nomic features, it is clear that reptilian genomics will pro-

    vide unique information about the evolution and functionof vertebrate genomes.

    Despite these reasons to suspect that reptilian genomesequences are likely to be especially informative, reptiliangenomics has languished. Major exceptions include thegreen anole (Anolis carolinensis) genome project [Losos etal., 2005] and the inclusion of a turtle (the painted turtle,Chrysemys picta) in the Evolution of the Proteome initia-tive [Gerhart et al., 2005]. However, many questions about

    reptilian genome evolution remain, and it will only bepossible to answer these questions by obtaining more rep-tilian genome sequences. Such an expanded set of reptil-ian genome sequences was recently proposed [Genome10K Community of Scientists, 2009], although it is farfrom clear when the bioinformatic infrastructure neces-sary to assemble and annotate thousands of vertebrategenome sequences will exist.

  • 8/2/2019 Katsu Reptilian Genome CGR 2009

    4/15

    Katsu /Braun /Guillette /IguchiCytogenet Genome Res 2009;127:799382

    Toward Reptilian Phylogenomics Analyses ofProto-OncogenesGenomic approaches have already had an important

    impact on our understanding of reptiles despite the avail-ability of a single reptilian genome. Additional reptiliangenome sequences will no doubt be generated, but ex-

    pressed sequence tags (ESTs) from members of all majorreptilian lineages [e.g. Qinghua et al., 2006; Chojnowskiet al., 2007; Chojnowski and Braun, 2008; Leo et al.,2009] have already provided substantial information. Al-though ESTs are less informative than a draft genomesequence (or ESTs and the genome combined) they cangreatly facilitate gene discovery. The power to identifygenes using database searches rather than the laboriousmethods used in the past has made previously inconceiv-able research relatively straightforward. To illustrate this,we searched American alligator (Alligator mississippien-sis) ESTs for proto-oncogene sequences, an important

    class of genes with a number of distinct functions in thebiology of healthy organisms that play a major role incancer [Strausberg et al., 2004], although specific onco-gene functions may differ among clades. Many approach-es can be used to study oncogenes and proto-oncogenes;here we use a phylogenomic approach focused on 2 novelreptilian proto-oncogenes.

    cDNAs encoding the c-Jun(JUN) and DJ-1(PARK7)proto-oncogenes were identified by searching alligatorESTs collected by our group. Both proto-oncogenes playa role in sexual development in other vertebrates, whichis important because mechanisms of sexual development

    vary among reptiles (alligators exhibit temperature-dependent sex determination, TSD, like all crocodilians[Lang and Andrews, 1994]). However, c-Jun and DJ-1 playdifferent roles in mammalian sexual development andresponses to steroid hormones. c-Jun plays a role in es-trogen responses, since estrogen receptors (ERs) can actthrough activating protein-1 (AP-1) complexes [Cheunget al., 2005], transcription factors that include Jun pro-teins [Vogt, 2001; Eferl and Wagner, 2003]. In contrast,DJ-1 plays a role in androgen receptor (AR) regulation[Takahashi et al., 2001; Niki et al., 2003]. However, both

    genes are multifunctional, with DJ-1 also playing a role inresistance to oxidative stress [Kim et al., 2005] and c-Junacting as a transcriptional regulator for many differentgenes [Vogt, 2001].

    To complement this alligator gene discovery effort, wealso obtained c-Jun and DJ-1 homologs from 2 other rep-tiles exhibiting TSD, the Nile crocodile (Crocodylus niloti-cus) and the Florida red-belly turtle (Pseudemys nelsoni)and we examined patterns of mRNA accumulation for

    both proto-oncogenes in the adult female alligator. Wepresent phylogenomic analyses, using methods such asphylogenetic networks [Huson and Bryant, 2006] to high-light conflicting signals in sequence alignments. The per-sistent difficulties associated with establishing a robustreptilian phylogeny, especially the position of the turtles

    (fig. 1), also make it important to consider the possibilitythat trees differ from expectation due to sampling error.Thus, we examined the evidence for conflicting signals inthese proto-oncogene sequences and attempted to explaintheir origin in the context of the complete genome.

    Materials and Methods

    American all igator and Florida red-belly turtle tissues were col-lected on Lake Woodruff, FL, USA and Mr. Albert Pretorius gener-ously provided Nile crocodile tissues. The alligator and turtle tis-sues were collected with permits from the Florida Fish and WildlifeConservation Commission and the US Fish and Wildlife Service,and crocodile tissues were collected under permit on the ThabaKwena Crocodile Farm in the Republic of South Africa in 2004.

    Full-length c-Jun and DJ-1 clones were identified by TBLASTN[Altschul et al., 1997] searches of alligator ESTs (those from Choj-nowski et al. [2007] and additional unpublished ESTs) and se-quenced in their entirety. Turtle and crocodile c-Jun and DJ-1 ho-mologs were amplified from total ovarian RNA (2.5 g) isolatedusing the RNeasy kit (QIAGEN, Chatsworth, Calif., USA). TheRNA was reverse transcribed using SuperScript II (Invitrogen,Carlsbad, Calif., USA) and an oligo (dT) primer and the relevantgenes were amplified with JUN-1 (5-ATGASTGCAAAGATGG-AGCCTACYTTC-3), JUN-2 (5-TCAAAACGTYTGCAACTGY-TGTGTGAG-3), DJ-1 (5-ATGKCYKCRAAAAGAGCSTTRGT-RATTCTG-3), and DJ-2 (5-TYARTCTTTCARTATWAGKGG-WGMCTTCAC-3) under standard conditions (32 cycles; 94 Cfor 30 s, 50 C for 30 s, 72 C for 1 min). Amplicons were clonedinto pGEM-T Easy (Promega, Madison, Wisc., USA) and multipleclones sequenced with BigDyeTM Terminator Cycle Sequencingmix (Applied Biosystems, Foster City, Calif., USA) using an ABIPRISM 377 automatic sequencer (Applied Biosystems).

    Transcript accumulation in alligator tissues was examinedby isolating total RNA from adult female alligator stomach, kid-ney, intestine, gonad, heart, brain and lung with RNeasy. TotalRNA (2.5 g) was reverse transcribed as described previouslyand transcripts were amplified using primers for-actin (actin-1,5-GCAAAAGAGGTATCCTGACCCTG-3; actin-2, 5-CCTG-ACCATCAGGGAGTTCATAG-3), c-Jun (Jun-1, 5-CTCATC-ATCCAGTCCAGCAA-3; Jun-2, 5-CCAATCTGGCAATCC-TTTCC-3) and DJ-1 (DJ-1, 5-ATGAGAAGGGCTGGAATCA-AG-3; DJ-2, 5-GTTTTCAGAGTACTTGTAGTG-3) understandard conditions (28 cycles; 94 C for 30 s, 55 C for 30 s, 72 Cfor 1 min) and visualized on 1.5% agarose gels.

    Novel sequences were deposited in DDBJ with accession num-bers AB195248AB195250 (c-Jun) and AB195251AB195253 (DJ-1). Sequences were aligned by eye using MacClade 4.0 [Maddisonand Maddison, 2002]. PAUP* 4.0b10 [Swofford, 2003] was used toidentify optimal trees under the maximum-parsimony (MP) cri-

  • 8/2/2019 Katsu Reptilian Genome CGR 2009

    5/15

    Phylogenomics of Reptilian Oncogenes Cytogenet Genome Res 2009;127:7993 83

    terion and to conduct distance analyses (the generation of dis-tance matrices and neighbor-joining trees). Phylogenetic net-works were generated by the NeighborNet [Bryant and Moulton,2004] method as implemented in SplitsTree4 [Bryant and Moult-on, 2004]. RAxML 7.0.4 [Stamatakis, 2006] was used to identifyoptimal trees under the maximum-likelihood (ML) criterion us-ing standard (stationary) models of sequence evolution andnhPhyML (http://pbil.univ-lyon1.fr/software/nhphyml) was usedto conduct ML analyses using the non-stationary Galtier andGouy [1998] model. To approximate a non-stationary codon mod-el we divided data into first, second, and third codon positionsand calculated the likelihood for each partition assuming the

    GG98+model.

    Results and Discussion

    Identification of Alligator c-Jun andDJ-1 cDNAsSince the availability of sequence data from tradition-

    al reptiles remains limited, we searched alligator ESTsfor proto-oncogene homologs. We obtained full-length

    clones of c-Jun and DJ-1 proto-oncogenes, which had a992-bp insert with an open reading frame (ORF) able toencode a 320-aa protein (fig. 2) and a 1,020-bp insert withan ORF able to encode a 189-aa protein (fig. 3), respec-tively. Both alligator genes are orthologs of the humanc-Jun and DJ-1 proto-oncogenes. Phylogenetic analyses ofthe protein encoded by the alligator cDNA placed it with-in the c-Jun clade, 1 of 4 major clades in the vertebrate Juntree (c-Jun, JunB, JunD, and F-Jun) [Cottage et al., 2003].DJ-1 homologs that have been identified have a 1: 1 rela-

    tionship, suggesting that few, if any, gene duplicationshave occurred and indicating that orthology is very likely.Alligator c-Jun ORF exhibited a biased nucleotide

    composition and the degree of compositional bias showedvariation along the sequence as the 5 end showed a strongbias toward GC nucleotides, whereas the 3 end wassomewhat less biased (fig. 2). Both avian and mammalianc-Jun orthologs show a similar spatial pattern in GC con-tent. Hyder et al. [1995] reported a functional estrogen

    Fig. 2. Alligator c-Jun overlaps with a CpG island. CpG dinucleo-tides are shaded and the start codon is indicated in gray. Segmentshomologous to the rat JUN5 and JUN6 ERE-like sequences (thefirst of which appears to have ERE function) [see Hyder et al.,1995] are shown above the sequence. The alligator sequence has asingle difference from the rat sequence in these regions; those dif-ferences are highlighted in gray.

    CTCCCCGCGGTGGCGGCCGCTCTAGAACTAGTGGATCCCCCGGGCTGCAGGAATTCGGCA 60

    CGAGGCCTCGTGCCGAATTCGGCACGAGGAAAACATGTCTGCAAAAAGAGCCTTAGTAAT 120

    M S A K R A L V I 9

    TCTGGCTAAAGGAGCAGAGGAGATGGAAACTGTTATCCCCACTGATCTTATGAGAAGGGC 180

    L A K G A E E M E T V I P T D L M R R A 29

    TGGAATCAAGGTCACAGTTGCAGGCCTAACAGGAAAAGAGCCAGTGCAGTGCAGCCGAGA 240

    G I K V T V A G L T G K E P V Q C S R D 49

    TGTCTTTGTTTGTCCTGACACCAGTTTGGAAGACGCCAGAAAAGAGGGGCCTTATGATGT 300

    V F V C P D T S L E D A R K E G P Y D V 69

    GGTGGTCCTACCAGGAGGTAATCTAGGAGCTCAGAACTTGTCAGAGTCCTCTGCTGTTAA 360

    V V L P G G N L G A Q N L S E S S A V K 89

    AGACATCCTGAAGGACCAGGAGATGAGGAAAGGCTTGATTGCTGCCATTTGTGCAGGTCC 420

    D I L K D Q E M R K G L I A A I C A G P 109

    AACTGCCCTTCTGGCACATGGAATAGGGTTTGGGAGCAAAGTTACTACACATCCTTTGGC 480

    T A L L A H G I G F G S K V T T H P L A 129

    TAAAGATAAAATGATGAATGGGGAACACTACAAGTACTCTGAAAACCGGGTTGAGAAGGA 540

    K D K M M N G E H Y K Y S E N R V E K D 149

    TGGGAACATTCTTACCAGTCGTGGTCCAGGCACCAGTTTCGAGTTTGGCTTGGCTATTAT 600

    G N I L T S R G P G T S F E F G L A I I 169

    TGAAACGCTAATGGGGAAGGAGGTGTCTGACCAAGTGAAGTCTCCCCTTATACTGAAAGA 660

    E T L M G K E V S D Q V K S P L I L K D 189

    TTAATACNCAGTTGTAATTGGTGAACAGGAGAAGAAAACATCATGAGATAGAAAGTCCTC 720

    *

    AGGNACTTATCTCCCTTCCTGGAAACAACAAGAAAGCNAGCTGTATCTTGTATACAATCT 780CATTGTGAGGCAAATGCCTTTCATACAATATTGATGTGAACTAGAATAATTCATAAACCA 840

    ATGGGTTTTGTTTATAGGGAAACAGATACAGATGTGTCTGCTAACTCATGACGCCATGAG 900

    AGAGTTAAACAGCTGTGAATAGGGAAAAAAACTAATAGAGTTTATTCATGTGATTTGAAG 960

    CTTGCTTGTTTAAATAAAAATGTAGTAACCCTGAAAAAAAAAAAAAAAAAAAAAAAAAAA 1020

    Fig. 3. Alligator DJ-1 has CpG dinucleotides in the 5 UTR. Thealligator DJ-1 sequence is presented in a manner identical to thec-Jun sequence (fig. 2), with the addition of gray nucleotides toindicate bases that flank introns based upon the intron-exonstructure in the chicken genome. A non-coding (5 UTR) segmentof the alligator sequence identical to a sequence in the chickengenome (GI:118109532) is underlined.

  • 8/2/2019 Katsu Reptilian Genome CGR 2009

    6/15

    Katsu /Braun /Guillette /IguchiCytogenet Genome Res 2009;127:799384

    response element (ERE) in the coding region of rat c-Jun;the alligator sequence differs from the rat by only a singletransition in the region homologous to the putative ERE(JUN5 in fig. 2). The rat JUN5 ERE has ER-binding activ-ity in a gel shift assay (JUN5 was the only sequence in therat c-Junregion with this activity) and it confers estrogen-

    dependent transcriptional activation in a yeast reportersystem [Hyder et al., 1995]. The conservation of this se-quence in the alligator suggests it may play a role in c-Junregulation in other organisms. Similar transcriptionalregulatory elements have not been reported within theDJ-1 transcript.

    Both Proto-Oncogenes Are Associated with ApparentCpG IslandsVertebrate genomes exhibit 2 types of base composi-

    tional heterogeneity that have been proposed to havechanged during the transitions from reptiles to birds and

    mammals (the latter is more properly the transition fromnon-mammalian synapsids to mammals, see fig. 1). Bothc-Jun and DJ-1 exhibit complex patterns of base compo-sitional heterogeneity that should be considered in lightof these genome-wide patterns.

    The first type of heterogeneity is the division of verte-brate genomes into isochores [Bernardi, 2000], which arelong (1100 kb) DNA segments with relatively homoge-neous within-segment GC-content that differ from otherisochore segments [Cuny et al., 1981]. Isochores were re-cently reported to be absent [e.g. International HumanGenome Sequencing Consortium, 2001], but the contro-

    versies regarding their existence reflected debate regard-ing the amount of within-isochore heterogeneity thatshould be tolerated given the prefix iso [see Clay andBernardi, 2001; Li et al., 2003]. The more important evo-lution question is why there have been major changes tothe GC-content of specific isochores (or isochore-likeregions) in vertebrate genomes, when the changes in basecomposition occurred, and what impact the changes hadupon the genes embedded in the regions.

    Both birds and mammals have GC-rich isochores withhigher GC-contents than those found in amphibians or

    fish, indicating there has been at least one increase in GC-content within the tetrapods. However, the actual num-ber of transitions to this GC-rich isochore structure isunclear, and it has been suggested to reflect either a singletransition at the base of the amniotes [Duret et al., 2002]or 2 transitions correlated with the change from cold-blooded to warm-blooded physiology [Bernardi, 2000].Given the diversity of thermoregulatory mechanismswithin reptiles, use of the terms cold-blooded and warm-

    blooded may be problematic [see Chojnowski et al., 2007].Furthermore, it is not clear whether thermal physiologyhas a major impact on genomic composition and, if itdoes, what aspects of thermoregulation are important.Combining information based upon reptilian sequences[e.g. Hughes et al., 1999; Hamada et al., 2002; Chojnows-

    ki et al., 2007; Chojnowski and Braun, 2008] with densi-ty gradient centrifugation data [e.g. Thiery et al., 1976;Hughes et al., 2002] reveals a confusing picture, with theturtles appearing especially problematic [see Chojnowskiand Braun, 2008].

    The most parsimonious models of isochore evolutiongiven the sequence data involve either a single transitionto a GC-rich isochore structure early in amniotes fol-lowed by a reversion to a less GC-rich isochore structurein turtles (fig. 1, ), or 2 transitions to a GC-rich isochorestructure (1 in mammals and 1 in archosaurs; fig. 1, ).Density gradient centrifugation data [Thiery et al., 1976;

    Hughes et al., 2002] and analyses of the anole genome se-quence [Costantini et al., 2009] suggest squamates have aGC-poor isochore structure, supporting model . How-ever, there is variation within squamates based upon thedensity gradient centrifugation data, suggesting thatthere may be more changes to reptilian isochore struc-ture. Regardless, either model is consistent with very GC-rich genes in the alligator genome, like the c-Jun gene.

    The other important compositional feature of avianand mammalian genomes are CpG islands, regions1 kbin size (much smaller than isochores) located 5 of GC-rich genes that often include regulatory sequences [Bird,1987]. CpG islands have many unmethylated CpG dou-blets and they appear to be relatively rare in reptiles [As-sani and Bernardi, 1991a, b]. Amphibians and teleostshave CpG islands [Brunner et al., 2000; Stancheva et al.,2002; Yamakoshi and Shimoda, 2003] suggesting thatreptiles either have fewer CpG islands than birds andmammals or reptilian CpG islands have a lower GC-con-tent making them difficult to detect as GC-rich HpaIItiny fragments (the assay used by Assani and Bernardi[1991a]). Assani and Bernardi [1991b] suggested the lat-ter possibility, which has a precedent in teleosts where

    CpG islands are less GC-rich than those in mammals[Cross et al., 1991].The 5 end of the all igator c-Jun coding region appears

    to overlap with a GC-rich CpG island (fig. 2), like the hu-man c-Jun gene. The human c-Jun CpG island appearsfunctional based upon hydrazine cleavage of unmethyl-ated cytosines [Rozek and Pfeifer, 1993], although onlythe c-Jun upstream region was examined. Nevertheless,the large numbers of CpG dinucleotides near the 5 end

  • 8/2/2019 Katsu Reptilian Genome CGR 2009

    7/15

    Phylogenomics of Reptilian Oncogenes Cytogenet Genome Res 2009;127:7993 85

    of the c-Jun ORF in many organisms (e.g. fig. 1) suggestthe CpG island extends into the ORF. This could reflectthe existence of 1 or more transcriptional regulatory ele-ments in the coding region [like the JUN5 ERE; Hyder etal., 1995]. Contrary to suggestions that reptiles have lessGC-rich CpG islands than birds or mammals, the GC-content of alligator c-Jun (61.6% GC) is similar to that ofthe human and chicken c-Jun genes (64.4% GC and 57.6%GC, respectively).

    The DJ-1 coding region does not contain many CpGdinucleotides. However, a number of CpG dinucleotides

    are present in the GC-rich (66% GC) DJ-1 5

    UTR (fig. 3).Part of this 5 UTR corresponds to a highly conservednon-coding region, since the chicken genome has a 54-bpsegment identical to this alligator sequence (the chickensequence was not assigned to a specific chromosome inthe current genome build, making it unclear whether it isnear chicken DJ-1). These analyses emphasize that CpGislands are present in the alligator genome, although themean GC-content of alligator CpG islands remains un-

    clear. Thus, it remains possible that these CpG islands areexceptionally GC-rich. Regardless, our analyses of bothproto-oncogenes indicate that GC-rich CpG islands arepresent in the alligator genome.

    Isochore structure is fundamental to genome organi-zation and CpG islands are associated with many genes.The GC-rich isochores of birds and mammals have beencalled the genome core, and Saccone et al. [2002] foundthat genome core-chromatin has an open structure in in-terphase nuclei. Jabbari et al. [2003] postulated that thehigh GC-content stabilizes the avian and mammalian

    genome core whereas the orthologous regions in cold-blooded vertebrates do not require stabilization due tothe lower mean body temperatures and higher degree ofDNA methylation [methylation can stabilize DNA; e.g.Marcourt et al., 1999]. However, mammals, birds, andreptiles have similar amounts of 5-methylcytosine (5mC)whereas fish and amphibians have a higher amount of5mC [Jabbari et al., 1997]. This suggests a single transi-tion to a lower amount of genomic 5mC at the base of the

    A, % C, % G, % T, %

    Mean for vertebrate c-Jun sequences (excluding turtle and frog)5 region of gene

    1st codon positions 31.1 29.0 27.6 12.32nd codon positions 32.3 29.3 12.3 26.13rd codon positions 6.3 46.6 38.2 8.8

    3 region of gene1st codon positions 33.1 31.0 25.9 10.12nd codon positions 37.1 22.8 14.9 25.13rd codon positions 15.0 29.3 48.1 7.6

    Turtle c-Jun sequence5 region of gene

    1st codon positions 35.2 26.6 27.3 10.92nd codon positions 34.1 27.1 12.4 26.43rd codon positions 18.6 23.3 22.5 35.7

    3 region of gene1st codon positions 36.2 25.9 25.9 12.12nd codon positions 37.1 23.3 14.7 25.03rd codon positions 28.4 19.8 36.2 15.5

    Frog c-Jun sequence5 region of gene

    1st codon positions 33.3 24.6 28.6 13.52nd codon positions 31.5 29.1 12.6 26.83rd codon positions 13.4 38.6 26.8 21.3

    3 region of gene1st codon positions 37.1 24.1 26.7 12.12nd codon positions 37.1 22.4 15.5 25.03rd codon positions 31.0 19.0 32.8 17.2

    Table 1. c-Jun sequences exhibit distinctpatterns of nucleotide composition

  • 8/2/2019 Katsu Reptilian Genome CGR 2009

    8/15

    Katsu /Braun /Guillette /IguchiCytogenet Genome Res 2009;127:799386

    amniotes and is not consistent with a model postulating2 independent changes in 5mC amounts during the shiftfrom cold-blooded reptiles to warm-blooded birds andmammals. Larger datasets, including complete genomes,will be necessary to place all of these results in a unif iedframework. However, it is clear that GC-rich genes are

    present in some reptiles and that these genes can have aGC-content similar to avian and mammalian genes.

    Contrasting Phylogenetic Signals in c-Jun andDJ-1Since the alligator sequences appeared to be orthologs

    of the c-Junand DJ-1 proto-oncogenes, we designed prim-ers to amplify those genes from 2 other reptiles (the croc-odile and turtle). The cDNAs were similar in size to thosefrom the alligator and the crocodile c-Jun sequence had asimilar GC-content (59.7% GC). In contrast, turtle c-Junexhibited a slight AT-bias (47.6% GC) and did not appearto overlap with a CpG island, resembling the frog c-Jun

    sequence from the standpoint of base composition (ta-ble 1). Standard phylogenetic analyses (e.g., using the MPcriterion) suggest a frog-turtle clade (fig. 4A), in sharpcontrast to prior expectation (fig. 1). In principle, anygene tree can be reconciled with the species tree by as-suming a sufficient number of gene duplications alongwith gene losses and/or the failure to identify specificparalogs in key taxa [see Maddison, 1997; Page andCharleston, 1997]. However, the fact that turtle and frogc-Jun have similar base compositions, especially at themost variable positions, suggests biased phylogenetic es-timation is more likely.

    Base compositional convergence has been studiedextensively from empirical and theoretical perspectives[e.g., Lockhart et al., 1994; Foster and Hickey, 1999;Conant and Lewis, 2001], and it is clear that base compo-sitional convergence can generate a non-historical signalthat can obscure evolutionary history. There are a num-ber of ways to extract historical signals despite the exis-tence of base compositional convergence, but the simplestapproaches are noise reduction by RY-coding (transver-sion analyses) [see Braun and Kimball, 2002; Phill ips andPenny, 2003] or translation to amino acid sequences

    [Loomis and Smith, 1990]. Indeed, the unexpected frog-turtle clade separates when the data are subjected to ei-ther RY-coding (fig. 4B) or translated to protein sequenc-es (fig. 4C). Reducing noise by either of these methodsalso reduces signal, and it is clear that signal was reducedsince the number of equally parsimonious trees increasedwhen either of the alternative coding of the data wereused.

    A

    B

    C

    Fig. 4. Estimates of c-Jun phylogeny. Bootstrap support from1,000 replicates is presented adjacent to branches (for values650%). A The single MP tree obtained using c-Jun nucleotide se-quences. B Strict consensus of 18 MP trees obtained after RY-cod-ing c-Jun sequences. C Strict consensus of 40 MP trees obtainedusing c-Jun amino acid sequences. Similar results were obtainedusing ML and distance analyses (data not shown).

  • 8/2/2019 Katsu Reptilian Genome CGR 2009

    9/15

    Phylogenomics of Reptilian Oncogenes Cytogenet Genome Res 2009;127:7993 87

    Another potential problem with these putative noisereduction techniques is the retention of non-historicalsignal even after excluding data. In fact, base compositionafter RY-coding may not be stationary [Braun and Kim-ball, 2002] and convergence in base composition has even

    been shown to drive amino acid compositional conver-gence [Foster et al., 1997; Foster and Hickey, 1999]. Anal-yses of amino acid sequences may be especially problem-atic since the development of models of protein evolutionhas lagged behind the development of nucleotide models

    A

    B C

    Fig. 5. Conflicting signals in the c-Jun alignment. A Neighbor-joining tree generated using LogDet distances.Bootstrap support from 1,000 replicates is shown above branches. A similar topology was obtained whenGTR++inv sites distances were used, and bootstrap support from that analysis is presented below branches.An asterisk below the branch was used to indicate branches with 100% bootstrap support in both analyses, oth-erwise bootstrap support was only presented for values650%. Despite the topological congruence of neighbor-joining analyses, NeighborNet [Bryant and Moulton, 2004] networks generated using B LogDet distances andC GTR++inv sites distances were able to reveal conflicting signals.

  • 8/2/2019 Katsu Reptilian Genome CGR 2009

    10/15

    Katsu /Braun /Guillette /IguchiCytogenet Genome Res 2009;127:799388

    and amino acid evolution has been suggested to havegreater potential for homoplasy than the evolution of nu-cleotides [Simmons et al., 2004]. Although the increasedcongruence between the c-Jun trees after noise reductionand the probable species tree (fig. 1) suggests that thenoise reduction methods are increasing the amount ofhistorical signal, it would be preferable to retain all of thedata. Regardless, the noise reduction analyses indicatethat multiple signals are present in the c-Jun alignmentand suggest that the signal that unites the frog-turtleclade is not historical.

    Other methods useful for phylogenetic estimation de-spite base compositional convergence include LogDetdistances [Lockhart et al., 1994] and models for ML andBayesian Markov chain Monte Carlo analyses [e.g. theGG98 model of Galtier and Gouy, 1998]. These approach-

    es retain all of the data (unlike noise reduction approach-es) while still accommodating some potential for conver-gence in base composition. However, the tree generatedby neighbor-joining of LogDet distances includes theunexpected frog-turtle clade (fig. 5A), like the MP tree(fig. 4A). However, LogDet distances appear better thandistances calculated using a model that assumes station-ary base composition, since a LogDet network (fig. 5B)exhibits substantially more conflict than one based on

    the stationary model (the GTR++inv sites model;fig. 5C). Indeed, the GTR++inv sites network was re-markably similar to a network based on base compositionalone (data not shown). Although the unexpected frog-turtle clade is strongly influenced by base compositionalconvergence, the situation actually appears more com-

    plex since the anole c-Jun sequence groups with the frog-turtle clade despite having a base composition similar tothe alligator and other taxa. This may reflect the general-ized acceleration of molecular evolution in squamates[e.g., Gorr et al. 1998; Hughes and Mouchiroud, 2001].Examining this in more detail is likely to require moreextensive sampling of both turtles and squamates.

    Like the LogDet analyses, ML using the GG98+model (which allows GC-content to vary across the tree)[Galtier and Gouy, 1998] did not recover a tree congruentwith the expected species tree (data not shown). Thiscould reflect the limited amount of historical signal re-

    tained by c-Jun after eliminating the non-historical sig-nals, but it is also possible that the GG98+ model is arelatively poor approximation to actual patterns of c-Junsequence evolution. Sanderson and Kim [2000] suggestedthat large differences in f it (i.e. likelihood or Akaike in-formation criterion) for models that differ by few param-eters may indicate that all models tested fit the sequencealignment poorly. Large differences in model fit were ob-served when comparing the GG98+ model applied in-dependently to each codon position to the same modelapplied to all sites. There are also biochemical reasons toexpect available models to fit the c-Jun alignment poorly,since strong neighboring nucleotide effects are likely,given the large number of CpG dinucleotides. Althoughmodels incorporating neighboring nucleotide effectshave been described [e.g. Arndt and Hwa, 2005], it is notclear if they are practical for large-scale phylogenomicanalyses. It is also far from clear whether any availablemodel would be able to accommodate all of the patternsof sequence evolution observed for c-Jun. However, it ispossible to extract information about the conflicting sig-nals in the c-Jun alignment using phylogenetic networks(e.g. fig. 5) even if the best available models prove to have

    an inadequate fit to the data.In sharp contrast to c-Jun, the DJ-1 sequences fromdifferent taxa do not exhibit major differences in basecomposition and phylogenetic analyses ofDJ-1 do not in-clude the unexpected frog-turtle clade (fig. 6). In fact, thearrangement of birds, turtles and crocodilians in the DJ-1 tree is consistent with the current consensus based uponother molecular data (fig. 1B), although there is limitedsupport (probably reflecting the fact that DJ-1 is relatively

    Fig. 6. Estimate ofDJ-1 phylogeny. Strict consensus of 2 MP treeswith bootstrap support from 1,000 replicates is indicated nearbranches (for values 650%). The topology was similar in otheranalyses.

  • 8/2/2019 Katsu Reptilian Genome CGR 2009

    11/15

    Phylogenomics of Reptilian Oncogenes Cytogenet Genome Res 2009;127:7993 89

    short). However, the position of the anole was unexpect-ed, possibly due to the elevated rate of sequence evolutionin squamates [Gorr et al., 1998; Hughes and Mouchiroud,2001] and long-branch attraction to the relatively distantoutgroup. Surprisingly, ML analyses using the same taxaresults in a similar topology and increased support for the

    unexpected position (98.9% bootstrap support for a cladeincluding all tetrapods except the lizard when the GTR+model was used). Like the anole-turtle-frog group in c-Jun analyses (e.g. fig. 5), adding more lepidosaur sequenc-es will shed light on this unexpected relationship in theDJ-1 tree.

    The Putative Exonic ERE in c-Jun Is NotExceptionally ConservedSince both proto-oncogenes were chosen in part be-

    cause of their roles in sexual development, the putativeexonic ERE in c-Jun (fig. 2) is especially interesting. The

    existence of a sequence similar to the exonic ERE charac-terized in the rat prompted us to examine rates of evolu-tion at the synonymous sites, an approach conceptuallyidentical to phylogenetic shadowing [Boffelli et al., 2003].The most common functional sequences in coding re-gions that are not directly related to the encoded proteinare exonic splicing enhancers (ESEs), and these sequenc-es do limit synonymous divergence [Parmley et al., 2006].The number of transcriptional regulatory elements thatare located in coding regions is unclear, so the impact ofthese elements on the rates of synonymous site evolutionis also unclear. The intronless c-Jun gene would seem tobe ideal for this analysis (because ESEs would be absent)but the variation in nucleotide composition variationalong the c-Jun sequence (table 1) makes available codonmodels problematic. Thus, MP was used to estimate min-imum numbers of changes at each site (data not shown),revealing that the number of third codon position chang-es in the JUN5 ERE, which appears functional in the rat[Hyder et al., 1995], was similar to the number that oc-curred in a second ERE-like sequence (JUN6) that didnot appear functional. Furthermore, all third codon posi-tions in JUN5 underwent at least 1 substitution during

    vertebrate evolution, including sites that appear to abol-ish either ER binding or estrogen-dependent transcrip-tional activation when mutated [see Hyder et al., 1995],although it is difficult to interpret these mutations sincemultiple bases were changed in the oligonucleotides. Tak-en as a whole, these results raise questions about the bio-logical significance of the JUN5 sequence.

    Several models are consistent with the apparent lackof conservation in the JUN5 sequence. A straightforward

    hypothesis would be that JUN5 is not a bona fide ERE;

    this would suggest that the activities identified by Hyderet al. [1995] are not relevant to the in vivo estrogen regu-lation of c-Jun. Alternatively c-Jun might exhibit differentpatterns of regulation in various organisms. However, es-trogen regulation of c-Jun appears to be ancestral in am-niotes, since estrogens induce c-Jun mRNA in mammals[e.g. Morishita et al., 1995; Fujimoto et al., 1996] and insome chicken tissues [Lau et al., 1990]. However, regula-tion of c-Jun by an ERE in the coding region could be aderived character state of a group less inclusive than allamniotes, whether this less inclusive group is the mam-mals, the rodent-primate clade (Euarchontoglires) [seeMurphy et al., 2001], rodents, or even a subset of rodents.Estrogen regulation of chicken c-Jun involves transcrip-tional and post-transcriptional mechanisms [Lau et al.,1991] and estrogen induction of c-Jun in human endome-trial fibroblasts and cancer cell lines requires protein ki-nase C [Fujimoto et al., 1996]. Thus, estrogen regulationof c-Jun requires more than ER binding to an ERE insome systems. Finally, mechanisms of estrogen regula-tion could also vary among cell types, although the ab-sence of strong conservation of the putative coding re-gion ERE suggests that JUN5 is unlikely to have a critical

    role in the regulation of c-Jun by estrogen in reptiles.

    c-Jun andDJ-1 mRNAs Can Be Detected in theAlligatorReverse transcriptase (RT)-PCR was used to examine

    relative c-Jun and DJ-1 transcript accumulation in adultfemale alligator tissues. Both mRNAs could be detectedin all tissues examined (fig. 7), although c-Jun mRNAaccumulation appeared lower in kidney, stomach, and

    BrainGo

    nad

    Heart

    Intestine

    Kidn

    ey

    Lung

    Stom

    ach

    RT (+)

    RT ()Actin

    c-Jun

    DJ-1

    Fig. 7. Proto-oncogene mRNA accumulation in alligator tissues.RT-PCR was used to detect c-Jun, DJ-1 and -actin (positive con-trol) transcripts in total brain, gonad, heart, intestine, kidney,lung and stomach RNA of a female alligator.

  • 8/2/2019 Katsu Reptilian Genome CGR 2009

    12/15

    Katsu /Braun /Guillette /IguchiCytogenet Genome Res 2009;127:799390

    intestine than in the other tissues examined. Thesesequences provide tools to examine patterns of mRNAaccumulation for these proto-oncogenes in the alligatorand other reptiles.

    The Coming Impact of Reptilian Genomes on

    PhylogenomicsDespite the advances in sequence data acquisition thathave led to the genomics revolution, reptilian sequencedata remains relatively scarce. In fact, the number of pro-to-oncogene sequences available from crocodilians andturtles are quite limited, including a few sequences suchas WT1 [Kent et al., 1995], c-Ski [Hughes et al., 1999], c-Myc [Harshman et al., 2003], and c-Mos [Saint et al.,1998]. Although these sequences have contributed to ourknowledge of the evolutionary history of crocodiliansand turtles, it is clear that more sequence data are needed.At present, there are no complete crocodilian or turtle

    genome sequences (although painted turtle genomic in-formation will be available soon) [Gerhart et al., 2005]and there is only a single squamate genome sequence (thegreen anole). Large-scale sequence data will make it pos-sible to go far beyond studies based upon limited num-bers of genes, like those reported here. We focused on c-Jun and DJ-1 because they have roles in sex determinationand reptiles exhibit diverse patterns of sex determination[Crews, 2003; Sarre et al., 2004; Modi and Crews, 2005;Organ and Janes, 2008]. Large-scale sequence datasetswill allow a more complete definition of the set of genesthat exhibit differential regulation during sex determina-tion in various reptiles and allow the common features ofamniote sex determination to be established.

    Reptilian phylogeny has been the subject of substan-tial debate (e.g. fig. 1), but the availability of completereptilian genomes is likely to firmly establish the domi-nant phylogenetic signal present in reptilian genomes.Identification of this dominant signal will shift the goalof sequence analyses back to the original goal of phyloge-nomics, extracting functional information from phylo-genetic analyses [Eisen, 1998]. In that context, the con-flicting signals evident in the c-Jun tree are likely to be

    of interest. Non-historical signals can arise in sequencealignments in many ways, including convergence in basecomposition [Lockhart et al., 1994; Conant and Lewis,2001], long-branch attraction due to changes in the rateof evolution [e.g. Fares et al., 2006], changes in the rate ofevolution at specific sites [e.g. Ruano-Rubio and Fares,2007], and natural selection for specific molecular vari-ants [e.g. Castoe et al., 2009]. These phenomena mayeven overlap (e.g. there may be selection for changes in

    base composition). Regardless of the reasons for conflict-ing phylogenetic signals, phylogenetic networks maymake it possible to extract information from all signalsin alignments. It is not clear whether the NeighborNetmethod is the best possible approach to examine con-flicting signals, but it would be interesting to determine

    whether functionally related genes exhibit similar non-historical signals regardless of the method used to exam-ine them.

    There are several potential levels at which non-histor-ical signals in molecular sequences can arise, and reptil-ian sequence data are likely to be especially important forexamining some of these levels. The simplest case in-

    volves locus-specif ic effects, and additional data fromspecific groups of organisms (like reptiles) are less likelyto be especially informative if locus-specific effects aredominant. However, it does appear that a related ap-proach, based on comparing vectors of branch length es-

    timates, is effective for detecting protein-protein interac-tions [Waddell et al., 2007]. Although this suggests thatcomparisons of branch length estimates are reasonable, itis unclear whether including estimates of the magnitudeof non-historical signals would improve the method. Se-quence data from specific groups, such as reptiles, arelikely to be informative regarding higher levels of organi-zation in vertebrate genomes, like the isochores that genesare embedded within and even the genome as a whole;and signals may reflect changes in patterns of evolutionat any of these levels. It has been suggested that there havebeen major changes in these aspects of genome structureduring the transition from reptile-like ancestors to mam-mals and birds (fig. 1), and additional data suggest transi-tions with reptiles [e.g. Chojnowski and Braun, 2008]. Ex-amining biases at the genomic level may be simpler thanexamining the impact of isochore context, since examin-ing patterns of genome evolution may be possible us-ing genome survey sequences [e.g. Shedlock et al., 2007]while examining isochore-specific patterns is likely to re-quire relatively high-quality assemblies. It is unclearwhen such high-quality assemblies of reptilian genomeswill be available, but we believe they are inevitable given

    their value and ongoing improvements to sequencingtechnologies and bioinformatics. When they becomeavailable, it will be possible to place analyses of individu-al genes similar to those presented here in the context ofboth the whole genome and the evolutionary history ofthe taxa examined.

  • 8/2/2019 Katsu Reptilian Genome CGR 2009

    13/15

    Phylogenomics of Reptilian Oncogenes Cytogenet Genome Res 2009;127:7993 91

    Acknowledgements

    We thank Alan Woodward and the FL Fish and Wildlife Con-servation Commission for field assistance and long-term researchsupport. We also thank Mr. Albert Pretorius of the Thaba KwenaCrocodile Farm, Republic of South Africa for access to his farm,donation of crocodile tissues, and the assistance from his workersduring sample collection as well as Dr. Jan Myburgh, Universityof Pretoria, School of Veterinary Medicine, who made tissue col-

    lection possible. The manuscript was improved by commentsfrom 2 anonymous reviewers. This work was supported or facili-tated by grants to L.J.G. and E.L.B. (from the University of FloridaOpportunity Fund), to E.L.B. and collaborators (NSF DEB-0228682 and DUE-0920151) and to Y.K. and T.I. (Core Researchfor Evolutionary Science and Technology, Japan Science andTechnology; Grant-in-Aid for Scientific Research from Ministr yof Education, Science, Culture, Sports, Science and Technology,Japan and grants from the Ministry of the Environment, Japan).

    References

    Assani B, Bernardi G: CpG islands: Featuresand distribution in the genome of verte-brates. Gene 106: 173183 (1991a).

    Assani B, Bernardi G: CpG islands, genes, iso-chores in the genome of vertebrates. Gene106: 185195 (1991b).

    Altschul SF, Madden TL, Schffer AA, Zhang J,Zhang Z, et al: Gapped BLAST and PSI-

    BLAST: a new generation of protein databasesearch programs. Nucleic Acids Res 25:33893402 (1997).

    Arndt P, Hwa T: Identificat ion and measurementof neighbor-dependent nucleotide substitu-tion processes. Bioinformatics 21: 23222328(2005).

    Benton MJ, Donoghue PCJ: Paleontological evi-dence to date the t ree of life. Mol Biol Evol 24:2653 (2007).

    Bernardi G: Isochores and the evolutionary ge-nomics of ver tebrates. Gene 241: 317 (2000).

    Bird A: CpG islands as gene markers in the ver-tebrate nucleus. Trends Genet 3: 342347(1987).

    Boffelli D, McAuliffe J, Ovcharenko D, Lewis

    KD, Ovcharenko I, et al: Phylogenetic shad-owing of primate sequences to find function-al regions of the human genome. Science 299:13911394 (2003).

    Braun EL, Grotewold E: Fungal Zuotin proteinsevolved from MIDA1-like factors by lineage-specific loss of MYB domains. Mol Biol Evol18: 14011412 (2001).

    Braun EL, Kimball RT: Examining basal aviandivergences with mitochondrial sequences:Model complexity, taxon sampling, and se-quence length. Syst Biol 51: 614625 (2002).

    Bromham L, Wolfit M, Lee MSY, Rambaut A:Testing the relationship between morpho-logical and molecular rates of change alongphylogenies. Evolution 56: 19211930 (2002).

    Brunner B, Grutzner F, Yaspo ML, Ropers HH,Haaf T, Kalscheuer VM: Molecular cloningand characterization of the Fugu rubripesMEST/COPG2 imprinting cluster and chro-mosomal localization in Fugu and Tetraodonnigroviridis. Chromosome Res 8: 465476(2000).

    Bryant D, Moulton V: NeighborNet: An agglom-erative algorithm for the construction ofphylogenetic networks. Mol Biol Evol 21:255265 (2004).

    Castoe TA, de Koning AP, Kim HM, Gu W,Noonan BP, et al: Evidence for an ancientadaptive episode of convergent molecularevolution. Proc Natl Acad Sci USA 106:89868991 (2009).

    Cheung E, Acevedo ML, Cole PA, Kraus WL: Al-tered pharmacology and distinct coactivatorusage for estrogen receptor-dependent tran-

    scription through activating protein-1. ProcNatl Acad Sci USA 102: 559564 (2005).Chojnowski JL , Braun EL: Turtle isochore struc-

    ture is intermediate between amphibiansand other amniotes. Int Comp Biol 48: 454462 (2008).

    Chojnowski JL, Franklin J, Katsu Y, Iguchi T,Guillette Jr LJ, et al: Patterns of vertebrateisochore evolution revealed by comparisonof expressed mammalian, avian, and croco-dilian genes. J Mol Evol 65: 259266 (2007).

    Chojnowski JL, Kimball RT, Braun EL: Intronsoutperform exons in analyses of basal avianphylogeny using clathrin heavy chain genes.Gene 410: 8996 (2008).

    Clay O, Bernardi G: Compositional heterogene-

    ity within and among isochores in mamma-lian genomes. II. Some general comments.Gene 276: 2531 (2001).

    Conant GC, Lewis PO: Effects of nucleotidecomposition bias on the success of the parsi-mony criterion in phylogenetic inference.Mol Biol Evol 18: 10241033 (2001).

    Costantini M, Cammarano R, Bernardi G: Theevolution of isochore patterns in vertebrategenomes. BMC Genomics 10: 146 (2009).

    Cottage AJ, Edwards YJK, Elgar G: AP1 genes inFugu indicate a divergent transcriptionalcontrol from that of mammals. Mamm Ge-nome 14: 514525 (2003).

    Cotton JA, Page RDM: Going nuclear: Genefamily evolution and vertebrate phylogenyreconciled. Proc R Soc Lond Ser B 269: 15551561 (2002).

    Crews D: Sex determination: Where environ-ment and genetics meet . Evol Dev 5: 5055(2003).

    Cross S, Kovarik P, Schmidtke J, Bird A: Non-methylated islands in fish genomes are GC-poor. Nucleic Acids Res 19: 14691474 (1991).

    Cuny G, Soriano P, Macaya G, Bernardi G: Themajor components of the mouse and humangenomes. 1. Preparation, basic propertiesand compositional heterogeneity. Eur J Bio-chem 115: 227233 (1981).

    deBraga M, Rieppel O: Repti le phylogeny and theinterrelationships of turtles. Zool J Linn Soc120: 281354 (1997).

    Delsuc F, Brinkmann H, Philippe H: Phyloge-nomics and the reconstruction of the tree oflife. Nat Rev Genet 6: 361375 (2005).

    Duret L, Semon M, Piganeau G, Mouchiroud D,Galtier N: Vanishing GC-rich isochores inmamma lian genomes. Genetics 162: 18371847 (2002).

    Eernisse DJ, Kluge AG: Taxonomic congruenceversus total evidence, and amniote phyloge-ny inferred from fossils, molecules, and mor-phology. Mol Biol Evol 10: 11701195 (1993).

    Eferl R, Wagner EF: AP-1: A double-edged swordin tumorigenesis. Nat Rev Cancer 3: 859868(2003).

    Eisen JA: Phylogenomics: Improving functionalpredictions for uncharacterized genes by

    evolutionary analysis. Genome Res 8: 163167 (1998).

    Eisen JA, Kaiser D, Myers RL: Gastrogenomicdelights: A movable feast. Nat Med 3: 10761078 (1997).

    Fares MA, Byrne KP, Wolfe KH: Rate asymme-try a fter genome duplication causes substan-tial long-branch attraction artifacts in thephylogeny of Saccharomyces species. MolBiol Evol 23: 245253 (2006).

    Foster PG, Hickey DA: Compositional bias mayaffect both DNA-based and protein-basedphylogenetic reconstructions. J Mol Evol 48:284290 (1999).

    Foster PG, Jermiin LS, Hickey DA: Nucleotidecomposition bias affects amino acid contentin proteins coded by animal mitochondria. JMol Evol 44: 282288 (1997).

    Fujimoto J, Hori M, Ichigo S, Morishita S, Tama-ya T: Estrogen induces expression of c-fosand c-jun via activation of protein kinase Cin an endometrial cancer cell line and fibro-blasts derived from human uterine endome-trium. Gynecol Endocrinol 10: 109118(1996).

  • 8/2/2019 Katsu Reptilian Genome CGR 2009

    14/15

    Katsu /Braun /Guillette /IguchiCytogenet Genome Res 2009;127:799392

    Galtier N, Gouy M: Inferring pattern and pro-cess: Max imum-likelihood implementationof a nonhomogeneous model of DNA se-quence evolution for phylogenetic analysis.Mol Biol Evol 15: 871879 (1998).

    Gaucher E, Gu X, Miyamoto M, Benner S: Pre-dicting functional d ivergence in protein evo-lution by site-specific rate shifts. Trends Bio-chem Sci 27: 315321 (2002).

    Gauthier J, Kluge AG, Rowe T: Amniote phylog-eny and the importance of fossils. Cladistics4: 105209 (1988).

    Genome 10K Community of Scientists: Genome10K: A proposal to obtain whole-genome se-quences for 10,000 vertebrate species. JHered 100: 659674 (2009).

    Georgelis N, Braun EL, Hannah LC: Duplicationand functional divergence of ADP-glucosepyrophosphorylase genes in plants. BMCEvol Biol 8: 232 (2008).

    Gerhart J, Bronner-Fraser M, Edwards S, Hol-land P: Evolution of the human proteome:Completing the chordate nodes. NIH WhitePaper available from http://www.genome.

    gov under The Large-Scale Genome Se-quencing Program (2005).Gorr TA, Mable BK, Kleinschmidt T: Phyloge-

    netic analysis of reptilian hemoglobins:trees, rates, and divergences. J Mol Evol 47:471485 (1998).

    Hackett SJ, Kimball RT, Reddy S, Bowie RCK,Braun EL, et al: A phylogenomic study ofbirds reveals their evolutionary history. Sci-ence 320:17631768 (2008).

    Hamada K, Horiike T, Kanaya S, Nakamura H,Ota H, et al: Changes in body temperaturepattern in vertebrates do not influence thecodon usages of alpha-globin genes. GenesGenet Syst 77: 197207 (2002).

    Harshman J, Huddleston CJ, Bollback JP, Par-

    sons TJ, Braun MJ: True and false gharials: Anuclear gene phylogeny of crocodylia. SystBiol 52: 386402 (2003).

    Heath TA, Hedtke SM, Hillis DM: Taxon sam-pling and the accuracy of phylogenetic anal-yses. J Syst Evol 46: 239257 (2008).

    Hedges SB, Poling LL: A molecular phylogeny ofreptiles. Science 283: 9981001 (1999).

    Hubbard TJP, Aken BL, Ayling S, Ballester B,Beal K, et al: Ensembl 2009. Nucleic AcidsRes 37:D690697 (2009).

    Hugall AF, Foster R, Lee MSY: Calibrationchoice, rate smoothing, and the pattern oftetrapod diversification according to thelong nuclear gene RAG-1. Syst Biol 56: 543563 (2007).

    Hughes S, Mouchiroud D: High evolutionaryrates in nuclear genes of squamates. J MolEvol 53: 7076 (2001).

    Hughes S, Zelus D, Mouchiroud D: Warm-blooded isochore structure in Nile crocodileand turtle. Mol Biol Evol 16: 15211527(1999).

    Hughes S, Clay O, Bernardi G: Compositionalpatterns in reptilian genomes. Gene 295:323329 (2002).

    Huson DH, Bryant D: Application of phyloge-netic networks in evolutionary studies. MolBiol Evol 23: 254267 (2006).

    Hyder SM, Nawaz Z, Chiappetta C, Yokoyama K,Stancel GM: The protooncogene c-jun con-tains an unusual estrogen-inducible enhanc-er within the coding sequence. J Biol Chem270: 85068513 (1995).

    International Chicken Genome SequencingConsortium: Sequence and comparativeanalysis of the chicken genome provideunique perspectives on vertebrate evolution.Nature 432: 695716 (2004).

    International Human Genome Sequencing Con-sortium: Initial sequencing and analysis ofthe human genome. Nature 409: 860921(2001).

    Iwabe N, Hara Y, Kumazawa Y, Shibamoto K,Saito Y, et al: Sister group relationship of tur-tles to the bird-crocodilian clade revealed bynuclear DNA-coded proteins. Mol Biol Evol22: 810813 (2005).

    Jabbari K, Cacci S, Pas de Barros JP, Desgrs J,Bernardi G: Evolutionary changes in CpG

    and methylation levels in the genome of ver-tebrates. Gene 205: 109118 (1997).Jabbari K, Clay O, Bernardi G: GC3 heterogene-

    ity and body temperature in vertebrates.Gene 317: 161163 (2003).

    Janssen P, Goldovsky L, Kunin V, Darzentas N,Ouzounis CA: Genome coverage, literallyspeaking. The challenge of annotating 200genomes with 4 mi llion publications. EMBORep 6:397399 (2005).

    Jeffroy O, Brinkmann H, Delsuc F, Philippe H:Phylogenomics: the beginni ng of incongru-ence? Trends Genet 22: 225231 (2006).

    Jiang ZJ, Castoe TA, Austin CC, Burbrink FT,Herron MD, et al: Comparative mitochon-drial genomics of snakes: Extraordinary

    substitution rate dynamics and functionalityof the duplicate control region. BMC EvolBiol 7: 123 (2007).

    Kent J, Coriat AM, Sharpe PT, Hastie ND,van Heyningen V: The evolution of WT1sequence and expression pattern in the ver-tebrates. Oncogene 11: 17811792 (1995).

    Kim RH, Smith PD, Aleyasin H, Hayley S, MountMP, et al: Hypersensitivity of DJ-1-deficientmice to 1-methyl-4-phenyl-1,2,3,6-tetrahy-dropyrindine (MPTP) and oxidative stress.Proc Natl Acad Sci USA 102: 52155220(2005).

    Lang JW, Andrews HV: Temperature-dependentsex determination in crocodilians. J ExpZool 270: 2844 (1994).

    Lau CK, Subramaniam M, Rasmussen K, Spels-berg TC: Rapid inhibition of the c-jun proto-oncogene expression in avian oviduct byestrogen. Endocrinology 127: 25952597(1990).

    Lau CK, Subramaniam M, Rasmussen K, Spels-berg TC: Rapid induction of the c-jun proto-oncogene in the avian oviduct by the anties-trogen tamoxifen. Proc Natl Acad Sci USA88: 829833 (1991).

    Leo LI, Ho PL, Junqueira-de-Azevedo I de LM:Transcriptomic basis for an antiserumagainst Micrurus corallinus (coral snake)venom. BMC Genomics 10: 112 (2009).

    Lee MSY: Molecules, morphology, and themonophyly of diapsid reptiles. Contrib Zool70: 122 (2001).

    Li W, Bernaola-Galvan P, Carpena P, Oliver JL:Isochores merit the prefix iso. Comput BiolChem 27: 510 (2003).

    Lockhart PJ, Steel MA, Hendy MD, Penny D: Re-covering evolutionary trees under a more re-alistic model of sequence evolution. Mol BiolEvol 11: 605612 (1994).

    Loomis WF, Smith DW: Molecular phylogeny ofDictyostelium discoideum by protein se-quence comparison. Proc Natl Acad Sci USA87: 90939097 (1990).

    Losos J, Braun E, Brown D, Clifton S, Edwards S,et al: Proposal to sequence the first reptiliangenome: The green anole lizard,Anolis caro-linensis. NIH White Paper available fromhttp://www.genome.gov under The Large-Scale Genome Sequencing Program (2005).

    Lvtrup S: The Phylogeny of Vertebrata (JohnWiley, London 1977).Maddison D, Maddison W: MacClade (Sinauer

    Associates, Sunderland 2002).Maddison WP: Gene trees in species trees. Syst

    Biol 46: 523536 (1997).Mannen H, Li SS-L: Molecular evidence for a

    clade of turt les. Mol Phylogenet Evol 13: 144148 (1999).

    Marcourt L, Cordier C, Couesnon T, DodinG: Impact of C5-cytosine methylation onthe solution structure of d(GAAAACGTTTTC)2. An NMR and molecular modellinginvestigation. Eur J Biochem 265: 10321042(1999).

    Modi WS, Crews D: Sex chromosomes and sex

    determination in reptiles. Curr Opin GenetDev 15: 660665 (2005).

    Morishita S, Niwa K, Ichigo S, Hori M, MuraseT, et al: Overexpressions of c-fos/jun mRNAand their oncoproteins (Fos/Jun) in themouse uterus treated with three natural es-trogens. Cancer Lett 97: 225231 (1995).

    Murphy WJ, Eizirik E, OBrien SJ, Madsen O,Scally M, et al: Resolution of the early placen-tal mammal radiation using Bayesian phylo-genetics. Science 294: 23482351 (2001).

    Niki T, Takahashi-Niki K, Taira T, Iguchi-ArigaSMM, Ariga H: DJBP: A novel DJ-1-bindingprotein, negatively regulates the androgenreceptor by recruiting histone deacetylasecomplex, and DJ-1 antagonizes this inhibi-tion by abrogation of this complex. Mol Can-cer Res 1: 247261 (2003).

    Olmo E: Trends in the evolution of reptilianchromosomes. Integr Comp Biol 48: 486493(2008).

    Omland KE: Correlated rates of molecular andmorphological evolution. Evolution 51:13811393 (1997).

    Organ CL, Janes DE: Evolution of sex chromo-somes in Sauropsida. Integr Comp Biol 48:512519 (2008).

  • 8/2/2019 Katsu Reptilian Genome CGR 2009

    15/15

    Phylogenomics of Reptilian Oncogenes Cytogenet Genome Res 2009;127:7993 93

    Organ CL, Moreno RG, Edwards SV: Three tiersof genomic evolution in reptiles. IntegrComp Biol 48: 494504 (2008).

    Page RDM, Charleston MA: From gene to organ-ismal phylogeny: Reconciled trees and thegene tree/species tree problem. Mol Phylo-genet Evol 7: 231240 (1997).

    Parmley JL, Chamar y JV, Hurst LD: Evidence forpurifying selection against synonymous mu-tations in mammalian exonic splicing en-hancers . Mol Biol Evol 23: 301309 (2006).

    Phillips MJ, Penny D: The root of the mamma-lian tree i nferred from whole mitochondrialgenomes. Mol Phylogenet Evol 28: 171185(2003).

    Pollock DD, Eisen JA, Doggett NA, CummingsMP: A case for evolutionary genomics andthe comprehensive examination of sequencebiodiversity. Mol Biol Evol 17: 17761788(2000).

    Qinghua L, Xiaowei Z, Wei Y, Chenji L, Yijun H,et al: A catalog for transcripts in the venomgland of the Agkistrodon acutus: identifica-tion of the toxins potentially involved in co-

    agulopathy. Biochem Biophys Res Commun341: 522531 (2006).Rest JS, Ast JC, Austin CC, Waddell PJ, Tibbetts

    EA, et al: Molecular systematics of primaryreptilian lineages and the tuatara mitochon-dria l genome. Mol Phylogenet Evol 29: 289297 (2003).

    Rozek D, Pfeifer GP: In vivo protein-DNA inter-actions at the c-jun promoter: preformedcomplexes mediate the UV response. MolCell Biol 13: 54905499 (1993).

    Ruano-Rubio V, Fares MA: Artifactual phylog-enies caused by correlated distribution ofsubstitution rates among sites and lineages:the good, the bad, and the ugly. Syst Biol 56:6882 (2007).

    Saccone S, Federicoa C, Bernardi G: Localiza-tion of the gene-richest and the gene-poorestisochores in the interphase nuclei of mam-mals and birds. Gene 300: 169178 (2002).

    Saint KM, Austin CC, Donnellan SC, Hutchin-son MN: c-mos, a nuclear marker useful forsquamate phylogenetic analysis. Mol Phylo-genet Evol 10: 259263 (1998).

    Sanderson MJ, Kim J: Parametric phylogenetics?Syst Biol 49: 817829 (2000).

    Sarre SD, Georges A, Quinn A: The ends of acontinuum: Genetic and temperature-dependent sex determination in reptiles.BioEssays 26: 639645 (2004).

    Shedlock AM, Botka CW, Zhao S, Shetty J,Zhang T, et al: Phylogenomics of nonavianreptiles and the structure of the ancestralamniote genome. Proc Natl Acad Sci USA104: 27672772 (2007).

    Simmons MP, Carr TG, ONeill K: Relative char-acter-state space, amount of potential phylo-genetic information, and heterogeneity ofnucleotide and amino acid characters. MolPhylogenet Evol 32: 913926 (2004).

    Stamatakis A: RAxML-VI-HPC: Maximumlikelihood-based phylogenetic analyses withthousands of taxa a nd mixed models. Bioin-formatics 22: 26882690 (2006).

    Stancheva I, El-Maarri O, Walter J, Niveleau A,Meehan RR: DNA methylation at promoterregions regulates the timing of gene activa-tion in Xenopus laevi s embryos. Dev Biol243: 155165 (2002).

    Strausberg RL, Simpson AJG, Old LJ, RigginsGJ: Oncogenomics and the development of

    new cancer therapies . Nature 429: 469474(2004).

    Swofford D: PAUP*: Phylogenetic Analysis Us-ing Parsimony (*and other methods), ver. 4(Sinauer Associates, Sunderland 2003).

    Takahashi K, Taira T, Niki T, Seino C, Iguchi-Ariga SMM, Ariga H: DJ-1 positively regu-lates the androgen receptor by impairing thebinding of PIASx to the receptor. J BiolChem 276: 3755637563 (2001).

    Thiery JP, Macaya G, Bernardi G: An analysis ofeukaryotic genomes by density gradient cen-trif ugation. J Mol Biol 108: 219235 (1976).

    Thomas JW: Comparative vertebrate genomics,in Brown JR (ed): Comparative Genomics:Basic and Applied Research (CRC Press,Boca Raton 2008).

    Vogt PK: Jun, the oncoprotein. Oncogene 20:23652377 (2001).

    Waddell PJ, Kishino H, Ota H: Phylogeneticmethodology for detecting protein interac-tions. Mol Biol Evol 24: 650659 (2007).

    Werneburg I, Snchez-Villagra MR: Timing oforganogenesis support basal position of tur-tles in the amniote tree of li fe. BMC Evol Biol9: 82 (2009).

    Wiens JJ, Kuczynski CA, Smith SA, MulcahyDG, Sites JWJ, et al: Branch lengths, support,and congruence: testing the phylogenomicapproach with 20 nuclear loci in snakes. SystBiol 57: 420 431 (2008).

    Yamakoshi K, Shimoda N: De novo DNA meth-ylation at the CpG island of the zebrafish notail gene. Genesis 37: 195202 (2003).

    Zardoya R, Meyer A: The evolutionary positionof turtle revised. Naturwissenschaften 88:193200 (2001).