phylogenomics and morphology of extinct paleognaths reveal the
TRANSCRIPT
Current Biology, Volume 27
Supplemental Information
Phylogenomics and Morphology
of Extinct Paleognaths
Reveal the Origin and Evolution of the Ratites
Takahiro Yonezawa, Takahiro Segawa, Hiroshi Mori, Paula F. Campos, YuichiHongoh, Hideki Endo, Ayumi Akiyoshi, Naoki Kohno, Shin Nishida, Jiaqi Wu, HaofeiJin, Jun Adachi, Hirohisa Kishino, Ken Kurokawa, Yoshifumi Nogi, HideyukiTanabe, Harutaka Mukoyama, Kunio Yoshida, Armand Rasoamiaramanana, SatoshiYamagishi, Yoshihiro Hayashi, Akira Yoshida, Hiroko Koike, FumihitoAkishinonomiya, Eske Willerslev, and Masami Hasegawa
Figure S1
Figure S1 | Maximum likelihood tree of Aves as inferred from concatenated sequences of
multiple nuclear genes (nuc) and mitochondrial genomes (mt) (Related to Figure 1). The numbers
on the branches indicate the bootstrap probabilities (nuc+mt/nuc/mt). Several nodes do not exist
because of missing taxa in each dataset. The bootstrap probabilities in such nodes are indicated as NA
(not applicable). “--” indicate nodes not supported by the maximum likelihood tree of respective
dataset. This tree topology was used in estimating the divergence times, ancestral body (or egg)
weights and geographic distributions except that the basal position of Anhimidae was assumed within
Anseriformes following the previous studies [S1, S2] as shown in the sub-tree in the dashed lined
rectangular. Although the sister relationship between Anhimidae and Anseranas was preferred by our
analysis, it may be an analytical artefact caused by the very limited number of shared sites within
Anseriformes in our supermatrix. The nodal numbers correspond to the numbers in “Table S2, S3 and
S5”. The black nodes were calibrated using the fossil evidence for the divergence time estimations. The
grey nodes were optionally calibrated to test the hypothesis of the divergence times.
Figure S2
Figure S2 | Effects of taxon sampling and fossil calibrations on the divergence time estimates of
Palaeognathae (Related to Figure 1). The black branches indicate divergence time estimates without
any calibration between Neognathae and Palaeognathae. The time tree with blue branches was
calibrated with the fossil of Ichthyornis (Neognathae/Palaeognathae split was younger than 86.5 mya),
and the time tree with red branches was calibrated with the fossil of Enaliornis
(Neognathae/Palaeognathae split was younger than 100.5 mya). The images on the left indicate taxon
sampling. The first time tree was based on all taxa including ratites, tinamous, neognaths and reptiles
(outgroup). The second time tree was based on the Aves including ratites, tinamous and neognaths (no
root in this tree). The third time tree was based on ratites, tinamous and reptiles. The last time tree was
based on ratites, neognaths and reptiles.
Figure S3
Figure S3 | Reconstruction of the ancestral geographic distribution areas (Related to Figure 3
and Figure 4). The colours of the branches and the nodes indicate the geographic distribution areas.
Fossil species without molecular data are indicated by †. The species indicated by circles were used for
the reconstruction of the ancestral geographic distribution. The species indicated by squares are fossil
species and were not involved in the analysis. (A) Reconstructed ancestral geographic distribution
areas based on the 2 states (Northern Hemisphere and Southern Hemisphere) model by the parsimony
method. (B) Reconstructed ancestral geographic distribution areas based on the 7 states (Palearctic,
Nearctic, Afrotropical, South America, Australia, Madagascan, and Zealandia) model by the Bayesian
method. Since there are no paleognaths including fossil species in Indomalay region, Indomalay was
fused in Palearctic region. (C) Reconstructed ancestral geographic distribution areas based on the 5
states (Palearctic+Nearctic, Afrotropical, South America+Australia+Antarctica, Madagascan, and
Zealandia) model by the Parsimony method. (D) Reconstructed ancestral geographic distribution areas
based on the 5 states (Palearctic+Nearctic, Afrotropical, South America+Australia+Antarctica,
Madagascan, and Zealandia) model by the Bayesian method. The regions in a pie chart on a node are
proportional to the posterior probabilities.
Table S1: List of ancient elephant bird samples analyzed in this study (Related to Figure 1)
Lab. No. Location Excavation age Genus Part GLa PBb PDc Date BP
07AEP05 Beloha Aepyornis Tarsometatarsus 354.5 152.8 131.2 1580 ± 80
07AEP06 Beloha Mullerornis Tarsometatarsus 292.9 74.8 73.6 1290 ± 90
07APE07 Belo-sur-Mer Mullerornis Tarsometatarsus 306.3 78.4 73.2
07APE08 Beloha Mullerornis Tarsometatarsus 292.2 71.9 70.6
07APE09 Beloha Mullerornis Tarsometatarsus 339 81.6 78
07APE10 E25B Aepyornis Tarsometatarsus 338.5 120.4 128.1
07APE11 Antseirab Mullerornis Tarsometatarsus 267.8 98.7 94.9
08AEP07 Beloha 1914 Aepyornis Tibiotarsus 618 591 472 1581 ± 23*
08APE01 Antsirabe 1914 Aepyornis? Tarsometatarsus 261.8 100.1 98.2
08APE03 ? Aepyornis Tarsometatarsus 334 132.7 --
Index refers to the index used while constructing Illumina DNA libraries.
14C dates were determined by AMS at the MALT facility, the University of Tokyo.
*Graphite preparation was performed at the 14C Dating Laboratory, the University Museum, the University of Tokyo and 14C dates were measured at Paleo Lab Co., Ltd.
aGL: Greatest length (mm)
bBreadth of the proximal end (mm)
cPD: Breadth of the distal end (mm)
Table S2: Fossil calibrations used for the divergence time estimations (related to Figure 1)
nodal number divergence event maximum time minimum time Reference
node23 Crocodilia - Aves 259 mya 243 mya Haddrath&Baker[S3],Benton et al.[S4], Muller&Reisz [S8]
node12 Casuarius - Dromaius 35 mya 25 mya Haddrath& Baker [S3], Boles [S6]
node24 Procellariiformes - Sphenisciformes 62 mya 60 mya Haddrath& Baker [S3], Slack et al. [S12]
node25 Rostratulidae - Jacanidae* 32 mya 30 mya Haddrath& Baker [S3], Rasmussen et al.[S11]
node26 Anatidae - Anseranatidae 68 mya 66 mya Haddrath& Baker [S3], Clarke et al. [S7]
node27 Caiman - Alligator 71 mya 66 mya Haddrath& Baker [S3], Muller&Reisz [S8]
node1 Neognathae - Palaeognathae 100.5 mya Bell&Chiappe [S5], O’Connor [S9]
86.5 mya Benton et al.[S4]
node14 Dinornithidae - Tinamidae 66 mya Parris&Hope [S10]
Nodal numbers are corresponding to "Supplemental Fig S1".
Basic callibrations
Optional callibrations
Optional callibrations were used for evaluating the stability of the divergence times and for testing the evolutionary hypothesis.
*This callibration was originally used for Scolopacidae- Jacanidae split [S3]. However, since this calibration is based on the oldest fossil of Jacanidae, it should be
applied to the split of Jacanidae and its closest relative (Rostratulidae in this case) rather than Scolopacidae- Jacanidae split. With the aim of evaluating the
appropriateness of this calibration, we optionaly excluded this calibration, but it resulted the limited effect [see Supplemental Table S3]
Table S4: Species identification of the eggshells from Madagascar (related to Figure 2)
The numbers of base differences between the nucleotide sequences from egg shells and those of known species are shown.
All ambiguous sites were removed from this analysis.
Aepyornis GU799601 Aepyornis GU799600 Mullerornis GU799591
Aepyornis maximus 07AEP05 (this study) 0 1 10
Aepyornis maximus 08AEP07 (this study) 0 1 10
Aepyornis hildebrandti KJ749824 1 2 10
Mullerornis 07AEP06 (this study) 5 6 0
Mullerornis agilis AY016018 5 6 0
Mullerornis agilis KJ749825 5 6 0
Apteryx haastii AF338708 12 13 19
Apteryx owenii GU071052 13 14 20
Casuarius casuarius AF338713 14 15 13
Dromaius novaehollandiae AF338711 14 13 14
Dinornis giganteus AY016013 9 8 16
Anomalopteryx didiformis NC 002779 10 9 16
Emeus crassus AY016015 9 8 17
Eudromia elegans AF338710 14 14 24
Tinamus major AF338707 15 14 23
Rhea americana AF090339 8 9 15
Pterocnemia pennata AF338709 7 8 16
Struthio camelus AF338715 15 14 16
total sequence length 82bp 82bp 230bp
sequences from egg shell
Table S3: Stability of the estimated divergence times (Related to Figure 1) [separate file]
Table S5: Body and egg weights of the extant and ancestral aves (Related to Figure 2 and Figure 4) [separate file] [S13-S25]
Supplemental Experimental Procedures
Ancient DNA
Information on the specimens
The sub-fossils of elephant birds used for this study were kept in the specimen room of the
paleontological laboratory of the Department of Biological Anthropology and Palaeontology, Faculty of
Science, University of Antananarivo (Antananarivo, Madagascar). We transferred the specimens between
27 and 30 August 2007 and between 16 and 18 September 2008 with the permission of the government of
Madagascar. Although detailed information on the sampling area and date is not archived, locality names
and numbers were written on the surface of the bones. For example, bones inscribed with Beloha or
Antsirabe and 1914 were assumed to have been collected at Beloha or Antsirabe in 1914. The specimens
from Beloha were morphologically classified into Aepyornis maximum (abbreviated as AEP) and
Mullerornis sp. (abbreviated as MUL). To reduce the possibility of sampling more than once from the
same individual, only the tarsometatarsus was used for the analysis, except for specimen 08AEP07, for
which the tibiotarsus was used. Approximately 0.4 g of bone was sampled from a section of
tarsometatarsus or tibiotarsus with a Volvere GX dental drill (Nakanishi) using a diamond disk55 dental
disk (Shofu Inc.), and the outermost layers were abraded to remove contaminating material. Detailed
information on these specimens is summarized in Table S1. Dates before present (BP) of the AEP and
MUL bone samples were measured by C14 analysis at the laboratory of carbon dating, University
Museum, University of Tokyo (07AEP05, 1580 ± 80 BP; 07AEP06, 1290 ± 90 BP; 08AEP07, 1581 ± 23
BP; Table S1).
DNA extraction from fossil species
DNA extraction from 10 ancient elephant bird bone samples (Table S1) was performed at the Centre for
GeoGenetics, Denmark. DNA extraction was performed as described previously [S26]. In brief,
approximately 0.15 g of bone powder generated by drilling was incubated overnight at 55°C in 1 ml of
buffer containing 0.5 M EDTA (pH 8.0) (Life Technologies) and 0.1 mg/ml proteinase K (Roche). After
centrifugation at 470 relative centrifugal force for 5 minutes at room temperature, supernatant was
transferred to 30-kDa Amicon filter units (Millipore) and centrifuged at 4000 g. Approximately 250 µl of
concentrate was recovered from each sample and further purified using a MinElute PCR Purification Kit
(Qiagen) following the manufacturer’s instructions, except that in the elution step the spin column was
incubated with 45 μl of EB buffer at 37°C for 10 minutes. The same procedure without bone powder was
conducted as a negative control and the resulting eluate was examined to identify contaminating DNA.
Genome library construction
A-tailed libraries were prepared with 16 μl of DNA sample using a NEBNext Quick DNA Library
Prep Master Mix Set for 454 (E6090; New England BioLabs) according to the manufacturer’s protocol but
without the DNA fragmentation process. Illumina sequencing adapters were added to the end-repair
reaction at a final concentration of 0.25 μM together with 1 unit of Quick T4 DNA ligase. DNA fragments
in the reaction mixture were purified using the MinElute PCR Purification Kit (Qiagen) and eluted in 20
μl of EB buffer at 37°C for 10 minutes.
Three microlitres of the eluate were subjected to PCR amplification in 50 μl of reaction mixture
containing 1 High Fidelity PCR buffer, 2 mM MgSO4, 0.2 mM of each dNTP, 1 unit Platinum Taq
DNA polymerase High Fidelity (Life Technologies), 0.2 μM of the Illumina Multiplexing PCR primer
InPE1.0 and 0.004 µM of InPE2.0, and 0.2 µM of PCR primer with an index [S27]. PCR conditions
consisted of initial denaturation for 4 minutes at 94°C, 12 cycles of 30 seconds at 94°C, 30 seconds at
60°C and 45 seconds at 68°C, and a final extension for 7 minutes at 72°C. Three independent reaction
mixtures were combined after PCR and purified using a NucleoSpin Gel and PCR Clean-up Kit
(Macherey-Nagel) with a 10 minute incubation at 37°C for the elution step. A seconround PCR consisting
of 18 cycles was performed under the same conditions except that only a modified InPE1.0 primer and the
indexed primer were used. The PCR products were separated by 3% agarose gel electrophoresis and
purified as described above.
Blunt-end libraries were constructed according to Orlando et al. [S28], using 21.25 µl of DNA
sample and the NEBNext DNA Library Prep Mater Mix Set for 454 (E6070; New England BioLabs).
DNA purification after adapter ligation was performed as described above. PCR conditions were the same
as above except that 10 instead of 12 cycles were used in the second-round PCR.
Illumina sequencing
Single-end and paired-end reads were generated on an Illumina MiSeq platform using the MiSeq Reagent
Kit v2 or v3 (Illumina) at the NIPR. Read files (fastq.gz) were generated using MiSeq Reporter software
version 2.3.32 (Illumina). As a result, raw sequence reads (208,414,418 for Aepyornis maximums and
176,042,008 for Mullerornis sp.) were generated.
PCR amplification and Sanger sequencing of D-loop and tRNA (Pro-Thr) regions of mitochondrial
genomes
Mitochondrial genome regions not recovered by Illumina sequencing were amplified by PCR using
the primers shown in DataBaseTable2 (http://aepyornis.paleogenome.jp/) and the p5 or p7 region of the
Illumina adapter sequences. PCR amplification was performed in 25 μl of reaction mixture containing 1 μl
of genome library DNA, 1.25 U Platinum Taq DNA polymerase High Fidelity (Life Technologies), 1
High Fidelity PCR buffer, 2 mM MgSO4, 0.25 mM of each dNTP and 0.25 µM of each primer. PCR
conditions were as follows: 3 minutes of initial denaturation at 94°C, 30 cycles of 45 seconds of
denaturation at 94°C, 45 seconds of annealing at 53°C, and 60 seconds extension at 68°C, with a final
extension for 10 minutes at 72°C. A control reaction without the DNA template was performed for each
PCR attempt. PCR products were sequenced using a BigDye Terminator Cycle Sequencing Kit v3.1 and
an ABI 3130xl Genetic Analyser (Applied Biosystems).
Bioinformatics
Quality filtering of Illumina sequence data
We discarded the Illumina MiSeq platform reads that contained ambiguous nucleotides or were mapped to
the PhiX genome sequence, using Bowtie 2 version 2.1.0 [S29] with default parameters. After that, we
removed the adapter sequences in the reads using Cutadapt version 1.2.1 [S30] and low quality regions in
the 3' end of the reads with a Phred-like quality score <17. In addition, we discarded the reads that
contained <50 bp or were associated with an average Phred-like quality score <25.
Identification of Aepyornis maximus and Mullerornis sp. reads
The MiSeq reads that were derived from Aepyornis maximus or Mullerornis sp. genomes were identified
as follows. (i) An in-house nucleotide database comprising the genome sequences of paleognaths and their
relatives (designated as in-house Palaeognathae DB) was constructed by combining previously reported
sequence data [S3, S31-32] (DataBaseDataset1: http://aepyornis.paleogenome.jp/ ). (ii) All of the high
quality reads were subjected to BLASTN searches against the in-house Palaeognathae DB with E-value
<0.001. (iii) The reads that matched the sequences in the in-house Palaeognathae DB were subjected to
BLASTN searches against the GenBank-nt database (January 2014) with E-value <1e−4. (iv) The reads
that matched the sequences from birds or reptiles in the GenBank-nt database were regarded as genome
fragments of Aepyornis maximus or Mullerornis sp.
Analyses of DNA fragmentation and nucleotide missincorporation patterns
Although the MiSeq sequencing conditions used in this study can generate reads with >250 bp, the
average and median of the actual lengths of the reads identified as Aepyornis maximus or Mullerornis sp.
mitochondrial sequences were only 75–89 bp (DataBaseTable3: http://aepyornis.paleogenome.jp/ ).
We used mapDamage version 0.3.3 [S33] to analyse DNA fragmentation and nucleotide
missincorporation patterns across all of the identified mitochondrial reads from four samples (two from
Aepyornis maximus and two from Mullerornis sp.). In each sample, the reads were mapped to the
Aepyornis maximus or Mullerornis sp. mitochondrial genome sequence with BWA version 0.6.1 [S34],
and analysed using mapDamage [S33] with parameters (-l 300 -u).
The patterns of DNA fragmentation and nucleotide missincorporation analysed using MapDamage are
shown in DataBaseFigure5 (http://aepyornis.paleogenome.jp/). The results showed typical patterns of
ancient DNA which indicate (a) an increase in missincorporation of thymine instead of cytosine residues
at the 5′-end regions of the reads, which is in parallel with that of adenine instead of guanine residues at
the 3′-end regions and (b) a higher frequency of guanine and adenine at the nucleotide site adjacent to the
5′-end of the reads. It suggests that the former reflects cytosine deamination and that the latter reflects
strand breaks resulting from depurination of purines [S35].
Mitochondrial genome reconstruction
MiSeq reads identified as mitochondrial genome fragments of Aepyornis maximus or Mullerornis sp.
were used for the reconstruction of the nearly complete mitochondrial genome sequences of each species
by the following procedure: (i) We conducted the initial assembly by combining the identified
mitochondrial reads from two samples per species and mapping them against the mitochondrial genome
sequence of Dromaius novaehollandiae (AF338711) using MIRA version 4.0 [S36] with parameters (-
NW:cnfs=warn -NW:cmrnl=no -AS:nop=1 -SK:bph=10 -CL:pecbph=20 SOLEXA_SETTINGS -
CO:msr=no -AS:epoq=no -AS:mrpc=2 -SK:pr=80 -AL:ms=10 -AL:mo=15). (ii) An iterative refinement
of the assembly was performed using MITObim version 1.7 [S37] with a maximum of 20 rounds of
iteration. (iii) For each gap region on the mitochondrial genome, candidate reads that filled the gap were
identified by a BLASTN search using all of the high quality reads before taxonomic assignment and
Sanger sequencing reads obtained as described above, as queries against the initial assembly of the
mitochondrial genome sequence with E-value <0.1, identity >80%, and alignment length >20 bp. (iv) To
remove partially matched false positive reads, we constructed a multiple alignment of the candidate reads
and the initial assembly of the mitochondrial genome sequence using MAFFT version 7.13 [S38] with
parameters (--localpair --maxiterate 1000). (v) The reads that overlapped gap regions and almost
completely aligned with the initial assembly of the mitochondrial genome sequence were used for manual
gap filling. The assembled sequences of Aepyornis maximus and Mullerornis sp. were included in the
phylogenomic analyses. Because coverage of the Aepyornis maximus and Mullerornis sp. genomes is not
extensive, their ancestral mitochondrial genome sequences were reconstructed from the combined data of
two individuals for each species using PAML ver. 4.7 [S39] with the codon substitution+Γ model for
protein coding genes and the GTR+Γ model for RNA genes and introns. The composite genomes of
Aepyornis maximus and Mullerornis sp. were aligned with the nucleotide sequences of other avian species
and reptilian species.
The numbers of MiSeq raw reads and high quality reads are summarized in DataBaseTable4,
DataBaseTable5 and DataBaseTable6 (http://aepyornis.paleogenome.jp/) present the results of automated
identification of the Aepyornis maximus and Mullerornis sp. reads. The total number of contigs and
singleton reads for each nuclear gene after manual sequence validation is presented in DataBaseTable7
(http://aepyornis.paleogenome.jp/). The assembly results of Aepyornis maximus and Mullerornis sp.
mitochondrial genomes are summarized in DataBaseTable8 (http://aepyornis.paleogenome.jp/). The
Aepyornis maximus and Mullerornis sp. mitochondrial genome sequences were deposited with accession
numbers AP014697 and AP014698, respectively. The nucleotide alignments are available from
http://aepyornis.paleogenome.jp/.
Reconstruction of nuclear genes
MiSeq reads identified as a part of nuclear genes of Aepyornis maximus or Mullerornis sp. were used
for the reconstruction of each gene sequence as follows. (i) The identified reads were independently
assembled for each gene using CAP3 software [S40]. (ii) Assembled contigs and singletons were multiply
aligned with reference gene sequences in the in-house Palaeognathae DB using MAFFT [S38] with
parameters (--adjustdirectionaccurately --genafpair --maxiterate 1000). (iii) Partially matched false
positive reads were manually removed by checking the multiple alignment result with the UGENE viewer
[S41]. (iv) Each gene sequence of Aepyornis maximus and Mullerornis sp. was manually reconstructed
from the multiple alignment result. The minimum read depth required for reconstruction of a nuclear
sequence was one read.
As shown in the DataBase Table 5, the numbers of the high quality reads of Miseq data are
187,903,977 reads (Aepyornis maximus in total) and 150,210,376 reads (Mullerornis sp. in total). The
number of MiSeq reads identified as nuclear gene (locus) sequences are shown in the DataBase Table 6,
and it indicates 30,174 reads were identified as Aepyornis maximus, and 29,134 reads were identified as
Mullerornis sp. Therefore, 0.016% (Aepyornis maximus) ~ 0.019% (Mullerornis sp.) are endogenous
DNA. Since our in house database (DataBase Table 9) consists of 871,499 sites (≈0.87 Mbp), if avian
genome size is assumed to be 1 G bp, our in house database covers about 1/1,250 of the whole genome.
Therefore, theoretically, about 20% of the Miseq reads are endogenous DNA. However, considering
genome fragment lengths of the reconstructed Aepyornis maximus and Mullerornis sp. are only 89,203
sites (10.2% of in house database) and 49,480 sites (5.68% of in house database), respectively, it is
plausible the portion of endogenous DNA is much smaller.
Evolutionary Analyses
Molecular phylogenetic inference
Our data consisted of mitochondrial genomes (13 protein-coding regions, 2 ribosomal RNA
genes, 22 transfer RNA genes: 15,977 bp in total), and multiple nuclear genes including the following
three datasets: those of Haddrath and Baker [S3] (10 nuclear loci; exons: 9,888 bp), Harshman et al.
[S32](19 nuclear loci; exons and introns: 26,775 bp) and Smith et al. [S42] (40 nuclear loci; introns:
25,131 bp). These sequences were separately aligned using MAFFT [S38] and MUSCLE [S43], and were
carefully checked visually. Detailed information on these genes is presented in DataBaseTable9
(http://aepyornis.paleogenome.jp/). The alignment was deposited on the web site
http://aepyornis.paleogenome.jp/. Taking account of the differences in taxon sampling, the partitions of
the mitochondrial genomes and the three nuclear gene datasets were analysed separately. Moreover, the
exon regions and intron regions in the dataset of Harshman et al. [S32] were dealt with separately. Then,
the best partition among loci within each data set was chosen using PartitionFinder [S44] under the
GTR+I+Γ model with the unlinked branch length model. The phylogenetic tree was inferred by the
maximum likelihood method using RAxML ver. 7.8.1 [S45]. The MTMAM+F+I+Γ model was used for
the amino acid sequences of protein-coding genes of the mitochondrial genomes. The GTR+I+Γ model
was used for the other regions. Taking account of the difference among codon positions, the first, second
and third positions of nuclear exons were partitioned. The confidence limits of the internal branches were
evaluated using the rapid bootstrap algorithm. The branch lengths of each partition were independently
estimated (unlinked branch length model).
The maximum likelihood trees as inferred from mitochondrial genomes (mt), multiple nuclear
genes (nuc), and combined sequence data (mt+nuc) are shown in Figure S1. The mt and nuc data support
an essentially identical topology within the Palaeognathae. A sister group relationship of the elephant bird
and the kiwi was supported by the mitochondrial genome [S46], and also by the combined analysis of
multiple nuclear genes and the mitochondrial genome, although the support value from the nuclear genes
alone was not high and was dependent on the model (partition model) and taxon sampling (data not
shown). The maximum likelihood tree as inferred from the combined sequence data (mt+nuc) gives strong
support to the basal position of the ostrich (100% BP). The sister group relationship between the elephant
bird and the kiwi (99% BP), between the cassowary and the emu (100 % BP) and between the tinamou
and the moa (100% BP) were strongly supported, and that between the elephant bird–kiwi clade and the
cassowary–emu clade was moderately supported (80% BP). However, the position of the rhea was only
weakly supported (33% BP), and was highly affected by the model and taxon sampling (data not shown).
Morphological phylogenetic inference
To clarify the phylogenetic positions of the fossil paleognaths that became extinct in the
Palaeogene and the Neogene, such as Lithornis, Palaeotis and Emuarius, we reanalysed the morphological
data matrix created by Mayr [S47]. Mitchell et al. [S46] and Mayr [S47] carried out phylogenetic analyses
of the Palaeognathae based on total evidence from molecular and morphological data, and identified a
monophyletic relationship between Lithornis and the Tinamiformes. However, according to the results of
the pseudo-extinct analysis by Springer et al. [S48], in which they treated particular extant taxa (for
example, Carnivora and Afrotheria in Eutherian mammals) as if extinct and assumed that only osteologic
morphology data were available, phylogenetic analyses based on total evidence often failed to reconstruct
the well-established tree.
The major difficulty in phylogenetic inference based on morphologic data are that there are
considerable amount of convergences of morphologic characters, but the current substitution model does
not take convergent evolution into account [S49, S50]. Consequently, independent lineages that possess
convergently acquired characters (for example, two distantly related groups such as Xenarthra and
Pholidota), or groups retaining morphologically primitive (symplesiomorphic) characters (for example,
polyphyletic or paraphyletic groups such as the so called “Lipotyphla” and “Artiodactyla”) are often
recognized erroneously as a monophyletic group (see the molecular and morphologic trees in O’Leary et
al. [S51]). To overcome this difficulty, the morphologic characters without homoplasy (convergence or
parallelism) should be used in phylogenetic reconstruction. Such characters were selected by the following
procedures. First, the morphologic characters of the extant species (including recently extinct elephant
birds and moas) were selected from the matrix in Mayr [S47]. Subsequently, the characters with
convergent, parallel or reversal evolution were excluded under the tree topology as inferred from the
molecular data (Figure S1). If a morphologic character has N states, the characters with ≤N – 1 steps were
selected. This procedure was carried out using Mesquite version 3.3 (http://mesquiteproject.org [S52])
with the parsimony criterion. Although the matrix presented by Mayr [S47] is limited to the Neornithes
(Palaeognathae + Neognathae), cases with information on basal fossil lineages of Aves were also taken
into account (characters 180–243 in Mayr’s [S47] matrix correspond to those in the matrix of Johnston et
al. [S53], including the basal fossil lineages of the Aves). Finally, the selected characters of the fossil
species were added to the data matrix.
The phylogenetic tree including both extant and extinct species was inferred from the selected
morphologic characters by the ML method using the RAxML 7.2.6 [S45] and MrBayes ver. 3.2 [S54]
with the BIN+Γ model. The tree topology as inferred from the molecular data were given as the constraint
(Figure 3). Because the BIN model is a binary state version of the JC model, each multistate character (N
≥3) was converted into multiple binary state characters using the following procedures. If a character
changed from state X to state Y, and then changed from state Y to another state Z, we regarded it as a
nested character. If a character started from state X, and independently evolved into different states Y and
Z in two different lineages, we regarded it as an independent character. The final data matrix is presented
in DataBaseTable10 (http://aepyornis.paleogenome.jp/).
After morphological characters with homoplasy were excluded, 34 characters remained
DataBaseTable10 (http://aepyornis.paleogenome.jp/). There are many missing data for Palaeotis [S47],
and only seven characters remained in this taxon. All of these seven characters were commonly shared
with the ostrich and the rhea. Therefore, Palaeotis was excluded from the subsequent phylogenetic
analyses. The maximum likelihood tree as inferred from the morphologic data is shown in Figure 3A. The
most basal position of Lithornis among the Palaeognathae was supported by this analysis (77% BP/0.90
PP). The basal position of Lithornis and the ostrich among the Palaeognathae (irrespective of whether they
are monophyletic or paraphyletic) was also supported by a relatively high bootstrap value (92% BP/0.99
PP). Another extinct fossil taxon, Emuarius, was clustered together with the emu (99% BP/0.98 PP).
This method is effective in excluding homoplastic characters, but not completely. As far as the
extant species alone are concerned, character 12 (Os basisphenoidale [os parasphenoidale], position of
proc. basipterygoidei) of Mitchell et al. [S46] is not homoplasy. However, character state 1 (anterior to
basitemporal platform on the caudal end of rostrum parasphenoidale and widely separated) seems to have
convergently evolved in Anhimidae (Anseriformes) and Lithornis independently. When this character was
excluded from the analysis, Lithornis formed a monophyletic group with the ostrich with a low bootstrap
value (17% BP), but the basal position of Lithornis + ostrich was still supported by a relatively high
bootstrap value (91% BP). Although Palaeotis was not included in the phylogenetic inference, Palaeotis
shared a common state with the ostrich and the rhea (character 89). The possible phylogenetic positions of
Palaeotis based on this finding are also shown in Figure 3A (indicated by arrows).
Lithornis is a member of the family Lithornithidae together with Paracathartes and
Pseudocrypturus. According to Houde [S55], the ratites have evolved polyphyletically from the
Lithornithidae multiple times, and he suggested the Lithornis-cohort as an ancestral state of the ratites.
Therefore, it is also important to elucidate the phylogenetic positions of Paracathartes and
Pseudocrypturus among the Palaeognathae. Houde [S56] carried out an extensive phylogenetic analysis of
the Palaeognathae based on morphologic data of extant and extinct species including three species of
Lithornithidae (Lithornis, Paracathartes and Pseudocrypturus) as well as Palaeotis and Diogenornis.
Since he did not provide the character matrix, we reconstructed the character matrix based on his
phylogenetic tree (Figure 39 in Houde [S56]). In this process, we assumed there are no missing data.
Subsequently, we excluded morphologic characters with homoplasy under the assumption of the tree
topology of (((rhea, tinamou, (kiwi, cassowary)), Lithornis, ostrich), Neognathae) following Figure 3A.
Finally, 10 morphologic characters remained DataBaseTable10: http://aepyornis.paleogenome.jp/) and the
maximum likelihood tree and the Bayesian tree were reconstructed under the constraint of the same
topology used in selecting the morphologic characters. The tree is shown in Figure 3B. Pseudocrypturus
was placed in the most basal position of the Palaeognathae, and the family Lithornithidae was recognized
as a paraphyletic group. Palaeotis and the ostrich form a monophyletic group. Therefore, all the fossil
Palaeognathae species distributed in the Northern Hemisphere were placed in basal positions among the
Palaeognathae. An essentially identical topology with the basal position of Lithornis+Paracathartes
among the Palaeognathae was also supported from the morphological character matrix selected from
Worthy et al. [S57] (92 characters remained DataBaseTable10: http://aepyornis.paleogenome.jp/). The tree
is shown in Figure 3C.
On the other hand, Diogenornis discovered from Brazil clustered together with the rhea. As
mentioned above, Emuarius can be recognized as a sister group of the emus (Figure 3A), and all the fossil
species involved in this analysis from the Southern Hemisphere can be recognized as members of the
Notopalaeognathae, which includes the order Rheiformes (the rhea), the clade Novaeratitae (the kiwi,
emu, cassowary, and elephant bird), the order Tinamiformes (the tinamou) and the extinct order
Dinornithiformes (the moa) [S58].
Divergence time estimations
The divergence times in the evolution of the Palaeognathae were estimated by the relaxed molecular clock
method [S59, S60] using MCMCTREE implemented in PAML ver. 4.7 [S39]. To reduce the
computational burden, we applied the two-stage procedure of the Laplace method [S59]. Because most of
the information on divergence times and evolutionary rates is contained in the branch lengths, we
considered the likelihood of the branch lengths. In the first stage, we obtained maximum likelihood
estimates of the branch lengths and the Fisher information (Hessian matrix) from the sequence data
without a constraint of molecular clock. The likelihood of the branch lengths is approximated by a
multivariate normal distribution. The mean was the maximum likelihood estimate as generated above, and
the variance was the minus Hessian matrix.
The fossil records used as the calibration points are summarized in Table S2 [S3-S12].
The prior distribution of the root rate was the gamma distribution with the shape parameter α set to 2 and
the scale parameter β set to 5.33. The prior distribution of σ2 was the gamma distribution with the shape
parameter α set to 1 and the scale parameter β set to 0.3. These prior distributions were roughly estimated
from the genomic data of Baker et al. [S31] assuming the root of the tree, that is the divergence between
the Archosauria and Lepidosauria, as 300 mya (because we used one billion years as one time unit, 300
million years is 0.3 in this case) under the strict clock. For the Markov Chain Monte Carlo (MCMC)
analysis, the first 500,000 generations were discarded as the burn-in, and then 250,000 trees were sampled
per every 20 generations. Each analysis was run at least twice to confirm consistency between runs.
The mitochondrial genomes, the three datasets of nuclear loci used in reconstructing the tree and the
genome scale dataset of Baker et al. [S31] (594 nuclear loci; exon: 795,492 bp) were used for this
analysis. Taking account of the difference of the taxon sampling, these five data sets were treated as
different partitions. Moreover, the first, second and third codon positions of the protein-encoding genes,
introns and RNAs were partitioned. Concerning Baker et al. [S31]’s genomic data, the partitions were
determined by the following two strategies: The first strategy is the optimization by the PartitionFinder
[S44]. Baker et al. [S31]’s genomic data were separated into 236 optimised partitions. However, this
genomic data could not be separated into three codon positions due to a computational problem. In this
framework, our 5 data sets were finally separated into 248 partitions [3 partitions (three codon positions)
for [S3], 4 partitions (three codon positions and intron) for [S32], 1 partitions (intron) for [S42], 4
partitions (three codon positions and RNAs) for mitogenome, and 236 partitions for [S31]]. The GTR+Γ
model was used for the nucleotide substitutions. However, since this model could not be applied to Baker
et al. [S31]’s data because of a computational problem, we applied Tamura’s three parameter model [S61]
(without Γ) for Baker et al. [S31]’s data. The second strategy is simply separating Baker et al. [31]’s
genomic data into three codon positions (three partitions) because the time estimation based on the
optimised 248 partitions is computationally too expensive. In this second framework, our sequence data
sets were simply separated into 15 partitions taking account of the taxon samplings of each data set, as
well as three codon positions, coding and non-coding, mitochondrial protein coding genes and nuclear
coding genes [15 partitions: 3 partitions (three codon positions) for [S3], 4 partitions (three codon
positions and intron) for [S32], 1 partitions (intron) for [S42], 4 partitions (three codon positions and
RNAs) for mitogenome, and 3 partitions (three codon positions) for [S31]]. The results are shown in the
Table S3. The divergence time estimates based on data with the partitions optimized by PartitionFinder
(248 partitions) are generally larger, but close to the estimates based on data with simpler empirical
partitioning (15 partitions), and the difference between these two analyses are within the standard errors.
Although Nishihara et al. [S62] demonstrated the sensitivity of phylogenomic inference on the
evolutionary model, especially the partition model, differences in partitioning had only a limited effect on
divergence time estimations in the present study. The results are shown in DataBase Figure 6
(http://aepyornis.paleogenome.jp/). However, 248 partition strategy required huge computational burden,
and the calculation of the likelihood function does not work well. In addition, as mentioned above, 248
partitions strategy needs several compromises such as use of homogeneity model (that is without Γ model
and without codon partitions). Therefore, we mainly used the results of the empirical 15 partitions strategy
for this study.
The time tree of the Palaeognathae is shown in Figure 1. Our estimates are significantly younger
than those of Haddrath & Baker [S3], but slightly older than those of Mitchell et al. [S46]. The divergence
time between the Palaeognathae and the Neognathae was estimated to be about 110 mya. The emergence
of the crown Palaeognathae (the divergence time between the ostrich and the others) was about 80 mya.
Then, the rhea, the tinamou–moa clade, the cassowary–emu clade and the elephant bird–kiwi clade
successively diverged during 71-62 mya, which roughly coincides with the K-Pg boundary. As mentioned
before, the phylogenetic relationships among these four clades are not strongly supported (<80% BP), and
in particular, the position of the rhea remains controversial. This suggests rapid successive divergence
events among the paleognaths. The divergences between the elephant bird (Madagascar) and the kiwi
(New Zealand), and between the tinamou (South America) and the moa (New Zealand) occurred about 62
(58.9-64.7) mya and about 53 (50.2-56.1) mya, respectively. The divergence events among the taxa that
are currently distributed in different landmasses ended during this stage. The emergence times of the
current crown taxa are as follows: the elephant bird (about 35 (29.9-39.9) mya, Madagascar), the tinamou
(about 30 (26.6-33.0) mya, South America), the cassowary–emu (about 32 (27.4-35.3) mya, Australia–
New Guinea), the moa (about 13 (9.7-15.8) mya, New Zealand), the kiwi (about 12 (9.0-15.8) mya, New
Zealand) and the rhea (about 9 (6.9-12.1) mya, South America).
Evaluation of stability of divergence time estimations
To evaluate the stability of the divergence time estimations, the effects of taxon sampling, fossil
calibrations were investigated in detail. For the effect of taxon sampling, estimates based on the dataset of
all species (ratites, tinamous, neognaths, reptiles), the avian dataset (ratites, tinamous, neognaths), the
excl-tinamous dataset (ratites, neognaths, reptiles) and the excl-neognaths dataset (ratites, tinamous,
reptiles) were compared following Mitchell et al. [S46].
For the effect of fossil constraints, we focused on the divergence time between the Palaeognathae and
Neognathae. The first assumption had no constraints on the divergence time between the Palaeognathae
and Neognathae. The second assumption is that the divergence time between the Palaeognathae and
Neognathae was after the emergence of the first fossil record of Ichthyornis (86.5 mya [S4]). The third
assumption is that the divergence time between the Palaeognathae and Neognathae was after the
emergence of the first fossil record of Enaliornis (100.5 mya [S5, S9]).
In contrast to the findings of Mitchell et al. [S46], whose estimates were highly dependent on the
taxon sampling, the effect of taxon sampling was negligible when the sequence data of about 873 Kbp
were analysed as in this work (Table S3, Figure S2).
Our estimates are also stable against different assumptions of fossil constraints. It is generally
difficult to give the maximum ages (the older limit) of the nodes. As mentioned above, we used three
types of assumptions on the divergence of the Palaeognathae and the Neognathae; that is, (a) no maximum
limitation, (b) later than 86.5 mya based on the oldest fossil record of Ichthyornis, and (c) later than 100.5
mya based on the oldest fossil record of Enaliornis. Because the soft boundary method [S63, S64] was
applied in this study, if the molecular data strongly support earlier divergence time than these maximum
constraints such fossils of as Ichthyornis or Enaliornis, the posterior mean of the divergence time can be
earlier than these constraints (Figure 1B, Figure S2 and Table S3). No matter which constraints were
applied, the estimated divergence times were stable. Only in the case in which an out-group (reptiles) was
excluded (Aves data set) and when the split between the Palaeognathae and the Neognathae was
constrained by the ages of Ichthyornis (86.5 mya) or of Enaliornis (100.5 mya), the estimated time was
significantly later than that of the other cases. In particular, the divergence time between the
Palaeognathae and the Neognathae was strongly affected by this calibration. Probably, this is due to the
rooting problem as discussed in detail in the main text. However, the estimates (about 104 mya) were still
much earlier than the given constraints. It is also notable that despite that most of the calibration points
were in the Neognathae clade or in the non-Avian outgroup in our major analyses, the excl-Neognathae
data, in which one calibration within the Palaeognathae and two calibrations within the reptiles were used,
yielded nearly the same divergence time estimates as the data including all taxa (Figure S2 and Table S3).
The Aves data (because the non-Avian outgroup were not used, the calibrations were limited within the
Aves) also showed very close estimated times.
The only fossil calibration commonly used in this taxon sampling sensitivity analysis is Emuarius
(25-35 mya) for the split of Casuarius – Dromaius. The divergence time of Casuarius – Dromaius without
this calibration was estimated to be about 32 mya (27.1-38.1 mya), and very close to this fossil calibration
(Table S3).
Recently developed joint method of the molecular and morphological data for the tip-dating [e.g.,
S65] might provide a novel framework for the divergence time estimation. Selected 92 morphological data
from Worthy et al. [S57] were applied for the time estimation among the crown Aves. The results are
shown in the Table S3. The first splits within the crown Aves (about 110 mya) are well consistent with the
traditional calibration method. However, the divergence times within the Palaeognathae and the
Neognathae were generally estimated to be older than those of the traditional calibration method. The first
splits of the Palaeognathae and the Notopalaeognathae was estimated to be about 84.0 mya (about 79 mya
by the traditional internal node calibration method) and 76 mya (about 70 mya by the traditional internal
node calibration method), respectively, and the first splits of the Neognathae, the Galloanserae, and the
Neoaves were about 91mya (about 87 mya by the traditional method), 75 mya (about 74 mya by the
traditional method), and 77mya (about 72 mya by the traditional method), respectively.
In addition to the issues on the taxon samplings and the fossil calibrations, the stabilities and
sensitivities of the divergence time estimates irrespective to the uncertainties of the tree topologies, the
saturations of the fast evolving genes, and the use of clock-like genes were further examined. Taking into
account the instability of the phylogenetic position of the rhea, we also estimated the divergence times
assuming alternative positions of the rhea (the rhea is closer to the elephant bird–kiwi–cassowary–emu
clade or the rhea is closer to the moa–tinamou clade). Alternative placements of the rhea had a very
limited effect on the time estimations (Table S3).
Since several fast evolving gene loci such as the mitochondrial genome or introns were included in
this study, it might be possible that the saturation of these loci caused a considerable effect on our time
estimation because of the long evolutionary time scale of the crown Aves (~110 mya) or Archosauria
(~300 mya). With the aim of addressing this issue, the effect of fast evolving loci on the time estimations
were evaluated by excluding such loci. Within 15 empirical simple partitions in this study, the
mitochondrial 3rd codon positions are the fastest evolving sites. Exclusion of the mitochondrial 3rd codon
positions gave only a limited effect on the time estimation (DataBase Figure 7). The 3rd codon positions of
the nuclear protein coding genes, the introns, and the entire mitochondrial genome were also successively
excluded, but these exclusions did not give any considerable effect (DataBase Figure 7). Finally only the
1st and 2nd codon positions of the nuclear genes remained. Because there were no remarkable differences
in the estimates by these data sets, the effect of the saturation if any in limited gene loci (e.g.,
mitochondrial 3rd codon positions) is negligible in this analysis.
The effect of the clock-like and non-clock like genes were also evaluated by a similar method. The
standard deviations of the root-tip lengths within the crown Aves were estimated for each partition, and
they were normalized by dividing the mean root-tip lengths of each partition. There was a tendency that
the non-coding regions such as introns and RNAs show more clock-like evolution than the protein coding
genes. The results from the top 5 clock-like partitions, top 10 clock-like partitions, and 15 partitions were
compared. The estimated divergence times were as follows: the first splits within the crown Aves [110.0
(105.0-115.8) mya for the 15 partitions, 106.8 (101.5-112.7) mya for the 10 partitions, and 105.7 (98.6-
113.7) mya] for the 5 partitions, the first splits within the crown Palaeognathae [79.6 (77.0-83.9) mya,
78.3 (75.5-81.4) mya, and 76.5 (70.2-79.8) mya for the 15, 10, and 5 partitions, respectively], the first
splits within the Notopalaeognathae [70.6 (68.5-74.6) mya, 70.2 (67.8-72.9) mya, and 69.2 (64.0-71.4)
mya with the same order], the first splits within the crown Neognathae [90.2 (87.1-93.6) mya, 88.3 (85.1-
91.9) mya, and 87.8 (83.5-92.6) mya], the first splits within the Galloanseres [75.3 (73.3-77.6) mya, 74.9
(72.7-77.4) mya, and 74.1 (70.3-77.6) mya], and the first splits within the Neoaves [72.0 (69.9-74.4) mya,
71.4 (69.5-74.1), and 69.7 (67.9-71.4) mya]. The exclusion of the partitions with the large normalized
standard deviations resulted slightly younger estimates, but did not differ very much from the others.
.
The mitochondrial mutation rate and flightlessness
Because mitochondria play an important role in aerobic metabolism, it is expected that
the mitochondrial evolution is highly correlated with the evolution of flight capability, which requires a
high metabolic rate. In this study, we focused on the mitochondrial synonymous substitution rate. This
rate is likely to be correlated with the flight capability, because it has a negative correlation with body size
[S66] and it is plausible that it also has a strong correlation with the amount of free radicals generated by
the high metabolism associated with volant behaviour.
Stability of divergence time estimates is essential for the reliable estimation of
molecular substitution rates. In this respect, our stable time estimation (Figure 1B) has a crucial advantage.
The substitution rates of the third codon positions estimated by the MCMCTREE program with the
GTR+Γ model were used as approximations of the synonymous substitution rates (Table S5). The
substitution rates of volant birds such as songbirds, hummingbirds and tinamous (harmonic mean of the
substitution rates in the terminal branches of volant birds, 4.64%/site/million years) are significantly
higher than those of flightless birds such as ratites and penguins (harmonic mean of the substitution rates
in the terminal branches of flightless birds: 1.22%/site/million years; t test, P = 1.21×10-5) (Figure 2B) The
substitution rate in the branch of the common ancestral crown Aves was estimated to be
2.75%/site/million years, and the substitution rates of the ancestral branches of paleognaths (the branches
coloured in red seen in Figure 1A, Figure S1) were estimated to be 3.71–5.03%/site/million years (Table
S5). The harmonic mean was 4.18%/site/million years, which is significantly higher than the substitution
rates for flightless birds such as ratites and penguins (t test, P = 5.9×10-8), but not significantly different
from those for the volant birds (P = 0.19; Figure 2B). Because it is generally more difficult to reconstruct
the ancestral character states (volant or flightless) of the internal branches than those of the terminal
branches, the substitution rates of the terminal branches only were analysed in this study. However, if only
one species represents an old lineage, such as the extant ostrich representing Struthionidae, even for the
terminal branch, the substitution rate along the branch is actually average of the ancestral volant form
(with higher substitution rate) and the descendant flightless form (with lower substitution rate). Therefore,
compared with other terminal branches of flightless birds whose sister-groups are also flightless, it is
likely that the branches of the ostrich and the penguin (only one mitogenome was involved in this
analysis) have higher substitution rates. Indeed, the substitution rates of these branches are high among the
flightless birds (Table S5 and DataBase Figure S2). For this reason, the mean substitution rates of the
flightless bird might be overestimated. To examine this possibility, we excluded the branches of the
ostrich and penguin. However, there was no considerable effect (harmonic mean of the substitution rates
in the terminal branches of flightless birds excluding ostrich and penguin: 1.14%/site/million years; t test
(flightless birds vs. volant birds), P = 1.99×10-5).
A scatter diagram between body weight and substitution rate in the mitochondrial third
codon positions along the terminal branches is shown in Figure 2C and Database Figure 2
(http://aepyornis.paleogenome.jp/). There is a negative correlation between substitution rate and body
weight. Although the coefficient of the correlation is not high (R2 = 0.364), the negative correlation is
significant (PP>0.99). Since the mitochondrial substitution rate of the ancestral paleognaths was estimated
to be 3.71–5.03%/site/million years as mentioned above, their body masses can be roughly inferred from
this approximate prediction. They were estimated to be 1,500–2,800 g.
Among the volant birds, substitution rates of songbirds (9.69%/site/million years), parakeets
(6.49%/site/million years) and hummingbirds (7.22%/site/million years) are especially high, probably
because of their small body size and high metabolic rate associated with powerful flight behaviour.
Although the reason is unclear, the substitution rate of the tinamou is extraordinarily high (harmonic
mean, 13.15%/site/million years). If we exclude the tinamou, the correlation between body weight and
substitution rate becomes stronger (R2 = 0.59), and the difference between the ancestral paleognaths and
the extant volant birds becomes smaller (P = 0.65). The body weight of the ancestral paleognaths as
inferred from this approximate prediction is 500–1,700 g.
Ancestral state reconstruction of body weight
Body weight data for extant species were collected from the literature (TableS5 [S13-S25]).
Because genomic sequences include abundant information on the rates and the patterns of molecular
evolution, the precision of the ancestral state reconstruction of phenotypic traits could be significantly
improved, especially when there is a strong correlation between the values of the phenotypic traits and the
rate of molecular evolution [S67]. Lartillot & Delsuc [S67] reported a negative correlation between the
synonymous substitution rate and the body weight. This is probably an indirect correlation mediated by
generation intervals. Larger animals generally have later female sexual maturation, which results in a
longer generation interval than for smaller animals. Lartillot & Delsuc [S67] estimated the body weights
of the ancestral Eutherian mammals by using their method taking account of the correlation of the
synonymous substitution rates and the body weights (correlation model), and compared these estimates
with the traditional method without taking into account the correlations (uncorrelation model). The body
weights of the ancestral Eutherian mammals estimated by either method were generally very close, but
there were remarlable difference in Cetartiodactyla, especially in whippomorpha (hippos + cetaceans). The
estimate by the correlation model was 6.0-376.1 kg (geometirc mean is 47.5kg), and the estimate by the
uncorrelation model was 32.3 – 2086.1 kg (geometric mean is 259.6 kg). Lartillot & Delsuc [S67] argued
that the early Cetacean species probably had small body size such as Himalayacetus and Pakicetus (wolf
size) or Ichtyolestes (fox size) [S68]. The extant hippopotamus and the extant cetaceas seem to have
gained large body size convergently because of the independent aquatic adaptation. In such a case, if the
ancestral body weight is estimated only from the body weights of the extant species, it may give an
overestimation of the ancestral body weight.
To take account of the correlation between body weight and the synonymous substitution rate in
avian evolution, we applied the Ancov program of Coevol [S69]. We adopted mitochondrial genomes as a
material to support the ancestral state reconstruction of body weight. However, the current version of
Coevol does not incorporate variable evolutionary rates among sites (e.g., the gamma model), and this
may have a crucial effect on phylogenetic inference [S70]. To take account of the possible among-sites
variability of the rate of synonymous substitutions, we adopted the following two-stage procedure. First,
using the above MCMC sample of the divergence times as the constraints on the divergence times of the
internal nodes, we applied MCMCTREE [S39] to the data of the third codon positions of the
mitochondrial genomes with the discrete gamma model of variable rates among sites. We treated the
estimated evolutionary rates at the internal and the external nodes as measured values of a phenotypic trait
to reconstruct the ancestral state of body weight. The synonymous substitution rate and body weight were
negatively correlated (Figure 2C); the posterior mean of the correlation was −0.848 and the posterior
probability of negative correlation was 0.99997. The model assumed by default that the log-transformed
values follow the Brownian process. Accordingly, we calculated the geometric mean of the upper and
lower limits of the 95% credibility intervals to represent the ancestral values of body weight. The
estimated substitution rates of the mitochondrial 3rd codon positions, the body weights of the Aves (and
the non-Avian outgroups) reported by the previous studies, and the estimated body weights of the
ancestral Aves were summarized in the Table S5.
As shown in Figure 4 and the Table S5, the body weight of the ancestral paleognaths
was estimated to be about 5,000 g (3,800–5,500 g). Although this estimate is relatively large, it is still
within the range of the volant birds [S15]. It is also notable that these estimates are close to the estimated
size of the fossil Lithornithidae. According to Mayr [S71], “Lithornithid species significantly vary in size,
with the turkey-sized P. howardae being about twice as large as P. cercanaxius” (the body weight of the
turkey is about 6,000 g). When the correlation between body weight and substitution rate was not taken
into account, the estimates of body weight became grossly larger (data not shown). This is probably
because the conventional substitution model does not take account of convergence [S50]. Although
circumstantial, this work provides the first reconstruction of the ancestral state of the paleognaths.
However, we feel that there is a room for improvement in the methodologies used in the present
study. Because the mitochondrial substitution rate of birds, especially the synonymous rate, is very high, it
is sometimes difficult to estimate an accurate number of substitutions along a branch, particularly for a
deep ancestral branch. It is plausible that the substitution rates inferred in this study (Figure 2B, Table S5)
are still underestimates. If this is the case, the ancestral body weights may be overestimated. For example,
although the body weight of the ancestral crown Aves was estimated to be 3,500 g, it is larger than the
body weights of Mesozoic birds estimated from fossil evidence (32–1,779 g) [S72]. Furthermore, to test
the tendency of estimated body weight, the weights of several extant species were assumed as missing
data. The estimated mean body weight of extant species using this test was generally higher than the actual
body weight, although it was still within 95% confidence intervals (data not shown). Improvements in the
substitution model are important to resolve this issue
Species identification of eggshells and ancestral state reconstruction of egg weight
One of the remarkable features of Aepyornis maximus is that they laid the largest eggs ever
known among the animals including dinosaurs, and they often reached more than 9 kg. A huge number of
eggshell fragments has been excavated from the west to the southwest coast of Madagascar, and the dune
of Faux Cap is known as the site with the most abundant eggshell deposits. However, somewhat
surprisingly there is no direct evidence that these huge eggs were laid by Aepyornis, because there are no
reports of sympatric excavation sites of Aepyornis bones and these eggshell fragments (but see also the
embryological study by Balanoff & Rowe [S73]). Oskam et al. [S74] first reported DNA sequences from
eggshells identified as Aepyornis and Mullerornis based on their thickness. However, there were no
sequence data of homologous nucleotide sites from Aepyornis bones available at that time. Although
Mitchell et al. [S46] reported nearly complete mitochondrial genome sequences of these two genera, they
did not focus on the species identification of the eggs. In the present study, we confirmed that the eggs
putatively identified as Aepyornis and Mullerornis are from these genera (Table S2). The nucleotide
sequence data (GU799601: 82 bp) from thick egg shells from Meanderare Estuary (25°09'S, 46°26'E),
Toriala, Madagascar were exactly identical with the mitochondrial DNA data of Aepyornis maximus
sequenced in this study, but one base substitution was observed from the comparison with
Aepyornis hildebrandti reported by Mitchell et al. [S46]. Based on the findings presented above, here we
discuss the evolution of reproductive strategy in the ratites.
There is a strong correlation between body weight (log-scale) and the egg size in the Aves,
especially in the Palaeognathae (Figure 2A, DataBase Figure S1). It is noteworthy that the eggs of
Aepyornis and the kiwi are extraordinarily large compared with their body sizes. The correlation between
body and egg size is very high among the extant Palaeognathae (R2=0.873). Although the egg size of
Aepyornis is inside of 95% CI within the extant Palaeognathae, it is noteworthy that Aepyornis lays huge
eggs not only in terms of the absolute size, but also in terms of the relative size in the comparison with the
body mass. If the kiwis were excluded, the correlation becomes higher (R2 = 0.983), and these two taxa are
clearly outside this range (Figure 2A, DataBase Figure 1: http://aepyornis.paleogenome.jp/). Endo et al.
[S75] indicated an anatomical similarity among the coxa of Aepyornis and the kiwi, suggesting similar
reproductive strategies. Molecular phylogenetic studies (Mitchell et al. [S46] and the present study)
strongly support the sister group relationship of these two taxa. Does this mean that the similarity in the
enormous relative size of the egg is synapomorphy among these taxa? To address this issue, absolute
ancestral egg sizes were estimated using the Ancov program, taking into account the correlation with
molecular evolutionary rate. The rate of synonymous substitutions was negatively correlated with egg
weight; the posterior mean of the correlation was −0.703 and the posterior probability of negative
correlation was 0.9964. Our result suggests that the common ancestor of Aepyornis and the kiwi laid larger
eggs relative to body size compared with the other extant paleognaths. Although the common ancestral
branch between Aepyornis and the kiwi is short, this suggests that the gigantism of the egg had already
started to evolve during this stage. The huge egg is one of the typical characters seen in the K strategists
that are adaptive in a stable environment. It should be noted that the extinct insular family Dinornithidae
also showed a tendency to egg gigantism (Figure 2A and DataBase Figure 1:
http://aepyornis.paleogenome.jp/). The K strategists might have advantages in remote isolated islands with
less migrants and small carrying capacities, such as Madagascar and New Zealand. Although fossil
records of the elephant bird and the kiwi are limited and no fossils have been reported in the Paleogene
and Neogene, except for a recently described fossil of a kiwi from the Early Miocene [S76], these
ancestral state reconstructions suggest evolutionary traits in terms of morphology, ethology and ecology.
The ancestral geographic distributions of Palaeognathae
The ancestral geographic distribution of the Palaeognathae was inferred by the Bayesian method
using BayesTraits [S77]. The time-calibrated tree shown in Figure 1A was used in this analysis using
BayesTraits. In addition, the maximum parsimony method was also applied using Mesquite [S52]. The
fossil paleognaths, Lithornis and Palaeotis, were also included. Because the number of morphologic
characters used to estimate the phylogenetic positions of these two taxa was small, the divergence times of
these taxa from the other paleognaths were not estimated, but it was assumed that (a) Lithornis branched
out from the other paleognaths at the midpoint of the common ancestral branch of the extant paleognaths,
and (b) the relationships among the ostrich, Palaeotis, and the other extant paleognaths
(Notopalaeognathae) were represented by trifurcation. The geological age of Lithornis and Palaeotis was
assumed to be 61 mya [S56, S78] and 48 mya, respectively [S79]. The geographic distribution of Lithornis
and Palaeotis was in the Northern Hemisphere, while that of the Notopalaeognathae is in the Southern
Hemisphere. We assumed the ostrich to be distributed in both the Northern and the Southern Hemispheres
because, although the extant ostriches are found only in Africa, they were also distributed in Eurasia until
recently [S80, S81] as mentioned in the main text.
The ancestral geographic distribution of the Palaeognathae as inferred by the Bayesian method is shown in
Figure 3D (Bayesian method) and Figure S3A (Parsimony method). Our results indicated that the common
ancestor of all paleognaths including fossil species was distributed in the Northern Hemisphere, with the
highest posterior probability (100% PP). The distribution of the common ancestor of the ostrich, Palaeotis
and the Notopalaeognathae was also inferred to be in the Northern Hemisphere with the highest posterior
probability (100% PP). The common ancestor of the Notopalaeognathae was inferred to be in the Southern
Hemisphere with a high posterior probability (97.9% PP). The geographic distribution reconstructed by
the maximum parsimony methods is also consistent with this result (Figure S3A). The Bayesian method
takes into account all possibilities to reconstruct the ancestral states. Therefore, the possibilities of North
Hemispheric distributions in the several ancestral nodes within the Notopalaeognathae (e.g., the common
ancestor of the Apterygidae and the Aepyornithidae) were not completely excluded. Conversely, the
ancestral reconstructions of the maximum parsimony method are clearer (Figure S3A). The North
Hemispheric origin of the Palaeognathae and a single migration event to the South Hemisphere by the
common ancestor of the Notopalaeognathae were strongly supported by both analyses.
Taking into account the uncertainty of the phylogenetic position of Palaeotis, the ancestral
geographic distribution of the root of all paleognaths was also estimated based on seven possible
topologies (there are seven possible alternative positions of Palaeotis; Figure 3A). Equal prior
probabilities among these seven tree topologies were assumed in this analysis. Again, the Northern
Hemispheric origin of the paleognaths was strongly supported with a high Bayesian posterior probability
(100% BP; data not shown).
With the aim for drawing more detailed picture on the establishment of the current geographic
distribution of the paleognaths, the seven zoogeographic regions (Nearctic, Palearctic, Afrotropical, South
America, Australia, Madagascan, and Zealandia) used by Claramunt and Cracraft [S1] were further
applied. Indomalay region was not considered because of the lack of the extant and fossil record of the
paleognaths. The Northern Hemispheric origin of the paleognaths was again supported (Figure S3B).
Although Palearctic region rather than Nearctic region was supported as the origin of the paleognaths, it
can be an analytical artefact because Nearctic species such as Paracathartes and Pseudocrypturus were
not involved in this analysis. Therefore, the origin of paleognaths can be either Palearctic or Nearctic
region. In either case, the ancestor of notopaleognaths migrated to South America via North America, and
then they spread to each zoogeographic region such as Australia, Madagascan, and Zealandia. Since the
number of species is small in the paleognaths including fossil species (only 24 species were involved in
this analysis), it is plausible that they do not have sufficient power to resolve the order of migrations
among seven zoogeographic regions.
As discussed in the main text, the timings of their intercontinental migration are limited to the Late
Cretaceous to Paleocene, the seven zoogeographic regions were degenerated into five zoogeographic
regions taking account of the continental positions at that time, namely (1) North Hemisphere
(Nearctic+Palearctic), (2) Afrotropical, (3) South America+Australia+Antarctica, (4) Madagascan, and (5)
Zealandia. The ancestral geographic distributions were estimated based on these five zoogeographic
regions. The results were shown in Figure S3. Both of the MP analyses (Figure S3C) and the Bayesian
analyses (Figure S3D) supported the followings: The Palaeognathae originated in the North Hemisphere.
The first split of the crown Palaeognathae also occurred in the North Hemisphere. The Notopalaeognathae
originated in the South America+Australia+Antarctica. The Dinornithidae migrated to Zealand from the
South America+Australia+Antarctica. The geographic distribution of the common ancestor of Apterygidae
(Zealandia) and Aepyornithidae (Madagascan) is unclear.
REFERENCE:
S1. Claramunt, S., and Cracraft, J. (2015) A new time tree reveals Earth history’s imprint on the
evolution of modern birds. Sci Adv 2015, 1. doi: 10.1126/sciadv.1501005
S2. Prum, R.O., Berv, J.S., Dornburg, A., Field, D.J., Townsend, J.P., Lemmon, E.M., and Lemmon,
A.R. (2015). A comprehensive phylogeny of birds (Aves) using targeted next-generation DNA
sequencing. Nature 526, 569-573.
S3. Haddrath, O., and Baker, A.J. (2012). Multiple nuclear genes and retroposons support vicariance
and dispersal of the palaeognaths, and an Early Cretaceous origin of modern birds. Proc. Biol.
Sci. 279, 4617-4625.
S4. Benton, M.J., Donoghue, P.C.J., and Asher, R.J. (2009). Calibrating and constraining molecular
clocks, (Oxford: The Time Tree of Life. Oxford University Press).
S5. Bell, A., and Chiappe, L.M. (2015). A species-level phylogeny of the Cretaceous
Hesperornithiformes (Aves: Ornithuromorpha): Implications for body size evolution amongst the
earliest diving birds. J. Syst. Palaeontol. 14, 239-251
S6. Boles, W.E. (1992). Revision of Dromaius gidju Patterson and Rich 1987 from Riversleigh,
northwestern Queensland, Australia, with a reassessment of its generic position. Nat. Hist. Mus.
Los Angeles Cty. Sci. Ser. 36, 195-208.
S7. Clarke, J.A., Tambussi, C.P., Noriega, J.I., Erickson, G.M., and Ketcham, R.A. (2005). Definitive
fossil evidence for the extant avian radiation in the Cretaceous. Nature 433, 305-308.
S8. Muller, J., and Reisz, R.R. (2005). Four well-constrained calibration points from the vertebrate
fossil record for molecular clock estimates. Bioessays 27, 1069-1075.
S9. O'Connor, J.Z. (2013). A redescription of Chaoyangia beishanensis (Aves) and a comprehensive
phylogeny of Mesozoic birds. J. Syst. Palaeontol. 11, 889–906.
S10. Parris, D.C., and Hope, S. (2002). New interpretations of birds from the Hornerstown and
Navesink formations, New Jersey. In Proceedings of the 5th International Meeting of the Society
of Avian Paleontology and Evolution. (Beijing: Science Press).
S11. Rasmussen, D.T., Olson, S.L., and Simmons, E.L. (1987). Fossil birds from the Oligocene Jebel
Qatrani Formation, Fayum Province, Egypt. Smithsonian Contrib. Paleobiol. 62, 1-20.
S12. Slack, K.E., Jones, C.M., Ando, T., Harrison, G.L., Fordyce, R.E., Arnason, U., and Penny, D.
(2006). Early penguin fossils, plus mitochondrial genomes, calibrate avian evolution. Mol. Biol.
Evol. 23, 1144-1155.
S13. Davies, S.J.J.F. (2003). Ostriches. Grzimek's Animal Life Encyclopedia 8, (Farmington Hills:
Gale Group).
S14. Dickison, M.R. (2007). The Allometry of Giant Flightless Birds. PhD thesis, (Durham: Duke
University).
S15. Dunning, J.B.J. ( 2008). CRC Handbook of Avian Body Masses Second Edition, ( Boca Raton:
CRC press).
S16. Hauber, M.E. (2014). The book of eggs. A life size guide to the eggs of six hundred of the
world’s bird species, (Lewes: The Ivy Press).
S17. Higgins, P.J., Peter, J.M., and Steele, W.K. (2001). Handbook of Australian, New Zealand &
Antarctic birds. Volume 5: Tyrant-flycatchers to Chats, (Melbourne: Oxford University Press).
S18. Huynen, L., Gill, B.J., Millar, C.D., and Lambert, D.M. (2010). Ancient DNA reveals extreme
egg morphology and nesting behavior in New Zealand's extinct moa. Proc. Natl. Acad. Sci. USA
107, 16201-16206.
S19. Johnsgard, P.A. (1981). The plovers, sandpipers, and snipes of the world, (Lincoln: University of
Nebraska Press).
S20. Johnsgard, P.A. (1999). The pheasants of the world, (Oxford).
S21. Kear, J. (2005). Ducks, Geese and Swans: Volume 1: general chapters, species accounts (Anhima
to Salvadorina), (Oxford: Oxford University Press).
S22. Williams, T.D. (1995). The penguins: Spheniscidae, (Oxford: Oxford University Press).
S23. Yoshimura, T., and Suzuki, M. (2014). Tori to Tamago to Su no Daizukan, (Tokyo: Bookman
press).
S24. Brooke, M. (2004). Albatrosses and Petrels across the world, (Oxford: Oxford University Press).
S25. Cramp, S., Perrins, C.M., and Brooks, D.J. (1977). Handbook of the birds of Europe the Middle
East and North Africa: the birds of the western Palearctic. volume I: Ostrich to Ducks, (Oxford:
Oxford University Press).
S26. Campos, P.F., Willerslev, E., Sher, A., Orlando, L., Axelsson, E., Tikhonov, A., Aaris-Sørensen,
K., Greenwood, A.D., Kahlke, R.D., Kosintsev, P., et al. (2010). Ancient DNA analyses exclude
humans as the driving force behind late Pleistocene musk ox (Ovibos moschatus) population
dynamics. Proc. Natl. Acad. Sci. USA 107, 5675-5680.
S27. Kampmann, M.L., Fordyce, S.L., Avila-Arcos, M.C., Rasmussen, M., Willerslev, E., Nielsen,
L.P., and Gilbert, M.T. (2011). A simple method for the parallel deep sequencing of full
influenza A genomes. J. Virol. Methods 178, 243-248.
S28. Orlando, L., Ginolhac, A., Zhang, G., Froese, D., Albrechtsen, A., Stiller, M., Schubert, M.,
Cappellini, E., Petersen, B., Moltke, I., et al. (2013). Recalibrating Equus evolution using the
genome sequence of an early Middle Pleistocene horse. Nature 499, 74-78.
S29. Langmead, B., and Salzberg, S.L. (2012). Fast gapped-read alignment with Bowtie 2. Nat.
Methods 9, 357-359.
S30. Martin, M. (2011). Cutadapt removes adapter sequences from high-throughput sequencing reads.
EMBnet. journal 17, 10–12.
S31. Baker, A.J., Haddrath, O., McPherson, J.D., and Cloutier, A. (2014). Genomic support for a moa-
tinamou clade and adaptive morphological convergence in flightless ratites. Mol. Biol. Evol. 31,
1686-1696.
S32. Harshman, J., Braun, E.L., Braun, M.J., Huddleston, C.J., Bowie, R.C., Chojnowski, J.L.,
Hackett, S.J., Han, K.L., Kimball, R.T., Marks, B.D., et al. (2008). Phylogenomic evidence for
multiple losses of flight in ratite birds. Proc. Natl. Acad. Sci. USA 105, 13462-13467.
S33. Ginolhac, A., Rasmussen, M., Gilbert, M.T., Willerslev, E., and Orlando, L. (2011).
mapDamage: testing for damage patterns in ancient DNA sequences. Bioinformatics 27, 2153-
2155.
S34. Li, H., and Durbin, R. (2009). Fast and accurate short read alignment with Burrows-Wheeler
transform. Bioinformatics 25, 1754-1760.
S35. Lindahl, T. (1993). Instability and decay of the primary structure of DNA. Nature 362, 709-715.
S36. Chevreux, B., Pfisterer, T., Drescher, B., Driesel, A.J., Muller, W.E., Wetter, T., and Suhai, S.
(2004). Using the miraEST assembler for reliable and automated mRNA transcript assembly and
SNP detection in sequenced ESTs. Genome Res. 14, 1147-1159.
S37. Hahn, C., Bachmann, L., and Chevreux, B. (2013). Reconstructing mitochondrial genomes
directly from genomic next-generation sequencing reads--a baiting and iterative mapping
approach. Nucleic Acids Res. 41, e129.
S38. Katoh, K., and Standley, D.M. (2013). MAFFT multiple sequence alignment software version 7:
improvements in performance and usability. Mol. Biol. Evol. 30, 772-780.
S39. Yang, Z. (2007). PAML 4: phylogenetic analysis by maximum likelihood. Mol. Biol. Evol. 24,
1586-1591.
S40. Huang, X., and Madan, A. (1999). CAP3: A DNA sequence assembly program. Genome Res. 9,
868-877.
S41. Okonechnikov, K., Golosova, O., Fursov, M., and team, U. (2012). Unipro UGENE: a unified
bioinformatics toolkit. Bioinformatics 28, 1166-1167.
S42. Smith, J.V., Braun, E.L., and Kimball, R.T. (2013). Ratite nonmonophyly: independent evidence
from 40 novel Loci. Syst. Biol. 62, 35-49.
S43. Edgar, R.C. (2004). MUSCLE: multiple sequence alignment with high accuracy and high
throughput. Nucleic Acids Res. 32, 1792-1797.
S44. Lanfear, R., Calcott, B., Ho, S.Y., and Guindon, S. (2012). Partitionfinder: combined selection of
partitioning schemes and substitution models for phylogenetic analyses. Mol. Biol. Evol. 29,
1695-1701.
S45. Stamatakis, A., Hoover, P., and Rougemont, J. (2008). A rapid bootstrap algorithm for the
RAxML Web servers. Syst Biol 57, 758-771.
S46. Mitchell, K.J., Llamas, B., Soubrier, J., Rawlence, N.J., Worthy, T.H., Wood, J., Lee, M.S., and
Cooper, A. (2014). Ancient DNA reveals elephant birds and kiwi are sister taxa and clarifies
ratite bird evolution. Science 344, 898-900.
S47. Mayr, G. (2014). The middle Eocene European “ratite” Palaeotis (Aves, Palaeognathae) restudied
once more. Paläontol Z, DOI 10.1007/s12542-12014-10248-y.
S48. Springer, M.S., Meredith, R.W., Eizirik, E., Teeling, E., and Murphy, W.J. (2008). Morphology
and placental mammal phylogeny. Syst. Biol. 57, 499-503.
S49. Yonezawa, T., and Hasegawa, M. (2010). Was the universal common ancestry proved? Nature
468, E9; discussion E10.
S50. Yonezawa, T., and Hasegawa, M. (2012). Some problems in proving the existence of the
universal common ancestor of life on Earth. ScientificWorldJournal 2012, 479824.
S51. O'Leary, M.A., Bloch, J.I., Flynn, J.J., Gaudin, T.J., Giallombardo, A., Giannini, N.P., Goldberg,
S.L., Kraatz, B.P., Luo, Z.X., Meng, J., et al. (2013). The placental mammal ancestor and the
post-K-Pg radiation of placentals. Science 339, 662-667.
S52. Maddison, W.P., and Maddison, D.R. (2015). MESQUITE: a modular system for evolutionary
analysis. Volume Version 3.03.
S53. Johnston, P. (2011). New morphological evidence supports congruent phylogenies and
Gondwana vicariance for palaeognathous birds. Zool. J. Linn. Soc. 163, 959–9825.
S54. Ronquist, F., Teslenko, M., van der Mark, P., Ayres, D.L., Darling, A., Höhna, S., Larget, B.,
Liu, L., Suchard, M.A., Huelsenbeck, J.P. (2012) MrBayes 3.2: efficient Bayesian phylogenetic
inference and model choice across a large model space. Syst. Biol. 61, 539-542. doi:
10.1093/sysbio/sys029.
S55. Houde, P. (1986). Ancestors of ostriches found in the Northern Hemisphere suggest a new
hypothesis for origin of ratites. Nature 324, 563–565.
S56. Houde, P. (1988). Paleognathous Birds from the Early Tertiary of the Northern Hemisphere,
(Cambridge MA: Publications of the Nuttall Ornithological Club).
S57. Worthy, T.H., Mitri, M., Handley, W.D., Lee, M.S.Y., Anderson, A., Sand, C. (2016) Osteology
supports a stem-galliform affinity for the giant extinct flightless bird Sylviornis neocaledoniae
(Sylviornithidae,Galloanseres). PLoS ONE 11, e0150871.doi:10.1371/journal.pone.0150871
S58. Yuri, T., Kimball, R.T., Harshman, J., Bowie, R.C., Braun, M.J., Chojnowski, J.L., Han, K.L.,
Hackett, S.J., Huddleston, C.J., Moore, W.S., et al. (2013). Parsimony and model-based analyses
of indels in avian nuclear genes reveal congruent and incongruent phylogenetic signals. Biology
(Basel) 2, 419-444.
S59. Thorne,J.L., Kishino, H., and Painter, I.S. (1998) Estimating the rate of evolution of the rate of
molecular evolution. Mol. Biol. Evol. 15, 1647-1657
S60. Thorne, J.L., and Kishino, H. (2002). Divergence time and evolutionary rate estimation with
multilocus data. Syst. Biol. 51, 689-702.
S61. Tamura, K. (1992). Estimation of the number of nucleotide substitutions when there are strong
transition-transversion and G+C-content biases. Mol. Biol. Evol. 9, 678-687.
S62. Nishihara, H., Okada, N., and Hasegawa, M. (2007). Rooting the eutherian tree: the power and
pitfalls of phylogenomics. Genome Biol. 8, R199.
S63. Inoue, J., Donoghue, P.C., and Yang, Z. (2010). The impact of the representation of fossil
calibrations on Bayesian estimation of species divergence times. Syst. Biol. 59, 74-89.
S64. Yang, Z., and Rannala, B. (2006). Bayesian estimation of species divergence times under a
molecular clock using multiple fossil calibrations with soft bounds. Mol. Biol. Evol. 23, 212-226.
S65. Zhang, C., Stadler, T., Klopfstein, S., Heath, T.A., and Ronquist, F. (2015) Total-Evidence
Dating under the Fossilized Birth-Death Process. Syst. Biol. doi: 10.1093/sysbio/syv080
S66. Nabholz, B., Lanfear, R., and Fuchs, J. (2016). Body mass-corrected molecular rate for bird
mitochondrial DNA. Mol. Ecol. 25, 4438–4449.
S67. Lartillot, N., and Delsuc, F. (2012). Joint reconstruction of divergence times and life-history
evolution in placental mammals using a phylogenetic covariance model. Evolution 66, 1773-
1787.
S68. Thewissen, J.G., Williams, E.M., Roe, L.J., and Hussain, S.T. (2001) Skeletons of terrestrial
cetaceans and the relationship of whales to artiodactyls. Nature 413, 277–281.
S69. Lartillot, N. (2014). A phylogenetic Kalman filter for ancestral trait reconstruction using
molecular data. Bioinformatics 30, 488-496.
S70. Yang, Z. (1994). Maximum likelihood phylogenetic estimation from DNA sequences with
variable rates over sites: approximate methods. J. Mol. Evol. 39, 306-314.
S71. Mayr, G. (2009). Paleogene Fossil Birds, (New York: Springer Publishing).
S72. Seranno, F.J., Palmqvist, P., and Sanz, J.L. (2015). Multivariate analysis of neognath skeletal
measurements: implications for body mass estimation in Mesozoic birds. Zool. J. Linn. Soc. 173,
929-955.
S73. Balanoff, A.M., and Rowe, T. (2007). Osteological description of an embryonic skeleton of the
extinct elephant bird, Aepyornis (Palaeognathae: Ratitae). , J. Vert. Paleontol. Memoir 9, 27, 53.
S74. Oskam, C.L., Haile, J., McLay, E., Rigby, P., Allentoft, M.E., Olsen, M.E., Bengtsson, C.,
Miller, G.H., Schwenninger, J.L., Jacomb, C., et al. (2010). Fossil avian eggshell preserves
ancient DNA. Proc. Biol. Sci. 277, 1991-2000.
S75. Endo, H., Akishinonomiya, F., Yonezawa, T., Hasegawa, M., Rakotondraparany, F., Sasaki, M.,
Taru, H., Yoshida, A., Yamasaki, T., Itou, T., et al. (2012). Coxa morphologically adapted to
large egg in aepyornithid species compared with various palaeognaths. Anat. Histol. Embryol.
41, 31-40.
S76. Worthy, T.H., Worthy, J.P., Tennyson, A.J.D., Salisbury, S.W., Hand, S.J., and Scofield, R.P.
(2013). Miocene fossils show that kiwi (Apteryx, Apterygidae) are probably not phyletic
dwarves. (Proceedings of the 8th International Meeting Society of Avian Paleontology and
Evolution).
S77. Pagel, M., Meade, A., and Barker, D. (2004). Bayesian estimation of ancestral character states on
phylogenies. Syst. Biol. 53, 673-684.
S78. Jarvis, E.D., Mirarab, S., Aberer, A.J., Li, B., Houde, P., Li, C., Ho, S.Y., Faircloth, B.C.,
Nabholz, B., Howard, J.T., et al. (2014). Whole-genome analyses resolve early branches in the
tree of life of modern birds. Science 346, 1320-1331.
S79. Lenz, O.K., Wilde, V., Mertz, D.F., and Riegel, W. (2014) New palynology-based astronomical
and revised 40Ar/39Ar ages for the Eocene maar lake of Messel (Germany). Int. J. Earth Sci. 21.
doi:10.1007/s00531-014-1126-2
S80. Hou, L., Zhou, Z., Zhang, F., and Wang, Z. (2005). A Miocene ostrich fossil from Gansu
Province, northwest China. Chin. Sci. Bull. 50, 1808–1810.
S81. Janz, L., Elston, R.G., and Burr, G.S. (2009). Dating North Asian surface assemblages with
ostrich eggshell: implications for palaeoecology and extirpation. J. Archaeol. Sci. 36, 1982-1989.