molecular data from orthonectid worms show they are highly ... · 1 2 title: molecular data from...
TRANSCRIPT
1 Title: Molecular data from Orthonectid worms show they 2
are highly degenerate members of phylum Annelida not 3
phylum Mesozoa. 4
Authors: Philipp H. Schiffer1, Helen E. Robertson1, Maximilian J. Telford1* 5
6
Affiliations: 1 Centre for Life’s Origins and Evolution, Department of Genetics, Evolution 7
and Environment, University College London, Gower Street, London, WC1E 6BT, UK 8
*Correspondence to: [email protected] 9
10
11
Summary: 12
The Mesozoa are a group of tiny, extremely simple, vermiform endoparasites of various 13
marine animals (Fig. 1). There are two recognised groups within the Mesozoa: the 14
Orthonectida (Fig. 1a,b; with a few hundred cells including a nervous system made up of just 15
10 cells [1]) and the Dicyemids (Fig. 1c; with at most 42 cells [2]). They are 16
classic ’Problematica’ [3] - the name Mesozoa suggests an evolutionary position 17
intermediate between Protozoa and Metazoa (animals) [4] and implies their simplicity is a 18
primitive state, but molecular data have shown they are members of Lophotrochozoa within 19
Bilateria [5-8] which would mean they derive from a more complex ancestor. Their precise 20
phylogenetic affinities remain uncertain, however, and ascertaining this is complicated by the 21
very fast evolution observed in genes from both groups, leading to the common systematic 22
error of Long Branch Attraction (LBA) [9]. Here we use mitochondrial and nuclear gene 23
author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (which was not peer-reviewed) is the. https://doi.org/10.1101/235549doi: bioRxiv preprint
sequence data, and show beyond doubt that both dicyemids and orthonectids are members of 24
the Lophotrochozoa. Carefully addressing the effects of systematic errors due to unequal 25
rates of evolution, we show that the phylum Mesozoa is polyphyletic. While the precise 26
position of dicyemids remains unresolved within Lophotrochozoa, we unequivocally identify 27
orthonectids as members of the phylum Annelida. This result reveals one of the most extreme 28
cases of body plan simplification in the animal kingdom; our finding makes sense of an 29
annelid-like cuticle in orthonectids [1] and suggests the circular muscle cells repeated along 30
their body [10] may be segmental in origin. 31
32
Results 33
Using a new assembly of available genomic and transcriptomic sequence data we identified an 34
almost complete mitochondrial genome from Intoshia linei (2 ribosomal RNAs, 20 transfer 35
RNAs and all protein coding genes apart from atp8) and recovered 9 individual mitochondrial 36
gene containing contigs from Dicyema japonicum and from a second unidentified species 37
(Dicyema sp.; cox1, 2, 3; cob; and nad1, 2, 3, 4, 5). Cob, nad3, nad4, and nad5 had not 38
previously been identified in any Dicyma species. All protostomes studied possess a unique, 39
derived combination of amino acid signatures and conserved deletions in their mitochondrial 40
NAD5 genes. Comparing the NAD5 protein coding regions of Intoshia and Dicyema to those 41
of other Metazoa shows that both share almost all of the conserved protostome signatures [11] 42
(Fig. 2a). This signature is significantly more complex than the two amino acids of the 43
Lox5/DoxC signature from Dicyema previously published [11-13] and shows beyond doubt 44
that both groups are protostomes. 45
It has been suggested that mesozoans are derived from the parasitic neodermatan flatworms. If 46
this were correct mesozoans would be expected to share two changes in mitochondrial genetic 47
author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (which was not peer-reviewed) is the. https://doi.org/10.1101/235549doi: bioRxiv preprint
code that unite all rhabditophoran flatworms, where the triplet AAA codes for Asparagine (N) 48
rather than the normal Lysine (K) and ATA codes for Isoleucine (I) rather than the usual 49
Methionine (M) [14]. We inferred the mitochondrial genetic codes for Dicyema and Intoshia. 50
Both groups have the standard invertebrate mitochondrial code arguing against a relationship 51
with the parasitic rhabditophoran platyhelminths (table S1). 52
We next aligned the mitochondrial genes of Intoshia and three species of Dicyema with 53
orthologs from a diversity of other Metazoans and concatenated these to produce a matrix of 54
2,969 reliably aligned amino acids from 69 species. Phylogenetic analyses of this 55
comparatively small data set is not expected to be as reliable as a much larger set of nuclear 56
genes and aspects of the topology and observed branch lengths suggest it was affected by LBA 57
(Fig. 2b). To reduce the effects of LBA on the inference of the affinities of the mesozoans we 58
removed the taxa with the longest branches and considered the position of the dicyemids and 59
orthonectid separately (as both are very long branched). We were unable to resolve the position 60
of the dicyemids (although they are clearly lophotrochozoans), but found some support for 61
placing the orthonectid Intoshia linei with the annelids (Fig. 2c and figures S1, 2). Intoshia 62
linei has a unique mitochondrial gene order although the order of the genes nad1, nad6, and 63
cob match that seen in the Lophotrochozoan ground plan and the early branching annelid 64
Owenia (Fig. 2d). 65
We next assembled a data set of 469 orthologous genes, 227,187 reliably aligned amino acids, 66
from 45 species of animals including Intoshia linei and two species of Dicyema. After 67
removing positions in the concatenated alignment with less than 50% occupancy we had an 68
alignment length of 190,027 amino acids and average completeness of ~68%. Intoshia linei 69
was 65% complete, while Dicyema japonicum and Dicyema sp. were 77% and 43% complete 70
respectively (table S2). We conducted a bayesian phylogenetic analyses of these data with the 71
site heterogeneous CAT+G4 model in Phylobayes [15]. To provide an additional, conservative 72
author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (which was not peer-reviewed) is the. https://doi.org/10.1101/235549doi: bioRxiv preprint
estimate of clade support and to enable further analyses in a practical time frame, we also used 73
jackknife subsampling. For each jackknife analysis we took 50 random subsamples of 30,000 74
amino acids each and ran 2,000 cycles (phylobayes CAT+G4) per sample. All 50 subsamples 75
were summarised into a single tree with the first 1800 trees from each excluded as ‘burnin’ 76
[16]. 77
We observed strong support for a clade of Lophotrochozoa (excluding Rotifers) including both 78
dicyemids and the orthonectid (Bayesian Posterior Probability (PP) = 1.0; Jackknife Proportion 79
(JP) = 0.97) (Fig. 3a). The dicyemids and orthonectids were not each other’s closest relatives; 80
the position of the dicyemids within the Lophotrochozoa was not resolved; they were not the 81
sister group of the platyhelminths nor of the gastrotrichs in our analysis. The position of the 82
orthonectid Intoshia, in contrast, was resolved as being within the clade of annelids (Fig. 2a PP 83
= 0.97; JP = 0.74). 84
We next asked whether there was any effect from long branched dicyemids on the strength of 85
support for inclusion of Intoshia within the Annelida - Intoshia also being a long-branched 86
taxon. Repeating our jackknife analyses with dicyemids excluded increased the support for 87
Intoshia as an annelid from JP = 0.74 to JP = 0.86 (Fig. 3b) showing that when the expected 88
LBA between Dicyema spp and Intoshia is prevented, there is stronger support for including 89
the orthonectid in Annelida. An equivalent analysis omitting Intoshia did not help to resolve 90
the position of dicyemids (figure S3). 91
To test further the support for Intoshia being a member of Annelida, we reasoned that an 92
analysis restricted to genes showing the strongest signal supporting monophyletic Annelida 93
should give stronger support to Intoshia within Annelida but only if it is indeed a member of 94
the clade; if not, support should decrease when using this subset of genes. We first removed all 95
mesozoan sequences from each individual gene alignment and reconstructed a tree for each 96
gene. We ranked these gene trees according to the proportion of all annelids present in a given 97
author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (which was not peer-reviewed) is the. https://doi.org/10.1101/235549doi: bioRxiv preprint
gene data set that were observed united in a clade. We concatenated the genes (now including 98
mesozoans) from strongest supporters of monophyletic Annelida to weakest. We repeated our 99
jackknife analyses using the best quarter of genes. An analysis of the genes that most strongly 100
support monophyletic Annelida results in an increase support for inclusion of Intoshia within 101
Annelida from JP = 0.74 to JP = 0.94 (Fig. 3c). 102
Our results suggest that recent findings of a close relationship between Intoshia and Dicyema 103
and the linking of both these taxa to rapidly evolving gastrotrichs and platyhelminths [7,8] is 104
due to long branch attraction. To test this prediction we exaggerated the expected effects of 105
LBA on our own data set by using less well fitting models. We first conducted cross validation 106
comparing the site heterogeneous CAT+G4 model we have used to the site homogenous 107
LG+G4 and show that LG+G4 is a significantly less good fit to our data (CAT+G4 is better 108
than LG+G4: ΔlnL = 9787 +/- 249.265). We used the less well fitting LG+G4 model to 109
reanalyse the jackknife replicates of a data set including our four most complete annelids. We 110
observed a topology clearly influenced by LBA in which long branched taxa including 111
flatworms, annelids, rotifers and nematodes were grouped. We also observed within this ‘LBA 112
assemblage’ the two longest branched clades, dicyemids and the orthonectid as each other’s 113
closest relatives. As a further test we reanalysed the published data set [8] which had linked 114
orthonectid and dicyemid with platyhelminths and gastrotrichs. When we removed the most 115
obvious source of LBA - the long branched dicyemid - we found that the orthonectid Intoshia 116
was, as expected, found not with platyhelminths or gastrotrichs but with the two annelids 117
present in this data set, again providing evidence of the effects of long branch attraction (Fig 118
4). 119
Discussion 120
We have analysed the first, almost complete mitochondrial genome sequence of an orthonectid 121
mesozoan and added to the known mitochondrial genes of Dicyemida to provide two powerful 122
author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (which was not peer-reviewed) is the. https://doi.org/10.1101/235549doi: bioRxiv preprint
rare genomic changes. Our analyses of mitochondrial NAD5 gene sequences show 123
unequivocally that both Dicyemida and Orthonectida are members of the protostomes and the 124
absence of rhabditophoran flatworm mitochondrial genetic code changes rejects existing ideas 125
that either group might be derived from parasitic flatworms. Both groups show unusually high 126
rates of evolution and this required steps to test for and avoid the possible effects of long branch 127
attraction, not least between the orthonectids and dicyemids. 128
Our mitochondrial data set and our large, taxonomically broad set of nuclear genes with a low 129
percentage of missing data, analysed with well fitting, site heterogeneous models of sequence 130
evolution, do not support the close relationship between orthonectids and dicyemids. 131
Orthonectids are annelids and not members of the Mesozoa and the phylum Mesozoa sensu 132
lato is an unnatural polyphyletic assemblage. We were unable to place the dicyemids more 133
precisely and they may be considered a phylum in their own right. Experiments manipulating 134
the expected effects of LBA strongly suggest previous phylogenies were affected by this 135
important source of systematic error. Finding the orthonectids and dicyemids not closely 136
associated demonstrates a remarkable instance of convergent evolution in two unrelated, 137
miniaturised parasites. 138
The finding that the orthonectid Intoshia is a member of the Annelida shows that it has evolved 139
its extraordinary simplicity by drastic simplification from a much more complex annelid 140
common ancestor. Our phylogenetic analyses could not more precisely place Intoshia within 141
the annelids, however, a short stretch of mitochondrial genes (nad1, nad6, cob) that are found 142
in the same order as in the lophotrochozoan ancestor and in the early branching annelid Owenia 143
fusiformis but not in the pleistoannelid ground plan argues for a position outside of the 144
Pleistoannelida [17] (Fig 2d). Possible evidence of an ancestral segmented body plan is still 145
apparent in the series of circular muscles regularly spaced along the antero-posterior axis of 146
Intoshia (Fig 1b), along with similarly repeated bands of cilia (Fig 1 and ref [18]). Further 147
author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (which was not peer-reviewed) is the. https://doi.org/10.1101/235549doi: bioRxiv preprint
analysis of the genome, embryology and morphology of Intoshia or other orthonectids are 148
predicted to show additional clues as to their cryptic annelidan ancestry. 149
Methods 150 151
Genome and transcriptome assemblies. 152
We downloaded genomic (Intoshia linei: SRR4418796, SRR4418797) and transcriptomic 153
(Dicyema sp.: SRR827581; Dicyema japonicum: DRR057371) data from the NCBI Short 154
Read Archives and DDBJ, and used Trimmomatic [19] to clean residual adapter sequences 155
from the sequencing reads and to remove low quality bases. We used the clc assembly cell 156
(clcBIO/Qiagen; v.5.0) to re-assemble the I. linei genome and the Trinity pipeline [20] 157
(v.2.3.2) to assemble the Dicyema. sp. and D. japonicum transcriptomes using default 158
settings. We additionally assembled transcriptomes for Phascolopsis gouldii, 159
Spiochaetopterus sp., Arenicola marina, Sabella pavonina, Magelona pitelkai, 160
Pharyngocirrus tridentiger and Bonellia viridis from SRA datasets (SRR1654498, 161
SRR1224605, SRR2005653, SRR2005708, SRR2015609, SRR2016714, SRR2017645) 162
using the same approach. 163
164
Identifying mitochondrial genome fragments. 165
Using mitochondrial gene protein coding sequences from flatworms as queries [21] we used 166
tblastn [22] and blastp to search for Dicyema sp. and D. japonicum mitochondrial fragments 167
in the Trinity RNA-Seq assemblies, and screened the I. linei genome re-assembly in a similar 168
way. Positively identified ORFs were then blasted against NCBI nr to detect possible 169
contamination from host species in the RNA-Seq data. For each Dicyema sp. gene-bearing 170
contig, we also found additional contigs which had strongly matching blast hits to Octopus or 171
author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (which was not peer-reviewed) is the. https://doi.org/10.1101/235549doi: bioRxiv preprint
other cephalopods (or in some cases to the gastropod mollusc Aplysia) and we discarded 172
these as likely contaminations. 173
174
Annotating mitochondrial genomes. 175
Using blast we identified a 14.2kb mitochondrial contig in the assembled I. linei genome, 176
which we annotated using MITOS [23]. The location of protein-coding genes were manually 177
verified from MITOS prediction, and inferred to start from the first in-frame start codon 178
(ATN, GTG, TTG, or GTT). The C-terminal of the protein-coding genes was inferred to be 179
the first in-frame stop codon (TAA, TAG or TGA). We aligned the Intoshia and Dicyema 180
NAD5 genes with those from 5 protostomes, 4 deuterostomes, and 2 non-bilaterian species in 181
the Geneious software to visualise Protostome specific signatures in the sequence. 182
183
Mitochondrial Phylogenetics 184
We grouped the mesozoan mitochondrial protein coding genes with their orthologs from 65 185
other species selected to cover the diversity of the Metazoa including diploblasts, 186
deuterostomes and ecdysozoans but with an emphasis on the diversity of Lophotrochozoa. 187
We aligned each set of orthologs using Muscle [24] v3.8.31 using default parameters and 188
trimmed these alignments to exclude unreliably aligned positions using TrimAl [25] (version 189
1.2 rev 59 using default settings). Finally, we concatenated the trimmed alignments of all 190
genes into a supermatrix of 2969 positions. We inferred a phylogeny with phylobayes (4.1b) 191
under the CAT+G4 model. We ran 10 independent chains for 10,000 cycles each. We 192
summarised all ten chains (bpcomp) discarding the first 8,000 trees from each as burnin. We 193
reconstructed additional mitochondrial phylogenies omitting (i) the long branching flatworm 194
species, (ii) all long branch taxa and also Intoshia, and (iii) long branch taxa and the Dicyema 195
author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (which was not peer-reviewed) is the. https://doi.org/10.1101/235549doi: bioRxiv preprint
species. Here and elsewhere we visualised and edited phylogenetic trees with FigTree 196
(v1.4.3; http://tree.bio.ed.ac.uk/software/figtree/). 197
198
Nuclear gene orthology determination 199
We chose to add the mesozoan data to sets of orthologous genes that were previously 200
successfully used to infer lophotrochozoan phylogeny [26,27]. We first used Orthofinder [28] 201
(v.1.0.8) to calculate orthologous relationships between the genes predicted for I. linei in the 202
recent genome paper [8] and our Dicyema sp. gene predictions. To ensure robustness of the 203
analysis we included several outgroup species (Supplementary table 2) In particular, as we 204
were concerned about potential contamination by the hosts of the parasitic Dicyema we 205
included the Octopus bimaculoides proteome. Since the published phylogenomic studies 206
included few annelid species we added our own Trinity assemblies of several additional 207
species (see above). We then extracted all orthologous groups containing the Octopus and the 208
two mesozoan taxa from the Orthofinder output and inserted these sequences into the original 209
alignments. This resulted in 590 orthologous groups. With the aid of OMA [29] and custom 210
Perl scripts we filtered these groups to contain single copy orthologs of all species. We re-211
aligned each set of orthologs using clustal-omega [30]; we removed unreliably aligned 212
positions from each alignment using TrimAl; finally we constructed individual gene trees 213
from these trimmed alignments using phyml [31] (v20160207). Using Python code and the 214
ETE3 toolkit we checked each tree for instances where sequences from Octopus and Dicyema 215
sp. were each other’s closest relatives (suggesting the sequence is an Octopus contaminant) 216
and removed the 5 alignments where the trees had this topology from our set. We 217
concatenated all single trimmed alignments of 45 taxa into a supermatrix of 227,646 218
author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (which was not peer-reviewed) is the. https://doi.org/10.1101/235549doi: bioRxiv preprint
positions. We used a custom script to eliminate all positions in the alignment with less than 219
50% occupancy. 220
221
Nuclear Gene Phylogenomic analyses 222
Using the mpi version of phylobayes (in v.1.7) run over four independent chains for 5000 223
cycles and discarding the first 4500 trees as burnin we reconstructed a phylogeny using this 224
alignment under both the CAT+G4 model of molecular evolution. To provide a conservative 225
measure of clade support and to test different data samples in a reasonable time we also 226
reconstructed trees using 50 jackknife sub-samples of 30,000 positions each from the 227
supermatrix. We used phylobayes 4.1c with the aid of the gnu-parallel command line tool 228
[32] and the UCL HPC cluster. We used the CAT+G4 model, and also compared results 229
from LG+G4. We ran phylobayes for 2000 cycles per jackknife sample which consistently 230
resulted in a plateauing of the likelihood score. We summarised all 50 of these phylobayes 231
analyses per model (using bpcomp) discarding the first 1800 sampled trees per jackknife as 232
burnin. We also tested the effect of different species compositions in our dataset by 233
performing phylobayes jackknife sampling with different subsets of taxa. 234
235
Cross validation 236
We compared the fit of CAT+G4 and LG+G4 models to our data using cross validation as 237
described in the phylobayes user manual. We ran 10 replicates and for each replicate we used 238
a randomly selected 30,000 positions of the data as a training set and 10,000 randomly 239
author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (which was not peer-reviewed) is the. https://doi.org/10.1101/235549doi: bioRxiv preprint
selected positions as the test set. Log likelihood scores were averaged over the ten replicates 240
using the sumcv command. 241
242
Ranking genes according to support for monophyletic Annelida. 243
We first removed all Intoshia and Dicyema sequences from each individual gene alignment. 244
For each individual gene, we reconstructed a tree from the aligned protein coding sequences 245
using Ninja [33]. Each tree was parsed using a custom script to find the proportion of 246
annelids in the data set present in the largest clade of annelids found. The tree was given a 247
score which was calculated as the number of annelids in the largest clade/total number of 248
annelids on the tree. Trees with larger monophyletic annelid clades scored highest. The 249
genes were then concatenated in order of their score. We took the first 25% of positions from 250
this concatenation (those genes with the strongest signal supporting monophyletic annelids) 251
and analysed jackknife replicates as before. 252
253 Acknowledgements: 254
255 The authors are grateful to Prof George Slyusarev (Saint Petersburg State University, Russia) 256
for providing images of Intoshia linei and for sharing an unpublished book chapter and to Dr 257
Tsai-Ming Lu and colleagues (Okinawa Institute of Science and Technology, Japan) for 258
sharing data. The authors also thank Fraser Simpson (UCL) for help in editing Figure 1b. The 259
research was funded by ERC grant (ERC-2012-AdG 322790) to MJT. Alignments have been 260
deposited with Zenodo (XXX) and phylogenetic trees are available through treebase.org 261
(XXX). 262
263
264
author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (which was not peer-reviewed) is the. https://doi.org/10.1101/235549doi: bioRxiv preprint
Author Contributions 265
Conceived the study: MJT. Planned the study: MJT and PHS. Assembled the data sets: PHS. 266
Analysed the data: PHS and MJT. Drafted the manuscript: MJT and PHS. Analysed 267
mitochondrial data: HER. 268
269
Declaration of Interests: 270
The authors declare no competing financial interests. 271 272 273 Figure Legends 274 275 Fig. 1: The mesozoans Intoshia variabili and Dicyema typus 276 277 A. Differential Interference contrast micrograph of an Intoshia variabili female showing 278 repeated bands of ciliated cells. Picture G. Slyusarev (St Petersburg State University, 279 Russia). 280 B. Confocal image of a phalloidin stained female specimen of Intoshia linei reveals repeated 281 set of circular muscles. Picture G. Slyusarev (St Petersburg State Univ.). 282 C. Rhombogen stage of a dicyemid (Dicyema typus from the Octopus) adapted from Hyman 283 L.H. The Invertebrates: Protozoa through Ctenophora McGraw-Hill, New York 1940(19). 284 Anterior to right in all images. 285 286 Fig. 2: Analyses of the phylogenetic positions of Dicyema and Intoshia based on 287 mitochondrial gene sequences. 288 289 A. Alignment of the mitochondrial NAD5 gene from selected protostomes, deuterostomes, 290 and outgroups, highlighting derived substitutions and amino acid deletions shared by the 291 orthonectids, dicyemids, and other protostomes. 292 B. A mitochondrial bayesian phylogeny based on 2969 positions places orthonectids and 293 dicyemids inside Lophotrochozoa, but the unlikely assemblage of Intoshia linei and 294 flatworms with annelids suggest this is affected by systematic error. 295 C. Mitochondrial bayesian phylogeny omitting the long branching taxa including Dicyema 296 gives some support for a position of Intoshia within Annelida.D. Order of the Intoshia nad1, 297 nad6, and cob mitochondrial genes in comparison to the early branching annelid Owenia 298 fusiformis, the pleistoannelid ground plan and the lophotrochozoan ground plan (see ref [17]). 299 300
author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (which was not peer-reviewed) is the. https://doi.org/10.1101/235549doi: bioRxiv preprint
Fig. 3: Analyses of the phylogenetic positions of Dicyema and Intoshia based on nuclear 301 gene sequences. 302
A. A bayesian phylogeny reconstructed from 190,027 aligned amino acid positions analysed 303 under the CAT+G4 model. Support values are from bayesian posterior probabilities (PP) and 304 from 50 jackknifed sub-samples of 30,000 residues (JP support values in brackets). Both 305 analyses reveal Mesozoa to be polyphyletic and place Intoshia linei in Annelida (see Supp 306 Fig 4a for support values). 307
B. A repeat of the jackknife analysis omitting the long-branching Dicyema species eliminates 308 the potential for LBA between Intoshia and Dicyema. This leads to an increase in the support 309 for a position of Intoshia within Annelida from JP 0.74 to JP 0.86 JP. (Only lophotrochozoan 310 part of the tree shown, see Supplementary Fig 4c for full tree). 311
C. Bayesian jackknife using CAT+G4 model using the best quarter of genes supporting 312 monophyletic annelids leads to increased support for Intoshia within Annelida to JP 0.94 313 even with the inclusion of the Dicyema species. (Only lophotrochozoan part of the tree 314 shown, see Supplementary Fig 4d for full tree). 315
316
Fig. 4: Reanalysis of a published data set addressing potential LBA between mesozoans 317 supports annelid affinity for Intoshia. 318
Repeating the analyses on a previously published data set [8] excluding the long branching 319 Dicyema leads to Intoshia being placed with the annelids, showing the likely effect of LBA 320 on the original analysis. Support values are bayesian posterior probabilities (PP). 321
author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (which was not peer-reviewed) is the. https://doi.org/10.1101/235549doi: bioRxiv preprint
A B
C
Fig. 1: The mesozoans Intoshia variabili and Dicyema typus A. Differential Interference contrast micrograph of an Intoshia variabili female showing repeated bands of ciliated cells. Picture G. Slyusarev (St Petersburg State University, Russia). B. Confocal image of a phalloidin stained female specimen of Intoshia linei reveals repeated set of circular muscles. Picture G. Slyusarev (St Petersburg State Univ.). C. Rhombogen stage of a dicyemid (Dicyema typus from the Octopus) adapted from Hyman L.H. The Invertebrates: Protozoa through Ctenophora McGraw-Hill, New York 1940(19). Anterior to right in all images.
author/funder. All rights reserved. N
o reuse allowed w
ithout permission.
The copyright holder for this preprint (w
hich was not peer-review
ed) is the.
https://doi.org/10.1101/235549doi:
bioRxiv preprint
A
B C
0.1
Nematostella_sp
Aplysia_californica
Doliolum_nationalisBranchiostoma_floridae
Acropora_tenuis
Siphonodentalium_lobatum
Lampetra_fluviatilis
Asymmetron_inferum
Loligo_bleekeri
Fasciola_hepaticaTaenia_crassiceps
Sepioteuthis_lessoniana
Tribolium_castaneum
Biomphalaria_tenagophila
Sepia_officinalis
Locusta_migratoria
Decipisagitta_decipiens
Limulus_polyphemus
Schistosoma_mansoni
Xenoturbella_bocki
Sepia_esculenta
Bugula_neritina
Terebratulina_retusa
Pupa_strigosa
Microstomum_lineare
Echinococcus_multilocularis
Marphysa_sanguinea
Orbinia_latreillii
Sagitta_inflata
Dicyema_japonicum
Saccoglossus_kowalevskii
Strongyloides_stercoralis
Escarpia_spicata
Urechis_caupo
Trichinella_spiralis
Lineus_alborostratusNautilus_macromphalus
Steinernema_carpocapsae
Mytilus_trossulus
Spadella_cephaloptera
Thais_clavigera
Sipunculus_nudus
Echinococcus_shiquicus
Intoshia_linei
Ornithorhynchus_anatinus
Priapulus_caudatus
Perionyx_excavatus
Strongylocentrotus_pallidus
Myxine_glutinosaSalmo_salar
Amynthas_jiriensis
Homo_sapiens
Balanoglossus_carnosus
Aurelia_aurita
Octopus_vulgaris
Trichoplax_adhaerens
Octopus_ocellatus
Laqueus_rubellus
Ciona_intestinalis
Dosidicus_gigas
Myzostoma_seymourcollegiorum
Clymenella_torquata
Flustrellidra_hispida
Dicyema_spDicyema_misakiense
Schmidtea_mediterranea
Lumbricus_terrestris
Daphnia_pulex
Platynereis_dumerilii
0.331
0.57
0.41
0.3
0.68
0.87
1
1
0.980.96
0.62
0.1
Aurelia_auritaStrongylocentrotus_pallidus
Dosidicus_gigas
Flustrellidra_hispidaBugula_neritina
Limulus_polyphemus
Biomphalaria_tenagophila
Asymmetron_inferumTribolium_castaneum
Decipisagitta_decipiens
Octopus_vulgaris
Branchiostoma_floridae
Xenoturbella_bocki
Priapulus_caudatus
Thais_clavigera
Amynthas_jiriensis
Ornithorhynchus_anatinus
Orbinia_latreillii
Daphnia_pulexLocusta_migratoria
Doliolum_nationalis
Octopus_ocellatus
Salmo_salar
Sagitta_inflata
Saccoglossus_kowalevskii
Mytilus_trossulus
Myxine_glutinosa
Homo_sapiens
Myzostoma_seymourcollegiorum
Ciona_intestinalis
Intoshia_linei
Trichoplax_adhaerens
Loligo_bleekeri
Terebratulina_retusa
Spadella_cephaloptera
Urechis_caupo
Lampetra_fluviatilis
Aplysia_californica
Siphonodentalium_lobatum
Sepia_esculenta
Laqueus_rubellus
Platynereis_dumerilii
Sepioteuthis_lessoniana
Nautilus_macromphalus
Nematostella_sp
Escarpia_spicataPerionyx_excavatus
Marphysa_sanguinea
Balanoglossus_carnosus
Acropora_tenuis
Clymenella_torquata
Sipunculus_nudus
Lineus_alborostratus
Pupa_strigosa
Sepia_officinalis
Lumbricus_terrestris
0.80.69
0.39
0.94
0.92
1
0.83
1
0.741
Annelida
DicyemaIntoshiaOctopusPhonorisTaeniaBrugiaDrosophilaXenopusPongoXenoturbellaHydraParamecium
180 190 260 270 280 290
Fig. 2: Analyses of the phylogenetic positions of Dicyema and Intoshia based on mitochondrial gene sequences. A. Alignment of the mitochondrial NAD5 gene from selected protostomes, deuterostomes, and outgroups, highlighting derived substitutions and amino
acid deletions shared by the orthonectids, dicyemids, and other protostomes. B. A mitochondrial bayesian phylogeny based on 2969 positions places orthonectids and dicyemids inside Lophotrochozoa, but the unlikely assemblage
of Intoshia linei and flatworms with annelids suggest this is affected by systematic error. C. Mitochondrial bayesian phylogeny omitting the long branching taxa including Dicyema gives some support for a position of Intoshia within Annelida. D. Order of the Intoshia nad1, nad6, and cob mitochondrial genes in comparison to the early branching annelid Owenia fusiformis, the pleistoannelid ground plan and the lophotrochozoan ground plan (see ref [17]).
-nad4 -cox1 nad1 nad6 cob nad3 rrnL
nad5 nad2 nad1 nad6 cob nad4L nad4
cox2 atp8 cox3 nad6 cob atp6 nad5
rrnS rrnL nad1 nad6 cob nad4L nad4D
Pleistoannelida
Intoshia linei
Owenia fusiformis
Lophotrochozoa
author/funder. All rights reserved. N
o reuse allowed w
ithout permission.
The copyright holder for this preprint (w
hich was not peer-review
ed) is the.
https://doi.org/10.1101/235549doi:
bioRxiv preprint
A
AnnelidaB
0.5
Maritigrella_crozieri
Macrodasys_sp
Cerebratulus_sp
Macrostomum_lignano
Tubulanus_polymorphus
Sabella_pavonina
Helobdella_robustaIntoshia_linei
Cephalothrix_linearis
Pharyngocirrus_tridentiger
Lymnea_stagnalis
Lepidodermella_squamata
Octopus_bimaculoides
Golfingia_vulgaris
Adineta_ricciaeMesodasys_laticaudatus
Neomenia_megatrapezata
Schistosoma_mansoni
Bonellia_viridisAlvinella_pompejana
Macracanthorhynchus_hirudinaceus
Dicyema_japonicum
Lottia_gigantea
Magelona_pitelkai
Monocelis_sp
Stenostomum_sp
Capitella_tellata
Dicyema_sp
Dendrocoelum_lacteum
C
0.5
Lepidodermella_squamata
Helobdella_robusta
Magelona_pitelkai
Intoshia_linei
Tubulanus_polymorphus
Stenostomum_sp
Macracanthorhynchus_hirudinaceus
Alvinella_pompejana
Macrodasys_sp
Monocelis_sp
Sabella_pavonina
Adineta_ricciae
Bonellia_viridis
Macrostomum_lignano
Lottia_gigantea
Octopus_bimaculoides
Golfingia_vulgaris
Lymnea_stagnalis
Dendrocoelum_lacteum
Neomenia_megatrapezata
Cephalothrix_linearis
Capitella_tellata
Schistosoma_mansoni
Pharyngocirrus_tridentiger
Maritigrella_crozieri
Cerebratulus_sp
Mesodasys_laticaudatus
0.86 0.94
Annelida0.5 0.5
0.1
Pediculus_humanus
Lymnea_stagnalis
Macrostomum_lignano
Monocelis_sp
Alvinella_pompejana
Tribolium_castaneum
Patiria_miniata
Macracanthorhynchus_hirudinaceus
Tubulanus_polymorphus
Schistosoma_mansoni
Helobdella_robusta
Adineta_ricciae
Strongylocentrotus_purpuratus
Dendrocoelum_lacteum
Homo_sapiens
Hydra_vulgaris
Trichinella_spiralis
Rhabdopleura_sp
Nematostella_vectensis
Capitella_tellata
Priapulus_caudatus
Neomenia_megatrapezata
Maritigrella_crozieri
Magelona_pitelkai
Cephalothrix_linearis
Sycon_coactum
Lepidodermella_squamata
Trichoplax_adhaerens
Dicyema_sp
Amphimedon_queenslandica
Stenostomum_sp
Bonellia_viridis
Macrodasys_sp
Lottia_gigantea
Intoshia_linei
Saccoglossus_kowalevskii
Branchiostoma_floridae
Cerebratulus_sp
Pharyngocirrus_tridentiger
Daphnia_pulex
Mesodasys_laticaudatus
Golfingia_vulgaris
Octopus_bimaculoides
Sabella_pavonina
Dicyema_japonicum
0.1
0.97 (0.74)
1 (0.97)
Fig. 3: Analyses of the phylogenetic positions of Dicyema and Intoshia based on nuclear gene sequences. A. A bayesian phylogeny reconstructed from 190,027 aligned amino acid positions analysed under the CAT+G4 model. Support values are from bayesian posterior probabilities (PP) and from 50 jackknifed sub-samples of 30,000 residues (JP support values in brackets). Both analyses reveal Mesozoa to be polyphyletic and place Intoshia linei in Annelida (see Supp Fig 4a for support values). B. A repeat of the jackknife analysis omitting the long-branching Dicyema species eliminates the potential for LBA between Intoshia and Dicyema. This leads to an increase in the support for a position of Intoshia within Annelida from JP 0.74 to JP 0.86 JP. (Only lophotrochozoan part of the tree shown, see Supplementary Fig 4c for full tree). C. Bayesian jackknife using CAT+G4 model using the best quarter of genes supporting monophyletic annelids leads to increased support for Intoshia within Annelida to JP 0.94 even with the inclusion of the Dicyema species. (Only lophotrochozoan part of the tree shown, see Supplementary Fig 4d for full tree).
author/funder. All rights reserved. N
o reuse allowed w
ithout permission.
The copyright holder for this preprint (w
hich was not peer-review
ed) is the.
https://doi.org/10.1101/235549doi:
bioRxiv preprint
Fig. 4: Reanalysis of a published data set addressing potential LBA between mesozoans supports annelid affinity for Intoshia.
Repeating the analyses on a previously published data set [7] excluding the long branching Dicyema leads to Intoshia being placed with the annelids, showing the likely effect of LBA on the original analysis. Support values are bayesian posterior probabilities (PP).
Annelida
0.1
Daphnia_pulex
Bothrioplana_semperi
Lepidodermella_squamata
Helobdella_robusta
Homo_sapiens
Stenostomum_leucops
Schistosoma_mansoni
Mesodasys_laticaudatus
Geocentrophora_applanata
Macrodasys_sp
Stenostomum_sthenum
Octopus_bimaculatus
Prostheceraeus_vittatus
Branchiostoma_floridae
Monocelis_fusca
Tribolium_castaneum
Austrognathia_sp
Ciona_intestinalis
Drosophila_melanogaster
Brachionus_koreanusMacracanthorhynchus_hirudinaceus
Gnathostomula_paradoxa
Lottia_gigantea
Microstomum_lineare
Diuronotus_aspetos
Intoshia_linei
Limnognathia_maerski
Capitella_teleta0.980.93
1
1
1
1
1
1
1
11
1
1
0.98
1
1
1
1
1
1
0.99
0.94
0.911
0.96
1
author/funder. All rights reserved. N
o reuse allowed w
ithout permission.
The copyright holder for this preprint (w
hich was not peer-review
ed) is the.
https://doi.org/10.1101/235549doi:
bioRxiv preprint
Supplementary Tables 322 323
Supplementary Table 1 324 Predicted correspondence of nucleotide triplets to amino acids in Intoshia and three Dicyema 325 species. For each triplet, the amino acid corresponding to the triplet in the standard 326 invertebrate mitochondrial code is shown, the number of observations of the triplet to 327 prediction is based on, the predicted amino acid and its score and finally the second highest 328 scoring amino acid prediction. The triplets AAA and ATA are highlighted in green and likely 329 errors highlighted in blue. Likely errors are mostly associated with very low numbers of 330 observed GC rich triplets in these very AT rich mitochondrial genomes. 331
332
Supplementary Table 2 333 List of species used in the final phylogenetic analysis, data sources, and representation in the 334 final alignment. 335
336 337
Supplementary Figures: 338 339
Figure S1. Related to Figure 2. 340 Phylogram and corresponding cladogram of a Bayesian analysis of our mitochondrial data set 341 omitting the long-branching flatworm species. Phylobayes CAT+G4 model was run in 10 342 independent runs for 10,000 cycles each on an alignment with 2969 positions and 8000 trees 343 were discarded as burnin. 344
345
Figure S2. Related to Figure 2. 346
Phylogram and corresponding cladogram of a Bayesian analysis of our mitochondrial data set 347 omitting Intoshia linei. Phylobayes CAT+G4 model was run in 10 independent runs for 348 10,000 cycles each on an alignment with 2969 positions and 8000 trees were discarded as 349 burnin. 350
351
Figure S3. Related to Figure 3. 352
A phylogram based on our analysis of the jackknifed dataset omitting Intoshia linei. Contrary 353 to the improvement in placing I. linei observed when excluding the Dicyema species, the 354 exclusion of I. linei does not lead to a better resolution of the Dicyema species’ position. This 355 can be seen as further evidence for the non-affiliation of orthonectids and dicyemids and the 356 correct inference that orthonectids are part of Annelida. 357
author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (which was not peer-reviewed) is the. https://doi.org/10.1101/235549doi: bioRxiv preprint
Figure S4. Related to Figure 3. 358 A. Cladogram corresponding to Fig 3a showing all PP support values for the CAT+G4 359 phylogeny based on the full alignment of 190,027 amino acid positions. 360 361 B. A cladogram including JP support values based on 50 jackknife subsamples of 30,000 362 amino acid positions each independently analysed for 2000 cycles under the CAT+G4 model 363 in phylobayes and summarised with the bpcomp command setting 1800 as burnin. As in the 364 analysis of the full dataset I. linei is found within the annelids and phylum Mesozoa is found 365 as an unnatural assemblage. 366 367 C. Cladogram corresponding to Fig 3b showing all support values. 368 369 D. Cladogram corresponding to Fig 3c showing all support values. 370
371 References: 372
373 [1] Slyusarev GS, Starunov VV. The structure of the muscular and nervous systems of 374
the female Intoshia linei (Orthonectida). Org Divers Evol 2015;16:65–71. 375 [2] Furuya H, Hochberg FG, Tsuneki K. Cell number and cellular composition in 376
infusoriform larvae of dicyemid mesozoans (Phylum Dicyemida). Zool Sci 377 2004;21:877–89. 378
[3] Nielsen C. Animal Evolution. Oxford University Press; 2011. 379 [4] Dodson EO. A note on the systematic position of the Mesozoa. Syst Zoo 1956;5:37. 380 [5] Suzuki TG, Ogino K, Tsuneki K, Furuya H. Phylogenetic analysis of dicyemid 381
mesozoans (phylum Dicyemida) from innexin amino acid sequences: Dicyemids are 382 not related to Platyhelminthes. J Parasitol 2010;96:614–25. 383
[6] Hanelt B, Van Schyndel D, Adema CM, Lewis LA, Loker ES. The phylogenetic 384 position of Rhopalura ophiocomae (Orthonectida) based on 18S ribosomal DNA 385 sequence analysis. Mol Biol Evol 1996;13:1187–91. 386
[7] Lu T-M, Kanda M, Satoh N, Furuya H. The phylogenetic position of dicyemid 387 mesozoans offers insights into spiralian evolution. Zool Letts 2017;3:419. 388
[8] Mikhailov KV, Slyusarev GS, Nikitin MA, Logacheva MD, Penin AA, Aleoshin VV, et 389 al. The genome of Intoshia linei affirms orthonectids as highly simplified spiralians. 390 Curr Biol 2016;26:1768–74. 391
[9] Philippe H, Brinkmann H, Copley RR, Moroz LL, Nakano H, Poustka AJ, et al. 392 Acoelomorph flatworms are deuterostomes related to Xenoturbella. Nature 393 2011;470:255–8. 394
[10] Slyusarev GS. Fine structure and development of the cuticle of Intoshia variabili 395 (Orthonectida). Acta Zool 2000;81:1–8. 396
[11] Telford MJ, Copley RR. Improving animal phylogenies with genomic data. Trends 397 Genet 2011;27:186–95. 398
[12] Telford MJ. Turning Hox “signatures” into synapomorphies. Evol Dev 2000;2:360–4. 399 [13] Kobayashi M, Furuya H, Holland PW. Dicyemids are higher animals. Nature 400
1999;401:762–2. 401 [14] Telford MJ, Herniou EA, Russell RB, Littlewood DT. Changes in mitochondrial 402
genetic codes as phylogenetic characters: two examples from the flatworms. P Natl 403 Acad Sci Usa 2000;97:11359–64. 404
[15] Lartillot N, Blanquart S, Lepage T. PhyloBayes 3.3 a Bayesian software for 405 phylogenetic reconstruction and molecular dating using mixture models. 2012. 406
author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (which was not peer-reviewed) is the. https://doi.org/10.1101/235549doi: bioRxiv preprint
[16] Simion P, Philippe H, Baurain D, Jager M, Richter DJ, Di Franco A, et al. A large 407 and consistent phylogenomic dataset supports sponges as the sister group to all 408 other animals. Curr Biol 2017;27:958–67. 409
[17] Weigert A, Golombek A, Gerth M, Schwarz F, Struck TH, Bleidorn C. Evolution of 410 mitochondrial gene order in Annelida. Mol Phyl Evol 2016;94:196–206. 411
[18] Slyusarev GS, Kristensen RM. Fine structure of the ciliated cells and ciliary rootlets 412 of Intoshia variabili (Orthonectida). Zoomorphology 2002;122:33–9. 413
[19] Bolger AM, Lohse M, Usadel B. Trimmomatic: a flexible trimmer for Illumina 414 sequence data. Bioinformatics 2014;30:2114–20. 415
[20] Haas BJ, Papanicolaou A, Yassour M, Grabherr M, Blood PD, Bowden J, et al. De 416 novo transcript sequence reconstruction from RNA-seq using the Trinity platform for 417 reference generation and analysis. Nat Protoc 2013;8:1494–512. 418
[21] Robertson HE, Lapraz F, Egger B, Telford MJ, Schiffer PH. The mitochondrial 419 genomes of the acoelomorph worms Paratomella rubra, Isodiametra pulchra and 420 Archaphanostoma ylvae. Sci Rep 2017;7:1847. 421
[22] Altschul SF, Gish W, Miller W, Myers EW. Basic local alignment search tool. Journal 422 of Mol Biol 1990;215:403–10. 423
[23] Bernt M, Donath A, Jühling F, Externbrink F, Florentz C, Fritzsch G, et al. MITOS: 424 improved de novo metazoan mitochondrial genome annotation. Mol Phyl Evol 425 2013;69:313–9. 426
[24] Edgar RC. MUSCLE: a multiple sequence alignment method with reduced time and 427 space complexity. BMC Bioinformatics 2004;5:113. 428
[25] Capella-Gutiérrez S, Silla-Martínez JM, Gabaldón T. TrimAl: a tool for automated 429 alignment trimming in large-scale phylogenetic analyses. Bioinformatics 430 2009;25:1972–3. 431
[26] Egger B, Lapraz F, Tomiczek B, Müller S, Dessimoz C, Girstmair J, et al. A 432 transcriptomic-phylogenomic analysis of the evolutionary relationships of flatworms. 433 Curr Biol 2015;25:1347–53. 434
[27] Struck TH, Wey-Fabrizius AR, Golombek A, Hering L, Weigert A, Bleidorn C, et al. 435 Platyzoan paraphyly based on phylogenomic data supports a noncoelomate 436 ancestry of spiralia. Mol Biol Evol 2014;31:1833–49. 437
[28] Emms DM, Kelly S. OrthoFinder: solving fundamental biases in whole genome 438 comparisons dramatically improves orthogroup inference accuracy. Genome Biol 439 2015;16:E9–13. 440
[29] Altenhoff AM, Škunca N, Glover N, Train C-M, Sueki A, Piližota I, et al. The OMA 441 orthology database in 2015: function predictions, better plant support, synteny view 442 and other improvements. Nucleic Acids Res 2015;43:D240–9. 443
[30] Sievers F, Wilm A, Dineen D, Gibson TJ, Karplus K, Li W, et al. Fast, scalable 444 generation of high-quality protein multiple sequence alignments using Clustal 445 Omega. Mol Syst Biol 2011;7:1–6. 446
[31] Morrison DA. Increasing the efficiency of searches for the maximum likelihood tree 447 in a phylogenetic analysis of up to 150 nucleotide sequences. Systematic Biol 448 2007;56:988–1010. 449
[32] Tange O. Gnu parallel-the command-line power tool. The USENIX Magazine; 2011. 450 [33] Wheeler TJ. Large-Scale Neighbor-Joining with NINJA. Algorithms in Bioinformatics, 451
vol. 5724, Berlin, Heidelberg: Springer Berlin Heidelberg; 2009, pp. 375–89. 452
~50µm
author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (which was not peer-reviewed) is the. https://doi.org/10.1101/235549doi: bioRxiv preprint
0.1
Saccoglossus_kowalevskii
Lumbricus_terrestrisPerionyx_excavatusAmynthas_jiriensis
Strongylocentrotus_pallidus
Thais_clavigera
Marphysa_sanguinea
Lineus_alborostratus
Ciona_intestinalis
Siphonodentalium_lobatum
Xenoturbella_bocki
Spadella_cephaloptera
Homo_sapiens
Octopus_vulgaris
Daphnia_pulex
Myzostoma_seymourcollegiorum
Lampetra_fluviatilis
Biomphalaria_tenagophilaPupa_strigosa
Terebratulina_retusa
Loligo_bleekeri
Flustrellidra_hispida
Escarpia_spicata
Tribolium_castaneum
Aplysia_californica
Platynereis_dumerilii
Sagitta_inflata
Myxine_glutinosa
Asymmetron_inferum
Locusta_migratoria
Dicyema_japonicum
Ornithorhynchus_anatinus
Mytilus_trossulus
Octopus_ocellatus
Dosidicus_gigas
Nautilus_macromphalus
Branchiostoma_floridae
Intoshia_linei
Priapulus_caudatus
Laqueus_rubellus
Clymenella_torquata
Balanoglossus_carnosus
Limulus_polyphemus
Urechis_caupo
Dicyema_sp
Sepia_esculentaSepia_officinalis
Doliolum_nationalis
Decipisagitta_decipiens
Orbinia_latreillii
Trichoplax_adhaerens
Dicyema_misakiense
Acropora_tenuis
Salmo_salar
Bugula_neritina
Nematostella_sp
Sepioteuthis_lessoniana
Aurelia_aurita
Sipunculus_nudus
0.74
1
0.69
1
0.99
1
0.63
0.64
0.92
1
1
0.87
1
0.87
10.94
0.32
0.47
0.99
0.97
0.94
0.96
0.57
1
0.38
0.96
0.74
1
0.42
0.39
0.431
0.89
0.99
0.94
0.39
0.41
0.44
0.97
0.99
1
0.94
0.55
1
0.63
0.83
0.32
0.990.97
0.58
0.85
0.78
1
0.73
0.98
0.1
Saccoglossus_kowalevskii
Lumbricus_terrestrisPerionyx_excavatusAmynthas_jiriensis
Strongylocentrotus_pallidus
Thais_clavigera
Marphysa_sanguinea
Lineus_alborostratus
Ciona_intestinalis
Siphonodentalium_lobatum
Xenoturbella_bocki
Spadella_cephaloptera
Homo_sapiens
Octopus_vulgaris
Daphnia_pulex
Myzostoma_seymourcollegiorum
Lampetra_fluviatilis
Biomphalaria_tenagophilaPupa_strigosa
Terebratulina_retusa
Loligo_bleekeri
Flustrellidra_hispida
Escarpia_spicata
Tribolium_castaneum
Aplysia_californica
Platynereis_dumerilii
Sagitta_inflata
Myxine_glutinosa
Asymmetron_inferum
Locusta_migratoria
Dicyema_japonicum
Ornithorhynchus_anatinus
Mytilus_trossulus
Octopus_ocellatus
Dosidicus_gigas
Nautilus_macromphalus
Branchiostoma_floridae
Intoshia_linei
Priapulus_caudatus
Laqueus_rubellus
Clymenella_torquata
Balanoglossus_carnosus
Limulus_polyphemus
Urechis_caupo
Dicyema_sp
Sepia_esculentaSepia_officinalis
Doliolum_nationalis
Decipisagitta_decipiens
Orbinia_latreillii
Trichoplax_adhaerens
Dicyema_misakiense
Acropora_tenuis
Salmo_salar
Bugula_neritina
Nematostella_sp
Sepioteuthis_lessoniana
Aurelia_aurita
Sipunculus_nudus
Figure S1:Phylogram and corresponding cladogram of a Bayesian analysis of our mitochondrial data set omitting the long-branching flatworm species. Phylobayes CAT+G4 model was run in 10 independent runs for 10,000 cycles each on an alignment with 2969 positions and 8000 trees were discarded as burnin.
Figure S1. Related to Figure 2.
author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (which was not peer-reviewed) is the. https://doi.org/10.1101/235549doi: bioRxiv preprint
0.1
Salmo_salar
Dosidicus_gigas
Perionyx_excavatus
Doliolum_nationalis
Sagitta_inflata
Sepia_officinalis
Nematostella_sp
Sepioteuthis_lessoniana
Decipisagitta_decipiens
Amynthas_jiriensis
Acropora_tenuis
Homo_sapiensAsymmetron_inferum
Loligo_bleekeri
Aurelia_aurita
Priapulus_caudatus
Pupa_strigosa
Trichoplax_adhaerens
Clymenella_torquata
Ornithorhynchus_anatinus
Thais_clavigera
Ciona_intestinalis
Xenoturbella_bocki
Sepia_esculenta
Dicyema_misakiense
Sipunculus_nudus
Urechis_caupo
Octopus_vulgaris
Strongylocentrotus_pallidus
Myxine_glutinosa
Branchiostoma_floridae
Biomphalaria_tenagophila
Saccoglossus_kowalevskii
Lampetra_fluviatilis
Dicyema_japonicum
Siphonodentalium_lobatum
Platynereis_dumeriliiMarphysa_sanguinea
Spadella_cephaloptera
Bugula_neritina
Nautilus_macromphalus
Dicyema_sp
Mytilus_trossulus
Myzostoma_seymourcollegiorum
Terebratulina_retusaAplysia_californica
Tribolium_castaneum
Flustrellidra_hispida
Limulus_polyphemus
Octopus_ocellatus
Laqueus_rubellus
Balanoglossus_carnosus
Locusta_migratoriaDaphnia_pulex
Orbinia_latreillii
Lumbricus_terrestris
Lineus_alborostratus
Escarpia_spicata
1
0.990.66
0.970.92
0.81
0.761
0.97
1
0.99
0.57
0.91
0.98
0.71
0.56
0.93
1
0.89 0.99
0.72
0.96
1
0.56
0.59
1
1
0.98
0.95
0.88
0.97
1
0.85
0.8
0.64
0.96
0.91
0.99
1
0.97
0.68
1
1
0.761
1
0.74
0.98
0.1
Salmo_salar
Dosidicus_gigas
Perionyx_excavatus
Doliolum_nationalis
Sagitta_inflata
Sepia_officinalis
Nematostella_sp
Sepioteuthis_lessoniana
Decipisagitta_decipiens
Amynthas_jiriensis
Acropora_tenuis
Homo_sapiensAsymmetron_inferum
Loligo_bleekeri
Aurelia_aurita
Priapulus_caudatus
Pupa_strigosa
Trichoplax_adhaerens
Clymenella_torquata
Ornithorhynchus_anatinus
Thais_clavigera
Ciona_intestinalis
Xenoturbella_bocki
Sepia_esculenta
Dicyema_misakiense
Sipunculus_nudus
Urechis_caupo
Octopus_vulgaris
Strongylocentrotus_pallidus
Myxine_glutinosa
Branchiostoma_floridae
Biomphalaria_tenagophila
Saccoglossus_kowalevskii
Lampetra_fluviatilis
Dicyema_japonicum
Siphonodentalium_lobatum
Platynereis_dumeriliiMarphysa_sanguinea
Spadella_cephaloptera
Bugula_neritina
Nautilus_macromphalus
Dicyema_sp
Mytilus_trossulus
Myzostoma_seymourcollegiorum
Terebratulina_retusaAplysia_californica
Tribolium_castaneum
Flustrellidra_hispida
Limulus_polyphemus
Octopus_ocellatus
Laqueus_rubellus
Balanoglossus_carnosus
Locusta_migratoriaDaphnia_pulex
Orbinia_latreillii
Lumbricus_terrestris
Lineus_alborostratus
Escarpia_spicata
Figure S2:
Phylogram and corresponding cladogram of a Bayesian analysis of our mitochondrial data set omitting Intoshia linei. Phylobayes CAT+G4 model was run in 10 independent runs for 10,000 cycles each on an alignment with 2969 positions and 8000 trees were discarded as burnin.
Figure S2. Related to Figure 2.
author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (which was not peer-reviewed) is the. https://doi.org/10.1101/235549doi: bioRxiv preprint
0.1
Octopus_bimaculoides
Nematostella_vectensis
Maritigrella_crozieri
Pharyngocirrus_tridentiger
Priapulus_caudatusTrichinella_spiralis
Dicyema_japonicum
Bonellia_viridis
Branchiostoma_floridae
Tubulanus_polymorphusMacrostomum_lignano
Schistosoma_mansoni
Magelona_pitelkai
Adineta_ricciae
Homo_sapiens
Amphimedon_queenslandica
Saccoglossus_kowalevskii
Cephalothrix_linearis
Stenostomum_sp
Cerebratulus_sp
Dendrocoelum_lacteum
Helobdella_robusta
Sycon_coactum
Hydra_vulgaris
Mesodasys_laticaudatus
Dicyema_spLottia_gigantea
Trichoplax_adhaerens
Strongylocentrotus_purpuratus
Monocelis_sp
Sabella_pavonina
Alvinella_pompejana
Macrodasys_sp
Lymnea_stagnalis
Rhabdopleura_spPatiria_miniata
Capitella_tellata
Golfingia_vulgaris
Neomenia_megatrapezata
Tribolium_castaneum
Macracanthorhynchus_hirudinaceusPediculus_humanusDaphnia_pulex
0.93
0.94
1
0.9
0.96
0.980.98
1
0.97
1
1
0.79
0.87
1
1
0.98
1
1
0.98
1
1
1
1
0.53
0.98
10.99
0.84
0.82
1
1
1
1
0.51
1
0.1
Octopus_bimaculoides
Nematostella_vectensis
Maritigrella_crozieri
Pharyngocirrus_tridentiger
Priapulus_caudatusTrichinella_spiralis
Dicyema_japonicum
Bonellia_viridis
Branchiostoma_floridae
Tubulanus_polymorphusMacrostomum_lignano
Schistosoma_mansoni
Magelona_pitelkai
Adineta_ricciae
Homo_sapiens
Amphimedon_queenslandica
Saccoglossus_kowalevskii
Cephalothrix_linearis
Stenostomum_sp
Cerebratulus_sp
Dendrocoelum_lacteum
Helobdella_robusta
Sycon_coactum
Hydra_vulgaris
Mesodasys_laticaudatus
Dicyema_spLottia_gigantea
Trichoplax_adhaerens
Strongylocentrotus_purpuratus
Monocelis_sp
Sabella_pavonina
Alvinella_pompejana
Macrodasys_sp
Lymnea_stagnalis
Rhabdopleura_spPatiria_miniata
Capitella_tellata
Golfingia_vulgaris
Neomenia_megatrapezata
Tribolium_castaneum
Macracanthorhynchus_hirudinaceusPediculus_humanusDaphnia_pulex
Figure S3:
A phylogram based on our analysis of the jackknifed dataset omitting Intoshia linei. Contrary to the improvement in placing I. linei observed when excluding the Dicyema species, the exclusion of I. linei does not lead to a better resolution of the Dicyema species’ position. This can be seen as further evidence for the non-affiliation of orthonectids and dicyemids and the correct inference that orthonectids are part of Annelida.
Figure S3. Related to Figure 3.
author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (which was not peer-reviewed) is the. https://doi.org/10.1101/235549doi: bioRxiv preprint
0.1
Strongylocentrotus_purpuratusPatiria_miniata
Adineta_ricciae
Macrodasys_sp
Macrostomum_lignano
Pharyngocirrus_tridentiger
Daphnia_pulex
Branchiostoma_floridae
Golfingia_vulgaris
Lottia_gigantea
Hydra_vulgaris
Lymnea_stagnalis
Neomenia_megatrapezata
Macracanthorhynchus_hirudinaceus
Bonellia_viridis
Stenostomum_sp
Schistosoma_mansoni
Homo_sapiens
Sycon_coactum
Magelona_pitelkai
Cerebratulus_sp
Dicyema_sp
Capitella_tellata
Alvinella_pompejana
Nematostella_vectensis
Tribolium_castaneum
Mesodasys_laticaudatus
Lepidodermella_squamata
Sabella_pavonina
Rhabdopleura_sp
Dicyema_japonicum
Trichinella_spiralis
Intoshia_linei
Dendrocoelum_lacteum
Trichoplax_adhaerens
Helobdella_robusta
Amphimedon_queenslandica
Saccoglossus_kowalevskii
Tubulanus_polymorphusCephalothrix_linearis
Octopus_bimaculoides
Pediculus_humanus
Maritigrella_crozieriMonocelis_sp
Priapulus_caudatus
0.82
1
1
1
0.93
0.86
0.96
0.99
0.5
1
0.98
1
0.97
1
1
1
1
1
0.92
1
0.98
0.75
0.88
0.96
1
1
0.74
0.6
1
1
0.94
0.990.77
0.98
BA
0.1
Pediculus_humanus
Lymnea_stagnalis
Macrostomum_lignano
Monocelis_sp
Alvinella_pompejana
Tribolium_castaneum
Patiria_miniata
Macracanthorhynchus_hirudinaceus
Tubulanus_polymorphus
Schistosoma_mansoni
Helobdella_robusta
Adineta_ricciae
Strongylocentrotus_purpuratus
Dendrocoelum_lacteum
Homo_sapiens
Hydra_vulgaris
Trichinella_spiralis
Rhabdopleura_sp
Nematostella_vectensis
Capitella_tellata
Priapulus_caudatus
Neomenia_megatrapezata
Maritigrella_crozieri
Magelona_pitelkai
Cephalothrix_linearis
Sycon_coactum
Lepidodermella_squamata
Trichoplax_adhaerens
Dicyema_sp
Amphimedon_queenslandica
Stenostomum_sp
Bonellia_viridis
Macrodasys_sp
Lottia_gigantea
Intoshia_linei
Saccoglossus_kowalevskii
Branchiostoma_floridae
Cerebratulus_sp
Pharyngocirrus_tridentiger
Daphnia_pulex
Mesodasys_laticaudatus
Golfingia_vulgaris
Octopus_bimaculoides
Sabella_pavonina
Dicyema_japonicum
1
1
1
1
1
0.75
1
0.75
1
1
0.75
1
1
1
1
1
1
1
1
1
1
0.58
0.75
1
1
1
0.75
1
1
1
1
0.75
1
1
1
1
1
0.97
1
1
Figure S4: A. Cladogram corresponding to Fig 3a showing all PP support values for the CAT+G4 phylogeny based on the full alignment of 190,027 amino acid positions. B. A cladogram including JP support values based on 50 jackknife subsamples of 30,000 amino acid positions each independently analysed for 2000 cycles under
the CAT+G4 model in phylobayes and summarised with the bpcomp command setting 1800 as burnin. As in the analysis of the full dataset I. linei is found within the annelids and phylum Mesozoa is found as an unnatural assemblage.
C. Cladogram corresponding to Fig 3b showing all support values. D. Cladogram corresponding to Fig 3c showing all support values.
0.1
Sycon_coactum
Tribolium_castaneum
Macracanthorhynchus_hirudinaceus
Macrostomum_lignano
Hydra_vulgaris
Amphimedon_queenslandica
Helobdella_robusta
Pediculus_humanus
Cerebratulus_sp
Lymnea_stagnalis
Nematostella_vectensis
Macrodasys_sp
Pharyngocirrus_tridentiger
Dendrocoelum_lacteum
Priapulus_caudatus
Daphnia_pulex
Monocelis_spSchistosoma_mansoni
Lepidodermella_squamata
Saccoglossus_kowalevskii
Trichinella_spiralis
Stenostomum_sp
Golfingia_vulgaris
Neomenia_megatrapezata
Maritigrella_crozieri
Sabella_pavonina
Magelona_pitelkai
Strongylocentrotus_purpuratus
Cephalothrix_linearis
Octopus_bimaculoides
Alvinella_pompejana
Patiria_miniata
Adineta_ricciae
Trichoplax_adhaerens
Rhabdopleura_sp
Tubulanus_polymorphus
Branchiostoma_floridae
Lottia_gigantea
Capitella_tellata
Mesodasys_laticaudatus
Intoshia_linei
Homo_sapiens
Bonellia_viridis
0.9
0.99
0.81
0.97
0.86
1
0.95
1
1
0.91
1
0.511
1
0.97
0.55
0.72
1
0.65
0.8
1
0.86
0.99
0.921
1
0.99
0.97
1
0.99
1
1
1
1
C
0.1
Dicyema_sp
Alvinella_pompejana
Dicyema_japonicum
Amphimedon_queenslandica
Tubulanus_polymorphus
Tribolium_castaneum
Mesodasys_laticaudatus
Strongylocentrotus_purpuratus
Cerebratulus_sp
Cephalothrix_linearis
Capitella_tellata
Hydra_vulgaris
Rhabdopleura_sp
Patiria_miniata
Macracanthorhynchus_hirudinaceus
Priapulus_caudatus
Pharyngocirrus_tridentiger
Daphnia_pulex
Homo_sapiens
Adineta_ricciae
Monocelis_sp
Helobdella_robusta
Sabella_pavonina
Saccoglossus_kowalevskii
Stenostomum_sp
Sycon_coactum
Branchiostoma_floridae
Golfingia_vulgaris
Neomenia_megatrapezata
Macrostomum_lignano
Schistosoma_mansoni
Intoshia_linei
Trichinella_spiralis
Macrodasys_sp
Octopus_bimaculoidesLymnea_stagnalis
Dendrocoelum_lacteum
Lottia_gigantea
Pediculus_humanus
Bonellia_viridis
Trichoplax_adhaerens
Lepidodermella_squamata
Nematostella_vectensis
Maritigrella_crozieri
Magelona_pitelkai
1
1
0.97
1
1
0.59
0.54
0.68
0.54
1
1
0.96
0.66
0.99
0.67
0.99
1
0.99
1
0.88
1
1
1
0.52
0.97
0.94
0.8
0.99
1
1
0.94
10.82
1
1
0.99
1
1
0.99
1
0.70.43
D
Figure S4. Related to Figure 3.
author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (which was not peer-reviewed) is the. https://doi.org/10.1101/235549doi: bioRxiv preprint