title: the pattern and dynamics of genome and chromosome ... research papers... · genome size...

71
TITLE: The Pattern and Dynamics of Genome and Chromosome Size Variation Xianran Li, 1,* Chengsong Zhu, 1,* Zhongwei Lin, 1, * Yun Wu, 1 Dabao Zhang, 2 Weixing Song, 1 Guihua Bai, 3 Mike Scanlon, 4 Min Zhang, 2,† and Jianming Yu 1,† 1 Kansas State University, Manhattan, KS 66506. 2 Purdue University, West Lafayette, IN 47907. 3 USDA-ARS, Manhattan, KS 66506. 4 Cornell University, Ithaca, NY 14853. * These authors contributed equally to this work. To whom correspondence should be addressed. E-mail: [email protected] or [email protected] ONE-SENTENCE SUMMARY: Systematic analyses of the lengths of 886 chromosomes in basepair from 68 eukaryotic species with sequenced genomes revealed a striking conserved boundary of chromosome size variation that prevails across a wide taxonomic range. 1

Upload: others

Post on 22-Sep-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: TITLE: The Pattern and Dynamics of Genome and Chromosome ... Research Papers... · genome size variation were usually estimated through extrapolation and focused on the whole genome

TITLE: The Pattern and Dynamics of Genome and Chromosome Size Variation

Xianran Li,1,* Chengsong Zhu,1,* Zhongwei Lin,1, * Yun Wu,1 Dabao Zhang,2 Weixing Song,1

Guihua Bai,3 Mike Scanlon,4 Min Zhang,2,† and Jianming Yu1,†

1Kansas State University, Manhattan, KS 66506.

2Purdue University, West Lafayette, IN 47907.

3USDA-ARS, Manhattan, KS 66506.

4Cornell University, Ithaca, NY 14853.

*These authors contributed equally to this work.

† To whom correspondence should be addressed. E-mail: [email protected] or [email protected]

ONE-SENTENCE SUMMARY: Systematic analyses of the lengths of 886 chromosomes in

basepair from 68 eukaryotic species with sequenced genomes revealed a striking conserved

boundary of chromosome size variation that prevails across a wide taxonomic range.

1

Page 2: TITLE: The Pattern and Dynamics of Genome and Chromosome ... Research Papers... · genome size variation were usually estimated through extrapolation and focused on the whole genome

ABSTRACT: (125 words or less)

We examined genome complexity by coupling knowledge of evolutionary mechanisms and

genome sequence information, which revealed a general increase of genome size, chromosome

size, and variability of chromosome characteristics from prokaryotes to unicellular eukaryotes,

invertebrates, vascular plants, and vertebrates. Using available genome sequence information

from various species, systematic analyses and computer simulations revealed that chromosome

size expansion along the evolutionary trajectory followed a stochastic process but there is an

upper limit to chromosome size variation in many diploid eukaryotic genomes. Despite the

dramatic differences in cellular and organismal complexity, the common pattern of chromosome

size variation in different eukaryotic genomes suggested a conserved constraint for chromosome

size evolution.

Genome sequencing has revealed detailed genome and chromosome information for more than a

hundred species across different phyla (1), making it possible not only to answer questions on

metagenomics of environmental samples and molecular and evolutionary basis of species

formation, but also to ask new questions in biology and evolution (2-5). Although the genome

sizes of eukaryotes vary over five orders of magnitude, the distribution is skewed toward small

values (6). Overall genome size and complexity clearly have increased during evolution from

archaea and bacteria to eukaryota (7), but detailed mechanisms of many competing processes

that either expand or shrink the genome remain to be discovered. Previous efforts to elucidate

genome size variation were usually estimated through extrapolation and focused on the whole

genome. Now that a relative abundance of completed genome sequences are available, we can

address the evolution and variation of individual chromosomes across species (1).

2

Page 3: TITLE: The Pattern and Dynamics of Genome and Chromosome ... Research Papers... · genome size variation were usually estimated through extrapolation and focused on the whole genome

The transition from circular to linear chromosomes is one prerequisite for increases in individual

chromosome size and chromosome number (8). In eukaryota, DNA repeats increase

chromosome size, as do intron size and gene duplication (7). Changes in chromosome number

reflect the balance between forces that increase chromosome number (such as chromosome

fission, allopolyploidization, or autopolyplodization) and forces that decrease this number (such

as chromosome fusion). Some of these evolutionary events also lead to changes in chromosome

size. Here, we show that the rate of evolution in genome size from prokaryotes to vertebrates is

proportional to the average genome size; that chromosomes within a species do not show

dramatic fluctuations in mobile genetic elements as they expand from unicellular eukaryotes to

vertebrates; and that there is an upper limit to chromosome size variation in eukaryota despite

their plasticity.

We tabulated genome size, chromosome number, individual chromosome size, repeat-masked

chromosome size, and common name groupings for 128 species with sequenced genomes (1).

For all sequenced prokaryotic and diploid eukaryotic species, genome size increased as a result

of changes in chromosome number and chromosome size and from variability in these two

chromosome characteristics from prokaryotes to unicellular eukaryotes, invertebrates, vascular

plants, and vertebrates. Although genome size varies considerably among species with similar

levels of cellular and organismal complexity, a general increase from prokaryotes to unicellular

eukaryotes to multicellular eukaryotes was observed (Fig. 1) (1). In addition, the continuities in

the scale of genome size across different groups of organisms indicate that the primary forces

3

Page 4: TITLE: The Pattern and Dynamics of Genome and Chromosome ... Research Papers... · genome size variation were usually estimated through extrapolation and focused on the whole genome

driving the evolution of genomic architecture are unlikely to be organismal differences in

cell/tissue anatomical structure or metabolism (7).

Using these basepair data for genome size, we further tested whether the rate of genome size

evolution with each group is proportional to the average genome size of each group (6). The rate

of genome size evolution was measured by the standard deviation of genome size within each

group, a statistic similar to, but more general than, the absolute value of contrast within a group.

For the five broadly defined common name groups (i.e., prokaryotes, unicellular eukaryotes,

invertebrates, vascular plants, and vertebrates), the rate of genome size evolution correlated

positively with the average genome size (Fig. 1). After removing the dependency with Log10

transformation, the rate of genome size evolution within each group showed no correlation to the

average genome size. Obviously, groups with a larger average genome size also had a larger

variation in genome size. Our finding with genome size data in basepair confirmed previous

research in which the estimated genome size based on 18S rDNA was examined across 20

eukaryotic clades (6).

To further examine the role of repeats on genome size and chromosome size, repeat masking of

the genome was obtained from either the original publications of the sequenced genomes or the

repeat masking analysis (1, 9, 10). In general, the repeat proportion of the genome increased

from prokaryotes (mean: 0.04) to unicellular eukaryotes (0.08), invertebrates (0.14), vascular

plants (0.35), and vertebrates (0.38), following the same trend as genome size (r = 0.98; P =

4.6 10-3) (Fig. 1). For vascular plants, the skewed distribution of the repeat proportion was due

4

Page 5: TITLE: The Pattern and Dynamics of Genome and Chromosome ... Research Papers... · genome size variation were usually estimated through extrapolation and focused on the whole genome

to maize (82.5%) and sorghum (60.9%). Overall, repeat proportion of chromosomes increases

during evolution from prokaryotes to vertebrates.

Following the similar logic in analysis genome size, we tested whether the standard deviation of

chromosome size in basepair with each species is proportional to the mean of chromosome size.

Because of the difference in response to repeat accumulation between circular and linear

chromosomes, we considered only eukaryotes with linear chromosomes in this analysis. Clearly,

a significant positive correlation existed between standard deviation of chromosome size and the

average chromosome size of a species (Fig. 2). After removing the magnitude effects with Log10

transformation, however, the standard deviation of chromosome size for all eukaryotic species

was bounded in a much smaller region than the prokaryotic species. Given that 68 diploid

eukaryotic species was used and the strong signal of this relationship (P = 1.4 10-38) between

standard deviation and average chromosome size, we then derived the regression slope (0.3512)

of standard deviation on average chromosome size across species. This regression slope provided

an ad hoc estimate of a common coefficient of variation (CV = standard deviation/mean) for the

underlying distributions of chromosome sizes in different species. The results suggested that

although the large differences existed for average chromosome size and standard deviation of

chromosome size across different species, the proportional relationship between them

approached a constant. This was further verified with the plot of coefficient of variation and any

deviation was not unexpected because individual CV calculated for each species represented a

sample (1).

5

Page 6: TITLE: The Pattern and Dynamics of Genome and Chromosome ... Research Papers... · genome size variation were usually estimated through extrapolation and focused on the whole genome

Similar to the findings in chromosome size, the standard deviation of non-repeat size was

proportional to the average non-repeat size in original scale and so was the standard deviation of

repeat size to the average repeat size of a species. Although the mechanisms whereby non-repeat

and repeat sequences were expanded in eukaryotic genomes is complicated (10), our results

suggested that the rate of expansion among chromosomes is proportional to the preceding

chromosome size, indicating a stochastic process (Fig. 2). Previous estimations of repeat

proportions of the genomes have been either species-specific or based on extrapolation for a

smaller number of species (7, 10) than the current study. Our general approach to examining

repeat evolution across species with genome sequence data lays the groundwork for detailed

studies to follow the evolution of different classes of repeats and their composition among

chromosomes, genomes, and taxonomic groups.

Next, we examined the chromosome size variation in eukaryotes in details because the available

data in chromosome length across the sequenced genomes permitted systematic modeling of

chromosome size globally. In addition to the common CV of chromosome size in eukaryotes, we

noted that basepair sizes of the chromosomes within individual species usually have the same

order of magnitude, inspiring further investigation of chromosome size variation. Two

transformations made the modeling process statistically possible and biologically sound: relative

chromosome size, obtained by dividing the chromosome size in basepair by the average

chromosome size of the individual species; and chromosome index, obtained by dividing the

ascending ranked chromosome number with the total chromosome number of that particular

species (1). Using average chromosome size as the unit of measure standardized the original

chromosome size in basepair in different orders of magnitude for different species into

6

Page 7: TITLE: The Pattern and Dynamics of Genome and Chromosome ... Research Papers... · genome size variation were usually estimated through extrapolation and focused on the whole genome

comparable numbers. Chromosome index is bounded between 0 and 1, permitting the modeling

of chromosome size across species with different chromosome numbers. Amazingly, the plot of

chromosome size against chromosome index revealed a clear pattern and strongly suggested a

common curve similar to a cubic function: the incremental change in chromosome size larger at

both ends of the curve but smaller in the middle (Fig. 3) (1).

Further investigation into the potential distribution from which the chromosome sizes (samples)

were drawn suggested that a Gamma distribution is a more plausible candidate than other

distributions (Fig. 3) (1). Gamma distribution is widely used in engineering and science to

model continuous variables that are non-negative but have right-skewed probability densities

(11). A Gamma distribution approximated a histogram of all chromosomes sizes (with a mean of

1 and skewness of 1.0046) better than a Normal distribution. Histograms generated from data of

individual species and from the pooled data of species with the same total number of

chromosomes corroborated this finding. We then theoretically derived the approximate

relationship function between chromosome size and chromosome index as an inverse of Gamma

cumulative distribution function, 1( ,1/ )G , where is the parameter (1). Because no closed form

exists for this nonlinear function, we used an iterative procedure (iteratively reweighted least

square) that minimizes the influence of variance heterogeneity to obtain the parameter estimate

with a 95% confidence interval of 1(7.0438,1/ 7.0438)G ˆ as (6.6609, 7.4267). Model fitting statistics

indicated a better fit with the Gamma distribution than either other distributions or the intuitive

cubic function (1). Notice that the CV of is 0.3768, close to the previous ad hoc CV

estimate 0.3512 obtained through simple regression analysis. Based on , 95% of the

17.0438G

1(7.0438,1/ 7.0438)G

7

Page 8: TITLE: The Pattern and Dynamics of Genome and Chromosome ... Research Papers... · genome size variation were usually estimated through extrapolation and focused on the whole genome

chromosomes in a species are expected have a basepair length between 0.4035 and 1.8626 times

the average chromosome length, an interval for chromosomes in diploid eukaryotic species.

Based on this discovery, the chromosome sizes for a given species can be predicted by its

chromosome number. Furthermore, given either genome size or average chromosome basepair

length (genome size = average chromosome size total chromosome number), we can predict

the size range of all chromosomes of that species in basepair (Fig. 2). Chromosome size

proportion was obtained by dividing chromosome size by chromosome number; the sum of

chromosome size proportions equaled one. For example, for a species with 15 chromosomes, the

shortest chromosome was expected to account for 2.87% of the genome and the longest

chromosome for 11.99% of the genome. The predicted ratio of the longest to the shortest

chromosome for a given species was 1.68 for a species with 2 chromosomes and 5.70 for a

species with 38 chromosomes. We used this general prediction to confirm the cases where

exceptions occurred for a few outlier species (12) with known reasons (1).

To show the robustness of the prediction and ensure that we had used an adequate number of

genomes (68 diploid eukaryotic genomes), a series of cross-validation experiments were

performed using different proportions of the observed data for function derivation and the rest of

the data for validation. The plots of mean square of prediction error and parameter estimate

indicated that the original sample size was large enough to derive a robust prediction function (1).

In addition, simulation results showed that the numbers drawn from the Gamma distribution with

the identified parameter Gamma (7.0438, 1/7.0438) can reproduce the pattern of the observed

8

Page 9: TITLE: The Pattern and Dynamics of Genome and Chromosome ... Research Papers... · genome size variation were usually estimated through extrapolation and focused on the whole genome

data, indicating that Gamma distribution does viably describe the chromosome size variation

observed (1).

Many evolutionary alterations affect chromosome number and chromosome size, including

reciprocal translocations, deletions and insertions, unequal crossover, dispersion of repetitive

sequences, and chromosome fusion and fission (8). Among these factors, reciprocal

translocations have been considered a major force in shaping chromosome size variation (8, 13-

16). To verify this is the case as suggested in previous evolutionary modeling studies (13-15, 17),

we ran computer simulations to determine whether a similar pattern of chromosome size

variation can be generated (Fig. 4) (1). To our surprise, simulated chromosome sizes based on

the reciprocal translocation model showed a much greater variation than what we observed in

these sequenced genomes. Our results suggested that a more comprehensive modeling approach

that considers other evolutionary alterations besides reciprocal translocation may resolve this

discrepancy (8, 18). Unlike previous works in which modeling was conducted for individual

species and only a few species were involved, the current study with empirical data analyses and

computer simulations established a benchmark for future evolutionary modeling research in

chromosome size.

Genome and chromosome complexity have been addressed from different perspectives including

population genetics and evolution (6, 7), molecular biology and cytogenetics (8), and

evolutionary modeling (13, 19). Here we systematically studied the dynamics of genome and

chromosome size variation. Using a combination of bioinformatics and statistics approaches and

available genome sequences across the evolutionary spectrum, we examined genome size

9

Page 10: TITLE: The Pattern and Dynamics of Genome and Chromosome ... Research Papers... · genome size variation were usually estimated through extrapolation and focused on the whole genome

evolution, repeat size evolution, chromosome size variation, and evolutionary modeling. We

found that chromosome size tends to center around the average chromosome length within a

species for most diploid eukaryotes and chromosome size variation across species can be

adequately modeled with a Gamma distribution. Our findings is in agreement with the long-

standing karyotypes in which chromosomes are usually visualized in descending order (13) if the

higher-order structures of linear DNA sequence do not lead to a different pattern from chromatin

size in basepair (5). In a cell cycle, the synchrony of chromosome separation must be precisely

controlled to correctly separate homologous chromosomes or sister chromatids. Although the

exact mechanism of such synchrony is not clear, chromosome size variation as a basic feature of

chromosome architecture deserves more attention. Uniform chromosome length may facilitate

the cell achieving a more synchronized DNA replication time with the same number of

replication forks, chromosome configuration on equatorial plate, and finally correct migration of

homologous chromosomes or sister chromatids to opposite poles (5, 20).

Thus, an upper limit to chromosome size variation may provide better evolutionary fitness in

overall energy savings, given the number of cells and the mitosis events in an organism, because

of ATP molecules are required for chromosome velocity (21). Temporal control of kinetochore–

microtubule dynamics may be a mechanism for maintaining genome stability (22, 23).

Depolymerization of kinetochore microtubules (MTs) may partly power chromosome movement

during mitosis (24). Under normal conditions, chromosome velocity is about the same in

anaphase for chromosomes of different sizes in a single cell (21, 25). Large variations in

chromosome length may decrease the evolutionary fitness of an organism, so overly lengthy

chromosomes will delay the separation of homologous chromosomes and sister chromatids

10

Page 11: TITLE: The Pattern and Dynamics of Genome and Chromosome ... Research Papers... · genome size variation were usually estimated through extrapolation and focused on the whole genome

during mitosis and meiosis, resulting in cell cycle prolongation, sterility, or even death (8).

Moreover, meiotic recombination was experimentally demonstrated to depend on chromosome

size in Saccharomyces cerevisiae (26) and in human (27). Therefore, chromosome size variation

is a vital factor in cell biology and evolution

To date, genome sequences of polyploid species have not been reported. After resolving the

assembly hurdle, further sequencing of polyploid genomes would allow us to extend this

hypothesis beyond the diploid genomes. Many current diploid species may have undergone a

process of polyploidization and diploidization. Detailed examination of available genomes may

also reveal the evolutionary significance of ancient genome duplications (28). In addition, the

locations of chromosome centromeres have been studied in only a few species (29). Interestingly,

although the chromosome segregation machinery is highly conserved across all eukaryotes, the

DNA and protein components specific to centromeric chromatin evolve rapidly, which hampers

the identification of centromeres in non-model species. Once the positions of centromeres have

been identified in a wide range of species, further study of length variation of the chromosome

arm may allow us to understand both the fine control and variation in chromosome segregation

machinery.

11

Page 12: TITLE: The Pattern and Dynamics of Genome and Chromosome ... Research Papers... · genome size variation were usually estimated through extrapolation and focused on the whole genome

References and Notes

1. Materials and methods are available as supporting material on Science Online.

2. D. C. Presgraves, Nat Rev Genet (Jan 6, 2010).

3. S. G. Tringe, E. M. Rubin, Nat Rev Genet 6, 805 (Nov, 2005).

4. M. L. Metzker, Nat Rev Genet 11, 31 (Jan, 2010).

5. T. Misteli, Cell 128, 787 (Feb 23, 2007).

6. M. J. Oliver, D. Petrov, D. Ackerly, P. Falkowski, O. M. Schofield, Genome Res 17, 594

(May, 2007).

7. M. Lynch, J. S. Conery, Science 302, 1401 (Nov 21, 2003).

8. I. Schubert, Curr Opin Plant Biol 10, 109 (Apr, 2007).

9. A. F. A. Smit, R. Hubley, P. Green. (http://www.repeatmasker.org, verified on Feb 28,

2010).

10. E. Lerat, Heredity (Nov 25, 2009).

11. O. Schabenberger, F. J. Pierce, Contemporary Statistical Models for the Plant and Soil

Sciences (CRC Press, Boca Raton, FL, 2002), pp. 304-312.

12. W. C. Warren et al., Nature 453, 175 (May 8, 2008).

13. D. Sankoff, V. Ferretti, Genome Res 6, 1 (Jan, 1996).

14. A. De, M. Ferguson, S. Sindi, R. Durrett, J. Appl. Prob. 38, 324 (2001).

15. M. Mazowita, L. Haque, D. Sankoff, J Comput Biol 13, 554 (Mar, 2006).

16. W. A. Bickmore, P. Teague, Chromosome Res 10, 707 (2002).

17. H. T. Imai, Y. Satta, N. Takahata, J Theor Biol 210, 475 (Jun 21, 2001).

18. The Chimpanzee Sequencing and Analysis Consortium. Nature 437, 69 (Sep 1, 2005).

19. J. Ma et al., Proc Natl Acad Sci U S A 105, 14254 (Sep 23, 2008).

12

Page 13: TITLE: The Pattern and Dynamics of Genome and Chromosome ... Research Papers... · genome size variation were usually estimated through extrapolation and focused on the whole genome

20. D. J. Sharp, G. C. Rogers, J. M. Scholey, Nature 407, 41 (Sep 7, 2000).

21. R. B. Nicklas, J Cell Biol 25, SUPPL:119 (Apr, 1965).

22. S. F. Bakhoum, G. Genovese, D. A. Compton, Curr Biol 19, 1937 (Dec 1, 2009).

23. S. F. Bakhoum, S. L. Thompson, A. L. Manning, D. A. Compton, Nat Cell Biol 11, 27

(Jan, 2009).

24. M. I. Molodtsov, E. L. Grishchuk, A. K. Efremov, J. R. McIntosh, F. I. Ataullakhanov,

Proc Natl Acad Sci U S A 102, 4353 (Mar 22, 2005).

25. A. Raj, C. S. Peskin, Proc Natl Acad Sci U S A 103, 5349 (Apr 4, 2006).

26. D. B. Kaback, V. Guacci, D. Barber, J. W. Mahon, Science 256, 228 (Apr 10, 1992).

27. E. S. Lander et al., Nature 409, 860 (Feb 15, 2001).

28. Y. Van de Peer, S. Maere, A. Meyer, Nat Rev Genet 10, 725 (Oct, 2009).

29. S. Henikoff, K. Ahmad, H. S. Malik, Science 293, 1098 (Aug 10, 2001).

30. This work is supported by the National Science Foundation (DBI-0820610; IIS-0844945),

the National Research Initiative (NRI) Plant Genome Program of the USDA Cooperative

State Research, Education and Extension Service (CSREES) (2006-35300-17155), the

National Institute of Health (NIH/NCI U01-CA128535-01), the Department of Defense

(W81XWH-08-1-0065), the Targeted Excellence Program of Kansas State University,

and the Purdue University Discovery Park Seed Grant.

Supporting Online Material

www.sciencemag.org

Materials and Methods

13

Page 14: TITLE: The Pattern and Dynamics of Genome and Chromosome ... Research Papers... · genome size variation were usually estimated through extrapolation and focused on the whole genome

Figs. S1, S2, S3, S4, S5, S6

Tables. S1, S2

14

Page 15: TITLE: The Pattern and Dynamics of Genome and Chromosome ... Research Papers... · genome size variation were usually estimated through extrapolation and focused on the whole genome

LIST OF FIGURES

Fig. 1. (A) Genome size in Mb of sequenced prokaryotes, unicellular eukaryotes, invertebrates,

vascular plants, and vertebrates. (B) Boxplot of genome size in Log10. The F test for genome size

in Log10 scale among groups is highly significant (P = 2.3 10-57) and all pairwise group

comparisons are significant. (C) The rate of genome size evolution as measured by standard

deviation of genome size within each group positively correlates with genome size (r = 0.98; P =

4.6 10-3). Values are in Log10 scale for plotting. (D) After the dependency of evolutionary rate

variance on preceding genome size is removed with Log10 transformation, the rate of genome

size evolution within the groups shows no correlation (r = -0.05; P = 0.93) with genome size. (E)

Boxplot of the repeat proportions of genomes. The overall F test for repeat proportions among

groups is highly significant (P = 3.0 10-26), and all pairwise group comparisons are significant

except prokaryotes-unicellular eukaryotes and vascular plants-vertebrates.

Fig. 2. (A) The chromosome size variation as measured by standard deviation of chromosome

size within species correlates positively with average chromosome size (r = 0.96, P = 1.4 10-38).

Values are in Log10 scale for plotting. Estimate of a common coefficient of variation in original

scale is 0.3512. (B) The absolute nonrepeat size variation (r = 0.97, P = 1.8 10-41). (C) The

absolute repeat size variation (r = 0.94, P = 2.8 10-32). (D) After the dependency of absolute

chromosome size variation on preceding chromosome size is removed with Log10 transformation,

chromosome size variation within species shows no correlation (r = -0.10, P = 0.43) with

average chromosome size. (E) Prior Log10 transformed nonrepeat size variation (r = -0.11, P =

0.37). (F) Prior Log10 transformed repeat size variation (r = 0.02; P = 0.89). Prokaryotic

15

Page 16: TITLE: The Pattern and Dynamics of Genome and Chromosome ... Research Papers... · genome size variation were usually estimated through extrapolation and focused on the whole genome

chromosomes are not included in the correlation calculation. Each color-coded dot represents the

value for individual species.

Fig. 3. (A) Model fitting of chromosome size on chromosome index across 885 chromosomes

from 68 diploid eukaryotic species. The blue dotted line is the fitted cubic function and the red

line is the fitted inverse of Gamma cumulative distribution function

0.5 0.51 1ˆ( ) 7.0438

ˆ ˆ( ) / ( ) / 7.0438j ji j n ni i

Z G G , where ( )ˆ

i jZ is the predicted chromosome size for the

j-th ordered chromosome of a species i with a total of ni chromosomes; and 1ˆG is the inverse of

Gamma cumulative distribution function with parameter ˆ . (B) Histogram of chromosome size

distribution with the overlaid probability density functions of Gamma (7.0438, 1/7.0438) and

Normal (1.0000, 0.1371). The histogram has a mean of 1.0 and a skewness of 1.0046. Gray bars

represent approximately 95% of the chromosome size between 0.3851 and 1.8608, and black

bars represent the remaining 5% on both ends. Gamma (7.0438, 1/7.0438) has a means of 1.0

and a variance of 0.1420. Of the chromosome size from Gamma (7.0438, 1/7.0438), 95% lies

between 0.4035 and 1.8626. (C) Predicted chromosome size proportion versus observed

chromosome size proportion. (D) Predicted chromosome size proportion for a species with a

given number of chromosomes. Predictions are plotted for the low hinge, median, and high hinge

of the boxplot of individual common name group: unicellular eukaryotes, invertebrates, vascular

plants, and vertebrates.

Fig. 4. Simulation using the reciprocal translocation model to test whether it partly explains

observed (red line) chromosome size variations. (A) No constraints on chromosome size. (B) A

lower threshold. (C) An upper threshold. (D) Both lower and upper thresholds. Chromosome size

16

Page 17: TITLE: The Pattern and Dynamics of Genome and Chromosome ... Research Papers... · genome size variation were usually estimated through extrapolation and focused on the whole genome

values are not expected to form a single line because the reciprocal translocation model predicts

the chromosome sizes independently for different total number of chromosomes.

17

Page 18: TITLE: The Pattern and Dynamics of Genome and Chromosome ... Research Papers... · genome size variation were usually estimated through extrapolation and focused on the whole genome
Page 19: TITLE: The Pattern and Dynamics of Genome and Chromosome ... Research Papers... · genome size variation were usually estimated through extrapolation and focused on the whole genome
Page 20: TITLE: The Pattern and Dynamics of Genome and Chromosome ... Research Papers... · genome size variation were usually estimated through extrapolation and focused on the whole genome
Page 21: TITLE: The Pattern and Dynamics of Genome and Chromosome ... Research Papers... · genome size variation were usually estimated through extrapolation and focused on the whole genome
Page 22: TITLE: The Pattern and Dynamics of Genome and Chromosome ... Research Papers... · genome size variation were usually estimated through extrapolation and focused on the whole genome

Supporting online material

Materials and Methods

Genomes and chromosomes

Genome and chromosome data of 128 genomes (68 eukaryotes and 60 prokaryotes) with

multiple chromosomes were obtained from difference databases including GenBank, Ensembl,

JGI, and Phytozome or individual species’ genome databases (Supplementary Table 1.).

Sequences unanchored to chromosomes were not included in tabulating the basepair length. For

species with more than one strain sequenced, we randomly selected one strain to represent the

species. Chromosome sizes within each species were listed in ascending order in basepair units.

Common name groups were assigned using the literature and database information. Accession

number or version of genome assembly was provided. The sex chromosomes of 14 species were

excluded from the analysis because of their unique evolutionary processes (1). For species

without masked-ready genome sequence information, we identified the repetitive sequences with

RepeatMasker 3.2.8 using the library identified by RepeatScout 1.0.5 to mask the repetitive

regions (2). Because our focus was to obtain the general pattern of repeat proportion of the

genomes and chromosomes, not exact values for a certain species, we chose this more

extensively used library-based program (3). Repeat and nonrepeat regions of chromosomes were

obtained after the masking process (Supplementary Table 1).

The common theme of the current study is to examine genome size and chromosome size across

different species. Variations of genome size increased as average genome size increased across

different common name groups: prokaryotes, unicellular eukaryotes, invertebrates, vascular

plants, and vertebrates. For chromosome size in diploid eukaryotes, it was further demonstrated

1

Page 23: TITLE: The Pattern and Dynamics of Genome and Chromosome ... Research Papers... · genome size variation were usually estimated through extrapolation and focused on the whole genome

that the standard deviation of chromosome size increased as the average chromosome size

increased and a common coefficient of variation existed. Further model fitting and computer

simulations derived that the common distribution of chromosome size variation can be modeled

with a Gamma distribution (Supplementary Fig. 1).

Data analysis and statistical modeling

Data of genome size and chromosome size were analyzed with SAS and R, following the

standard procedures of correlation, regression, and plotting (Fig. 1 and Fig. 2; Supplementary

Fig. 2-3). Because circular chromosomes in prokaryotes have different mechanisms for

replication and separation in cell cycles (4), we focused only on eukaryotes with linear

chromosomes. We used two approaches to conduct statistical modeling of chromosome size

variation. In the first approach, an intuitive cubic function was fit to capture the relationship

between chromosome size and chromosome index. The chromosome size was calculated as the

ratio of basepair length of a chromosome to the average basepair length of chromosome of the

species, ( ) ( ) /i j i j iZ L L , where Li is the basepair chromosome length for j-th chromosome of a

species i; ( )1

1 ini j

ii jL L

n; ni is the total chromosome number; and i = 1, 2, · · · , n species. The

chromosome index was calculated as 0.5( jni

) . The fitted function was

0.5 0.5 0.52 3( )

ˆ 0.3920 2.2890( ) 3.9141( ) 3.0753( )j ji j n ni i

Z jni

where ( )ˆ

i jZ is the predicted chromosome size for j-th chromosome of a species i and ni is the

total chromosome number. Subtracting 0.5 in chromosome index was justified because we used a

continuous distribution to model the discrete chromosome number; this is a standard practice.

2

Page 24: TITLE: The Pattern and Dynamics of Genome and Chromosome ... Research Papers... · genome size variation were usually estimated through extrapolation and focused on the whole genome

The second approach was more systematic and aimed to model chromosome size variation from

statistical distributions. Iteratively reweighted least square method was used to derive the

parameter estimate. Four distributions commonly used in biology were considered: Gamma

distribution, Normal distribution, Truncated Normal distribution (truncation at zero), and

Lognormal distribution. Gamma distribution was chosen for four reasons. First, ( )i jZ were all

non-negative. Second, the histogram of ( )i jZ was right-skewed that can be modeled by a Gamma

distribution. Third, unlike Lognormal distribution, Gamma distribution is a member of the

exponential family and permits a generalized linear model (5). Four, model fitting showed that

Gamma distribution had the best model fit. Model fitting statistics were calculated for mean

square error (MSE), R2, and Akaike's information criterion (AIC) (Supplementary Table 2).

21

ˆ( ) /(nk kk

)MSE Z Z n p , where Zk is the k-th observed data point; ˆkZ is the predicted

value; k = 1, … , n; n = 886 chromosomes, and p is the number of parameters in the model. The

original definition of R2 was used, i.e., 2 1 SSERSST

, where and

; and

21

ˆ( )nk kk

SSE Z Z

21( 1.0)n

kkSST Z ln( ) ln( ) 2AIC n SSE n n p .

While it is not possible to prove statistically that chromosome size must follow a Gamma

distribution, our analysis did prove that it is the best candidate of the distributions examined. We

presented the modeling steps for the Gamma distribution in following section; similar steps were

derived for three other distributions (details available upon request).

1) Conjectures:

3

Page 25: TITLE: The Pattern and Dynamics of Genome and Chromosome ... Research Papers... · genome size variation were usually estimated through extrapolation and focused on the whole genome

C1. For each species, the chromosome lengths follow a Gamma distribution. That is, let Xij be

the length of the j-th chromosome of species i, then Xij Gamma (~ind

i, i) where i is the

shape parameter and i is the scale parameter, j = 1, 2, · · · , ni.

C2. The Gamma distributions for different species share the same coefficient of variation (CV).

That is, CVi CV for all species i.

2) Statistics:

S1. Let Xij Gamma (~ind

i, i) with i > 0 and i > 0. Then E[Xij] = i i, 2( ) ij i ivar X ,

2var( ) 1 [ ]

ij i ii

ij i i i

XCV

E X, and it has the probability density function:

1 /1( ) , 0.( )

i i

i

x

i i

f x x e x

S2. Let Xij ~ind

Gamma ( i, i) and define Yij = Xij/E[Xij]. With CVi CV, i and Yij

Gamma ( , 1/ ).

~iid

S3. Assume Yi f(·) with the corresponding cumulative distribution function F(·), i = 1, 2, · · · ,

n. Denote the order statistics of {Y

~iid

i, i = 1, 2, · · · , n} as Y(1) Y(2) · · · Y(n). Then

0.51( ) 1 2

(1 )( ), ,{ ( ( ))}

asykk k

k k k nk

p pY N F p pn f F p

S4. The cumulative distribution function of Gamma ( , 1) is defined as

1

0

1( ) .( )

y tG y t e dt

For p = G (y), we define 1( )G p y .

3) Modeling Chromosome Size

4

Page 26: TITLE: The Pattern and Dynamics of Genome and Chromosome ... Research Papers... · genome size variation were usually estimated through extrapolation and focused on the whole genome

M1. Following Yij Gamma ( , 1/ ), we can define ~iid

./ij ij iZ X X with 1. 1

ini in ji jX X . When

ni is sufficiently large, we can roughly assume Zij Gamma ( , 1/ ). (Note: In reality,

we have cases with small n

~approx

i (e.g., ni = 2). In such cases, we may have poor approximation.)

M2. Let {Yi(j), j = 1, 2, · · · , ni} be the order statistics of {Yij, j = 1, 2, · · · , ni}. Then

0.51( )[ ] ( j

i j niE Y F ) , following S3 and M1. Here F(·) is the cumulative distribution function

of Gamma ( , 1/ ). That is, 0.5 0.51 1( ) ( ) /j jn ni i

F G .

M3. Let {Zi(j), j = 1, 2, · · · , ni} be the order statistics of {Zij, j = 1, 2, · · · , ni}. Following S3, M1,

and M2, we can model

0.51( ) ( ) /j

i j ijniZ G

where 0.52~ (0, ( )approx

jij ni

N ) . Let 0.5jni

p , then 1 1

22

2 2 1

(1 ) ( )( )( ( )) { 2 ( )

p ppn G p exp G p }

)

.

For cross validation, the observed data were randomly split into two parts: model fitting and

validation. The model fitting portion of the data increased from 10% to 90%, and the validation

portion decreased from 90% to 10% accordingly. Mean square prediction error (MSPE) for cross

validation was conducted as 21

ˆ( ) /(kj jj

MSPE Z Z k p , where Zj is the i-th observed data

point in the validation portion of the data; j = 1, … , k; ˆjZ is the predicted value for these k

points based on the model derived from the model fitting portion of the data; and p is the number

of parameters in the model. From all species, we sampled the set of chromosome data points

belonging to each species. The proportion of species sampled for function derivation varied from

10% to 90%, and the rest of the data was used for validation. The mean squared prediction error

5

Page 27: TITLE: The Pattern and Dynamics of Genome and Chromosome ... Research Papers... · genome size variation were usually estimated through extrapolation and focused on the whole genome

(MSPE) over 1000 experiments decreased as more data points were used to derive the prediction

function. Likewise, the parameter estimate ( ) approached the value from the whole data set

(Supplementary Fig. 4). In addition, the plots of MSPE and indicated the original sample size

was large enough to derive a robust prediction function. With about 50% of the data ( 35

species), both MSPE and started to level off. Smaller sample sizes (e.g., 10% or 7 species)

yielded upward-biased estimates and large MSPE. Because the inverse of is the variance of

the chromosome size distribution, small sample sizes yielded narrower chromosome size

variation.

To further prove that Gamma distribution viably described chromosome size and that numbers

drawn from the Gamma distribution with the identified parameter Gamma (7.0438, 1/7.0438)

can reproduce the pattern from observed data, we conducted computer simulations. Numbers

representing chromosome sizes were drawn from Gamma distributions with specific parameters

for species having a chromosome number from 2 to 38. The simulated data were then overlaid on

the scatter plot of the observed data (Supplementary Fig. 5). As expected, simulated data better

approximated observed data when the specified parameter in simulation was closer to the

identified parameter value from the observed data. Both the dispersion of the scattered points and

the fitted curves of the simulated and observed data confirmed that the pattern discovered was

indeed reproducible. In addition, when the simulation was repeated 1000 times for each specific

Gamma distribution parameter value, the fitted lines of the simulated and observed data showed

the smallest average MSE around the identified parameter value.

Reciprocal translocation

6

Page 28: TITLE: The Pattern and Dynamics of Genome and Chromosome ... Research Papers... · genome size variation were usually estimated through extrapolation and focused on the whole genome

Among many evolutionary events, reciprocal translocation is a good starting point for

understanding the dynamics of chromosome size variation through modeling (4, 6-9).

Simulations tested whether reciprocal translocation is partly responsible for observed

chromosome size variation. Let l1,...,lk be the length of the k chromosomes of a karyotype, where

l1 … lk and . Two of the chromosomes are chosen at random at time t according to a

probability distribution, p(i, j), under one of these two models. The proportional model states that

the probability of chromosomes being chosen p(i, j) is proportional to their length, l

11k

il

i and lj, with

1 1( , ) ( )1 1i j

i j

p i j l ll l

. A breakpoint was picked at random on each of the two chromosomes.

The first chromosome was broken into two segments, uli and (1- u)li, and the second

chromosome into vlj and (1- v)lj, where u and v are random numbers from (0, 1). The left half of

the first chromosome was paired with the right half of the second chromosome to form a new

karyotype at time t+1containing the chromosomes of the length l1,..., uli+ vlj, (1- u)li +(1-

v)lj,…,lk, which were reordered so that the lengths of the chromosomes were in a monotone non-

decreasing order. For each chromosome number k = 2, …, 38, simulations were carried out for

100,000 times (equilibrium), and the averages were calculated for l1,…, lk. These numbers were

then plotted against the chromosome index to show whether the resulting line approximates the

predicted line from the inverse of Gamma cumulative distribution function (Fig. 4).

Four simulation schemes were carried out: 1) no constraints on chromosome size, 2) a lower

threshold, 3) an upper threshold, 4) both lower and upper thresholds (6-9). We incorporated

constraints on the smallest and largest chromosomes in the modeling process because (1)

chromosome size below a certain threshold will prevent any translocation events; (2) at the

7

Page 29: TITLE: The Pattern and Dynamics of Genome and Chromosome ... Research Papers... · genome size variation were usually estimated through extrapolation and focused on the whole genome

cytogenetic level, viable and functional chromosomes must at least contain a centromere and two

telomeres to maintain purely structural basis; and (3) each chromosome must have a length

sufficient for at least one crossover among the four aligned sister chromatids in meiosis.

Moreover, as shown experimentally, if one arm of the chromosome is more than 21.7% of the

total length of all chromosomes, most offspring are sterile (4, 6-9). The lower threshold was set

for the smallest observed chromosome size (6), and the upper threshold was set using a fitness

function (7).

Confirm outlier species with known reasons

To show the deviation of chromosome size variation from our general predictions for five known

outlier species, we plotted their chromosome sizes and chromosome size proportions against the

fitted lines (Supplementary Fig. 6). Chicken (Gallus gallus) (10) and Zebra finch (Taeniopygia

guttata) (11) genomes are examples of avian species that typically have high chromosome

numbers and complex karyotypes consisting of several macrochromosomes and numerous

microchromosomes. For twenty assembled chromosomes of platypus (Ornithorhynchus anatinus)

(12), the differentiation between macrochromosomes and microchromosomes was also obvious.

Platypus and its genome exhibit a fascinating combination of characters of reptiles and mammals.

Mycosphaerella graminicola is a haploid hemibiotrophic ascomycete (13) and Cyanothece sp.

ATCC 51142 (14) is a cyanobacterium with one large circular and one linear chromosome.

(Supplementary Table 1).

Supplementary Figure 1. Schematic diagram of the evolution of genome size and chromosome

size in eukaryotes. Variations of genome size and chromosome size increase as the mean of the

8

Page 30: TITLE: The Pattern and Dynamics of Genome and Chromosome ... Research Papers... · genome size variation were usually estimated through extrapolation and focused on the whole genome

trait increases. Four species represent four common name groups of diploid eukaryotes with

linear chromosomes: yeast (Saccharomyces cerevisiae) for unicellular eukaryotes (brown curve),

fruitfly (Drosophila melanogaster) for invertebrates (blue curve), rice (Oryza sativa) for vascular

plants (green curve), and human (Homo sapiens) for vertebrates (red curve). As the cell size,

metabolic rate, and individual size increase, the population size decreases (shown with the

yellow gradient). The probability density curve of individual species shows the concept of same

coefficient of variation for chromosome size. Standardizing the chromosome size value in

basepair with the mean of each species gives a common distribution, enabling a global model

fitting for chromosome size variation.

Supplementary Figure 2. Chromosome size and chromosome number variation across five

different common name groups. (A) Coefficient of variation (CV = Standard deviation/Mean) of

chromosome size of different species and the corresponding total chromosome number. Notice

that C.V. varied dramatically for the prokaryotic genomes but remained within 0.2 and 0.5 for

most eukaryotic genomes. (B) Average chromosome size and chromosome number shows no

correlation.

Supplementary Figure 3. The observed chromosome size proportion for 68 diploid eukaryotic

species with different numbers of chromosomes.

Supplementary Figure 4. Cross validation of the model prediction for chromosome size by

sampling different proportions of the data for model training and validating with the rest of the

data. (A) Parameter estimates ( ˆ ) for the Gamma distributions. (B) Mean square of prediction

9

Page 31: TITLE: The Pattern and Dynamics of Genome and Chromosome ... Research Papers... · genome size variation were usually estimated through extrapolation and focused on the whole genome

errors (MSPE) between predicted and observed values of the validating data. Each dot represents

the average across 1,000 repetitions. The MSPE decreased to 0.0236, and estimate approached

7.0438 when 90% of the data was used as the training set and the remaining data for validation.

We start with 10% of data as the training set to determine whether the original data provide an

ample sample size so the curves of MSPE start to level off after a certain point (~50% in this

case).

Supplementary Figure 5. Simulation of different Gamma ( , 1/ ) from which chromosome

sizes were drawn to demonstrate that the same pattern in observed data can be reproduced

through simulation. (A) Probability density functions of different Gamma distributions with the

same mean but different variances. (B) = 2.00. (C) = 5.00. (D) = 7.04. (E) = 16.00. Red

dots are observed chromosome sizes, and the yellow line represents fitted values. Black dots are

simulated chromosome sizes, and the green line represents fitted values. (F) Mean square of

error (MSE) between the observed data and the fitted line of the simulated data with varied

Gamma ( , 1/ ). Each dot represents the average of 1000 repetitions of simulations. The bar

represents the confidence interval of the parameter estimate (red dot) from the empirical data.

Supplementary Figure 6. Observed chromosome size and chromosome size proportion of three

outlier species to show the deviation of these values from the general prediction. (A) Chicken

(Gallus gallus) known to contain macrochromosomes and microchromosomes. (B) Platypus

(Ornithorhynchus anatinus) known to have a genome with characters of both reptiles and

mammals. (C) Cyanothece sp. ATCC 51142 known to have one large circular and one linear

chromosome. (D) Chromosome size proportion for three species against the fitted lines.

10

Page 32: TITLE: The Pattern and Dynamics of Genome and Chromosome ... Research Papers... · genome size variation were usually estimated through extrapolation and focused on the whole genome

Supplementary Table 1. Genome and chromosome information for the species used in the

current study. (pdf file attached)

Supplementary Table 2. Model fitting statistics for different distributions and the cubic

function. Values in the parentheses are calculated based on the parameter estimates.

Model Number of parameters Parameter estimate MSE R2 AIC

ˆ = 7.0438 (variance = 0.1420) 0.0229 0.8311 -3343.3 Gamma 1

lnˆLognormal 1 = -0.0720 (variance = 0.1564) 0.0231 0.8298 -3336.1

Normal 1 variance = 0.1371 0.0264 0.8059 -3220.0

Truncated Normal 2

ˆt = 0.9756, 2ˆt =0.1400 (variance = 0.1392) 0.0265 0.8051 -3214.3

Cubic function 4

0ˆ = 0.3920, 1̂ = 2.2890, 2

ˆ = -

3.9141, 3ˆ = 3.0753 (variance† = 0.1359) 0.0244 0.8209 -3285.0

MSE, mean square of error; AIC, Akaike's information criterion.

† denotes the sample variance because cubic function does not assume any distribution.

Supporting references and notes

1. D. Charlesworth, B. Charlesworth, G. Marais, Heredity 95, 118 (Aug, 2005).

2. A. F. A. Smit, R. Hubley, P. Green. (http://www.repeatmasker.org, verified on Feb 28, 2010).

3. E. Lerat, Heredity 2010 (online Nov 25, 2009).

4. I. Schubert, Curr Opin Plant Biol 10, 109 (Apr, 2007).

5. O. Schabenberger, F. J. Pierce, Contemporary Statistical Models for the Plant and Soil

Sciences (CRC Press, Boca Raton, FL, 2002), pp. 304-312.

11

Page 33: TITLE: The Pattern and Dynamics of Genome and Chromosome ... Research Papers... · genome size variation were usually estimated through extrapolation and focused on the whole genome

6. D. Sankoff, V. Ferretti, Genome Res 6, 1 (Jan, 1996).

7. A. De, M. Ferguson, S. Sindi, R. Durrett, J Appl Prob 38, 324 (2001).

8. M. Mazowita, L. Haque, D. Sankoff, J Comput Biol 13, 554 (Mar, 2006).

9. H. T. Imai, Y. Satta, N. Takahata, J Theor Biol 210, 475 (Jun 21, 2001).

10. International Chicken Genome Sequencing Consortium. Nature 432, 695 (Dec 9, 2004).

11. Y. Itoh, A. P. Arnold, Chromosome Res 13, 47 (2005).

12. W. C. Warren et al., Nature 453, 175 (May 8, 2008).

13. A. H. Wittenberg et al., PLoS One 4, e5863 (2009).

14. E. A. Welsh et al., Proc Natl Acad Sci U S A 105, 15094 (Sep 30, 2008).

12

Page 34: TITLE: The Pattern and Dynamics of Genome and Chromosome ... Research Papers... · genome size variation were usually estimated through extrapolation and focused on the whole genome
Page 35: TITLE: The Pattern and Dynamics of Genome and Chromosome ... Research Papers... · genome size variation were usually estimated through extrapolation and focused on the whole genome
Page 36: TITLE: The Pattern and Dynamics of Genome and Chromosome ... Research Papers... · genome size variation were usually estimated through extrapolation and focused on the whole genome
Page 37: TITLE: The Pattern and Dynamics of Genome and Chromosome ... Research Papers... · genome size variation were usually estimated through extrapolation and focused on the whole genome
Page 38: TITLE: The Pattern and Dynamics of Genome and Chromosome ... Research Papers... · genome size variation were usually estimated through extrapolation and focused on the whole genome
Page 39: TITLE: The Pattern and Dynamics of Genome and Chromosome ... Research Papers... · genome size variation were usually estimated through extrapolation and focused on the whole genome
Page 40: TITLE: The Pattern and Dynamics of Genome and Chromosome ... Research Papers... · genome size variation were usually estimated through extrapolation and focused on the whole genome

Table S1. Chromosome size of each species

SpeciesChromosome

numberChromosome

indexChromosome

sizeSize of masked Common name

Accession/version Source

Brucella melitensis 2 1 1185518 20342 Prokaryotes NC_012442 GenBankBrucella melitensis 2 2 2125701 46392 Prokaryotes NC_012441 GenBankHaloarcula marismortui 2 1 288050 7040 Prokaryotes NC_006397 GenBankHaloarcula marismortui 2 2 3131724 17771 Prokaryotes NC_006396 GenBankBurkholderia pseudomallei 2 1 3181762 225429 Prokaryotes NC_007435 GenBankBurkholderia pseudomallei 2 2 4126292 222514 Prokaryotes NC_007434 GenBankRhodobacter sphaeroides 2 1 1219053 45697 Prokaryotes NC_009050 GenBankRhodobacter sphaeroides 2 2 3147721 60664 Prokaryotes NC_009049 GenBankPseudoalteromonas haloplanktis 2 1 635328 6481 Prokaryotes NC_007482 GenBankPseudoalteromonas haloplanktis 2 2 3214944 74602 Prokaryotes NC_007481 GenBankBurkholderia sp. 3 1 1395069 45904 Prokaryotes NC_007509 GenBankBurkholderia sp. 3 2 3587082 113384 Prokaryotes NC_007511 GenBankBurkholderia sp. 3 3 3694126 108577 Prokaryotes NC_007510 GenBankVibrio cholerae 2 1 1072315 75848 Prokaryotes NC_002506 GenBankVibrio cholerae 2 2 2961149 67476 Prokaryotes NC_002505 GenBankCupriavidus taiwanensis 2 1 2502411 113495 Prokaryotes NC_010530 GenBankCupriavidus taiwanensis 2 2 3416911 92939 Prokaryotes NC_010528 GenBankRalstonia pickettii 2 1 1302238 15316 Prokaryotes NC_010678 GenBankRalstonia pickettii 2 2 3942557 80901 Prokaryotes NC_010682 GenBankBrucella canis 2 1 1206800 22257 Prokaryotes NC_010104 GenBankBrucella canis 2 2 2105969 45592 Prokaryotes NC_010103 GenBankBurkholderia pseudomallei 2 1 3100794 214233 Prokaryotes NC_009078 GenBankBurkholderia pseudomallei 2 2 3988455 232378 Prokaryotes NC_009076 GenBankBurkholderia cenocepacia 3 1 1224595 48035 Prokaryotes NC_010512 GenBankBurkholderia cenocepacia 3 2 3213911 129335 Prokaryotes NC_010515 GenBankBurkholderia cenocepacia 3 3 3532883 113924 Prokaryotes NC_010508 GenBankLeptospira biflexa 2 1 277655 825 Prokaryotes NC_010843 GenBankLeptospira biflexa 2 2 3599677 28539 Prokaryotes NC_010602 GenBankPhotobacterium profundum 2 1 2237943 271018 Prokaryotes NC_006371 GenBankPhotobacterium profundum 2 2 4085304 301993 Prokaryotes NC_006370 GenBankBurkholderia mallei 2 1 2284095 191078 Prokaryotes NC_008835 GenBankBurkholderia mallei 2 2 3458208 294703 Prokaryotes NC_008836 GenBankOchrobactrum anthropi 2 1 1895911 42481 Prokaryotes NC_009668 GenBankOchrobactrum anthropi 2 2 2887297 66334 Prokaryotes NC_009667 GenBankBurkholderia phytofirmans 2 1 3625999 69258 Prokaryotes NC_010676 GenBankBurkholderia phytofirmans 2 2 4467537 93344 Prokaryotes NC_010681 GenBank

Page 41: TITLE: The Pattern and Dynamics of Genome and Chromosome ... Research Papers... · genome size variation were usually estimated through extrapolation and focused on the whole genome

Table S1. Chromosome size of each species

SpeciesChromosome

numberChromosome

indexChromosome

sizeSize of masked Common name

Accession/version Source

Vibrio splendidus 2 1 1675519 33268 Prokaryotes NC_011744 GenBankVibrio splendidus 2 2 3299302 70072 Prokaryotes NC_011753 GenBankBrucella abortus 2 1 1162204 22023 Prokaryotes NC_006933 GenBankBrucella abortus 2 2 2124241 43758 Prokaryotes NC_006932 GenBankLeptospira borgpetersenii 2 1 317336 25673 Prokaryotes NC_008509 GenBankLeptospira borgpetersenii 2 2 3614446 332345 Prokaryotes NC_008508 GenBankParacoccus denitrificans 2 1 1730097 41095 Prokaryotes NC_008687 GenBankParacoccus denitrificans 2 2 2852282 64149 Prokaryotes NC_008686 GenBankVibrio parahaemolyticus 2 1 1877212 20855 Prokaryotes NC_004605 GenBankVibrio parahaemolyticus 2 2 3288558 83347 Prokaryotes NC_004603 GenBankLeptospira interrogans 2 1 358943 34150 Prokaryotes NC_004343 GenBankLeptospira interrogans 2 2 4332241 483688 Prokaryotes NC_004342 GenBankVibrio sp. 2 1 1829445 14047 Prokaryotes NC_013457 GenBankVibrio sp. 2 2 3259580 84681 Prokaryotes NC_013456 GenBankThermobaculum terrenum 2 1 1074634 16985 Prokaryotes NC_013526 GenBankThermobaculum terrenum 2 2 2026947 8016 Prokaryotes NC_013525 GenBankDeinococcus radiodurans 2 1 412348 5899 Prokaryotes NC_001264 GenBankDeinococcus radiodurans 2 2 2648638 91144 Prokaryotes NC_001263 GenBankVibrio fischeri 2 1 1418848 54540 Prokaryotes NC_011186 GenBankVibrio fischeri 2 2 2905029 64578 Prokaryotes NC_011184 GenBankBurkholderia phymatum 2 1 2697374 43977 Prokaryotes NC_010623 GenBankBurkholderia phymatum 2 2 3479187 63336 Prokaryotes NC_010622 GenBankHalorubrum lacusprofundi 2 1 525943 52892 Prokaryotes NC_012028 GenBankHalorubrum lacusprofundi 2 2 2735295 75577 Prokaryotes NC_012029 GenBankBurkholderia multivorans 3 1 919806 52805 Prokaryotes NC_010087 GenBankBurkholderia multivorans 3 2 2473162 134944 Prokaryotes NC_010805 GenBankBurkholderia multivorans 3 3 3448466 153995 Prokaryotes NC_010084 GenBankVibrio harveyi 2 1 2204018 339550 Prokaryotes NC_009784 GenBankVibrio harveyi 2 2 3765351 463218 Prokaryotes NC_009783 GenBankAgrobacterium tumefaciens 2 1 2075577 61360 Prokaryotes NC_003063 GenBankAgrobacterium tumefaciens 2 2 2841580 58165 Prokaryotes NC_003062 GenBankSphaerobacter thermophilus 2 1 1252731 36575 Prokaryotes NC_013524 GenBankSphaerobacter thermophilus 2 2 2741033 62307 Prokaryotes NC_013523 GenBankBrucella ovis 2 1 1164220 29961 Prokaryotes NC_009504 GenBankBrucella ovis 2 2 2111370 63000 Prokaryotes NC_009505 GenBankAgrobacterium radiobacter 2 1 2650913 72330 Prokaryotes NC_011983 GenBank

Page 42: TITLE: The Pattern and Dynamics of Genome and Chromosome ... Research Papers... · genome size variation were usually estimated through extrapolation and focused on the whole genome

Table S1. Chromosome size of each species

SpeciesChromosome

numberChromosome

indexChromosome

sizeSize of masked Common name

Accession/version Source

Agrobacterium radiobacter 2 2 4005130 67171 Prokaryotes NC_011985 GenBankBurkholderia pseudomallei 2 1 3173005 219666 Prokaryotes NC_006351 GenBankBurkholderia pseudomallei 2 2 4074542 222031 Prokaryotes NC_006350 GenBankBrucella microti 2 1 1220319 22600 Prokaryotes NC_013118 GenBankBrucella microti 2 2 2117050 51996 Prokaryotes NC_013119 GenBankVibrio cholerae 2 1 1086784 57256 Prokaryotes NC_012667 GenBankVibrio cholerae 2 2 3149584 139120 Prokaryotes NC_012668 GenBankRhodobacter sphaeroides 2 1 1297647 66616 Prokaryotes NC_011958 GenBankRhodobacter sphaeroides 2 2 3152792 95283 Prokaryotes NC_011963 GenBankRalstonia metallidurans 2 1 2580084 138901 Prokaryotes NC_007974 GenBankRalstonia metallidurans 2 2 3928089 120357 Prokaryotes NC_007973 GenBankBurkholderia mallei 2 1 2325379 199163 Prokaryotes NC_006349 GenBankBurkholderia mallei 2 2 3510148 301459 Prokaryotes NC_006348 GenBankVibrio cholerae 2 1 1108250 63490 Prokaryotes NC_009456 GenBankVibrio cholerae 2 2 3024069 84466 Prokaryotes NC_009457 GenBankRalstonia pickettii 2 1 1323321 13692 Prokaryotes NC_012857 GenBankRalstonia pickettii 2 2 3647724 53461 Prokaryotes NC_012856 GenBankVariovorax paradoxus 2 1 1128644 57956 Prokaryotes NC_012792 GenBankVariovorax paradoxus 2 2 5626353 201634 Prokaryotes NC_012791 GenBankBrucella suis 2 1 1400844 34341 Prokaryotes NC_010167 GenBankBrucella suis 2 2 1923763 39895 Prokaryotes NC_010169 GenBankRalstonia eutropha 2 1 2726152 102713 Prokaryotes NC_007348 GenBankRalstonia eutropha 2 2 3806533 75751 Prokaryotes NC_007347 GenBankAliivibrio salmonicida 2 1 1206461 155902 Prokaryotes NC_011313 GenBankAliivibrio salmonicida 2 2 3325165 358498 Prokaryotes NC_011312 GenBankLeptospira interrogans 2 1 350181 25969 Prokaryotes NC_005824 GenBankLeptospira interrogans 2 2 4277185 451675 Prokaryotes NC_005823 GenBankBurkholderia cenocepacia 3 1 1196094 54747 Prokaryotes NC_008062 GenBankBurkholderia cenocepacia 3 2 2788459 102179 Prokaryotes NC_008061 GenBankBurkholderia cenocepacia 3 3 3294563 117812 Prokaryotes NC_008060 GenBankBurkholderia mallei 2 1 2352693 199062 Prokaryotes NC_009079 GenBankBurkholderia mallei 2 2 3495687 298765 Prokaryotes NC_009080 GenBankBurkholderia xenovorans 3 1 1471779 31301 Prokaryotes NC_007953 GenBankBurkholderia xenovorans 3 2 3363523 94840 Prokaryotes NC_007952 GenBankBurkholderia xenovorans 3 3 4895836 163330 Prokaryotes NC_007951 GenBankVibrio vulnificus 2 1 1857073 15487 Prokaryotes NC_005140 GenBank

Page 43: TITLE: The Pattern and Dynamics of Genome and Chromosome ... Research Papers... · genome size variation were usually estimated through extrapolation and focused on the whole genome

Table S1. Chromosome size of each species

SpeciesChromosome

numberChromosome

indexChromosome

sizeSize of masked Common name

Accession/version Source

Vibrio vulnificus 2 2 3354505 134417 Prokaryotes NC_005139 GenBankBurkholderia ambifaria 3 1 1127947 36259 Prokaryotes NC_010557 GenBankBurkholderia ambifaria 3 2 2769414 68405 Prokaryotes NC_010552 GenBankBurkholderia ambifaria 3 3 3443583 92566 Prokaryotes NC_010551 GenBankBurkholderia vietnamiensis 3 1 1241007 84084 Prokaryotes NC_009254 GenBankBurkholderia vietnamiensis 3 2 2411759 118333 Prokaryotes NC_009255 GenBankBurkholderia vietnamiensis 3 3 3652814 156520 Prokaryotes NC_009256 GenBankBurkholderia mallei 4 1 1734922 225765 Prokaryotes NC_008784 GenBankBurkholderia mallei 4 2 2325379 292706 Prokaryotes NC_006349 GenBankBurkholderia mallei 4 3 3495687 416419 Prokaryotes NC_009080 GenBankBurkholderia mallei 4 4 3510148 405581 Prokaryotes NC_006348 GenBankBurkholderia cepacia 3 1 1281472 52977 Prokaryotes NC_008392 GenBankBurkholderia cepacia 3 2 2646969 75349 Prokaryotes NC_008391 GenBankBurkholderia cepacia 3 3 3556545 110869 Prokaryotes NC_008390 GenBankBurkholderia glumae 2 1 2827355 231907 Prokaryotes NC_012721 GenBankBurkholderia glumae 2 2 3906529 209050 Prokaryotes NC_012724 GenBankBurkholderia thailandensis 2 1 2914771 156089 Prokaryotes NC_007650 GenBankBurkholderia thailandensis 2 2 3809201 202513 Prokaryotes NC_007651 GenBankVibrio fischeri 2 1 1418848 54540 Prokaryotes NC_011186 GenBankVibrio fischeri 2 2 2905029 64578 Prokaryotes NC_011184 GenBankAgrobacterium vitis 2 1 1283187 37857 Prokaryotes NC_011988 GenBankAgrobacterium vitis 2 2 3726375 112171 Prokaryotes NC_011989 GenBankBigelowiella natans 3 1 98136 22680 Unicellular eukaryotes NC_010006 GenBankBigelowiella natans 3 2 134144 24243 Unicellular eukaryotes NC_010005 GenBankBigelowiella natans 3 3 140590 24920 Unicellular eukaryotes NC_010004 GenBankZygosaccharomyces rouxii 7 1 881646 49405 Unicellular eukaryotes NC_012994 GenBankZygosaccharomyces rouxii 7 2 1114666 61081 Unicellular eukaryotes NC_012990 GenBankZygosaccharomyces rouxii 7 3 1388208 34248 Unicellular eukaryotes NC_012991 GenBankZygosaccharomyces rouxii 7 4 1464093 33012 Unicellular eukaryotes NC_012992 GenBankZygosaccharomyces rouxii 7 5 1496342 32427 Unicellular eukaryotes NC_012993 GenBankZygosaccharomyces rouxii 7 6 1554288 41133 Unicellular eukaryotes NC_012995 GenBankZygosaccharomyces rouxii 7 7 1865392 35865 Unicellular eukaryotes NC_012996 GenBankDictyostelium discoideum 6 1 3602379 1394231 Unicellular eukaryotes NC_007092 GenBankDictyostelium discoideum 6 2 4923596 2018766 Unicellular eukaryotes NC_007087 GenBankDictyostelium discoideum 6 3 5125352 1918819 Unicellular eukaryotes NC_007091 GenBankDictyostelium discoideum 6 4 5450249 1934751 Unicellular eukaryotes NC_007090 GenBank

Page 44: TITLE: The Pattern and Dynamics of Genome and Chromosome ... Research Papers... · genome size variation were usually estimated through extrapolation and focused on the whole genome

Table S1. Chromosome size of each species

SpeciesChromosome

numberChromosome

indexChromosome

sizeSize of masked Common name

Accession/version Source

Dictyostelium discoideum 6 5 6357299 2395824 Unicellular eukaryotes NC_007089 GenBankDictyostelium discoideum 6 6 8484197 3395450 Unicellular eukaryotes NC_007088 GenBankBabesia bovis 2 1 1729419 154768 Unicellular eukaryotes NC_010574 GenBankBabesia bovis 2 2 2593320 171247 Unicellular eukaryotes NC_010575 GenBankOstreococcus lucimarinus 21 1 149386 13164 Unicellular eukaryotes NC_009372 GenBankOstreococcus lucimarinus 21 2 154676 14161 Unicellular eukaryotes NC_009373 GenBankOstreococcus lucimarinus 21 3 321799 25312 Unicellular eukaryotes NC_009375 GenBankOstreococcus lucimarinus 21 4 366173 35994 Unicellular eukaryotes NC_009371 GenBankOstreococcus lucimarinus 21 5 428333 31993 Unicellular eukaryotes NC_009370 GenBankOstreococcus lucimarinus 21 6 468366 31854 Unicellular eukaryotes NC_009369 GenBankOstreococcus lucimarinus 21 7 528469 43935 Unicellular eukaryotes NC_009367 GenBankOstreococcus lucimarinus 21 8 538963 35625 Unicellular eukaryotes NC_009366 GenBankOstreococcus lucimarinus 21 9 549133 43071 Unicellular eukaryotes NC_009374 GenBankOstreococcus lucimarinus 21 10 593542 59200 Unicellular eukaryotes NC_009365 GenBankOstreococcus lucimarinus 21 11 613585 45255 Unicellular eukaryotes NC_009364 GenBankOstreococcus lucimarinus 21 12 670853 49139 Unicellular eukaryotes NC_009363 GenBankOstreococcus lucimarinus 21 13 701771 50067 Unicellular eukaryotes NC_009362 GenBankOstreococcus lucimarinus 21 14 708927 62284 Unicellular eukaryotes NC_009368 GenBankOstreococcus lucimarinus 21 15 783246 53604 Unicellular eukaryotes NC_009361 GenBankOstreococcus lucimarinus 21 16 818664 52950 Unicellular eukaryotes NC_009360 GenBankOstreococcus lucimarinus 21 17 847696 56563 Unicellular eukaryotes NC_009359 GenBankOstreococcus lucimarinus 21 18 895087 61990 Unicellular eukaryotes NC_009356 GenBankOstreococcus lucimarinus 21 19 930724 61763 Unicellular eukaryotes NC_009358 GenBankOstreococcus lucimarinus 21 20 982987 62361 Unicellular eukaryotes NC_009357 GenBankOstreococcus lucimarinus 21 21 1152508 77658 Unicellular eukaryotes NC_009355 GenBankKluyveromyces thermotolerans 8 1 687718 14596 Unicellular eukaryotes NC_013077 GenBankKluyveromyces thermotolerans 8 2 893706 10757 Unicellular eukaryotes NC_013078 GenBankKluyveromyces thermotolerans 8 3 999246 15648 Unicellular eukaryotes NC_013079 GenBankKluyveromyces thermotolerans 8 4 1423902 43727 Unicellular eukaryotes NC_013084 GenBankKluyveromyces thermotolerans 8 5 1513537 24560 Unicellular eukaryotes NC_013080 GenBankKluyveromyces thermotolerans 8 6 1521774 21226 Unicellular eukaryotes NC_013081 GenBankKluyveromyces thermotolerans 8 7 1632914 25913 Unicellular eukaryotes NC_013082 GenBankKluyveromyces thermotolerans 8 8 1720065 17231 Unicellular eukaryotes NC_013083 GenBankTheileria annulata 4 1 1842271 219216 Unicellular eukaryotes NC_011098 GenBankTheileria annulata 4 2 1898942 316540 Unicellular eukaryotes NC_011100 GenBankTheileria annulata 4 3 1979170 236340 Unicellular eukaryotes NC_011099 GenBank

Page 45: TITLE: The Pattern and Dynamics of Genome and Chromosome ... Research Papers... · genome size variation were usually estimated through extrapolation and focused on the whole genome

Table S1. Chromosome size of each species

SpeciesChromosome

numberChromosome

indexChromosome

sizeSize of masked Common name

Accession/version Source

Theileria annulata 4 4 2632137 343409 Unicellular eukaryotes NC_011129 GenBankPlasmodium knowlesi 14 1 726886 119010 Unicellular eukaryotes NC_011903 GenBankPlasmodium knowlesi 14 2 785142 117595 Unicellular eukaryotes NC_011905 GenBankPlasmodium knowlesi 14 3 838594 81310 Unicellular eukaryotes NC_011902 GenBankPlasmodium knowlesi 14 4 973297 157999 Unicellular eukaryotes NC_011904 GenBankPlasmodium knowlesi 14 5 1053090 133650 Unicellular eukaryotes NC_011907 GenBankPlasmodium knowlesi 14 6 1324984 201270 Unicellular eukaryotes NC_011906 GenBankPlasmodium knowlesi 14 7 1486039 237102 Unicellular eukaryotes NC_011911 GenBankPlasmodium knowlesi 14 8 1496036 188033 Unicellular eukaryotes NC_011908 GenBankPlasmodium knowlesi 14 9 1770351 234817 Unicellular eukaryotes NC_011909 GenBankPlasmodium knowlesi 14 10 2147124 269565 Unicellular eukaryotes NC_011910 GenBankPlasmodium knowlesi 14 11 2200295 278094 Unicellular eukaryotes NC_011914 GenBankPlasmodium knowlesi 14 12 2372884 387518 Unicellular eukaryotes NC_011912 GenBankPlasmodium knowlesi 14 13 3128370 315305 Unicellular eukaryotes NC_011913 GenBankPlasmodium knowlesi 14 14 3159095 341420 Unicellular eukaryotes NC_011915 GenBankThalassiosira pseudonana 20 1 297349 34574 Unicellular eukaryotes NC_012087 GenBankThalassiosira pseudonana 20 2 454954 19894 Unicellular eukaryotes NC_012086 GenBankThalassiosira pseudonana 20 3 659924 33526 Unicellular eukaryotes NC_012080 GenBankThalassiosira pseudonana 20 4 800234 86751 Unicellular eukaryotes NC_012083 GenBankThalassiosira pseudonana 20 5 827053 29615 Unicellular eukaryotes NC_012081 GenBankThalassiosira pseudonana 20 6 931268 40290 Unicellular eukaryotes NC_012078 GenBankThalassiosira pseudonana 20 7 998643 37242 Unicellular eukaryotes NC_012077 GenBankThalassiosira pseudonana 20 8 1052196 33245 Unicellular eukaryotes NC_012076 GenBankThalassiosira pseudonana 20 9 1057565 86877 Unicellular eukaryotes NC_012085 GenBankThalassiosira pseudonana 20 10 1105668 82538 Unicellular eukaryotes NC_012073 GenBankThalassiosira pseudonana 20 11 1128382 102275 Unicellular eukaryotes NC_012075 GenBankThalassiosira pseudonana 20 12 1191060 40821 Unicellular eukaryotes NC_012072 GenBankThalassiosira pseudonana 20 13 1267198 39621 Unicellular eukaryotes NC_012071 GenBankThalassiosira pseudonana 20 14 1992434 70221 Unicellular eukaryotes NC_012070 GenBankThalassiosira pseudonana 20 15 2071480 73911 Unicellular eukaryotes NC_012069 GenBankThalassiosira pseudonana 20 16 2305972 44136 Unicellular eukaryotes NC_012068 GenBankThalassiosira pseudonana 20 17 2402323 59468 Unicellular eukaryotes NC_012067 GenBankThalassiosira pseudonana 20 18 2440052 52635 Unicellular eukaryotes NC_012066 GenBankThalassiosira pseudonana 20 19 2707195 110339 Unicellular eukaryotes NC_012065 GenBankThalassiosira pseudonana 20 20 3042585 79381 Unicellular eukaryotes NC_012064 GenBankPhaeodactylum tricornutum 33 1 87967 33213 Unicellular eukaryotes NC_011701 GenBank

Page 46: TITLE: The Pattern and Dynamics of Genome and Chromosome ... Research Papers... · genome size variation were usually estimated through extrapolation and focused on the whole genome

Table S1. Chromosome size of each species

SpeciesChromosome

numberChromosome

indexChromosome

sizeSize of masked Common name

Accession/version Source

Phaeodactylum tricornutum 33 2 157053 11673 Unicellular eukaryotes NC_011700 GenBankPhaeodactylum tricornutum 33 3 258242 74530 Unicellular eukaryotes NC_011699 GenBankPhaeodactylum tricornutum 33 4 317207 48517 Unicellular eukaryotes NC_011698 GenBankPhaeodactylum tricornutum 33 5 384256 58328 Unicellular eukaryotes NC_011697 GenBankPhaeodactylum tricornutum 33 6 387582 16631 Unicellular eukaryotes NC_011696 GenBankPhaeodactylum tricornutum 33 7 404298 44422 Unicellular eukaryotes NC_011695 GenBankPhaeodactylum tricornutum 33 8 441226 64169 Unicellular eukaryotes NC_011694 GenBankPhaeodactylum tricornutum 33 9 497271 43662 Unicellular eukaryotes NC_011693 GenBankPhaeodactylum tricornutum 33 10 511739 83055 Unicellular eukaryotes NC_011692 GenBankPhaeodactylum tricornutum 33 11 512847 31071 Unicellular eukaryotes NC_011691 GenBankPhaeodactylum tricornutum 33 12 591336 37383 Unicellular eukaryotes NC_011690 GenBankPhaeodactylum tricornutum 33 13 662217 33358 Unicellular eukaryotes NC_011689 GenBankPhaeodactylum tricornutum 33 14 683011 56393 Unicellular eukaryotes NC_011688 GenBankPhaeodactylum tricornutum 33 15 690427 75249 Unicellular eukaryotes NC_011687 GenBankPhaeodactylum tricornutum 33 16 702471 86674 Unicellular eukaryotes NC_011686 GenBankPhaeodactylum tricornutum 33 17 703943 87315 Unicellular eukaryotes NC_011685 GenBankPhaeodactylum tricornutum 33 18 764225 76107 Unicellular eukaryotes NC_011684 GenBankPhaeodactylum tricornutum 33 19 814910 47594 Unicellular eukaryotes NC_011683 GenBankPhaeodactylum tricornutum 33 20 829358 84273 Unicellular eukaryotes NC_011682 GenBankPhaeodactylum tricornutum 33 21 887524 71291 Unicellular eukaryotes NC_011681 GenBankPhaeodactylum tricornutum 33 22 901853 105936 Unicellular eukaryotes NC_011680 GenBankPhaeodactylum tricornutum 33 23 945026 79695 Unicellular eukaryotes NC_011679 GenBankPhaeodactylum tricornutum 33 24 976485 48152 Unicellular eukaryotes NC_011678 GenBankPhaeodactylum tricornutum 33 25 1002813 63848 Unicellular eukaryotes NC_011677 GenBankPhaeodactylum tricornutum 33 26 1007773 70023 Unicellular eukaryotes NC_011676 GenBankPhaeodactylum tricornutum 33 27 1029019 64887 Unicellular eukaryotes NC_011675 GenBankPhaeodactylum tricornutum 33 28 1035082 63800 Unicellular eukaryotes NC_011674 GenBankPhaeodactylum tricornutum 33 29 1098047 62793 Unicellular eukaryotes NC_011673 GenBankPhaeodactylum tricornutum 33 30 1360148 67110 Unicellular eukaryotes NC_011672 GenBankPhaeodactylum tricornutum 33 31 1460046 129896 Unicellular eukaryotes NC_011671 GenBankPhaeodactylum tricornutum 33 32 1497954 53345 Unicellular eukaryotes NC_011670 GenBankPhaeodactylum tricornutum 33 33 2535400 149383 Unicellular eukaryotes NC_011669 GenBankSaccharomyces cerevisiae 16 1 230208 10167 Unicellular eukaryotes release 3 Ensembl Fungi Saccharomyces cerevisiae 16 2 270148 6070 Unicellular eukaryotes release 3 Ensembl Fungi Saccharomyces cerevisiae 16 3 316617 7894 Unicellular eukaryotes release 3 Ensembl Fungi Saccharomyces cerevisiae 16 4 439885 13271 Unicellular eukaryotes release 3 Ensembl Fungi

Page 47: TITLE: The Pattern and Dynamics of Genome and Chromosome ... Research Papers... · genome size variation were usually estimated through extrapolation and focused on the whole genome

Table S1. Chromosome size of each species

SpeciesChromosome

numberChromosome

indexChromosome

sizeSize of masked Common name

Accession/version Source

Saccharomyces cerevisiae 16 5 562643 11592 Unicellular eukaryotes release 3 Ensembl Fungi Saccharomyces cerevisiae 16 6 576869 10890 Unicellular eukaryotes release 3 Ensembl Fungi Saccharomyces cerevisiae 16 7 666454 14947 Unicellular eukaryotes release 3 Ensembl Fungi Saccharomyces cerevisiae 16 8 745741 14449 Unicellular eukaryotes release 3 Ensembl Fungi Saccharomyces cerevisiae 16 9 784333 14885 Unicellular eukaryotes release 3 Ensembl Fungi Saccharomyces cerevisiae 16 10 813178 15191 Unicellular eukaryotes release 3 Ensembl Fungi Saccharomyces cerevisiae 16 11 924429 18514 Unicellular eukaryotes release 3 Ensembl Fungi Saccharomyces cerevisiae 16 12 948062 17745 Unicellular eukaryotes release 3 Ensembl Fungi Saccharomyces cerevisiae 16 13 1078175 21308 Unicellular eukaryotes release 3 Ensembl Fungi Saccharomyces cerevisiae 16 14 1090947 20684 Unicellular eukaryotes release 3 Ensembl Fungi Saccharomyces cerevisiae 16 15 1091289 20106 Unicellular eukaryotes release 3 Ensembl Fungi Saccharomyces cerevisiae 16 16 1531919 31596 Unicellular eukaryotes release 3 Ensembl Fungi Trypanosoma brucei 10 1 1064672 372931 Unicellular eukaryotes NC_008409 GenBankTrypanosoma brucei 10 2 1193948 473421 Unicellular eukaryotes NC_005063 GenBankTrypanosoma brucei 10 3 1590432 272395 Unicellular eukaryotes NC_007277 GenBankTrypanosoma brucei 10 4 1608198 237174 Unicellular eukaryotes NC_007278 GenBankTrypanosoma brucei 10 5 1618915 395432 Unicellular eukaryotes NC_007279 GenBankTrypanosoma brucei 10 6 1653225 238712 Unicellular eukaryotes NC_007276 GenBankTrypanosoma brucei 10 7 2205233 287860 Unicellular eukaryotes NC_007280 GenBankTrypanosoma brucei 10 8 2481190 308262 Unicellular eukaryotes NC_007281 GenBankTrypanosoma brucei 10 9 3057547 600228 Unicellular eukaryotes NC_007282 GenBankTrypanosoma brucei 10 10 4054025 270464 Unicellular eukaryotes NC_007283 GenBankSchizosaccharomyces pombe 3 1 2452883 202792 Unicellular eukaryotes NC_003421 GenBankSchizosaccharomyces pombe 3 2 4539804 203322 Unicellular eukaryotes NC_003423 GenBankSchizosaccharomyces pombe 3 3 5579133 205978 Unicellular eukaryotes NC_003424 GenBankYarrowia lipolytica 6 1 2303261 189296 Unicellular eukaryotes NC_006067 GenBankYarrowia lipolytica 6 2 3066374 113698 Unicellular eukaryotes NC_006068 GenBankYarrowia lipolytica 6 3 3272609 176616 Unicellular eukaryotes NC_006069 GenBankYarrowia lipolytica 6 4 3633272 158520 Unicellular eukaryotes NC_006070 GenBankYarrowia lipolytica 6 5 4003362 151477 Unicellular eukaryotes NC_006072 GenBankYarrowia lipolytica 6 6 4224103 150134 Unicellular eukaryotes NC_006071 GenBankCandida dubliniensis 8 1 1022435 158876 Unicellular eukaryotes NC_012866 GenBankCandida dubliniensis 8 2 1073895 110505 Unicellular eukaryotes NC_012865 GenBankCandida dubliniensis 8 3 1245899 128238 Unicellular eukaryotes NC_012864 GenBankCandida dubliniensis 8 4 1641709 141325 Unicellular eukaryotes NC_012863 GenBankCandida dubliniensis 8 5 1863824 159279 Unicellular eukaryotes NC_012862 GenBank

Page 48: TITLE: The Pattern and Dynamics of Genome and Chromosome ... Research Papers... · genome size variation were usually estimated through extrapolation and focused on the whole genome

Table S1. Chromosome size of each species

SpeciesChromosome

numberChromosome

indexChromosome

sizeSize of masked Common name

Accession/version Source

Candida dubliniensis 8 6 2267510 139607 Unicellular eukaryotes NC_012867 GenBankCandida dubliniensis 8 7 2289089 170672 Unicellular eukaryotes NC_012861 GenBankCandida dubliniensis 8 8 3214061 196306 Unicellular eukaryotes NC_012860 GenBankPichia stipitis 8 1 979380 95613 Unicellular eukaryotes NC_009048 GenBankPichia stipitis 8 2 1114415 109921 Unicellular eukaryotes NC_009047 GenBankPichia stipitis 8 3 1724953 79910 Unicellular eukaryotes NC_009046 GenBankPichia stipitis 8 4 1725948 66164 Unicellular eukaryotes NC_009045 GenBankPichia stipitis 8 5 1803401 122129 Unicellular eukaryotes NC_009044 GenBankPichia stipitis 8 6 1841851 113528 Unicellular eukaryotes NC_009043 GenBankPichia stipitis 8 7 2740984 145111 Unicellular eukaryotes NC_009042 GenBankPichia stipitis 8 8 3510247 198757 Unicellular eukaryotes NC_009068 GenBankCryptosporidium parvum 8 1 875659 37281 Unicellular eukaryotes NC_006980 GenBankCryptosporidium parvum 8 2 985969 43496 Unicellular eukaryotes NC_006981 GenBankCryptosporidium parvum 8 3 1080900 74067 Unicellular eukaryotes NC_006984 GenBankCryptosporidium parvum 8 4 1099352 47510 Unicellular eukaryotes NC_006982 GenBankCryptosporidium parvum 8 5 1104417 63269 Unicellular eukaryotes NC_006983 GenBankCryptosporidium parvum 8 6 1278458 36308 Unicellular eukaryotes NC_006986 GenBankCryptosporidium parvum 8 7 1332857 68524 Unicellular eukaryotes NC_006985 GenBankCryptosporidium parvum 8 8 1344712 62653 Unicellular eukaryotes NC_006987 GenBankAspergillus fumigatus 8 1 1833124 77932 Unicellular eukaryotes NC_007201 GenBankAspergillus fumigatus 8 2 2058334 129292 Unicellular eukaryotes NC_007200 GenBankAspergillus fumigatus 8 3 3778736 211007 Unicellular eukaryotes NC_007199 GenBankAspergillus fumigatus 8 4 3923705 152090 Unicellular eukaryotes NC_007197 GenBankAspergillus fumigatus 8 5 3948441 88168 Unicellular eukaryotes NC_007198 GenBankAspergillus fumigatus 8 6 4079167 166459 Unicellular eukaryotes NC_007196 GenBankAspergillus fumigatus 8 7 4844472 168566 Unicellular eukaryotes NC_007195 GenBankAspergillus fumigatus 8 8 4918979 113279 Unicellular eukaryotes NC_007194 GenBankEncephalitozoon cuniculi 11 1 194439 13840 Unicellular eukaryotes NC_003230 GenBankEncephalitozoon cuniculi 11 2 197426 22385 Unicellular eukaryotes NC_003229 GenBankEncephalitozoon cuniculi 11 3 209982 63705 Unicellular eukaryotes NC_003242 GenBankEncephalitozoon cuniculi 11 4 211018 19249 Unicellular eukaryotes NC_003232 GenBankEncephalitozoon cuniculi 11 5 218329 26623 Unicellular eukaryotes NC_003231 GenBankEncephalitozoon cuniculi 11 6 220294 27143 Unicellular eukaryotes NC_003233 GenBankEncephalitozoon cuniculi 11 7 226576 20288 Unicellular eukaryotes NC_003234 GenBankEncephalitozoon cuniculi 11 8 238147 29844 Unicellular eukaryotes NC_003235 GenBankEncephalitozoon cuniculi 11 9 251002 15485 Unicellular eukaryotes NC_003238 GenBank

Page 49: TITLE: The Pattern and Dynamics of Genome and Chromosome ... Research Papers... · genome size variation were usually estimated through extrapolation and focused on the whole genome

Table S1. Chromosome size of each species

SpeciesChromosome

numberChromosome

indexChromosome

sizeSize of masked Common name

Accession/version Source

Encephalitozoon cuniculi 11 10 262797 24419 Unicellular eukaryotes NC_003236 GenBankEncephalitozoon cuniculi 11 11 267509 26226 Unicellular eukaryotes NC_003237 GenBankCyanidioschyzon merolae 20 1 422618 NA Unicellular eukaryotes AP006483 http://merolae.biol.s.u-tokyo.ac.jp/Cyanidioschyzon merolae 20 2 457015 NA Unicellular eukaryotes AP006484 http://merolae.biol.s.u-tokyo.ac.jp/Cyanidioschyzon merolae 20 3 481793 NA Unicellular eukaryotes AP006485 http://merolae.biol.s.u-tokyo.ac.jp/Cyanidioschyzon merolae 20 4 513457 NA Unicellular eukaryotes AP006486 http://merolae.biol.s.u-tokyo.ac.jp/Cyanidioschyzon merolae 20 5 528684 NA Unicellular eukaryotes AP006487 http://merolae.biol.s.u-tokyo.ac.jp/Cyanidioschyzon merolae 20 6 536165 NA Unicellular eukaryotes AP006488 http://merolae.biol.s.u-tokyo.ac.jp/Cyanidioschyzon merolae 20 7 584454 NA Unicellular eukaryotes AP006489 http://merolae.biol.s.u-tokyo.ac.jp/Cyanidioschyzon merolae 20 8 739755 NA Unicellular eukaryotes AP006490 http://merolae.biol.s.u-tokyo.ac.jp/Cyanidioschyzon merolae 20 9 810153 NA Unicellular eukaryotes AP006491 http://merolae.biol.s.u-tokyo.ac.jp/Cyanidioschyzon merolae 20 10 839709 NA Unicellular eukaryotes AP006492 http://merolae.biol.s.u-tokyo.ac.jp/Cyanidioschyzon merolae 20 11 852729 NA Unicellular eukaryotes AP006496 http://merolae.biol.s.u-tokyo.ac.jp/Cyanidioschyzon merolae 20 12 852851 NA Unicellular eukaryotes AP006493 http://merolae.biol.s.u-tokyo.ac.jp/Cyanidioschyzon merolae 20 13 859121 NA Unicellular eukaryotes AP006494 http://merolae.biol.s.u-tokyo.ac.jp/Cyanidioschyzon merolae 20 14 866985 NA Unicellular eukaryotes AP006495 http://merolae.biol.s.u-tokyo.ac.jp/Cyanidioschyzon merolae 20 15 902902 NA Unicellular eukaryotes AP006497 http://merolae.biol.s.u-tokyo.ac.jp/Cyanidioschyzon merolae 20 16 908487 NA Unicellular eukaryotes AP006498 http://merolae.biol.s.u-tokyo.ac.jp/Cyanidioschyzon merolae 20 17 1232260 NA Unicellular eukaryotes AP006499 http://merolae.biol.s.u-tokyo.ac.jp/Cyanidioschyzon merolae 20 18 1253089 NA Unicellular eukaryotes AP006500 http://merolae.biol.s.u-tokyo.ac.jp/Cyanidioschyzon merolae 20 19 1282941 NA Unicellular eukaryotes AP006501 http://merolae.biol.s.u-tokyo.ac.jp/Cyanidioschyzon merolae 20 20 1621619 NA Unicellular eukaryotes AP006502 http://merolae.biol.s.u-tokyo.ac.jp/Pichia pastoris 4 1 1778296 9467 Unicellular eukaryotes NC_012966 GenBankPichia pastoris 4 2 2245428 13623 Unicellular eukaryotes NC_012965 GenBankPichia pastoris 4 3 2394163 9995 Unicellular eukaryotes NC_012964 GenBankPichia pastoris 4 4 2798491 14217 Unicellular eukaryotes NC_012963 GenBankCryptococcus neoformans 14 1 767333 85323 Unicellular eukaryotes NC_009190 GenBankCryptococcus neoformans 14 2 806693 42731 Unicellular eukaryotes NC_009189 GenBankCryptococcus neoformans 14 3 932832 29425 Unicellular eukaryotes NC_009188 GenBankCryptococcus neoformans 14 4 1063760 38692 Unicellular eukaryotes NC_009187 GenBankCryptococcus neoformans 14 5 1086008 39902 Unicellular eukaryotes NC_009186 GenBankCryptococcus neoformans 14 6 1131068 39994 Unicellular eukaryotes NC_009185 GenBankCryptococcus neoformans 14 7 1177453 40032 Unicellular eukaryotes NC_009184 GenBankCryptococcus neoformans 14 8 1411173 77571 Unicellular eukaryotes NC_009182 GenBankCryptococcus neoformans 14 9 1414206 59088 Unicellular eukaryotes NC_009183 GenBankCryptococcus neoformans 14 10 1461964 43872 Unicellular eukaryotes NC_009181 GenBank

Page 50: TITLE: The Pattern and Dynamics of Genome and Chromosome ... Research Papers... · genome size variation were usually estimated through extrapolation and focused on the whole genome

Table S1. Chromosome size of each species

SpeciesChromosome

numberChromosome

indexChromosome

sizeSize of masked Common name

Accession/version Source

Cryptococcus neoformans 14 11 1786351 71716 Unicellular eukaryotes NC_009180 GenBankCryptococcus neoformans 14 12 2079976 69839 Unicellular eukaryotes NC_009179 GenBankCryptococcus neoformans 14 13 2283892 47513 Unicellular eukaryotes NC_009178 GenBankCryptococcus neoformans 14 14 2297073 60251 Unicellular eukaryotes NC_009177 GenBankPlasmodium vivax 14 1 755035 42293 Unicellular eukaryotes NC_009907 GenBankPlasmodium vivax 14 2 830022 40884 Unicellular eukaryotes NC_009906 GenBankPlasmodium vivax 14 3 876622 57760 Unicellular eukaryotes NC_009909 GenBankPlasmodium vivax 14 4 1011127 56228 Unicellular eukaryotes NC_009908 GenBankPlasmodium vivax 14 5 1033388 56499 Unicellular eukaryotes NC_009911 GenBankPlasmodium vivax 14 6 1370936 77405 Unicellular eukaryotes NC_009910 GenBankPlasmodium vivax 14 7 1419739 73600 Unicellular eukaryotes NC_009915 GenBankPlasmodium vivax 14 8 1497819 69653 Unicellular eukaryotes NC_009912 GenBankPlasmodium vivax 14 9 1678596 77390 Unicellular eukaryotes NC_009913 GenBankPlasmodium vivax 14 10 1923364 87797 Unicellular eukaryotes NC_009914 GenBankPlasmodium vivax 14 11 2031768 88027 Unicellular eukaryotes NC_009918 GenBankPlasmodium vivax 14 12 2067354 91887 Unicellular eukaryotes NC_009916 GenBankPlasmodium vivax 14 13 3004884 125151 Unicellular eukaryotes NC_009917 GenBankPlasmodium vivax 14 14 3120417 147077 Unicellular eukaryotes NC_009919 GenBankChlamydomonas reinhardtii 17 1 2734619 497275 Unicellular eukaryotes Version 5.0 phytozomeChlamydomonas reinhardtii 17 2 3005669 512573 Unicellular eukaryotes Version 5.0 phytozomeChlamydomonas reinhardtii 17 3 3066274 731591 Unicellular eukaryotes Version 5.0 phytozomeChlamydomonas reinhardtii 17 4 3366352 583184 Unicellular eukaryotes Version 5.0 phytozomeChlamydomonas reinhardtii 17 5 4114342 675931 Unicellular eukaryotes Version 5.0 phytozomeChlamydomonas reinhardtii 17 6 4189090 696675 Unicellular eukaryotes Version 5.0 phytozomeChlamydomonas reinhardtii 17 7 4733070 741036 Unicellular eukaryotes Version 5.0 phytozomeChlamydomonas reinhardtii 17 8 6170768 965613 Unicellular eukaryotes Version 5.0 phytozomeChlamydomonas reinhardtii 17 9 6579462 908730 Unicellular eukaryotes Version 5.0 phytozomeChlamydomonas reinhardtii 17 10 6588689 984938 Unicellular eukaryotes Version 5.0 phytozomeChlamydomonas reinhardtii 17 11 6617689 958916 Unicellular eukaryotes Version 5.0 phytozomeChlamydomonas reinhardtii 17 12 6673064 1029061 Unicellular eukaryotes Version 5.0 phytozomeChlamydomonas reinhardtii 17 13 7695193 1086209 Unicellular eukaryotes Version 5.0 phytozomeChlamydomonas reinhardtii 17 14 7773333 1125593 Unicellular eukaryotes Version 5.0 phytozomeChlamydomonas reinhardtii 17 15 9355449 1278588 Unicellular eukaryotes Version 5.0 phytozomeChlamydomonas reinhardtii 17 16 9975745 1412004 Unicellular eukaryotes Version 5.0 phytozomeChlamydomonas reinhardtii 17 17 9982135 1359453 Unicellular eukaryotes Version 5.0 phytozomeAspergillus niger 8 1 3050620 50082 Unicellular eukaryotes release 3 Ensembl Fungi

Page 51: TITLE: The Pattern and Dynamics of Genome and Chromosome ... Research Papers... · genome size variation were usually estimated through extrapolation and focused on the whole genome

Table S1. Chromosome size of each species

SpeciesChromosome

numberChromosome

indexChromosome

sizeSize of masked Common name

Accession/version Source

Aspergillus niger 8 2 3374671 47888 Unicellular eukaryotes release 3 Ensembl Fungi Aspergillus niger 8 3 3379275 37848 Unicellular eukaryotes release 3 Ensembl Fungi Aspergillus niger 8 4 4185054 92782 Unicellular eukaryotes release 3 Ensembl Fungi Aspergillus niger 8 5 4496558 85155 Unicellular eukaryotes release 3 Ensembl Fungi Aspergillus niger 8 6 4584456 66419 Unicellular eukaryotes release 3 Ensembl Fungi Aspergillus niger 8 7 5211011 101370 Unicellular eukaryotes release 3 Ensembl Fungi Aspergillus niger 8 8 6124123 90035 Unicellular eukaryotes release 3 Ensembl Fungi Hemiselmis andersenii 3 1 179593 40072 Unicellular eukaryotes NC_009979 GenBankHemiselmis andersenii 3 2 184755 27991 Unicellular eukaryotes NC_009978 GenBankHemiselmis andersenii 3 3 207524 51579 Unicellular eukaryotes NC_009977 GenBankOstreococcus tauri 20 1 159539 7744 Unicellular eukaryotes ver 2.0 http://genome.jgi-psf.org/Ostreococcus tauri 20 2 182106 2018 Unicellular eukaryotes ver 2.0 http://genome.jgi-psf.org/Ostreococcus tauri 20 3 323548 9059 Unicellular eukaryotes ver 2.0 http://genome.jgi-psf.org/Ostreococcus tauri 20 4 397082 11721 Unicellular eukaryotes ver 2.0 http://genome.jgi-psf.org/Ostreococcus tauri 20 5 474136 12644 Unicellular eukaryotes ver 2.0 http://genome.jgi-psf.org/Ostreococcus tauri 20 6 486289 12334 Unicellular eukaryotes ver 2.0 http://genome.jgi-psf.org/Ostreococcus tauri 20 7 516206 15110 Unicellular eukaryotes ver 2.0 http://genome.jgi-psf.org/Ostreococcus tauri 20 8 521582 14320 Unicellular eukaryotes ver 2.0 http://genome.jgi-psf.org/Ostreococcus tauri 20 9 531849 13378 Unicellular eukaryotes ver 2.0 http://genome.jgi-psf.org/Ostreococcus tauri 20 10 573623 16914 Unicellular eukaryotes ver 2.0 http://genome.jgi-psf.org/Ostreococcus tauri 20 11 578475 15452 Unicellular eukaryotes ver 2.0 http://genome.jgi-psf.org/Ostreococcus tauri 20 12 683751 17632 Unicellular eukaryotes ver 2.0 http://genome.jgi-psf.org/Ostreococcus tauri 20 13 689208 16832 Unicellular eukaryotes ver 2.0 http://genome.jgi-psf.org/Ostreococcus tauri 20 14 739027 17152 Unicellular eukaryotes ver 2.0 http://genome.jgi-psf.org/Ostreococcus tauri 20 15 789111 18905 Unicellular eukaryotes ver 2.0 http://genome.jgi-psf.org/Ostreococcus tauri 20 16 817477 19774 Unicellular eukaryotes ver 2.0 http://genome.jgi-psf.org/Ostreococcus tauri 20 17 884654 22813 Unicellular eukaryotes ver 2.0 http://genome.jgi-psf.org/Ostreococcus tauri 20 18 975458 24412 Unicellular eukaryotes ver 2.0 http://genome.jgi-psf.org/Ostreococcus tauri 20 19 1056936 12740 Unicellular eukaryotes ver 2.0 http://genome.jgi-psf.org/Ostreococcus tauri 20 20 1076297 28497 Unicellular eukaryotes ver 2.0 http://genome.jgi-psf.org/Kluyveromyces lactis 6 1 1062590 23112 Unicellular eukaryotes NC_006037 GenBankKluyveromyces lactis 6 2 1320834 25466 Unicellular eukaryotes NC_006038 GenBankKluyveromyces lactis 6 3 1715506 64351 Unicellular eukaryotes NC_006040 GenBankKluyveromyces lactis 6 4 1753957 34158 Unicellular eukaryotes NC_006039 GenBankKluyveromyces lactis 6 5 2234072 29908 Unicellular eukaryotes NC_006041 GenBankKluyveromyces lactis 6 6 2602197 34015 Unicellular eukaryotes NC_006042 GenBank

Page 52: TITLE: The Pattern and Dynamics of Genome and Chromosome ... Research Papers... · genome size variation were usually estimated through extrapolation and focused on the whole genome

Table S1. Chromosome size of each species

SpeciesChromosome

numberChromosome

indexChromosome

sizeSize of masked Common name

Accession/version Source

Theileria parva 2 1 1971884 163418 Unicellular eukaryotes NC_007345 GenBankTheileria parva 2 2 2540030 171254 Unicellular eukaryotes NC_007344 GenBankMicromonas sp.RCC299 17 1 214782 5526 Unicellular eukaryotes ver 3.0 http://genome.jgi-psf.org/Micromonas sp.RCC299 17 2 608929 23205 Unicellular eukaryotes ver 3.0 http://genome.jgi-psf.org/Micromonas sp.RCC299 17 3 739136 23492 Unicellular eukaryotes ver 3.0 http://genome.jgi-psf.org/Micromonas sp.RCC299 17 4 832468 28139 Unicellular eukaryotes ver 3.0 http://genome.jgi-psf.org/Micromonas sp.RCC299 17 5 1011177 29345 Unicellular eukaryotes ver 3.0 http://genome.jgi-psf.org/Micromonas sp.RCC299 17 6 1084119 29896 Unicellular eukaryotes ver 3.0 http://genome.jgi-psf.org/Micromonas sp.RCC299 17 7 1145873 34133 Unicellular eukaryotes ver 3.0 http://genome.jgi-psf.org/Micromonas sp.RCC299 17 8 1160640 37434 Unicellular eukaryotes ver 3.0 http://genome.jgi-psf.org/Micromonas sp.RCC299 17 9 1260462 38791 Unicellular eukaryotes ver 3.0 http://genome.jgi-psf.org/Micromonas sp.RCC299 17 10 1276107 62048 Unicellular eukaryotes ver 3.0 http://genome.jgi-psf.org/Micromonas sp.RCC299 17 11 1394110 36753 Unicellular eukaryotes ver 3.0 http://genome.jgi-psf.org/Micromonas sp.RCC299 17 12 1431126 43910 Unicellular eukaryotes ver 3.0 http://genome.jgi-psf.org/Micromonas sp.RCC299 17 13 1518631 46481 Unicellular eukaryotes ver 3.0 http://genome.jgi-psf.org/Micromonas sp.RCC299 17 14 1584431 50743 Unicellular eukaryotes ver 3.0 http://genome.jgi-psf.org/Micromonas sp.RCC299 17 15 1759951 55127 Unicellular eukaryotes ver 3.0 http://genome.jgi-psf.org/Micromonas sp.RCC299 17 16 1914325 77674 Unicellular eukaryotes ver 3.0 http://genome.jgi-psf.org/Micromonas sp.RCC299 17 17 2053059 34482 Unicellular eukaryotes ver 3.0 http://genome.jgi-psf.org/Leishmania braziliensis 35 1 235333 17448 Unicellular eukaryotes NC_009294 GenBankLeishmania braziliensis 35 2 332537 74399 Unicellular eukaryotes NC_009295 GenBankLeishmania braziliensis 35 3 388728 43319 Unicellular eukaryotes NC_009296 GenBankLeishmania braziliensis 35 4 459355 59118 Unicellular eukaryotes NC_009298 GenBankLeishmania braziliensis 35 5 466080 31582 Unicellular eukaryotes NC_009304 GenBankLeishmania braziliensis 35 6 470112 92314 Unicellular eukaryotes NC_009300 GenBankLeishmania braziliensis 35 7 475968 71247 Unicellular eukaryotes NC_009297 GenBankLeishmania braziliensis 35 8 526431 40818 Unicellular eukaryotes NC_009276 GenBankLeishmania braziliensis 35 9 556710 35756 Unicellular eukaryotes NC_009301 GenBankLeishmania braziliensis 35 10 558624 66227 Unicellular eukaryotes NC_009303 GenBankLeishmania braziliensis 35 11 586128 59526 Unicellular eukaryotes NC_009299 GenBankLeishmania braziliensis 35 12 606948 37046 Unicellular eukaryotes NC_009307 GenBankLeishmania braziliensis 35 13 612479 135034 Unicellular eukaryotes NC_009302 GenBankLeishmania braziliensis 35 14 632473 45497 Unicellular eukaryotes NC_009305 GenBankLeishmania braziliensis 35 15 654078 63432 Unicellular eukaryotes NC_009306 GenBankLeishmania braziliensis 35 16 657844 68225 Unicellular eukaryotes NC_009314 GenBankLeishmania braziliensis 35 17 694732 55134 Unicellular eukaryotes NC_009309 GenBank

Page 53: TITLE: The Pattern and Dynamics of Genome and Chromosome ... Research Papers... · genome size variation were usually estimated through extrapolation and focused on the whole genome

Table S1. Chromosome size of each species

SpeciesChromosome

numberChromosome

indexChromosome

sizeSize of masked Common name

Accession/version Source

Leishmania braziliensis 35 18 696353 83951 Unicellular eukaryotes NC_009308 GenBankLeishmania braziliensis 35 19 708397 70773 Unicellular eukaryotes NC_009310 GenBankLeishmania braziliensis 35 20 732538 46154 Unicellular eukaryotes NC_009313 GenBankLeishmania braziliensis 35 21 754788 74721 Unicellular eukaryotes NC_009311 GenBankLeishmania braziliensis 35 22 797353 74021 Unicellular eukaryotes NC_009315 GenBankLeishmania braziliensis 35 23 847824 61621 Unicellular eukaryotes NC_009316 GenBankLeishmania braziliensis 35 24 923860 83791 Unicellular eukaryotes NC_009317 GenBankLeishmania braziliensis 35 25 1008282 50007 Unicellular eukaryotes NC_009318 GenBankLeishmania braziliensis 35 26 1173595 67627 Unicellular eukaryotes NC_009320 GenBankLeishmania braziliensis 35 27 1190324 103200 Unicellular eukaryotes NC_009321 GenBankLeishmania braziliensis 35 28 1217395 266046 Unicellular eukaryotes NC_009319 GenBankLeishmania braziliensis 35 29 1359875 144881 Unicellular eukaryotes NC_009322 GenBankLeishmania braziliensis 35 30 1504429 143629 Unicellular eukaryotes NC_009325 GenBankLeishmania braziliensis 35 31 1579615 100530 Unicellular eukaryotes NC_009324 GenBankLeishmania braziliensis 35 32 1602271 188541 Unicellular eukaryotes NC_009323 GenBankLeishmania braziliensis 35 33 1668259 178140 Unicellular eukaryotes NC_009312 GenBankLeishmania braziliensis 35 34 2012684 189804 Unicellular eukaryotes NC_009326 GenBankLeishmania braziliensis 35 35 2686796 180306 Unicellular eukaryotes NC_009327 GenBankAspergillus oryzae 8 1 2930238 57576 Unicellular eukaryotes release 3 Ensembl Fungi Aspergillus oryzae 8 2 3395394 99426 Unicellular eukaryotes release 3 Ensembl Fungi Aspergillus oryzae 8 3 4303536 106592 Unicellular eukaryotes release 3 Ensembl Fungi Aspergillus oryzae 8 4 4533892 76148 Unicellular eukaryotes release 3 Ensembl Fungi Aspergillus oryzae 8 5 4887243 93368 Unicellular eukaryotes release 3 Ensembl Fungi Aspergillus oryzae 8 6 5123747 132846 Unicellular eukaryotes release 3 Ensembl Fungi Aspergillus oryzae 8 7 6264705 101080 Unicellular eukaryotes release 3 Ensembl Fungi Aspergillus oryzae 8 8 6513434 150734 Unicellular eukaryotes release 3 Ensembl Fungi Debaryomyces hansenii 7 1 1249565 33131 Unicellular eukaryotes NC_006043 GenBankDebaryomyces hansenii 7 2 1349926 49436 Unicellular eukaryotes NC_006044 GenBankDebaryomyces hansenii 7 3 1592360 58523 Unicellular eukaryotes NC_006045 GenBankDebaryomyces hansenii 7 4 1602771 57291 Unicellular eukaryotes NC_006046 GenBankDebaryomyces hansenii 7 5 2037969 42787 Unicellular eukaryotes NC_006047 GenBankDebaryomyces hansenii 7 6 2051428 44173 Unicellular eukaryotes NC_006049 GenBankDebaryomyces hansenii 7 7 2336804 67445 Unicellular eukaryotes NC_006048 GenBankAshbya gossypii 7 1 691920 7268 Unicellular eukaryotes NC_005782 GenBankAshbya gossypii 7 2 867696 9525 Unicellular eukaryotes NC_005783 GenBankAshbya gossypii 7 3 907057 9210 Unicellular eukaryotes NC_005784 GenBank

Page 54: TITLE: The Pattern and Dynamics of Genome and Chromosome ... Research Papers... · genome size variation were usually estimated through extrapolation and focused on the whole genome

Table S1. Chromosome size of each species

SpeciesChromosome

numberChromosome

indexChromosome

sizeSize of masked Common name

Accession/version Source

Ashbya gossypii 7 4 1466912 14571 Unicellular eukaryotes NC_005785 GenBankAshbya gossypii 7 5 1476507 13456 Unicellular eukaryotes NC_005788 GenBankAshbya gossypii 7 6 1519138 13881 Unicellular eukaryotes NC_005786 GenBankAshbya gossypii 7 7 1813154 22837 Unicellular eukaryotes NC_005787 GenBankAspergillus nidulans 8 1 2887738 35003 Unicellular eukaryotes release 3 Ensembl Fungi Aspergillus nidulans 8 2 3403833 58921 Unicellular eukaryotes release 3 Ensembl Fungi Aspergillus nidulans 8 3 3407944 48427 Unicellular eukaryotes release 3 Ensembl Fungi Aspergillus nidulans 8 4 3470996 47204 Unicellular eukaryotes release 3 Ensembl Fungi Aspergillus nidulans 8 5 3759208 39518 Unicellular eukaryotes release 3 Ensembl Fungi Aspergillus nidulans 8 6 4070061 69440 Unicellular eukaryotes release 3 Ensembl Fungi Aspergillus nidulans 8 7 4550219 54780 Unicellular eukaryotes release 3 Ensembl Fungi Aspergillus nidulans 8 8 4934093 58855 Unicellular eukaryotes release 3 Ensembl Fungi Guillardia theta 3 1 174133 53125 Unicellular eukaryotes NC_002751 GenBankGuillardia theta 3 2 180915 49522 Unicellular eukaryotes NC_002753 GenBankGuillardia theta 3 3 196216 53304 Unicellular eukaryotes NC_002752 GenBankCandida glabrata 13 1 485192 26930 Unicellular eukaryotes NC_005967 GenBankCandida glabrata 13 2 502101 9460 Unicellular eukaryotes NC_005968 GenBankCandida glabrata 13 3 558804 18930 Unicellular eukaryotes NC_006026 GenBankCandida glabrata 13 4 651701 15438 Unicellular eukaryotes NC_006027 GenBankCandida glabrata 13 5 687501 20482 Unicellular eukaryotes NC_006028 GenBankCandida glabrata 13 6 927101 12033 Unicellular eukaryotes NC_006029 GenBankCandida glabrata 13 7 992211 21648 Unicellular eukaryotes NC_006030 GenBankCandida glabrata 13 8 1050361 22756 Unicellular eukaryotes NC_006031 GenBankCandida glabrata 13 9 1089401 35969 Unicellular eukaryotes NC_006032 GenBankCandida glabrata 13 10 1192501 24919 Unicellular eukaryotes NC_006033 GenBankCandida glabrata 13 11 1302002 16765 Unicellular eukaryotes NC_006034 GenBankCandida glabrata 13 12 1400893 14997 Unicellular eukaryotes NC_006036 GenBankCandida glabrata 13 13 1440588 31762 Unicellular eukaryotes NC_006035 GenBankApis mellifera 16 1 5631066 560228 Invertebrates Amel_1.2 http://genome.ucsc.edu/Apis mellifera 16 2 7856270 667491 Invertebrates Amel_1.2 http://genome.ucsc.edu/Apis mellifera 16 3 8318479 703443 Invertebrates Amel_1.2 http://genome.ucsc.edu/Apis mellifera 16 4 8929068 607865 Invertebrates Amel_1.2 http://genome.ucsc.edu/Apis mellifera 16 5 9182753 706274 Invertebrates Amel_1.2 http://genome.ucsc.edu/Apis mellifera 16 6 9974240 800933 Invertebrates Amel_1.2 http://genome.ucsc.edu/Apis mellifera 16 7 10282195 815635 Invertebrates Amel_1.2 http://genome.ucsc.edu/Apis mellifera 16 8 10796202 864353 Invertebrates Amel_1.2 http://genome.ucsc.edu/

Page 55: TITLE: The Pattern and Dynamics of Genome and Chromosome ... Research Papers... · genome size variation were usually estimated through extrapolation and focused on the whole genome

Table S1. Chromosome size of each species

SpeciesChromosome

numberChromosome

indexChromosome

sizeSize of masked Common name

Accession/version Source

Apis mellifera 16 9 11440700 940717 Invertebrates Amel_1.2 http://genome.ucsc.edu/Apis mellifera 16 10 11452794 1004144 Invertebrates Amel_1.2 http://genome.ucsc.edu/Apis mellifera 16 11 12341916 975384 Invertebrates Amel_1.2 http://genome.ucsc.edu/Apis mellifera 16 12 12576330 1165475 Invertebrates Amel_1.2 http://genome.ucsc.edu/Apis mellifera 16 13 13386189 1181538 Invertebrates Amel_1.2 http://genome.ucsc.edu/Apis mellifera 16 14 14465785 1113401 Invertebrates Amel_1.2 http://genome.ucsc.edu/Apis mellifera 16 15 14581788 1116006 Invertebrates Amel_1.2 http://genome.ucsc.edu/Apis mellifera 16 16 25854376 2026304 Invertebrates Amel_1.2 http://genome.ucsc.edu/Caenorhabditis elegans 5 1 13783682 3560038 Invertebrates release 3 Ensembl MetazoaCaenorhabditis elegans 5 2 15072421 3791671 Invertebrates release 3 Ensembl MetazoaCaenorhabditis elegans 5 3 15279324 3287902 Invertebrates release 3 Ensembl MetazoaCaenorhabditis elegans 5 4 17493784 3576111 Invertebrates release 3 Ensembl MetazoaCaenorhabditis elegans 5 5 20924143 4365232 Invertebrates release 3 Ensembl MetazoaTribolium castaneum 10 1 10877635 998770 Invertebrates version 3 http://beetlebase.org/Tribolium castaneum 10 2 11386040 2714935 Invertebrates version 3 http://beetlebase.org/Tribolium castaneum 10 3 13176827 2856528 Invertebrates version 3 http://beetlebase.org/Tribolium castaneum 10 4 13894384 2129500 Invertebrates version 3 http://beetlebase.org/Tribolium castaneum 10 5 18021898 3770606 Invertebrates version 3 http://beetlebase.org/Tribolium castaneum 10 6 19135781 2455203 Invertebrates version 3 http://beetlebase.org/Tribolium castaneum 10 7 20218415 2840230 Invertebrates version 3 http://beetlebase.org/Tribolium castaneum 10 8 20532854 2716654 Invertebrates version 3 http://beetlebase.org/Tribolium castaneum 10 9 22083773 4098602 Invertebrates version 3 http://beetlebase.org/Tribolium castaneum 10 10 38791480 8318253 Invertebrates version 3 http://beetlebase.org/Anopheles gambiae 2 1 95164119 12543353 Invertebrates release 3 Ensembl MetazoaAnopheles gambiae 2 2 110909430 14036087 Invertebrates release 3 Ensembl MetazoaDrosophila melanogaster 2 1 44158252 4132861 Invertebrates release 56 EnsembleDrosophila melanogaster 2 2 52448610 3997380 Invertebrates release 56 EnsembleOryza sativa 12 1 23011239 9278025 Vascular plants release 3 Ensemble plantsOryza sativa 12 2 23134759 9846188 Vascular plants release 3 Ensemble plantsOryza sativa 12 3 27497214 11759026 Vascular plants release 3 Ensemble plantsOryza sativa 12 4 28439308 11781663 Vascular plants release 3 Ensemble plantsOryza sativa 12 5 28512666 11762131 Vascular plants release 3 Ensemble plantsOryza sativa 12 6 29696629 11806881 Vascular plants release 3 Ensemble plantsOryza sativa 12 7 29894789 11907898 Vascular plants release 3 Ensemble plantsOryza sativa 12 8 31246789 12163055 Vascular plants release 3 Ensemble plantsOryza sativa 12 9 35278225 14983280 Vascular plants release 3 Ensemble plants

Page 56: TITLE: The Pattern and Dynamics of Genome and Chromosome ... Research Papers... · genome size variation were usually estimated through extrapolation and focused on the whole genome

Table S1. Chromosome size of each species

SpeciesChromosome

numberChromosome

indexChromosome

sizeSize of masked Common name

Accession/version Source

Oryza sativa 12 10 35930381 11538468 Vascular plants release 3 Ensemble plantsOryza sativa 12 11 36406689 11095915 Vascular plants release 3 Ensemble plantsOryza sativa 12 12 43268879 14113097 Vascular plants release 3 Ensemble plantsPopulus trichocarpa 19 1 6060117 NA Vascular plants release 3 Ensemble plantsPopulus trichocarpa 19 2 10599685 NA Vascular plants release 3 Ensemble plantsPopulus trichocarpa 19 3 12003701 NA Vascular plants release 3 Ensemble plantsPopulus trichocarpa 19 4 12525049 NA Vascular plants release 3 Ensemble plantsPopulus trichocarpa 19 5 12805987 NA Vascular plants release 3 Ensemble plantsPopulus trichocarpa 19 6 13101108 NA Vascular plants release 3 Ensemble plantsPopulus trichocarpa 19 7 13470992 NA Vascular plants release 3 Ensemble plantsPopulus trichocarpa 19 8 13661513 NA Vascular plants release 3 Ensemble plantsPopulus trichocarpa 19 9 14142880 NA Vascular plants release 3 Ensemble plantsPopulus trichocarpa 19 10 14699529 NA Vascular plants release 3 Ensemble plantsPopulus trichocarpa 19 11 15120528 NA Vascular plants release 3 Ensemble plantsPopulus trichocarpa 19 12 16228216 NA Vascular plants release 3 Ensemble plantsPopulus trichocarpa 19 13 16625654 NA Vascular plants release 3 Ensemble plantsPopulus trichocarpa 19 14 17991592 NA Vascular plants release 3 Ensemble plantsPopulus trichocarpa 19 15 18519121 NA Vascular plants release 3 Ensemble plantsPopulus trichocarpa 19 16 19129466 NA Vascular plants release 3 Ensemble plantsPopulus trichocarpa 19 17 21101489 NA Vascular plants release 3 Ensemble plantsPopulus trichocarpa 19 18 24482572 NA Vascular plants release 3 Ensemble plantsPopulus trichocarpa 19 19 35571569 NA Vascular plants release 3 Ensemble plantsGlycine max 20 1 37397385 15369516 Vascular plants Version 5.0 phytozomeGlycine max 20 2 39172790 15500803 Vascular plants Version 5.0 phytozomeGlycine max 20 3 40113140 17142180 Vascular plants Version 5.0 phytozomeGlycine max 20 4 41906774 16371661 Vascular plants Version 5.0 phytozomeGlycine max 20 5 41936504 18430250 Vascular plants Version 5.0 phytozomeGlycine max 20 6 44408971 12028821 Vascular plants Version 5.0 phytozomeGlycine max 20 7 44683157 18310711 Vascular plants Version 5.0 phytozomeGlycine max 20 8 46773167 22001791 Vascular plants Version 5.0 phytozomeGlycine max 20 9 46843750 19494500 Vascular plants Version 5.0 phytozomeGlycine max 20 10 46995532 15880423 Vascular plants Version 5.0 phytozomeGlycine max 20 11 47781076 20770716 Vascular plants Version 5.0 phytozomeGlycine max 20 12 49243852 23434235 Vascular plants Version 5.0 phytozomeGlycine max 20 13 49711204 25117580 Vascular plants Version 5.0 phytozomeGlycine max 20 14 50589441 23682927 Vascular plants Version 5.0 phytozome

Page 57: TITLE: The Pattern and Dynamics of Genome and Chromosome ... Research Papers... · genome size variation were usually estimated through extrapolation and focused on the whole genome

Table S1. Chromosome size of each species

SpeciesChromosome

numberChromosome

indexChromosome

sizeSize of masked Common name

Accession/version Source

Glycine max 20 15 50722821 20276260 Vascular plants Version 5.0 phytozomeGlycine max 20 16 50939160 22532681 Vascular plants Version 5.0 phytozomeGlycine max 20 17 50969635 21858382 Vascular plants Version 5.0 phytozomeGlycine max 20 18 51656713 21884981 Vascular plants Version 5.0 phytozomeGlycine max 20 19 55915595 28110586 Vascular plants Version 5.0 phytozomeGlycine max 20 20 62308140 29742899 Vascular plants Version 5.0 phytozomeZea mays 10 1 149686045 124359456 Vascular plants AGPv1 http://www.maizesequence.orgZea mays 10 2 152350485 124833224 Vascular plants AGPv1 http://www.maizesequence.orgZea mays 10 3 169254300 139412678 Vascular plants AGPv1 http://www.maizesequence.orgZea mays 10 4 170974187 141749061 Vascular plants AGPv1 http://www.maizesequence.orgZea mays 10 5 174515299 143555046 Vascular plants AGPv1 http://www.maizesequence.orgZea mays 10 6 216915529 178020165 Vascular plants AGPv1 http://www.maizesequence.orgZea mays 10 7 230558137 191038936 Vascular plants AGPv1 http://www.maizesequence.orgZea mays 10 8 234752839 192414682 Vascular plants AGPv1 http://www.maizesequence.orgZea mays 10 9 247095508 206959994 Vascular plants AGPv1 http://www.maizesequence.orgZea mays 10 10 300239041 245410064 Vascular plants AGPv1 http://www.maizesequence.orgCucumis sativus 7 1 17451012 2834067 Vascular plants http://cucumber.genomics.org.cnCucumis sativus 7 2 22074575 3825441 Vascular plants http://cucumber.genomics.org.cnCucumis sativus 7 3 22393233 3263355 Vascular plants http://cucumber.genomics.org.cnCucumis sativus 7 4 26004697 4502911 Vascular plants http://cucumber.genomics.org.cnCucumis sativus 7 5 26767080 3472343 Vascular plants http://cucumber.genomics.org.cnCucumis sativus 7 6 28472644 5613919 Vascular plants http://cucumber.genomics.org.cnCucumis sativus 7 7 34138124 5292404 Vascular plants http://cucumber.genomics.org.cnLotus japonicus 6 1 62850758 NA Vascular plants http://www.kazusa.or.jp/lotus/Lotus japonicus 6 2 68429861 NA Vascular plants http://www.kazusa.or.jp/lotus/Lotus japonicus 6 3 72958068 NA Vascular plants http://www.kazusa.or.jp/lotus/Lotus japonicus 6 4 81008927 NA Vascular plants http://www.kazusa.or.jp/lotus/Lotus japonicus 6 5 87261079 NA Vascular plants http://www.kazusa.or.jp/lotus/Lotus japonicus 6 6 89987728 NA Vascular plants http://www.kazusa.or.jp/lotus/Arabidopsis thaliana 5 1 18585056 3204453 Vascular plants Version 5.0 phytozomeArabidopsis thaliana 5 2 19698289 3531081 Vascular plants Version 5.0 phytozomeArabidopsis thaliana 5 3 23459830 3731299 Vascular plants Version 5.0 phytozomeArabidopsis thaliana 5 4 26975502 3692374 Vascular plants Version 5.0 phytozomeArabidopsis thaliana 5 5 30427671 3677058 Vascular plants Version 5.0 phytozomeVitis vinifera 19 1 7693613 1010455 Vascular plants Version 5.0 phytozomeVitis vinifera 19 2 8158851 1147740 Vascular plants Version 5.0 phytozome

Page 58: TITLE: The Pattern and Dynamics of Genome and Chromosome ... Research Papers... · genome size variation were usually estimated through extrapolation and focused on the whole genome

Table S1. Chromosome size of each species

SpeciesChromosome

numberChromosome

indexChromosome

sizeSize of masked Common name

Accession/version Source

Vitis vinifera 19 3 9647040 1287749 Vascular plants Version 5.0 phytozomeVitis vinifera 19 4 10186927 1079990 Vascular plants Version 5.0 phytozomeVitis vinifera 19 5 13059092 1442234 Vascular plants Version 5.0 phytozomeVitis vinifera 19 6 13936303 1669210 Vascular plants Version 5.0 phytozomeVitis vinifera 19 7 14071813 1787559 Vascular plants Version 5.0 phytozomeVitis vinifera 19 8 15191948 1730361 Vascular plants Version 5.0 phytozomeVitis vinifera 19 9 15233747 1472677 Vascular plants Version 5.0 phytozomeVitis vinifera 19 10 15630816 1760556 Vascular plants Version 5.0 phytozomeVitis vinifera 19 11 16532244 2296979 Vascular plants Version 5.0 phytozomeVitis vinifera 19 12 17603400 2091740 Vascular plants Version 5.0 phytozomeVitis vinifera 19 13 18540817 2420756 Vascular plants Version 5.0 phytozomeVitis vinifera 19 14 19293076 2252161 Vascular plants Version 5.0 phytozomeVitis vinifera 19 15 19480434 2359440 Vascular plants Version 5.0 phytozomeVitis vinifera 19 16 19691255 2127811 Vascular plants Version 5.0 phytozomeVitis vinifera 19 17 21557227 2391938 Vascular plants Version 5.0 phytozomeVitis vinifera 19 18 23428299 2938153 Vascular plants Version 5.0 phytozomeVitis vinifera 19 19 24148918 2954886 Vascular plants Version 5.0 phytozomeMedicago truncatula 8 1 23064727 12769870 Vascular plants MT3 http://www.medicago.orgMedicago truncatula 8 2 32618758 16800964 Vascular plants MT3 http://www.medicago.orgMedicago truncatula 8 3 33214707 12795208 Vascular plants MT3 http://www.medicago.orgMedicago truncatula 8 4 35192239 19028184 Vascular plants MT3 http://www.medicago.orgMedicago truncatula 8 5 37356654 16376022 Vascular plants MT3 http://www.medicago.orgMedicago truncatula 8 6 41977796 19371038 Vascular plants MT3 http://www.medicago.orgMedicago truncatula 8 7 43208671 24944520 Vascular plants MT3 http://www.medicago.orgMedicago truncatula 8 8 44345767 22954162 Vascular plants MT3 http://www.medicago.orgBrachypodium distachyon 5 1 28444383 4207118 Vascular plants release 3 Ensembl plantsBrachypodium distachyon 5 2 48648102 6239395 Vascular plants release 3 Ensembl plantsBrachypodium distachyon 5 3 59328898 7061965 Vascular plants release 3 Ensembl plantsBrachypodium distachyon 5 4 59892396 7367832 Vascular plants release 3 Ensembl plantsBrachypodium distachyon 5 5 74834646 8262237 Vascular plants release 3 Ensembl plantsArabidopsis lyrata 8 1 19320864 2658532 Vascular plants release 30 GrameneArabidopsis lyrata 8 2 21221946 3487537 Vascular plants release 30 GrameneArabidopsis lyrata 8 3 22951293 3576316 Vascular plants release 30 GrameneArabidopsis lyrata 8 4 23328337 3859388 Vascular plants release 30 GrameneArabidopsis lyrata 8 5 24464547 3822636 Vascular plants release 30 GrameneArabidopsis lyrata 8 6 24649197 3412201 Vascular plants release 30 Gramene

Page 59: TITLE: The Pattern and Dynamics of Genome and Chromosome ... Research Papers... · genome size variation were usually estimated through extrapolation and focused on the whole genome

Table S1. Chromosome size of each species

SpeciesChromosome

numberChromosome

indexChromosome

sizeSize of masked Common name

Accession/version Source

Arabidopsis lyrata 8 7 25113588 3688102 Vascular plants release 30 GrameneArabidopsis lyrata 8 8 33132539 5414106 Vascular plants release 30 GrameneSorghum bicolor 10 1 55460251 36649989 Vascular plants Version 5.0 phytozomeSorghum bicolor 10 2 59635592 37510085 Vascular plants Version 5.0 phytozomeSorghum bicolor 10 3 60981646 37541262 Vascular plants Version 5.0 phytozomeSorghum bicolor 10 4 62208784 41899294 Vascular plants Version 5.0 phytozomeSorghum bicolor 10 5 62352331 41165852 Vascular plants Version 5.0 phytozomeSorghum bicolor 10 6 64342021 43361627 Vascular plants Version 5.0 phytozomeSorghum bicolor 10 7 68034345 38973699 Vascular plants Version 5.0 phytozomeSorghum bicolor 10 8 73840631 32855405 Vascular plants Version 5.0 phytozomeSorghum bicolor 10 9 74441160 44115093 Vascular plants Version 5.0 phytozomeSorghum bicolor 10 10 77932606 47655731 Vascular plants Version 5.0 phytozomeGasterosteus aculeatus 21 1 11912779 580210 Vertebrates release 56 EnsemblGasterosteus aculeatus 21 2 12455587 631668 Vertebrates release 56 EnsemblGasterosteus aculeatus 21 3 14846527 707665 Vertebrates release 56 EnsemblGasterosteus aculeatus 21 4 15500569 788661 Vertebrates release 56 EnsemblGasterosteus aculeatus 21 5 15918398 915417 Vertebrates release 56 EnsemblGasterosteus aculeatus 21 6 16468744 804482 Vertebrates release 56 EnsemblGasterosteus aculeatus 21 7 16554095 931137 Vertebrates release 56 EnsemblGasterosteus aculeatus 21 8 16984487 977598 Vertebrates release 56 EnsemblGasterosteus aculeatus 21 9 17078482 836829 Vertebrates release 56 EnsemblGasterosteus aculeatus 21 10 17368403 853656 Vertebrates release 56 EnsemblGasterosteus aculeatus 21 11 18417718 893501 Vertebrates release 56 EnsemblGasterosteus aculeatus 21 12 18707752 895093 Vertebrates release 56 EnsemblGasterosteus aculeatus 21 13 19691516 1002263 Vertebrates release 56 EnsemblGasterosteus aculeatus 21 14 20060939 1022719 Vertebrates release 56 EnsemblGasterosteus aculeatus 21 15 20417849 924290 Vertebrates release 56 EnsemblGasterosteus aculeatus 21 16 20578005 1074920 Vertebrates release 56 EnsemblGasterosteus aculeatus 21 17 20586971 1147873 Vertebrates release 56 EnsemblGasterosteus aculeatus 21 18 23683913 1203386 Vertebrates release 56 EnsemblGasterosteus aculeatus 21 19 28403068 1435607 Vertebrates release 56 EnsemblGasterosteus aculeatus 21 20 28655680 1551586 Vertebrates release 56 EnsemblGasterosteus aculeatus 21 21 33176831 1926017 Vertebrates release 56 EnsemblPongo pygmaeus 22 1 46535552 14905191 Vertebrates release 56 EnsemblPongo pygmaeus 22 2 48394510 15719376 Vertebrates release 56 EnsemblPongo pygmaeus 22 3 60714840 29554854 Vertebrates release 56 Ensembl

Page 60: TITLE: The Pattern and Dynamics of Genome and Chromosome ... Research Papers... · genome size variation were usually estimated through extrapolation and focused on the whole genome

Table S1. Chromosome size of each species

SpeciesChromosome

numberChromosome

indexChromosome

sizeSize of masked Common name

Accession/version Source

Pongo pygmaeus 22 4 62736349 29298016 Vertebrates release 56 EnsemblPongo pygmaeus 22 5 73212453 32798058 Vertebrates release 56 EnsemblPongo pygmaeus 22 6 77800216 35413328 Vertebrates release 56 EnsemblPongo pygmaeus 22 7 94050890 34576633 Vertebrates release 56 EnsemblPongo pygmaeus 22 8 99152023 37270321 Vertebrates release 56 EnsemblPongo pygmaeus 22 9 108868599 43591898 Vertebrates release 56 EnsemblPongo pygmaeus 22 10 117095149 45431952 Vertebrates release 56 EnsemblPongo pygmaeus 22 11 132107971 62658596 Vertebrates release 56 EnsemblPongo pygmaeus 22 12 133410057 60245521 Vertebrates release 56 EnsemblPongo pygmaeus 22 13 135191526 54652270 Vertebrates release 56 EnsemblPongo pygmaeus 22 14 136387465 66451024 Vertebrates release 56 EnsemblPongo pygmaeus 22 15 153482349 70724961 Vertebrates release 56 EnsemblPongo pygmaeus 22 16 157549271 72153940 Vertebrates release 56 EnsemblPongo pygmaeus 22 17 174210431 80513987 Vertebrates release 56 EnsemblPongo pygmaeus 22 18 183952662 86453400 Vertebrates release 56 EnsemblPongo pygmaeus 22 19 198332218 94802024 Vertebrates release 56 EnsemblPongo pygmaeus 22 20 202140232 94814409 Vertebrates release 56 EnsemblPongo pygmaeus 22 21 229942017 109613037 Vertebrates release 56 EnsemblPongo pygmaeus 22 22 248028950 111821154 Vertebrates release 56 EnsemblHomo sapiens 22 1 48129895 17307014 Vertebrates release 56 EnsemblHomo sapiens 22 2 51304566 17759462 Vertebrates release 56 EnsemblHomo sapiens 22 3 59128983 33255288 Vertebrates release 56 EnsemblHomo sapiens 22 4 63025520 30707360 Vertebrates release 56 EnsemblHomo sapiens 22 5 78077248 35867759 Vertebrates release 56 EnsemblHomo sapiens 22 6 81195210 39140404 Vertebrates release 56 EnsemblHomo sapiens 22 7 90354753 40811440 Vertebrates release 56 EnsemblHomo sapiens 22 8 102531392 40846827 Vertebrates release 56 EnsemblHomo sapiens 22 9 107349540 45035542 Vertebrates release 56 EnsemblHomo sapiens 22 10 115169878 46759363 Vertebrates release 56 EnsemblHomo sapiens 22 11 133851895 68679734 Vertebrates release 56 EnsemblHomo sapiens 22 12 135006516 68048404 Vertebrates release 56 EnsemblHomo sapiens 22 13 135534747 65197252 Vertebrates release 56 EnsemblHomo sapiens 22 14 141213431 61626897 Vertebrates release 56 EnsemblHomo sapiens 22 15 146364022 73346742 Vertebrates release 56 EnsemblHomo sapiens 22 16 159138663 79527395 Vertebrates release 56 EnsemblHomo sapiens 22 17 171115067 83947003 Vertebrates release 56 Ensembl

Page 61: TITLE: The Pattern and Dynamics of Genome and Chromosome ... Research Papers... · genome size variation were usually estimated through extrapolation and focused on the whole genome

Table S1. Chromosome size of each species

SpeciesChromosome

numberChromosome

indexChromosome

sizeSize of masked Common name

Accession/version Source

Homo sapiens 22 18 180915260 90618850 Vertebrates release 56 EnsemblHomo sapiens 22 19 191154276 97311995 Vertebrates release 56 EnsemblHomo sapiens 22 20 198022430 99363582 Vertebrates release 56 EnsemblHomo sapiens 22 21 243199373 117536732 Vertebrates release 56 EnsemblHomo sapiens 22 22 249250621 116296536 Vertebrates release 56 EnsemblMacaca mulatta 20 1 64391591 27153659 Vertebrates release 56 EnsemblMacaca mulatta 20 2 73567989 30801461 Vertebrates release 56 EnsemblMacaca mulatta 20 3 78773432 33361967 Vertebrates release 56 EnsemblMacaca mulatta 20 4 88221753 33420073 Vertebrates release 56 EnsemblMacaca mulatta 20 5 94452569 40448286 Vertebrates release 56 EnsemblMacaca mulatta 20 6 94855758 41592536 Vertebrates release 56 EnsemblMacaca mulatta 20 7 106505843 45084614 Vertebrates release 56 EnsemblMacaca mulatta 20 8 110119387 49283503 Vertebrates release 56 EnsemblMacaca mulatta 20 9 133002572 57301904 Vertebrates release 56 EnsemblMacaca mulatta 20 10 133323859 56892348 Vertebrates release 56 EnsemblMacaca mulatta 20 11 134511895 62523721 Vertebrates release 56 EnsemblMacaca mulatta 20 12 138028943 59246528 Vertebrates release 56 EnsemblMacaca mulatta 20 13 147794981 65754900 Vertebrates release 56 EnsemblMacaca mulatta 20 14 167655696 74539591 Vertebrates release 56 EnsemblMacaca mulatta 20 15 169801366 74965994 Vertebrates release 56 EnsemblMacaca mulatta 20 16 178205221 81262830 Vertebrates release 56 EnsemblMacaca mulatta 20 17 182086969 84317781 Vertebrates release 56 EnsemblMacaca mulatta 20 18 189746636 85529450 Vertebrates release 56 EnsemblMacaca mulatta 20 19 196418989 82091205 Vertebrates release 56 EnsemblMacaca mulatta 20 20 228252215 102602959 Vertebrates release 56 EnsemblCanis familiaris 38 1 26897727 8985686 Vertebrates release 56 EnsemblCanis familiaris 38 2 29542582 10601983 Vertebrates release 56 EnsemblCanis familiaris 38 3 33840356 11537590 Vertebrates release 56 EnsemblCanis familiaris 38 4 33915115 11325733 Vertebrates release 56 EnsemblCanis familiaris 38 5 34424479 12584215 Vertebrates release 56 EnsemblCanis familiaris 38 6 41731424 16581676 Vertebrates release 56 EnsemblCanis familiaris 38 7 42029645 15377197 Vertebrates release 56 EnsemblCanis familiaris 38 8 42263495 14876989 Vertebrates release 56 EnsemblCanis familiaris 38 9 43206070 16418056 Vertebrates release 56 EnsemblCanis familiaris 38 10 44191819 15354188 Vertebrates release 56 EnsemblCanis familiaris 38 11 44831629 17457462 Vertebrates release 56 Ensembl

Page 62: TITLE: The Pattern and Dynamics of Genome and Chromosome ... Research Papers... · genome size variation were usually estimated through extrapolation and focused on the whole genome

Table S1. Chromosome size of each species

SpeciesChromosome

numberChromosome

indexChromosome

sizeSize of masked Common name

Accession/version Source

Canis familiaris 38 12 45128234 15710761 Vertebrates release 56 EnsemblCanis familiaris 38 13 48908698 19122467 Vertebrates release 56 EnsemblCanis familiaris 38 14 50763139 19352454 Vertebrates release 56 EnsemblCanis familiaris 38 15 54024781 22063208 Vertebrates release 56 EnsemblCanis familiaris 38 16 54563659 20696873 Vertebrates release 56 EnsemblCanis familiaris 38 17 55389570 21948502 Vertebrates release 56 EnsemblCanis familiaris 38 18 56771304 22506270 Vertebrates release 56 EnsemblCanis familiaris 38 19 58872314 21845328 Vertebrates release 56 EnsemblCanis familiaris 38 20 61280721 22948046 Vertebrates release 56 EnsemblCanis familiaris 38 21 62570175 23060064 Vertebrates release 56 EnsemblCanis familiaris 38 22 63938239 23994531 Vertebrates release 56 EnsemblCanis familiaris 38 23 64401119 23479609 Vertebrates release 56 EnsemblCanis familiaris 38 24 64418924 22227365 Vertebrates release 56 EnsemblCanis familiaris 38 25 66182471 25029211 Vertebrates release 56 EnsemblCanis familiaris 38 26 67211953 26514773 Vertebrates release 56 EnsemblCanis familiaris 38 27 67347617 25243269 Vertebrates release 56 EnsemblCanis familiaris 38 28 72488556 27179968 Vertebrates release 56 EnsemblCanis familiaris 38 29 75515492 29706769 Vertebrates release 56 EnsemblCanis familiaris 38 30 77315194 30189543 Vertebrates release 56 EnsemblCanis familiaris 38 31 77416458 30313813 Vertebrates release 56 EnsemblCanis familiaris 38 32 80642250 31309156 Vertebrates release 56 EnsemblCanis familiaris 38 33 83999179 32088946 Vertebrates release 56 EnsemblCanis familiaris 38 34 88410189 34318504 Vertebrates release 56 EnsemblCanis familiaris 38 35 91483860 35267524 Vertebrates release 56 EnsemblCanis familiaris 38 36 91976430 33633391 Vertebrates release 56 EnsemblCanis familiaris 38 37 94715083 36932751 Vertebrates release 56 EnsemblCanis familiaris 38 38 125616256 48286780 Vertebrates release 56 EnsemblRattus norvegicus 20 1 46782294 17803196 Vertebrates release 56 EnsemblRattus norvegicus 20 2 55268282 21431472 Vertebrates release 56 EnsemblRattus norvegicus 20 3 59218465 20992658 Vertebrates release 56 EnsemblRattus norvegicus 20 4 87265094 34865832 Vertebrates release 56 EnsemblRattus norvegicus 20 5 87759784 35979061 Vertebrates release 56 EnsemblRattus norvegicus 20 6 90238779 35068142 Vertebrates release 56 EnsemblRattus norvegicus 20 7 97296363 37642740 Vertebrates release 56 EnsemblRattus norvegicus 20 8 109758846 44146916 Vertebrates release 56 EnsemblRattus norvegicus 20 9 110718848 39035935 Vertebrates release 56 Ensembl

Page 63: TITLE: The Pattern and Dynamics of Genome and Chromosome ... Research Papers... · genome size variation were usually estimated through extrapolation and focused on the whole genome

Table S1. Chromosome size of each species

SpeciesChromosome

numberChromosome

indexChromosome

sizeSize of masked Common name

Accession/version Source

Rattus norvegicus 20 10 111154910 49794100 Vertebrates release 56 EnsemblRattus norvegicus 20 11 112194335 45263347 Vertebrates release 56 EnsemblRattus norvegicus 20 12 113440463 44683542 Vertebrates release 56 EnsemblRattus norvegicus 20 13 129041809 50572244 Vertebrates release 56 EnsemblRattus norvegicus 20 14 143002779 57899931 Vertebrates release 56 EnsemblRattus norvegicus 20 15 147636619 59213662 Vertebrates release 56 EnsemblRattus norvegicus 20 16 171063335 66958384 Vertebrates release 56 EnsemblRattus norvegicus 20 17 173096209 73774946 Vertebrates release 56 EnsemblRattus norvegicus 20 18 187126005 78660372 Vertebrates release 56 EnsemblRattus norvegicus 20 19 258207540 112116263 Vertebrates release 56 EnsemblRattus norvegicus 20 20 267910886 108914730 Vertebrates release 56 EnsemblMus musculus 19 1 61342430 23398425 Vertebrates release 56 EnsemblMus musculus 19 2 90772031 36767774 Vertebrates release 56 EnsemblMus musculus 19 3 95272651 39650232 Vertebrates release 56 EnsemblMus musculus 19 4 98319150 39765092 Vertebrates release 56 EnsemblMus musculus 19 5 103494974 41710558 Vertebrates release 56 EnsemblMus musculus 19 6 120284312 49636179 Vertebrates release 56 EnsemblMus musculus 19 7 121257530 49733254 Vertebrates release 56 EnsemblMus musculus 19 8 121843856 46889597 Vertebrates release 56 EnsemblMus musculus 19 9 124076172 50297410 Vertebrates release 56 EnsemblMus musculus 19 10 125194864 51779602 Vertebrates release 56 EnsemblMus musculus 19 11 129993255 54146030 Vertebrates release 56 EnsemblMus musculus 19 12 131738871 51147641 Vertebrates release 56 EnsemblMus musculus 19 13 149517037 62897400 Vertebrates release 56 EnsemblMus musculus 19 14 152524553 65012401 Vertebrates release 56 EnsemblMus musculus 19 15 152537259 62680184 Vertebrates release 56 EnsemblMus musculus 19 16 155630120 67512083 Vertebrates release 56 EnsemblMus musculus 19 17 159599783 69981834 Vertebrates release 56 EnsemblMus musculus 19 18 181748087 72993988 Vertebrates release 56 EnsemblMus musculus 19 19 197195432 84195582 Vertebrates release 56 EnsemblEquus caballus 31 1 24984650 8845277 Vertebrates release 56 EnsemblEquus caballus 31 2 30062385 11830741 Vertebrates release 56 EnsemblEquus caballus 31 3 33091231 12460263 Vertebrates release 56 EnsemblEquus caballus 31 4 33672925 12950838 Vertebrates release 56 EnsemblEquus caballus 31 5 39536964 15526785 Vertebrates release 56 EnsemblEquus caballus 31 6 39960074 15261397 Vertebrates release 56 Ensembl

Page 64: TITLE: The Pattern and Dynamics of Genome and Chromosome ... Research Papers... · genome size variation were usually estimated through extrapolation and focused on the whole genome

Table S1. Chromosome size of each species

SpeciesChromosome

numberChromosome

indexChromosome

sizeSize of masked Common name

Accession/version Source

Equus caballus 31 7 41866177 15330286 Vertebrates release 56 EnsemblEquus caballus 31 8 42578167 17102728 Vertebrates release 56 EnsemblEquus caballus 31 9 46177339 18488085 Vertebrates release 56 EnsemblEquus caballus 31 10 46749900 17869246 Vertebrates release 56 EnsemblEquus caballus 31 11 49946797 20489901 Vertebrates release 56 EnsemblEquus caballus 31 12 55726280 21785924 Vertebrates release 56 EnsemblEquus caballus 31 13 57723302 21357169 Vertebrates release 56 EnsemblEquus caballus 31 14 59975221 23336644 Vertebrates release 56 EnsemblEquus caballus 31 15 61308211 22635829 Vertebrates release 56 EnsemblEquus caballus 31 16 64166202 24911456 Vertebrates release 56 EnsemblEquus caballus 31 17 80757907 31094294 Vertebrates release 56 EnsemblEquus caballus 31 18 82527541 32089849 Vertebrates release 56 EnsemblEquus caballus 31 19 83561422 34364106 Vertebrates release 56 EnsemblEquus caballus 31 20 83980604 33536116 Vertebrates release 56 EnsemblEquus caballus 31 21 84719076 32096651 Vertebrates release 56 EnsemblEquus caballus 31 22 87365405 34875557 Vertebrates release 56 EnsemblEquus caballus 31 23 91571448 36281913 Vertebrates release 56 EnsemblEquus caballus 31 24 93904894 37180356 Vertebrates release 56 EnsemblEquus caballus 31 25 94057673 35988209 Vertebrates release 56 EnsemblEquus caballus 31 26 98542428 40194615 Vertebrates release 56 EnsemblEquus caballus 31 27 99680356 39375286 Vertebrates release 56 EnsemblEquus caballus 31 28 108569075 41484314 Vertebrates release 56 EnsemblEquus caballus 31 29 119479920 47049332 Vertebrates release 56 EnsemblEquus caballus 31 30 120857687 48760922 Vertebrates release 56 EnsemblEquus caballus 31 31 185838109 73691659 Vertebrates release 56 EnsemblOryzias latipes 24 1 23451325 5674725 Vertebrates release 56 EnsemblOryzias latipes 24 2 24050845 6617420 Vertebrates release 56 EnsemblOryzias latipes 24 3 24165179 5932995 Vertebrates release 56 EnsemblOryzias latipes 24 4 24728221 6352401 Vertebrates release 56 EnsemblOryzias latipes 24 5 25865442 6514741 Vertebrates release 56 EnsemblOryzias latipes 24 6 26576615 7172127 Vertebrates release 56 EnsemblOryzias latipes 24 7 27595823 7138248 Vertebrates release 56 EnsemblOryzias latipes 24 8 28810691 7623521 Vertebrates release 56 EnsemblOryzias latipes 24 9 29412213 7614985 Vertebrates release 56 EnsemblOryzias latipes 24 10 29492121 7751325 Vertebrates release 56 EnsemblOryzias latipes 24 11 29908082 7624037 Vertebrates release 56 Ensembl

Page 65: TITLE: The Pattern and Dynamics of Genome and Chromosome ... Research Papers... · genome size variation were usually estimated through extrapolation and focused on the whole genome

Table S1. Chromosome size of each species

SpeciesChromosome

numberChromosome

indexChromosome

sizeSize of masked Common name

Accession/version Source

Oryzias latipes 24 12 30000224 7436043 Vertebrates release 56 EnsemblOryzias latipes 24 13 30014384 7606795 Vertebrates release 56 EnsemblOryzias latipes 24 14 30041894 8174482 Vertebrates release 56 EnsemblOryzias latipes 24 15 31118443 6457700 Vertebrates release 56 EnsemblOryzias latipes 24 16 31848461 8694029 Vertebrates release 56 EnsemblOryzias latipes 24 17 31883787 8434951 Vertebrates release 56 EnsemblOryzias latipes 24 18 33213694 8513601 Vertebrates release 56 EnsemblOryzias latipes 24 19 33409148 9034877 Vertebrates release 56 EnsemblOryzias latipes 24 20 33607196 7670608 Vertebrates release 56 EnsemblOryzias latipes 24 21 33792114 8777955 Vertebrates release 56 EnsemblOryzias latipes 24 22 34636364 9306821 Vertebrates release 56 EnsemblOryzias latipes 24 23 36623554 9568359 Vertebrates release 56 EnsemblOryzias latipes 24 24 39973033 9232618 Vertebrates release 56 EnsemblDanio rerio 25 1 38768535 20617827 Vertebrates release 56 EnsemblDanio rerio 25 2 40403431 20936303 Vertebrates release 56 EnsemblDanio rerio 25 3 41415389 21719042 Vertebrates release 56 EnsemblDanio rerio 25 4 43467561 22819133 Vertebrates release 56 EnsemblDanio rerio 25 5 44116856 23010149 Vertebrates release 56 EnsemblDanio rerio 25 6 44714728 23475745 Vertebrates release 56 EnsemblDanio rerio 25 7 46853116 24304795 Vertebrates release 56 EnsemblDanio rerio 25 8 47237297 24639688 Vertebrates release 56 EnsemblDanio rerio 25 9 47572505 25122334 Vertebrates release 56 EnsemblDanio rerio 25 10 48708673 25705576 Vertebrates release 56 EnsemblDanio rerio 25 11 49271716 26331775 Vertebrates release 56 EnsemblDanio rerio 25 12 49469313 25587761 Vertebrates release 56 EnsemblDanio rerio 25 13 50748729 26144623 Vertebrates release 56 EnsemblDanio rerio 25 14 51884995 27275236 Vertebrates release 56 EnsemblDanio rerio 25 15 51890894 27437560 Vertebrates release 56 EnsemblDanio rerio 25 16 52930158 28153498 Vertebrates release 56 EnsemblDanio rerio 25 17 54736511 27851215 Vertebrates release 56 EnsemblDanio rerio 25 18 55568185 29219669 Vertebrates release 56 EnsemblDanio rerio 25 19 58009534 30923681 Vertebrates release 56 EnsemblDanio rerio 25 20 59305620 29810724 Vertebrates release 56 EnsemblDanio rerio 25 21 60907308 31890338 Vertebrates release 56 EnsemblDanio rerio 25 22 61647013 31842833 Vertebrates release 56 EnsemblDanio rerio 25 23 71658100 36442692 Vertebrates release 56 Ensembl

Page 66: TITLE: The Pattern and Dynamics of Genome and Chromosome ... Research Papers... · genome size variation were usually estimated through extrapolation and focused on the whole genome

Table S1. Chromosome size of each species

SpeciesChromosome

numberChromosome

indexChromosome

sizeSize of masked Common name

Accession/version Source

Danio rerio 25 24 74451498 38789671 Vertebrates release 56 EnsemblDanio rerio 25 25 76918211 39652101 Vertebrates release 56 EnsemblSus scrofa 18 1 54314914 20507684 Vertebrates release 56 EnsemblSus scrofa 18 2 57436344 22827936 Vertebrates release 56 EnsemblSus scrofa 18 3 64400339 27207639 Vertebrates release 56 EnsemblSus scrofa 18 4 66741929 27284914 Vertebrates release 56 EnsemblSus scrofa 18 5 77440658 31017871 Vertebrates release 56 EnsemblSus scrofa 18 6 79819395 30700760 Vertebrates release 56 EnsemblSus scrofa 18 7 100521970 41205573 Vertebrates release 56 EnsemblSus scrofa 18 8 119990671 48630104 Vertebrates release 56 EnsemblSus scrofa 18 9 123310171 51201606 Vertebrates release 56 EnsemblSus scrofa 18 10 123604780 50308337 Vertebrates release 56 EnsemblSus scrofa 18 11 132473591 53265465 Vertebrates release 56 EnsemblSus scrofa 18 12 134546103 52847068 Vertebrates release 56 EnsemblSus scrofa 18 13 136259946 55816880 Vertebrates release 56 EnsemblSus scrofa 18 14 136414062 54857749 Vertebrates release 56 EnsemblSus scrofa 18 15 140138492 58621549 Vertebrates release 56 EnsemblSus scrofa 18 16 145240301 59720719 Vertebrates release 56 EnsemblSus scrofa 18 17 148515138 61829418 Vertebrates release 56 EnsemblSus scrofa 18 18 295534705 121258987 Vertebrates release 56 EnsemblBos taurus 29 1 44060403 18399229 Vertebrates release 56 EnsemblBos taurus 29 2 46084206 20112685 Vertebrates release 56 EnsemblBos taurus 29 3 48749334 19642239 Vertebrates release 56 EnsemblBos taurus 29 4 51750746 21330165 Vertebrates release 56 EnsemblBos taurus 29 5 51998940 22074747 Vertebrates release 56 EnsemblBos taurus 29 6 53376148 21157873 Vertebrates release 56 EnsemblBos taurus 29 7 61848140 25293552 Vertebrates release 56 EnsemblBos taurus 29 8 65020233 26764479 Vertebrates release 56 EnsemblBos taurus 29 9 65312493 24992200 Vertebrates release 56 EnsemblBos taurus 29 10 66141439 26588657 Vertebrates release 56 EnsemblBos taurus 29 11 69173390 30559240 Vertebrates release 56 EnsemblBos taurus 29 12 75796353 33275751 Vertebrates release 56 EnsemblBos taurus 29 13 76506943 33455537 Vertebrates release 56 EnsemblBos taurus 29 14 77906053 33268073 Vertebrates release 56 EnsemblBos taurus 29 15 81345643 35646570 Vertebrates release 56 EnsemblBos taurus 29 16 84419198 35223295 Vertebrates release 56 Ensembl

Page 67: TITLE: The Pattern and Dynamics of Genome and Chromosome ... Research Papers... · genome size variation were usually estimated through extrapolation and focused on the whole genome

Table S1. Chromosome size of each species

SpeciesChromosome

numberChromosome

indexChromosome

sizeSize of masked Common name

Accession/version Source

Bos taurus 29 17 84633453 38371189 Vertebrates release 56 EnsemblBos taurus 29 18 85358539 36705489 Vertebrates release 56 EnsemblBos taurus 29 19 106383598 46351905 Vertebrates release 56 EnsemblBos taurus 29 20 108145351 47433629 Vertebrates release 56 EnsemblBos taurus 29 21 110171769 47002482 Vertebrates release 56 EnsemblBos taurus 29 22 112078216 49211769 Vertebrates release 56 EnsemblBos taurus 29 23 116942821 52028224 Vertebrates release 56 EnsemblBos taurus 29 24 122561022 55452230 Vertebrates release 56 EnsemblBos taurus 29 25 124454208 54023641 Vertebrates release 56 EnsemblBos taurus 29 26 125847759 55840979 Vertebrates release 56 EnsemblBos taurus 29 27 127923604 57838407 Vertebrates release 56 EnsemblBos taurus 29 28 140800416 62566711 Vertebrates release 56 EnsemblBos taurus 29 29 161106243 72362486 Vertebrates release 56 EnsemblPan troglodytes 23 1 46489110 15177344 Vertebrates release 56 EnsemblPan troglodytes 23 2 50165558 15703477 Vertebrates release 56 EnsemblPan troglodytes 23 3 62293572 28589641 Vertebrates release 56 EnsemblPan troglodytes 23 4 64473437 29954174 Vertebrates release 56 EnsemblPan troglodytes 23 5 77261746 33784519 Vertebrates release 56 EnsemblPan troglodytes 23 6 83384210 35225908 Vertebrates release 56 EnsemblPan troglodytes 23 7 90682376 36845361 Vertebrates release 56 EnsemblPan troglodytes 23 8 100063422 36748448 Vertebrates release 56 EnsemblPan troglodytes 23 9 107349158 41743456 Vertebrates release 56 EnsemblPan troglodytes 23 10 114460064 50184514 Vertebrates release 56 EnsemblPan troglodytes 23 11 115868456 40719720 Vertebrates release 56 EnsemblPan troglodytes 23 12 134204764 60510676 Vertebrates release 56 EnsemblPan troglodytes 23 13 135001995 59589537 Vertebrates release 56 EnsemblPan troglodytes 23 14 135371336 65185416 Vertebrates release 56 EnsemblPan troglodytes 23 15 138509991 52937127 Vertebrates release 56 EnsemblPan troglodytes 23 16 145085868 67938792 Vertebrates release 56 EnsemblPan troglodytes 23 17 160261443 73304574 Vertebrates release 56 EnsemblPan troglodytes 23 18 173908612 78328581 Vertebrates release 56 EnsemblPan troglodytes 23 19 183994906 84980788 Vertebrates release 56 EnsemblPan troglodytes 23 20 194897272 92749395 Vertebrates release 56 EnsemblPan troglodytes 23 21 203962478 94663731 Vertebrates release 56 EnsemblPan troglodytes 23 22 229974691 106527844 Vertebrates release 56 EnsemblPan troglodytes 23 23 248603653 59379589 Vertebrates release 56 Ensembl

Page 68: TITLE: The Pattern and Dynamics of Genome and Chromosome ... Research Papers... · genome size variation were usually estimated through extrapolation and focused on the whole genome

Table S1. Chromosome size of each species

SpeciesChromosome

numberChromosome

indexChromosome

sizeSize of masked Common name

Accession/version Source

Monodelphis domestica 8 1 260857928 73211492 Vertebrates release 56 EnsemblMonodelphis domestica 8 2 292091736 86063930 Vertebrates release 56 EnsemblMonodelphis domestica 8 3 304825324 88330774 Vertebrates release 56 EnsemblMonodelphis domestica 8 4 312544902 89156483 Vertebrates release 56 EnsemblMonodelphis domestica 8 5 435153693 127165175 Vertebrates release 56 EnsemblMonodelphis domestica 8 6 527952102 154689147 Vertebrates release 56 EnsemblMonodelphis domestica 8 7 541556283 160980954 Vertebrates release 56 EnsemblMonodelphis domestica 8 8 748055161 226174860 Vertebrates release 56 Ensembl

Gallus gallus 1 30 1 50677 30568 Vertebrates release 56 EnsemblGallus gallus 1 30 2 440200 217808 Vertebrates release 56 EnsemblGallus gallus 1 30 3 910158 132686 Vertebrates release 56 EnsemblGallus gallus 1 30 4 2065663 265662 Vertebrates release 56 EnsemblGallus gallus 1 30 5 4002184 522490 Vertebrates release 56 EnsemblGallus gallus 1 30 6 4587227 591216 Vertebrates release 56 EnsemblGallus gallus 1 30 7 4922670 928412 Vertebrates release 56 EnsemblGallus gallus 1 30 8 5187479 454056 Vertebrates release 56 EnsemblGallus gallus 1 30 9 6142921 641312 Vertebrates release 56 EnsemblGallus gallus 1 30 10 6506778 702646 Vertebrates release 56 EnsemblGallus gallus 1 30 11 7075637 771210 Vertebrates release 56 EnsemblGallus gallus 1 30 12 10105386 921592 Vertebrates release 56 EnsemblGallus gallus 1 30 13 11107349 1222846 Vertebrates release 56 EnsemblGallus gallus 1 30 14 11368902 1094186 Vertebrates release 56 EnsemblGallus gallus 1 30 15 13184302 1188460 Vertebrates release 56 EnsemblGallus gallus 1 30 16 14219339 1441770 Vertebrates release 56 EnsemblGallus gallus 1 30 17 16083127 1630942 Vertebrates release 56 EnsemblGallus gallus 1 30 18 19227133 1975202 Vertebrates release 56 EnsemblGallus gallus 1 30 19 20878966 2081878 Vertebrates release 56 EnsemblGallus gallus 1 30 20 22293564 2245172 Vertebrates release 56 EnsemblGallus gallus 1 30 21 22932373 2145550 Vertebrates release 56 EnsemblGallus gallus 1 30 22 25980258 2514876 Vertebrates release 56 Ensembl

Page 69: TITLE: The Pattern and Dynamics of Genome and Chromosome ... Research Papers... · genome size variation were usually estimated through extrapolation and focused on the whole genome

Table S1. Chromosome size of each species

SpeciesChromosome

numberChromosome

indexChromosome

sizeSize of masked Common name

Accession/version Source

Gallus gallus 1 30 23 31182925 3782810 Vertebrates release 56 EnsemblGallus gallus 1 30 24 38023783 4713534 Vertebrates release 56 EnsemblGallus gallus 1 30 25 39024516 4444580 Vertebrates release 56 EnsemblGallus gallus 1 30 26 63276247 9282366 Vertebrates release 56 EnsemblGallus gallus 1 30 27 95800909 16865954 Vertebrates release 56 EnsemblGallus gallus 1 30 28 115552086 22898702 Vertebrates release 56 EnsemblGallus gallus 1 30 29 157454997 37749844 Vertebrates release 56 EnsemblGallus gallus 1 30 30 204343916 56191352 Vertebrates release 56 EnsemblTaeniopygia guttata 1 29 1 16690 20940 Vertebrates release 56 EnsemblTaeniopygia guttata 1 29 2 111571 51244 Vertebrates release 56 EnsemblTaeniopygia guttata 1 29 3 898088 322784 Vertebrates release 56 EnsemblTaeniopygia guttata 1 29 4 1101542 560362 Vertebrates release 56 EnsemblTaeniopygia guttata 1 29 5 5045922 1556090 Vertebrates release 56 EnsemblTaeniopygia guttata 1 29 6 6078790 1324868 Vertebrates release 56 EnsemblTaeniopygia guttata 1 29 7 6300194 1934720 Vertebrates release 56 EnsemblTaeniopygia guttata 1 29 8 8155069 2562948 Vertebrates release 56 EnsemblTaeniopygia guttata 1 29 9 11387817 2142328 Vertebrates release 56 EnsemblTaeniopygia guttata 1 29 10 11780862 2052638 Vertebrates release 56 EnsemblTaeniopygia guttata 1 29 11 11842874 2239772 Vertebrates release 56 EnsemblTaeniopygia guttata 1 29 12 14668616 2239546 Vertebrates release 56 EnsemblTaeniopygia guttata 1 29 13 15912931 2700496 Vertebrates release 56 EnsemblTaeniopygia guttata 1 29 14 16692730 2535200 Vertebrates release 56 EnsemblTaeniopygia guttata 1 29 15 17245088 2273832 Vertebrates release 56 EnsemblTaeniopygia guttata 1 29 16 21049581 3034618 Vertebrates release 56 EnsemblTaeniopygia guttata 1 29 17 21153446 2321720 Vertebrates release 56 EnsemblTaeniopygia guttata 1 29 18 21759739 3041582 Vertebrates release 56 EnsemblTaeniopygia guttata 1 29 19 21936119 2829558 Vertebrates release 56 EnsemblTaeniopygia guttata 1 29 20 27695206 3724774 Vertebrates release 56 EnsemblTaeniopygia guttata 1 29 21 28459985 3517682 Vertebrates release 56 Ensembl

Page 70: TITLE: The Pattern and Dynamics of Genome and Chromosome ... Research Papers... · genome size variation were usually estimated through extrapolation and focused on the whole genome

Table S1. Chromosome size of each species

SpeciesChromosome

numberChromosome

indexChromosome

sizeSize of masked Common name

Accession/version Source

Taeniopygia guttata 1 29 22 36910879 4547772 Vertebrates release 56 EnsemblTaeniopygia guttata 1 29 23 40508710 5119212 Vertebrates release 56 EnsemblTaeniopygia guttata 1 29 24 63414545 9285508 Vertebrates release 56 EnsemblTaeniopygia guttata 1 29 25 70943385 9628400 Vertebrates release 56 EnsemblTaeniopygia guttata 1 29 26 74884777 10980800 Vertebrates release 56 EnsemblTaeniopygia guttata 1 29 27 114494240 16312778 Vertebrates release 56 EnsemblTaeniopygia guttata 1 29 28 120524508 21212690 Vertebrates release 56 EnsemblTaeniopygia guttata 1 29 29 159019409 26114928 Vertebrates release 56 EnsemblOrnithorhynchus anatinus 1 19 1 1422794 747432 Vertebrates release 56 EnsemblOrnithorhynchus anatinus 1 19 2 1846686 1519896 Vertebrates release 56 EnsemblOrnithorhynchus anatinus 1 19 3 2741058 1960350 Vertebrates release 56 EnsemblOrnithorhynchus anatinus 1 19 4 3849995 3184290 Vertebrates release 56 EnsemblOrnithorhynchus anatinus 1 19 5 5746710 4435498 Vertebrates release 56 EnsemblOrnithorhynchus anatinus 1 19 6 6050548 4213908 Vertebrates release 56 EnsemblOrnithorhynchus anatinus 1 19 7 6721479 4767920 Vertebrates release 56 EnsemblOrnithorhynchus anatinus 1 19 8 6922712 4799388 Vertebrates release 56 EnsemblOrnithorhynchus anatinus 1 19 9 11431159 8009026 Vertebrates release 56 EnsemblOrnithorhynchus anatinus 1 19 10 16137211 12675776 Vertebrates release 56 EnsemblOrnithorhynchus anatinus 1 19 11 16574643 12435462 Vertebrates release 56 EnsemblOrnithorhynchus anatinus 1 19 12 25019374 20132736 Vertebrates release 56 EnsemblOrnithorhynchus anatinus 1 19 13 28249852 19463458 Vertebrates release 56 EnsemblOrnithorhynchus anatinus 1 19 14 40706407 30953684 Vertebrates release 56 EnsemblOrnithorhynchus anatinus 1 19 15 46300577 34764364 Vertebrates release 56 EnsemblOrnithorhynchus anatinus 1 19 16 48387522 40390866 Vertebrates release 56 EnsemblOrnithorhynchus anatinus 1 19 17 55710606 47040256 Vertebrates release 56 EnsemblOrnithorhynchus anatinus 1 19 18 59970384 47938090 Vertebrates release 56 EnsemblOrnithorhynchus anatinus 1 19 19 60574986 48007206 Vertebrates release 56 EnsemblCyanothece sp. 2 2 1 429701 14355 Unicellular eukaryotes NC_010547 GenBankCyanothece sp. 2 2 2 4934271 124664 Unicellular eukaryotes NC_010546 GenBank

Page 71: TITLE: The Pattern and Dynamics of Genome and Chromosome ... Research Papers... · genome size variation were usually estimated through extrapolation and focused on the whole genome

Table S1. Chromosome size of each species

SpeciesChromosome

numberChromosome

indexChromosome

sizeSize of masked Common name

Accession/version Source

Mycosphaerella graminicola 3 21 1 409213 124396 Unicellular eukaryotes ver 2.0 http://genome.jgi-psf.org/Mycosphaerella graminicola 3 21 2 472105 105014 Unicellular eukaryotes ver 2.0 http://genome.jgi-psf.org/Mycosphaerella graminicola 3 21 3 549847 141937 Unicellular eukaryotes ver 2.0 http://genome.jgi-psf.org/Mycosphaerella graminicola 3 21 4 573698 240319 Unicellular eukaryotes ver 2.0 http://genome.jgi-psf.org/Mycosphaerella graminicola 3 21 5 584099 151550 Unicellular eukaryotes ver 2.0 http://genome.jgi-psf.org/Mycosphaerella graminicola 3 21 6 607044 159809 Unicellular eukaryotes ver 2.0 http://genome.jgi-psf.org/Mycosphaerella graminicola 3 21 7 639501 223375 Unicellular eukaryotes ver 2.0 http://genome.jgi-psf.org/Mycosphaerella graminicola 3 21 8 773098 290645 Unicellular eukaryotes ver 2.0 http://genome.jgi-psf.org/Mycosphaerella graminicola 3 21 9 1185774 218763 Unicellular eukaryotes ver 2.0 http://genome.jgi-psf.org/Mycosphaerella graminicola 3 21 10 1462624 226452 Unicellular eukaryotes ver 2.0 http://genome.jgi-psf.org/Mycosphaerella graminicola 3 21 11 1624292 189929 Unicellular eukaryotes ver 2.0 http://genome.jgi-psf.org/Mycosphaerella graminicola 3 21 12 1682575 243474 Unicellular eukaryotes ver 2.0 http://genome.jgi-psf.org/Mycosphaerella graminicola 3 21 13 2142475 453583 Unicellular eukaryotes ver 2.0 http://genome.jgi-psf.org/Mycosphaerella graminicola 3 21 14 2443572 439962 Unicellular eukaryotes ver 2.0 http://genome.jgi-psf.org/Mycosphaerella graminicola 3 21 15 2665280 392752 Unicellular eukaryotes ver 2.0 http://genome.jgi-psf.org/Mycosphaerella graminicola 3 21 16 2674951 608359 Unicellular eukaryotes ver 2.0 http://genome.jgi-psf.org/Mycosphaerella graminicola 3 21 17 2861803 557494 Unicellular eukaryotes ver 2.0 http://genome.jgi-psf.org/Mycosphaerella graminicola 3 21 18 2880011 480180 Unicellular eukaryotes ver 2.0 http://genome.jgi-psf.org/Mycosphaerella graminicola 3 21 19 3505381 510740 Unicellular eukaryotes ver 2.0 http://genome.jgi-psf.org/Mycosphaerella graminicola 3 21 20 3860111 617510 Unicellular eukaryotes ver 2.0 http://genome.jgi-psf.org/Mycosphaerella graminicola 3 21 21 6088797 603996 Unicellular eukaryotes ver 2.0 http://genome.jgi-psf.org/1: Macrochromosome and microchromsome2: One circular and one linear chromosome3: Haploid