in brief vertical vs. horizontal homologous vs. unequal prokaryotes vs. eukaryotes mechanisms and...

Post on 21-Jan-2016

218 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

In brief

• Vertical vs. Horizontal• Homologous vs. Unequal• Prokaryotes vs. Eukaryotes• Mechanisms and Vectors• Impact on Tree of Life• Implications for prokaryotic species

Possible mechanisms for HT in Drosophila

From Heredity (2008) 100, 545–554

EVOLUTION: Genome Data Shake Tree of LifeE Pennisi - Science, 1998 - sciencemag.org

The ring of life provides evidence for a genome fusion origin of eukaryotes MC Rivera, JA Lake - Nature, 2004

The net of life: reconstructing the microbial phylogenetic networkV Kunin, L Goldovsky, N Darzentas, CA … - Genome Research 2005

The tree of one percentT Dagan, W Martin - Genome biology, 2006

Uprooting the tree of lifeWF Doolittle - Evolution: a Scientific American reader, 2006

Clusters of Orthologous Groups (COGs)

Puigbo et al.

• 6901 ML trees• 100 taxa total

• Objective – compare topological distance between trees

• New metric called IS (inconsistency score) = fraction of the time splits in a tree are found all trees

Many genes are not found in all taxa

Define 102 NUTs or “nearly universal trees” that include 90% of the prokaryotes under comparison.

Mostly translation and core transcription related

J Biol. 2009;8(6):59.

The big divide?

• Look for evidence of HGT between bacteria and archaea

• 56% of NUTs separated the groups perfectly• 44% show at least on HGT

– 13% from archaea to bacteria– 23% from bacteria to archaea– 8% both directions

The network of similarities among the nearly universal trees (NUTs). (a) Each node (green dot) denotes a NUT, and nodes are connected by edges if the similarity between the respective edges exceeds the indicated threshold. (b) The connectivity of 102 NUTs and the 14 1:1 NUTs depending on the topological similarity threshold.

The supernetwork of the NUTs. For spcies abbreviations see Additional File 1.Puigbò et al. Journal of Biology 2009 8:59 doi:10.1186/jbiol159

Network representation of the 6,901 trees of the forest of life. The 102 NUTs are shown as red circles in the middle. The NUTs are connected to trees with similar topologies: trees with at least 50% of similarity with at least one NUT (P-value < 0.05) are shown as purple circles and connected to the NUTs. The rest of the trees are shown as green circles.Puigbò et al. Journal of Biology 2009 8:59 doi:10.1186/jbiol159

Similarity of the trees in the forest of life to the NUTs. (a) For each of the 102 NUTs, the breakdown of the rest of the trees in the forest by percent similarity is shown. (b) The same breakdown for 102 random trees generated from the NUTs.Puigbò et al. Journal of Biology 2009 8:59 doi:10.1186/jbiol159

Proc Natl Acad Sci U S A. 2005 Oct 4;102(40):14332-7.

Highways of obligate gene transfer within and among phyla and divisions of prokaryotes, based on analysis of the 22,348 protein trees for which a minimal edit path could be resolved

Beiko R G et al. PNAS 2005;102:14332-14337

©2005 by National Academy of Sciences

Ratio of observed to expected discordant bipartitions among proteins in major TIGR role category groupings

Beiko R G et al. PNAS 2005;102:14332-14337

©2005 by National Academy of Sciences

Fig. 1. Two methods for assessing LGT in bacterial genomes, applied to available quartets of closely related, fully sequenced bacterial taxa. The reference topology, based on SSU rRNA, is shown in the upper left, with taxon names listed in the rows below. The yellow box contains the numbers of gene acquisitions in genomes A and B, as determined by parsimony in comparisons of complete genome contents. The blue box contains the numbers of orthologous genes supporting a topology that conflicts with the reference topology. "Interspecies" and "Intraspecies" comparisons represent quartets of taxa in which phylogenetic incongruence can be explained, respectively, by a transfer from another species or from another strain of the same species. For intraspecies comparisons, numbers of acquired and lost genes were not calculated because of uncertainty about the actual tree topology (nd, not determined). (B. aphidicola strains are entirely isolated in different hosts and were thus considered as different species despite having a single name. In B. aphidicola, amounts of gene loss and gene gain are similar, suggesting that LGT is overestimated due to independent losses of genes.)

Fig. 2. Relative frequencies of the three categories of alignments, i.e., those supporting the reference phylogeny (SSU rRNA), those supporting an alternate phylogeny (LGT), and those with no statistical support for any phylogeny. Points represent quartets of genomes for which orthologous genes have been inferred, aligned, and evaluated at the nucleic acid sequences level based on the SH test implemented in Puzzle 5.1 (19). The left part of the plot (in blue) represents the area where LGT predominates.

“THE” E. coli genome

Figure 1. The overall structure of the E. coli genome. The origin and terminus of replication are shown as green lines, with blue arrows indicating replichores 1 and 2. A scale indicates the coordinates both in base pairs and in minutes (actually centisomes, or 100 equal intervals of the DNA). The distribution of genes is depicted on two outer rings: The orange boxes are genes located on the presented strand, and the yellow boxes are genes on the opposite strand. Red arrows show the location and direction of transcription of rRNA genes, and tRNA genes are shown as green arrows. The next circle illustrates the positions of REP sequences around the genome as radial tick marks. The central orange sunburst is a histogram of inverse CAI (1 - CAI), in which long yellow rays represent clusters of low (<0.25) CAI. The CAI plot is enclosed by a ring

indicating similarities between previously described bacteriophage proteins and the proteins encoded by the complete E. coli genome; the similarity is plotted as described in Fig. 3 for the complete genome comparisons.

Blattner et al., Science 5 September 1997 277: 1453-1462

Perna et al., Nature 409, 529-533(25 January 2001)

Outer circle shows the distribution of islands: shared co-linear backbone (blue); position of EDL933-specific sequences (O-islands) (red); MG1655-specific sequences (K-islands) (green); O-islands and K-islands at the same locations in the backbone (tan); hypervariable (purple). Second circle shows the G+C content calculated for each gene longer than 100 amino acids, plotted around the mean value for the whole genome, colour-coded like outer circle. Third circle shows the GC skew for third-codon position, calculated for each gene longer than 100 amino acids: positive values, lime; negative values, dark green. Fourth circle gives the scale in base pairs. Fifth circle shows the distribution of the highly skewed octamer Chi (GCTGGTGG), where bright blue and purple indicate the two DNA strands. The origin and terminus of replication, the chromosomal inversion and the locations of the sequence gaps are indicated. Figure created by Genvision from DNASTAR.

Shared E. coli proteins

Welch R A et al. PNAS 2002;99:17020-17024

©2002 by National Academy of Sciences

top related