bioinformatics genome anatomy comparisons of some eukaryotic genomes allignment of long genomic...

23
Bioinformati cs Genome anatomy Comparisons of some eukaryotic genomes Allignment of long genomic sequences Comparative genomics Oxford Grid Reconstruction of ancestral mammalian karyotype Lecture 16

Post on 22-Dec-2015

224 views

Category:

Documents


4 download

TRANSCRIPT

Page 1: Bioinformatics Genome anatomy Comparisons of some eukaryotic genomes Allignment of long genomic sequences Comparative genomics Oxford Grid Reconstruction

Bioinformatics

• Genome anatomy

• Comparisons of some eukaryotic genomes

• Allignment of long genomic sequences

• Comparative genomics

•Oxford Grid

• Reconstruction of ancestral mammalian karyotype

Lecture 16

Page 2: Bioinformatics Genome anatomy Comparisons of some eukaryotic genomes Allignment of long genomic sequences Comparative genomics Oxford Grid Reconstruction

Genome anatomy

• Anatomy of different genomes, particularly such remote as eukaryotes and prokaryotes, differ very significantly.

• This includes size of genome; thousand fold difference between eukarytes and prokaryotes and 30 fold between different eukaryotes.

• The number of genes per genome varies less significantly; in humans ~30,000-35,000 and in bacterial genomes ~1,500 – 2,000 genes.

• Eukaryotic genomes are full of simple repeats, numerous types of transposable elements and other sequences.

• Prokaryotes have a few repeats and transposable elements and their genomes consist mainly from genes.

Page 3: Bioinformatics Genome anatomy Comparisons of some eukaryotic genomes Allignment of long genomic sequences Comparative genomics Oxford Grid Reconstruction

Homologous sequences: orthologous and paralogous• Genes or sequences are orthologous if their most recent common

ancestor did not undergo a gene duplication and they represent the same genetic element in a number of species.

• Paralogous genes and sequences are always a result of a duplication.

• Orthologous and paralogous could be very similar and their discrimination is not always a simple operation.

Page 4: Bioinformatics Genome anatomy Comparisons of some eukaryotic genomes Allignment of long genomic sequences Comparative genomics Oxford Grid Reconstruction

Taxonomic breakdown of homologous mouse

proteins according to taxonomic range

Page 5: Bioinformatics Genome anatomy Comparisons of some eukaryotic genomes Allignment of long genomic sequences Comparative genomics Oxford Grid Reconstruction

Comparisons of some eukaryotic genomes

Page 6: Bioinformatics Genome anatomy Comparisons of some eukaryotic genomes Allignment of long genomic sequences Comparative genomics Oxford Grid Reconstruction

Comparisons of some eukaryotic genomes

Page 7: Bioinformatics Genome anatomy Comparisons of some eukaryotic genomes Allignment of long genomic sequences Comparative genomics Oxford Grid Reconstruction

Comparative genomics

• Comparisons of very different genomes, while being useful for general purposes, does not allow more detailed analysis.

• On contrary comparison of genomes belonging to relatively similar group, like mammals, reveals some evolutionary trends.

• Comparative genome analysis of related species provides a powerful and general approach to identify functional elements without previous knowledge of function

• Reconstruction of an ancestral genome for a group like mammals is within a reach.

Page 8: Bioinformatics Genome anatomy Comparisons of some eukaryotic genomes Allignment of long genomic sequences Comparative genomics Oxford Grid Reconstruction

Allignment of long genomic sequences

• Allignment of long sequences is the essential part of comparative genomics.

• PipMaker is one of a few novel programs that compute alignments of similar regions in two DNA sequences.

• The resulting alignments are summarized with a ``percent identity plot'', or ``pip'' for short.

• MultiPipMaker allows the user to see relationships among more than two sequences. All pairwise alignments with the first sequence are computed and then returned as interleaved pips.

• MultiPipMaker can be requested to compute a true multiple alignment of the input sequences and return a nucleotide-level view of the results.

Page 9: Bioinformatics Genome anatomy Comparisons of some eukaryotic genomes Allignment of long genomic sequences Comparative genomics Oxford Grid Reconstruction

Allignment of long genomic sequences: PipMaker

Page 10: Bioinformatics Genome anatomy Comparisons of some eukaryotic genomes Allignment of long genomic sequences Comparative genomics Oxford Grid Reconstruction

Comparison of genome regions

Dotplot of the human-mouse comparison of the ApoE region. Note the yellow for exonic sequences and red for the upstream regulatory region.

Page 11: Bioinformatics Genome anatomy Comparisons of some eukaryotic genomes Allignment of long genomic sequences Comparative genomics Oxford Grid Reconstruction

Identification of functional genomic sequences

Pipmaker representation in a scale of 50% to 100% conservation for the same region.

Page 12: Bioinformatics Genome anatomy Comparisons of some eukaryotic genomes Allignment of long genomic sequences Comparative genomics Oxford Grid Reconstruction

Identification of functional genomic sequences

• Little is known about the actual fraction of the mammalian genome that is functional, but recent estimates based on sequence conservation patterns suggest that it is at least 5% .

• Given that the protein-coding fraction of the genome is about 1.5%, there is a lot of room for the identification of additional functional elements.

• Sequence conservation does not reveal the total fraction of the functional genome but simply the fraction of the genome that has remained functional within the group of species compared.

• It is expected therefore that an additional fraction will be species-specific or at least lineage-specific, and not conserved across large evolutionary distances such as human and mouse or across all vertebrate lineages.

Page 13: Bioinformatics Genome anatomy Comparisons of some eukaryotic genomes Allignment of long genomic sequences Comparative genomics Oxford Grid Reconstruction

Conserved non-coding sequences

PipMaker alignment of a gene-poor region of human chromosome 21. Blocks in red indicate regions of the human genome that are at least 100 bps and at least 70% identity between human and mouse (Conserved Non-Genic sequences: CNGs).

Page 14: Bioinformatics Genome anatomy Comparisons of some eukaryotic genomes Allignment of long genomic sequences Comparative genomics Oxford Grid Reconstruction

Databases and programs for genome comparisons

• Several databases provide characteristics of the genome, such as genes, expressed sequence tags (ESTs), repeats, computational predictions and other information in the context of genome conservation.

• Such databases are: UCSC browser (genome.ucsc.edu), Ensembl (www.ensembl.org) and NCBI (www.ncbi.nlm,nih.gov)• In these databases one can find information about the

levels of conservation of genes and also non-genic regions between a number of species.

Page 15: Bioinformatics Genome anatomy Comparisons of some eukaryotic genomes Allignment of long genomic sequences Comparative genomics Oxford Grid Reconstruction

Oxford Grid

• Each cell in the Oxford Grid represents a comparison of two chromosomes, one from each of the selected species.

• The number of orthologies appears inside each colored cell and the color indicates a range in the number: Grey (1), Blue (2-10), Green (11-25), Orange (26-50), Yellow (50+). Clicking on a colored cell will retrieve orthology details.

• Clicking on a mouse chromosome (blue numbers next to grid frame) will retrieve a comparative map showing all orthologies between the selected mouse chromosome and the comparison species displayed on the grid.

• Total orthologies observed between mouse and human genomes: 12,435, total mapped in both species: 12,290

Page 16: Bioinformatics Genome anatomy Comparisons of some eukaryotic genomes Allignment of long genomic sequences Comparative genomics Oxford Grid Reconstruction

Oxford Grid: comparison of mouse and human genomes

Page 17: Bioinformatics Genome anatomy Comparisons of some eukaryotic genomes Allignment of long genomic sequences Comparative genomics Oxford Grid Reconstruction

Comparative maps: Mouse Chromosome 1 Linkage Map versus human

Mouse chromosome 1 Human

orthologues located on different chromosomes

Page 18: Bioinformatics Genome anatomy Comparisons of some eukaryotic genomes Allignment of long genomic sequences Comparative genomics Oxford Grid Reconstruction

Whole chromosome paints from the tammar wallaby were hybridized to chromosomes of the swamp wallaby, which has the record lowest chromosome number in marsupials (2n=10 in females and 11 in males).

Complex origin of some chromosomes using fusions can be seen on the picture

Page 19: Bioinformatics Genome anatomy Comparisons of some eukaryotic genomes Allignment of long genomic sequences Comparative genomics Oxford Grid Reconstruction

1pq, 1q, 2pter-q13, 2q13-qter, 3, 4, 5, 6, 7pq, 7q, 8p, 8q, 10p, 10q,11, 12pq, 12q, 13, 14, 15, 16p, 16q, 17, 18, 19p, 19q, 20,

21, 22q, 22qdist, X and Y

Ancestral chromosome pool (numbers correspond to human homologs)

Whole chromosomes Large segments Neighboring segments

3, 5, 6, 9, 11, 13, 14, 15, 17, 18, 20, 21, X, Y

1q, 1pq, 2pter-q13, 2q13-qter,7pq, 7q, 8p, 8q,10p, 10q,

12pq, 12q, 16p, 16q,19p, 19q, 22q,

22qdist

3/21, 4/8p, 7q/16p,14/15, 16q/19q,12pq/22q, 12q/22qdist

Suggested ancestral eutherian karyotypes:1. Chowdhary et al. 1998: 2n=48

2. Murphy et al. 2001: 2n=503. Yang et al. 2003: 2n=44

4. Richard et al. 2003: 2n=505. Fronicke et al. 2003 2n=48

The likely ancestral mammalian (eutherian) karyotype represented as homologues of human chromosomes and the three major conserved components it comprises. The conserved components, viz., whole human chromosomes, large segments of human chromosomes and combination of neighboring segments of human chromosomes are commonly seen as conserved blocks in other evolutionarily diverged species.

Page 20: Bioinformatics Genome anatomy Comparisons of some eukaryotic genomes Allignment of long genomic sequences Comparative genomics Oxford Grid Reconstruction

Reconstruction of the putative ancestral eutherian karyotype. It was assumed the ancestor had 23 pairs of chromosomes and human

chromosomes were superimposed on them.

Page 21: Bioinformatics Genome anatomy Comparisons of some eukaryotic genomes Allignment of long genomic sequences Comparative genomics Oxford Grid Reconstruction

Segments and blocks >300kb in size with conserved synteny in human are superimposed on the mouse genome. Each colour corresponds to a

particular human chromosome. The 342 segments are separated from each other by thin, white lines within the 217 blocks of consistent colour

Page 22: Bioinformatics Genome anatomy Comparisons of some eukaryotic genomes Allignment of long genomic sequences Comparative genomics Oxford Grid Reconstruction

Dot plots of conserved syntenic segments in the three human and three mouse chromosomes

For each of three human (a–c) and mouse (d–f) chromosomes, the positions of orthologous landmarks are plotted along the x axis and the corresponding position of the landmark on chromosomes in the other genome is plotted on the y axis. Different chromosomes in the corresponding genome are differentiated with distinct colours. In a remarkable example of conserved synteny, human chromosome 20 (a) consists of just three segments from mouse chromosome 2 (d), with only one small segment altered in order. Human chromosome 17 (b) also shares segments with only one mouse chromosome (11) (e), but the 16 segments are extensively rearranged. However, most of the mouse and human chromosomes consist of multiple segments from multiple chromosomes, as shown for human chromosome 2 (c) and mouse chromosome 12 (f). Circled areas and arrows denote matching segments in mouse and human.

Page 23: Bioinformatics Genome anatomy Comparisons of some eukaryotic genomes Allignment of long genomic sequences Comparative genomics Oxford Grid Reconstruction