bits - comparative genomics on the genome level
DESCRIPTION
This is the third presentation of the BITS training on 'Comparative genomics'. It reviews the basic concepts of sequence homology on the gene Thanks to Klaas Vandepoele of the PSB department.TRANSCRIPT
![Page 1: BITS - Comparative genomics on the genome level](https://reader034.vdocuments.mx/reader034/viewer/2022051313/548531b9b4af9f9b0d8b4d96/html5/thumbnails/1.jpg)
Comparative genomicsin eukaryotes
Genome analysis
Klaas Vandepoele, PhD
Professor Ghent UniversityComparative & Integrative GenomicsVIB – Ghent University, Belgium
![Page 2: BITS - Comparative genomics on the genome level](https://reader034.vdocuments.mx/reader034/viewer/2022051313/548531b9b4af9f9b0d8b4d96/html5/thumbnails/2.jpg)
2
I. Genome conservation & genomic homology
Alignment of homologous regions Inter-genomic: aligning genomic sequences from different
species Intra-genomic aligning genomic sequences from the same
species
Different levels of resolution Comparative mapping (markers) Synteny (~ gene content) Colinearity (gene content + order conservation) DNA-based alignments (base-to-base mapping)
![Page 3: BITS - Comparative genomics on the genome level](https://reader034.vdocuments.mx/reader034/viewer/2022051313/548531b9b4af9f9b0d8b4d96/html5/thumbnails/3.jpg)
3
Human – Mouse - Ratresolution
![Page 4: BITS - Comparative genomics on the genome level](https://reader034.vdocuments.mx/reader034/viewer/2022051313/548531b9b4af9f9b0d8b4d96/html5/thumbnails/4.jpg)
4
Human – Mouse orthologous regions
Mouse chr IV
Comparativemapping
Genome translocations associated with human-mouse speciation
resolution
Human
www.ensembl.org
![Page 5: BITS - Comparative genomics on the genome level](https://reader034.vdocuments.mx/reader034/viewer/2022051313/548531b9b4af9f9b0d8b4d96/html5/thumbnails/5.jpg)
5
Human genome browser
Human chr IMouse chr IV
Conserved gene content & order
Human gene model
EST/cDNA similarities
Genome similarities
Gene loss and insertions in orthologous segments since human-mouse speciation
resolution
![Page 6: BITS - Comparative genomics on the genome level](https://reader034.vdocuments.mx/reader034/viewer/2022051313/548531b9b4af9f9b0d8b4d96/html5/thumbnails/6.jpg)
6
Human – Mouse base-to-base mapping
Blue: coding exons GT donor AG acceptor
Functional sequences (e.g. exons) evolve slower than non-functional ones (e.g. introns) due to natural selection against mutations in these regions
Consequently, functional elements, both coding and non-coding, are unusually well conserved in orthologous regions
resolution
![Page 7: BITS - Comparative genomics on the genome level](https://reader034.vdocuments.mx/reader034/viewer/2022051313/548531b9b4af9f9b0d8b4d96/html5/thumbnails/7.jpg)
7
DNA substitution rates for different gene/genome regions
Molecular Evolution, Li WH
![Page 8: BITS - Comparative genomics on the genome level](https://reader034.vdocuments.mx/reader034/viewer/2022051313/548531b9b4af9f9b0d8b4d96/html5/thumbnails/8.jpg)
8
Multiple species comparisons (gene-based)
PhIGsHedges, 2002
![Page 9: BITS - Comparative genomics on the genome level](https://reader034.vdocuments.mx/reader034/viewer/2022051313/548531b9b4af9f9b0d8b4d96/html5/thumbnails/9.jpg)
9
Genome size variation in the grasses: the use of model systems
BEP
PACC
Rice 450Mb
Barley ~5000Mb55 MYA
46 MYA
28 MYA
Maize ~2400Mb
Gaut 2002
Sorghum ~750Mb
![Page 10: BITS - Comparative genomics on the genome level](https://reader034.vdocuments.mx/reader034/viewer/2022051313/548531b9b4af9f9b0d8b4d96/html5/thumbnails/10.jpg)
10
Grass genomes: a single genetic system?
Gale and Devos, 1998
![Page 11: BITS - Comparative genomics on the genome level](https://reader034.vdocuments.mx/reader034/viewer/2022051313/548531b9b4af9f9b0d8b4d96/html5/thumbnails/11.jpg)
11
Micro-colinearity within the grasses
Bennetzen lab
![Page 12: BITS - Comparative genomics on the genome level](https://reader034.vdocuments.mx/reader034/viewer/2022051313/548531b9b4af9f9b0d8b4d96/html5/thumbnails/12.jpg)
Yeast Gene Order Browser (YGOB)
12
![Page 13: BITS - Comparative genomics on the genome level](https://reader034.vdocuments.mx/reader034/viewer/2022051313/548531b9b4af9f9b0d8b4d96/html5/thumbnails/13.jpg)
13
II. Computational detection of genomic homology
Synteny~ conservation of gene content
Colinearity~ conservation of (gene) content & order
Macro-colinearity Marker-based
Micro-colinearity DNA based or gene-based
![Page 14: BITS - Comparative genomics on the genome level](https://reader034.vdocuments.mx/reader034/viewer/2022051313/548531b9b4af9f9b0d8b4d96/html5/thumbnails/14.jpg)
14
How to find evidence for gene colinearity?
Time
1 2 3 4 5 6 7 8 9 10 11A
1 2 3 4 5 6 7 8 9 10 11
1 2 3 4 5 6 7 8 9 10 11
speciation
S1
S2
1 3 4 6 7 10 11
1 2 4 6 7 8 9 11
Gene loss, insertions, rearrangements, translocation, etc …
S1
S2
2
retained orthologs (anchor points)
![Page 15: BITS - Comparative genomics on the genome level](https://reader034.vdocuments.mx/reader034/viewer/2022051313/548531b9b4af9f9b0d8b4d96/html5/thumbnails/15.jpg)
15
Matrix representation
segment S1
seg
men
t S
2
1 3 4 - 6 X 101
2
4
6
7
8
9
-
- 7 X
-
X
1 3 4 6 7 10 11
1 2 4 6 7 8 9 11
S1
S2
11
11
![Page 16: BITS - Comparative genomics on the genome level](https://reader034.vdocuments.mx/reader034/viewer/2022051313/548531b9b4af9f9b0d8b4d96/html5/thumbnails/16.jpg)
16
Chromosome 1
Chr
omos
ome
2
• Represent chromosomes as sorted gene lists
• Identify all homologous gene pairs between chromosomes (all-against-all BLASTP*).
• Score pairs of homologues in matrix
Identifying homologous regions = identifying diagonal series of
elements in the gene homology matrix (GHM).
Map-based approach
Vandepoele et al., Genome Research 2002
![Page 17: BITS - Comparative genomics on the genome level](https://reader034.vdocuments.mx/reader034/viewer/2022051313/548531b9b4af9f9b0d8b4d96/html5/thumbnails/17.jpg)
17
The map-based approach: terminology
Chromosome 1
Ch
rom
oso
me
2
Tandem duplication
Colinear segment
Inverted colinear segment
1
2
Homologous gene
Gene Homology Matrix (GHM)
![Page 18: BITS - Comparative genomics on the genome level](https://reader034.vdocuments.mx/reader034/viewer/2022051313/548531b9b4af9f9b0d8b4d96/html5/thumbnails/18.jpg)
18
Detection of colinear homologous regions
HsaC1
MmuC4HsaC1
GgaC23
Human-mouse Chicken-human
![Page 19: BITS - Comparative genomics on the genome level](https://reader034.vdocuments.mx/reader034/viewer/2022051313/548531b9b4af9f9b0d8b4d96/html5/thumbnails/19.jpg)
19
Detection of colinear homologous regions
HsaC1HsaC1
MmuC4TviC1
Human-tetraodonHuman-mouse
![Page 20: BITS - Comparative genomics on the genome level](https://reader034.vdocuments.mx/reader034/viewer/2022051313/548531b9b4af9f9b0d8b4d96/html5/thumbnails/20.jpg)
MUMmer
20
NUCmer PROmer
![Page 21: BITS - Comparative genomics on the genome level](https://reader034.vdocuments.mx/reader034/viewer/2022051313/548531b9b4af9f9b0d8b4d96/html5/thumbnails/21.jpg)
21
And what about synteny?
ancient duplication
HsaC1
HsaC9
Identifying syntenic regions = identifying high homolog-density
regions in the gene homology matrix (GHM).
• Application of 2-dimensional sliding-window approach to score regions with a high density of homologous genes between 2 chromosomes
DeSyRe, Vandepoele et al. unpublished
![Page 22: BITS - Comparative genomics on the genome level](https://reader034.vdocuments.mx/reader034/viewer/2022051313/548531b9b4af9f9b0d8b4d96/html5/thumbnails/22.jpg)
22
Detection of recent and ancient large-scale duplications
synteny
ancient duplication
HsaC1
HsaC9
recent duplication
C2
C4
colinearity
![Page 23: BITS - Comparative genomics on the genome level](https://reader034.vdocuments.mx/reader034/viewer/2022051313/548531b9b4af9f9b0d8b4d96/html5/thumbnails/23.jpg)
23
III. Whole-genome alignments
Evolutionary constrained sequences are a good indicator of functional genome regions
Basic protocol1. Sequence generation2. Reconstructing homologous colinearity across
related genomes3. Multi-sequence alignment4. Detection sequences under purifying selection.
Margulies & Birney, NRG 2008
![Page 24: BITS - Comparative genomics on the genome level](https://reader034.vdocuments.mx/reader034/viewer/2022051313/548531b9b4af9f9b0d8b4d96/html5/thumbnails/24.jpg)
Reconstructing homologous colinearity
24
• Segmental duplication and other species-specific rearrangements (e.g. inversions, insertions, deletions) interfere with the accurate detection of orthologous genomic regions
![Page 25: BITS - Comparative genomics on the genome level](https://reader034.vdocuments.mx/reader034/viewer/2022051313/548531b9b4af9f9b0d8b4d96/html5/thumbnails/25.jpg)
Tools
Mercator (Ensembl) coding exons as anchor points graph of colinearity information travel through graph to generate homologous
regions chains-and-nets (UCSC)
reference-based local alignments different genomes (BLASTZ)
filtering highest-scoring chains net together chains from same locus
25
![Page 26: BITS - Comparative genomics on the genome level](https://reader034.vdocuments.mx/reader034/viewer/2022051313/548531b9b4af9f9b0d8b4d96/html5/thumbnails/26.jpg)
Sequence alignment & constraint detection
26
PhastConsBinConsGERPSiphy
![Page 27: BITS - Comparative genomics on the genome level](https://reader034.vdocuments.mx/reader034/viewer/2022051313/548531b9b4af9f9b0d8b4d96/html5/thumbnails/27.jpg)
Whole-genome base-pair alignment
Challenges multi-species alignment long DNA sequences (reflecting homologous
colinear regions) one-to-one mapping (with reference genome) various levels of sequence divergence
27
![Page 28: BITS - Comparative genomics on the genome level](https://reader034.vdocuments.mx/reader034/viewer/2022051313/548531b9b4af9f9b0d8b4d96/html5/thumbnails/28.jpg)
Whole-genome base-pair alignment toolbox
MLAGAN CHAOS seeding algorithm (k-mer anchors)
Dynamic programming (pairwise)
Multiple alignment using progressive strategy
Shuffle-LAGAN (incl. rearrangement map); VISTA
TBA / MultiZ; UCSC Pairwise BLASTZ alignments (local blocks)
Merging joining blocks using MultiZ
Complex ordering of blocks using Threaded Blockset Aligner
PECAN (Ensembl) Consistency alignment based on pairwise alignments (incl. outgroup
information)
MAVID
28
![Page 29: BITS - Comparative genomics on the genome level](https://reader034.vdocuments.mx/reader034/viewer/2022051313/548531b9b4af9f9b0d8b4d96/html5/thumbnails/29.jpg)
29
From gene to DNA-based colinearity…
Pairwise approach: Human segment as
reference
VISTA http://genome.lbl.gov/vista
![Page 30: BITS - Comparative genomics on the genome level](https://reader034.vdocuments.mx/reader034/viewer/2022051313/548531b9b4af9f9b0d8b4d96/html5/thumbnails/30.jpg)
30
From gene to DNA-based colinearity…
![Page 31: BITS - Comparative genomics on the genome level](https://reader034.vdocuments.mx/reader034/viewer/2022051313/548531b9b4af9f9b0d8b4d96/html5/thumbnails/31.jpg)
31
Input and output files
Frazer et al., 2003
PIP- maker
![Page 32: BITS - Comparative genomics on the genome level](https://reader034.vdocuments.mx/reader034/viewer/2022051313/548531b9b4af9f9b0d8b4d96/html5/thumbnails/32.jpg)
32
Conserved Non-coding Sequences or Elements (CNS/CNE)
Human/dog
Human/mouse
Mouse/dog
Blue: exonsTurquoise: UTR
VIS
TA
plo
t
![Page 33: BITS - Comparative genomics on the genome level](https://reader034.vdocuments.mx/reader034/viewer/2022051313/548531b9b4af9f9b0d8b4d96/html5/thumbnails/33.jpg)
Exercise
Explore the genome organization and conservation of your favorite locus in a set of related species.
Plants http://bioinformatics.psb.ugent.be/plaza/
Vertebrates http://teleost.cs.uoregon.edu/synteny_db/
Yeast http://wolfe.gen.tcd.ie/ygob/
33
![Page 34: BITS - Comparative genomics on the genome level](https://reader034.vdocuments.mx/reader034/viewer/2022051313/548531b9b4af9f9b0d8b4d96/html5/thumbnails/34.jpg)
34