an introduction to ensembl cédric notredame. the top 5 surprises in the human genome map 1.the blue...
TRANSCRIPT
![Page 1: An Introduction to ENSEMBL Cédric Notredame. The Top 5 Surprises in the Human Genome Map 1.The blue gene exists in 3 genotypes: Straight Leg, Loose Fit](https://reader035.vdocuments.mx/reader035/viewer/2022062519/56649eef5503460f94bfeb8f/html5/thumbnails/1.jpg)
An Introduction to ENSEMBL
Cédric Notredame
![Page 2: An Introduction to ENSEMBL Cédric Notredame. The Top 5 Surprises in the Human Genome Map 1.The blue gene exists in 3 genotypes: Straight Leg, Loose Fit](https://reader035.vdocuments.mx/reader035/viewer/2022062519/56649eef5503460f94bfeb8f/html5/thumbnails/2.jpg)
The Top 5 Surprises in the Human Genome Map
1. The blue gene exists in 3 genotypes: Straight Leg, Loose Fit and Button-Fly. 2. Tiny villages of Hobbits actually live in our DNA and produce minute quantities of wool -- which we've been
ignorantly referring to as "navel lint" and throwing away for centuries. 3. It's nearly impossible to re-fold it along the original creases. 4. Beer-drinking gene conveniently located next to bathroom-locating gene.
and the Number 1 Surprise In The Human Genome Map...
5-Now that there's a map, male scientists will attempt to cure diseases by randomly throwing stuff into beakers, stubbornly refusing to use the map or ask for directions -- all the while insisting the cure is right around the next corner
![Page 3: An Introduction to ENSEMBL Cédric Notredame. The Top 5 Surprises in the Human Genome Map 1.The blue gene exists in 3 genotypes: Straight Leg, Loose Fit](https://reader035.vdocuments.mx/reader035/viewer/2022062519/56649eef5503460f94bfeb8f/html5/thumbnails/3.jpg)
ENSEMBL: Our Scope
-What is ENSEMBL ?
-Searching Genes in ENSEMBL
-Viewing Genes in ENSEMBL?
-Doing Research With ENSEMBL?
-Where do ENSEMBL Genes Come From
![Page 4: An Introduction to ENSEMBL Cédric Notredame. The Top 5 Surprises in the Human Genome Map 1.The blue gene exists in 3 genotypes: Straight Leg, Loose Fit](https://reader035.vdocuments.mx/reader035/viewer/2022062519/56649eef5503460f94bfeb8f/html5/thumbnails/4.jpg)
• Genomes sequences are becoming available very rapidly– Large and difficult to handle computationally– Everyone expects to be able to access them immediately
• Bench Biologists– Has my gene been sequenced?– What are the genes in this region?– Where are all the GPCRs– Connect the genome to other resources
• Research Bioinformatics– Give me a dataset of human genomic DNA– Give me a protein dataset
Accessing Genomes
![Page 5: An Introduction to ENSEMBL Cédric Notredame. The Top 5 Surprises in the Human Genome Map 1.The blue gene exists in 3 genotypes: Straight Leg, Loose Fit](https://reader035.vdocuments.mx/reader035/viewer/2022062519/56649eef5503460f94bfeb8f/html5/thumbnails/5.jpg)
• Set of high quality gene predictions– From known human mRNAs aligned against genome– From similar protein and mRNAs aligned against
genome– From Genscan predictions confirmed via BLAST of
Protein, cDNA, ESTs databases.
• Initial functional annotation from Interpro• Integration with external resources (SNPs, SAGE,
OMIM)
• Comparative analysis– DNA sequence alignment– Protein orthologs
What is It ?
![Page 6: An Introduction to ENSEMBL Cédric Notredame. The Top 5 Surprises in the Human Genome Map 1.The blue gene exists in 3 genotypes: Straight Leg, Loose Fit](https://reader035.vdocuments.mx/reader035/viewer/2022062519/56649eef5503460f94bfeb8f/html5/thumbnails/6.jpg)
Mr ENSEMBL ?
Richard Durbin (ACEDB)
Ewan Birney (EBI)
![Page 7: An Introduction to ENSEMBL Cédric Notredame. The Top 5 Surprises in the Human Genome Map 1.The blue gene exists in 3 genotypes: Straight Leg, Loose Fit](https://reader035.vdocuments.mx/reader035/viewer/2022062519/56649eef5503460f94bfeb8f/html5/thumbnails/7.jpg)
• Scale and data flow– mainly engineering problems
• Presentation, ease of use– mainly engineering problems
• Algorithmic– Partly engineering– Partly research
Challenges ?
![Page 8: An Introduction to ENSEMBL Cédric Notredame. The Top 5 Surprises in the Human Genome Map 1.The blue gene exists in 3 genotypes: Straight Leg, Loose Fit](https://reader035.vdocuments.mx/reader035/viewer/2022062519/56649eef5503460f94bfeb8f/html5/thumbnails/8.jpg)
ENSEMBL Home
![Page 9: An Introduction to ENSEMBL Cédric Notredame. The Top 5 Surprises in the Human Genome Map 1.The blue gene exists in 3 genotypes: Straight Leg, Loose Fit](https://reader035.vdocuments.mx/reader035/viewer/2022062519/56649eef5503460f94bfeb8f/html5/thumbnails/9.jpg)
![Page 10: An Introduction to ENSEMBL Cédric Notredame. The Top 5 Surprises in the Human Genome Map 1.The blue gene exists in 3 genotypes: Straight Leg, Loose Fit](https://reader035.vdocuments.mx/reader035/viewer/2022062519/56649eef5503460f94bfeb8f/html5/thumbnails/10.jpg)
![Page 11: An Introduction to ENSEMBL Cédric Notredame. The Top 5 Surprises in the Human Genome Map 1.The blue gene exists in 3 genotypes: Straight Leg, Loose Fit](https://reader035.vdocuments.mx/reader035/viewer/2022062519/56649eef5503460f94bfeb8f/html5/thumbnails/11.jpg)
![Page 12: An Introduction to ENSEMBL Cédric Notredame. The Top 5 Surprises in the Human Genome Map 1.The blue gene exists in 3 genotypes: Straight Leg, Loose Fit](https://reader035.vdocuments.mx/reader035/viewer/2022062519/56649eef5503460f94bfeb8f/html5/thumbnails/12.jpg)
Help!
• context sensitive help pages - click
• access other documentation via generic home page
• email the helpdeskHelpDesk / Suggestions
![Page 13: An Introduction to ENSEMBL Cédric Notredame. The Top 5 Surprises in the Human Genome Map 1.The blue gene exists in 3 genotypes: Straight Leg, Loose Fit](https://reader035.vdocuments.mx/reader035/viewer/2022062519/56649eef5503460f94bfeb8f/html5/thumbnails/13.jpg)
Finding What You Need
![Page 14: An Introduction to ENSEMBL Cédric Notredame. The Top 5 Surprises in the Human Genome Map 1.The blue gene exists in 3 genotypes: Straight Leg, Loose Fit](https://reader035.vdocuments.mx/reader035/viewer/2022062519/56649eef5503460f94bfeb8f/html5/thumbnails/14.jpg)
Human homepage
![Page 15: An Introduction to ENSEMBL Cédric Notredame. The Top 5 Surprises in the Human Genome Map 1.The blue gene exists in 3 genotypes: Straight Leg, Loose Fit](https://reader035.vdocuments.mx/reader035/viewer/2022062519/56649eef5503460f94bfeb8f/html5/thumbnails/15.jpg)
Text search
![Page 16: An Introduction to ENSEMBL Cédric Notredame. The Top 5 Surprises in the Human Genome Map 1.The blue gene exists in 3 genotypes: Straight Leg, Loose Fit](https://reader035.vdocuments.mx/reader035/viewer/2022062519/56649eef5503460f94bfeb8f/html5/thumbnails/16.jpg)
BLAST/SSAHA
![Page 17: An Introduction to ENSEMBL Cédric Notredame. The Top 5 Surprises in the Human Genome Map 1.The blue gene exists in 3 genotypes: Straight Leg, Loose Fit](https://reader035.vdocuments.mx/reader035/viewer/2022062519/56649eef5503460f94bfeb8f/html5/thumbnails/17.jpg)
![Page 18: An Introduction to ENSEMBL Cédric Notredame. The Top 5 Surprises in the Human Genome Map 1.The blue gene exists in 3 genotypes: Straight Leg, Loose Fit](https://reader035.vdocuments.mx/reader035/viewer/2022062519/56649eef5503460f94bfeb8f/html5/thumbnails/18.jpg)
BLAST/SSAHA ????
![Page 19: An Introduction to ENSEMBL Cédric Notredame. The Top 5 Surprises in the Human Genome Map 1.The blue gene exists in 3 genotypes: Straight Leg, Loose Fit](https://reader035.vdocuments.mx/reader035/viewer/2022062519/56649eef5503460f94bfeb8f/html5/thumbnails/19.jpg)
Changing Angle…
![Page 20: An Introduction to ENSEMBL Cédric Notredame. The Top 5 Surprises in the Human Genome Map 1.The blue gene exists in 3 genotypes: Straight Leg, Loose Fit](https://reader035.vdocuments.mx/reader035/viewer/2022062519/56649eef5503460f94bfeb8f/html5/thumbnails/20.jpg)
![Page 21: An Introduction to ENSEMBL Cédric Notredame. The Top 5 Surprises in the Human Genome Map 1.The blue gene exists in 3 genotypes: Straight Leg, Loose Fit](https://reader035.vdocuments.mx/reader035/viewer/2022062519/56649eef5503460f94bfeb8f/html5/thumbnails/21.jpg)
Anchor View
Map View
![Page 22: An Introduction to ENSEMBL Cédric Notredame. The Top 5 Surprises in the Human Genome Map 1.The blue gene exists in 3 genotypes: Straight Leg, Loose Fit](https://reader035.vdocuments.mx/reader035/viewer/2022062519/56649eef5503460f94bfeb8f/html5/thumbnails/22.jpg)
Detailed ViewGenes, ESTs, CpG etc.100kb
OverviewGenes and Markers1Mb
Chromosome
Configuration
Contig View
![Page 23: An Introduction to ENSEMBL Cédric Notredame. The Top 5 Surprises in the Human Genome Map 1.The blue gene exists in 3 genotypes: Straight Leg, Loose Fit](https://reader035.vdocuments.mx/reader035/viewer/2022062519/56649eef5503460f94bfeb8f/html5/thumbnails/23.jpg)
Contig View
close-up
Evidence
Transcriptsred & black(Ensembl predictions)
Customising& short cuts
Pop-up menu
![Page 24: An Introduction to ENSEMBL Cédric Notredame. The Top 5 Surprises in the Human Genome Map 1.The blue gene exists in 3 genotypes: Straight Leg, Loose Fit](https://reader035.vdocuments.mx/reader035/viewer/2022062519/56649eef5503460f94bfeb8f/html5/thumbnails/24.jpg)
Cyto View
![Page 25: An Introduction to ENSEMBL Cédric Notredame. The Top 5 Surprises in the Human Genome Map 1.The blue gene exists in 3 genotypes: Straight Leg, Loose Fit](https://reader035.vdocuments.mx/reader035/viewer/2022062519/56649eef5503460f94bfeb8f/html5/thumbnails/25.jpg)
Marker View
![Page 26: An Introduction to ENSEMBL Cédric Notredame. The Top 5 Surprises in the Human Genome Map 1.The blue gene exists in 3 genotypes: Straight Leg, Loose Fit](https://reader035.vdocuments.mx/reader035/viewer/2022062519/56649eef5503460f94bfeb8f/html5/thumbnails/26.jpg)
SNP View
![Page 27: An Introduction to ENSEMBL Cédric Notredame. The Top 5 Surprises in the Human Genome Map 1.The blue gene exists in 3 genotypes: Straight Leg, Loose Fit](https://reader035.vdocuments.mx/reader035/viewer/2022062519/56649eef5503460f94bfeb8f/html5/thumbnails/27.jpg)
Synteny View
![Page 28: An Introduction to ENSEMBL Cédric Notredame. The Top 5 Surprises in the Human Genome Map 1.The blue gene exists in 3 genotypes: Straight Leg, Loose Fit](https://reader035.vdocuments.mx/reader035/viewer/2022062519/56649eef5503460f94bfeb8f/html5/thumbnails/28.jpg)
Dotter View
![Page 29: An Introduction to ENSEMBL Cédric Notredame. The Top 5 Surprises in the Human Genome Map 1.The blue gene exists in 3 genotypes: Straight Leg, Loose Fit](https://reader035.vdocuments.mx/reader035/viewer/2022062519/56649eef5503460f94bfeb8f/html5/thumbnails/29.jpg)
GeneView
![Page 30: An Introduction to ENSEMBL Cédric Notredame. The Top 5 Surprises in the Human Genome Map 1.The blue gene exists in 3 genotypes: Straight Leg, Loose Fit](https://reader035.vdocuments.mx/reader035/viewer/2022062519/56649eef5503460f94bfeb8f/html5/thumbnails/30.jpg)
Gene-View
![Page 31: An Introduction to ENSEMBL Cédric Notredame. The Top 5 Surprises in the Human Genome Map 1.The blue gene exists in 3 genotypes: Straight Leg, Loose Fit](https://reader035.vdocuments.mx/reader035/viewer/2022062519/56649eef5503460f94bfeb8f/html5/thumbnails/31.jpg)
Gene-View
![Page 32: An Introduction to ENSEMBL Cédric Notredame. The Top 5 Surprises in the Human Genome Map 1.The blue gene exists in 3 genotypes: Straight Leg, Loose Fit](https://reader035.vdocuments.mx/reader035/viewer/2022062519/56649eef5503460f94bfeb8f/html5/thumbnails/32.jpg)
Gene-View
![Page 33: An Introduction to ENSEMBL Cédric Notredame. The Top 5 Surprises in the Human Genome Map 1.The blue gene exists in 3 genotypes: Straight Leg, Loose Fit](https://reader035.vdocuments.mx/reader035/viewer/2022062519/56649eef5503460f94bfeb8f/html5/thumbnails/33.jpg)
Trans View
![Page 34: An Introduction to ENSEMBL Cédric Notredame. The Top 5 Surprises in the Human Genome Map 1.The blue gene exists in 3 genotypes: Straight Leg, Loose Fit](https://reader035.vdocuments.mx/reader035/viewer/2022062519/56649eef5503460f94bfeb8f/html5/thumbnails/34.jpg)
Exon-View
![Page 35: An Introduction to ENSEMBL Cédric Notredame. The Top 5 Surprises in the Human Genome Map 1.The blue gene exists in 3 genotypes: Straight Leg, Loose Fit](https://reader035.vdocuments.mx/reader035/viewer/2022062519/56649eef5503460f94bfeb8f/html5/thumbnails/35.jpg)
Protein-View
![Page 36: An Introduction to ENSEMBL Cédric Notredame. The Top 5 Surprises in the Human Genome Map 1.The blue gene exists in 3 genotypes: Straight Leg, Loose Fit](https://reader035.vdocuments.mx/reader035/viewer/2022062519/56649eef5503460f94bfeb8f/html5/thumbnails/36.jpg)
Protein-View
![Page 37: An Introduction to ENSEMBL Cédric Notredame. The Top 5 Surprises in the Human Genome Map 1.The blue gene exists in 3 genotypes: Straight Leg, Loose Fit](https://reader035.vdocuments.mx/reader035/viewer/2022062519/56649eef5503460f94bfeb8f/html5/thumbnails/37.jpg)
Protein-View
![Page 38: An Introduction to ENSEMBL Cédric Notredame. The Top 5 Surprises in the Human Genome Map 1.The blue gene exists in 3 genotypes: Straight Leg, Loose Fit](https://reader035.vdocuments.mx/reader035/viewer/2022062519/56649eef5503460f94bfeb8f/html5/thumbnails/38.jpg)
CDK-like
Family-View
![Page 39: An Introduction to ENSEMBL Cédric Notredame. The Top 5 Surprises in the Human Genome Map 1.The blue gene exists in 3 genotypes: Straight Leg, Loose Fit](https://reader035.vdocuments.mx/reader035/viewer/2022062519/56649eef5503460f94bfeb8f/html5/thumbnails/39.jpg)
CDK-like
Family-View
![Page 40: An Introduction to ENSEMBL Cédric Notredame. The Top 5 Surprises in the Human Genome Map 1.The blue gene exists in 3 genotypes: Straight Leg, Loose Fit](https://reader035.vdocuments.mx/reader035/viewer/2022062519/56649eef5503460f94bfeb8f/html5/thumbnails/40.jpg)
The Right View On My Gene
-Where Is My Gene ?Map ViewCyto ViewContig View
-How Many Transcript for My GeneGene ViewExon View
-What is the Function of my GeneProtein ViewSNP ViewFamily View
-How does My Gene compare with other Species
Synteny ViewDotter View
![Page 41: An Introduction to ENSEMBL Cédric Notredame. The Top 5 Surprises in the Human Genome Map 1.The blue gene exists in 3 genotypes: Straight Leg, Loose Fit](https://reader035.vdocuments.mx/reader035/viewer/2022062519/56649eef5503460f94bfeb8f/html5/thumbnails/41.jpg)
Getting The Stuff Back Home
![Page 42: An Introduction to ENSEMBL Cédric Notredame. The Top 5 Surprises in the Human Genome Map 1.The blue gene exists in 3 genotypes: Straight Leg, Loose Fit](https://reader035.vdocuments.mx/reader035/viewer/2022062519/56649eef5503460f94bfeb8f/html5/thumbnails/42.jpg)
Export-View
![Page 43: An Introduction to ENSEMBL Cédric Notredame. The Top 5 Surprises in the Human Genome Map 1.The blue gene exists in 3 genotypes: Straight Leg, Loose Fit](https://reader035.vdocuments.mx/reader035/viewer/2022062519/56649eef5503460f94bfeb8f/html5/thumbnails/43.jpg)
• The aim of EnsMart is to integrate Ensembl data into a single, multi-species, query-optimised database– Requirement for cross-database joins removed.– Query-optimised schema improves speed of data
retrieval.• Examples
– Coding SNPs for all novel GPCRs– The sequence in the 5kb upstream region of known
proteases between D1S2806 and D1S2907– Mouse homologues of human disease genes containing
transmembrane domain located between 1p23 and 1q23
Data Mining with EnsMart
![Page 44: An Introduction to ENSEMBL Cédric Notredame. The Top 5 Surprises in the Human Genome Map 1.The blue gene exists in 3 genotypes: Straight Leg, Loose Fit](https://reader035.vdocuments.mx/reader035/viewer/2022062519/56649eef5503460f94bfeb8f/html5/thumbnails/44.jpg)
EnsMart I
![Page 45: An Introduction to ENSEMBL Cédric Notredame. The Top 5 Surprises in the Human Genome Map 1.The blue gene exists in 3 genotypes: Straight Leg, Loose Fit](https://reader035.vdocuments.mx/reader035/viewer/2022062519/56649eef5503460f94bfeb8f/html5/thumbnails/45.jpg)
EnsMart II
![Page 46: An Introduction to ENSEMBL Cédric Notredame. The Top 5 Surprises in the Human Genome Map 1.The blue gene exists in 3 genotypes: Straight Leg, Loose Fit](https://reader035.vdocuments.mx/reader035/viewer/2022062519/56649eef5503460f94bfeb8f/html5/thumbnails/46.jpg)
Asking Questions With
ENSEMBL
![Page 47: An Introduction to ENSEMBL Cédric Notredame. The Top 5 Surprises in the Human Genome Map 1.The blue gene exists in 3 genotypes: Straight Leg, Loose Fit](https://reader035.vdocuments.mx/reader035/viewer/2022062519/56649eef5503460f94bfeb8f/html5/thumbnails/47.jpg)
Asking Questions
1-Selecting AND Downloading Genes using-Functional-And Evolutive Criteria
2-Comparing Two Pieces of Genome
![Page 48: An Introduction to ENSEMBL Cédric Notredame. The Top 5 Surprises in the Human Genome Map 1.The blue gene exists in 3 genotypes: Straight Leg, Loose Fit](https://reader035.vdocuments.mx/reader035/viewer/2022062519/56649eef5503460f94bfeb8f/html5/thumbnails/48.jpg)
All The Human Genes
-Involved in Cell Death-Associated with a Disease-With a Homologue in Mouse and Chicken
Asking A Question with ENSMART
What Do You Want ???
![Page 49: An Introduction to ENSEMBL Cédric Notredame. The Top 5 Surprises in the Human Genome Map 1.The blue gene exists in 3 genotypes: Straight Leg, Loose Fit](https://reader035.vdocuments.mx/reader035/viewer/2022062519/56649eef5503460f94bfeb8f/html5/thumbnails/49.jpg)
Which Specie
![Page 50: An Introduction to ENSEMBL Cédric Notredame. The Top 5 Surprises in the Human Genome Map 1.The blue gene exists in 3 genotypes: Straight Leg, Loose Fit](https://reader035.vdocuments.mx/reader035/viewer/2022062519/56649eef5503460f94bfeb8f/html5/thumbnails/50.jpg)
Select the regionSelect the region
Where?
What kindof Gene ?
![Page 51: An Introduction to ENSEMBL Cédric Notredame. The Top 5 Surprises in the Human Genome Map 1.The blue gene exists in 3 genotypes: Straight Leg, Loose Fit](https://reader035.vdocuments.mx/reader035/viewer/2022062519/56649eef5503460f94bfeb8f/html5/thumbnails/51.jpg)
Select the Select the kind of datakind of data
Choose AnEvolutionnary Trace
What Kind of Function ?
![Page 52: An Introduction to ENSEMBL Cédric Notredame. The Top 5 Surprises in the Human Genome Map 1.The blue gene exists in 3 genotypes: Straight Leg, Loose Fit](https://reader035.vdocuments.mx/reader035/viewer/2022062519/56649eef5503460f94bfeb8f/html5/thumbnails/52.jpg)
Select the Select the kind of datakind of data
Control of Genetic Variation
Control of Regulatory Region
Control ofBiochemicalFunction
![Page 53: An Introduction to ENSEMBL Cédric Notredame. The Top 5 Surprises in the Human Genome Map 1.The blue gene exists in 3 genotypes: Straight Leg, Loose Fit](https://reader035.vdocuments.mx/reader035/viewer/2022062519/56649eef5503460f94bfeb8f/html5/thumbnails/53.jpg)
Human GeneCell Death
Human GeneCell DeathMouse
Human GeneCell DeathChicken
Human GeneCell DeathC. Elegans
1133 genes 1106 genes 880 genes 338 genes
![Page 54: An Introduction to ENSEMBL Cédric Notredame. The Top 5 Surprises in the Human Genome Map 1.The blue gene exists in 3 genotypes: Straight Leg, Loose Fit](https://reader035.vdocuments.mx/reader035/viewer/2022062519/56649eef5503460f94bfeb8f/html5/thumbnails/54.jpg)
I would like -Chromosome Information-The ID of my sequences-The corresponding OMIM Id-The corresponding Chicken id
Asking A Question with ENSMART
How Do You Want it Packed ???
![Page 55: An Introduction to ENSEMBL Cédric Notredame. The Top 5 Surprises in the Human Genome Map 1.The blue gene exists in 3 genotypes: Straight Leg, Loose Fit](https://reader035.vdocuments.mx/reader035/viewer/2022062519/56649eef5503460f94bfeb8f/html5/thumbnails/55.jpg)
![Page 56: An Introduction to ENSEMBL Cédric Notredame. The Top 5 Surprises in the Human Genome Map 1.The blue gene exists in 3 genotypes: Straight Leg, Loose Fit](https://reader035.vdocuments.mx/reader035/viewer/2022062519/56649eef5503460f94bfeb8f/html5/thumbnails/56.jpg)
![Page 57: An Introduction to ENSEMBL Cédric Notredame. The Top 5 Surprises in the Human Genome Map 1.The blue gene exists in 3 genotypes: Straight Leg, Loose Fit](https://reader035.vdocuments.mx/reader035/viewer/2022062519/56649eef5503460f94bfeb8f/html5/thumbnails/57.jpg)
Come to think of it…
-I’d like to take a look at the 5’ upstream regions
Asking A Question with ENSMART
How Do You Want it Packed ???
![Page 58: An Introduction to ENSEMBL Cédric Notredame. The Top 5 Surprises in the Human Genome Map 1.The blue gene exists in 3 genotypes: Straight Leg, Loose Fit](https://reader035.vdocuments.mx/reader035/viewer/2022062519/56649eef5503460f94bfeb8f/html5/thumbnails/58.jpg)
![Page 59: An Introduction to ENSEMBL Cédric Notredame. The Top 5 Surprises in the Human Genome Map 1.The blue gene exists in 3 genotypes: Straight Leg, Loose Fit](https://reader035.vdocuments.mx/reader035/viewer/2022062519/56649eef5503460f94bfeb8f/html5/thumbnails/59.jpg)
![Page 60: An Introduction to ENSEMBL Cédric Notredame. The Top 5 Surprises in the Human Genome Map 1.The blue gene exists in 3 genotypes: Straight Leg, Loose Fit](https://reader035.vdocuments.mx/reader035/viewer/2022062519/56649eef5503460f94bfeb8f/html5/thumbnails/60.jpg)
I Want To know if the Mouse and the Human Genome are conserved around the Human Gene SNX5
Asking A Question with ENSMART
What Do You Want ???
![Page 61: An Introduction to ENSEMBL Cédric Notredame. The Top 5 Surprises in the Human Genome Map 1.The blue gene exists in 3 genotypes: Straight Leg, Loose Fit](https://reader035.vdocuments.mx/reader035/viewer/2022062519/56649eef5503460f94bfeb8f/html5/thumbnails/61.jpg)
![Page 62: An Introduction to ENSEMBL Cédric Notredame. The Top 5 Surprises in the Human Genome Map 1.The blue gene exists in 3 genotypes: Straight Leg, Loose Fit](https://reader035.vdocuments.mx/reader035/viewer/2022062519/56649eef5503460f94bfeb8f/html5/thumbnails/62.jpg)
![Page 63: An Introduction to ENSEMBL Cédric Notredame. The Top 5 Surprises in the Human Genome Map 1.The blue gene exists in 3 genotypes: Straight Leg, Loose Fit](https://reader035.vdocuments.mx/reader035/viewer/2022062519/56649eef5503460f94bfeb8f/html5/thumbnails/63.jpg)
![Page 64: An Introduction to ENSEMBL Cédric Notredame. The Top 5 Surprises in the Human Genome Map 1.The blue gene exists in 3 genotypes: Straight Leg, Loose Fit](https://reader035.vdocuments.mx/reader035/viewer/2022062519/56649eef5503460f94bfeb8f/html5/thumbnails/64.jpg)
![Page 65: An Introduction to ENSEMBL Cédric Notredame. The Top 5 Surprises in the Human Genome Map 1.The blue gene exists in 3 genotypes: Straight Leg, Loose Fit](https://reader035.vdocuments.mx/reader035/viewer/2022062519/56649eef5503460f94bfeb8f/html5/thumbnails/65.jpg)
![Page 66: An Introduction to ENSEMBL Cédric Notredame. The Top 5 Surprises in the Human Genome Map 1.The blue gene exists in 3 genotypes: Straight Leg, Loose Fit](https://reader035.vdocuments.mx/reader035/viewer/2022062519/56649eef5503460f94bfeb8f/html5/thumbnails/66.jpg)
![Page 67: An Introduction to ENSEMBL Cédric Notredame. The Top 5 Surprises in the Human Genome Map 1.The blue gene exists in 3 genotypes: Straight Leg, Loose Fit](https://reader035.vdocuments.mx/reader035/viewer/2022062519/56649eef5503460f94bfeb8f/html5/thumbnails/67.jpg)
![Page 68: An Introduction to ENSEMBL Cédric Notredame. The Top 5 Surprises in the Human Genome Map 1.The blue gene exists in 3 genotypes: Straight Leg, Loose Fit](https://reader035.vdocuments.mx/reader035/viewer/2022062519/56649eef5503460f94bfeb8f/html5/thumbnails/68.jpg)
![Page 69: An Introduction to ENSEMBL Cédric Notredame. The Top 5 Surprises in the Human Genome Map 1.The blue gene exists in 3 genotypes: Straight Leg, Loose Fit](https://reader035.vdocuments.mx/reader035/viewer/2022062519/56649eef5503460f94bfeb8f/html5/thumbnails/69.jpg)
![Page 70: An Introduction to ENSEMBL Cédric Notredame. The Top 5 Surprises in the Human Genome Map 1.The blue gene exists in 3 genotypes: Straight Leg, Loose Fit](https://reader035.vdocuments.mx/reader035/viewer/2022062519/56649eef5503460f94bfeb8f/html5/thumbnails/70.jpg)
![Page 71: An Introduction to ENSEMBL Cédric Notredame. The Top 5 Surprises in the Human Genome Map 1.The blue gene exists in 3 genotypes: Straight Leg, Loose Fit](https://reader035.vdocuments.mx/reader035/viewer/2022062519/56649eef5503460f94bfeb8f/html5/thumbnails/71.jpg)
Where Do ENSEMBLGenes Come From
Genebuild
![Page 72: An Introduction to ENSEMBL Cédric Notredame. The Top 5 Surprises in the Human Genome Map 1.The blue gene exists in 3 genotypes: Straight Leg, Loose Fit](https://reader035.vdocuments.mx/reader035/viewer/2022062519/56649eef5503460f94bfeb8f/html5/thumbnails/72.jpg)
![Page 73: An Introduction to ENSEMBL Cédric Notredame. The Top 5 Surprises in the Human Genome Map 1.The blue gene exists in 3 genotypes: Straight Leg, Loose Fit](https://reader035.vdocuments.mx/reader035/viewer/2022062519/56649eef5503460f94bfeb8f/html5/thumbnails/73.jpg)
![Page 74: An Introduction to ENSEMBL Cédric Notredame. The Top 5 Surprises in the Human Genome Map 1.The blue gene exists in 3 genotypes: Straight Leg, Loose Fit](https://reader035.vdocuments.mx/reader035/viewer/2022062519/56649eef5503460f94bfeb8f/html5/thumbnails/74.jpg)
• Ensembl gene set
• Ensembl EST genes
• Ab initio predictions
• Manual curation (Vega / Sanger)
• Gene models from other groups
• Known v. novel genes
• Gene names & descriptions
Evaluating genes and transcripts
![Page 75: An Introduction to ENSEMBL Cédric Notredame. The Top 5 Surprises in the Human Genome Map 1.The blue gene exists in 3 genotypes: Straight Leg, Loose Fit](https://reader035.vdocuments.mx/reader035/viewer/2022062519/56649eef5503460f94bfeb8f/html5/thumbnails/75.jpg)
The Aim…
![Page 76: An Introduction to ENSEMBL Cédric Notredame. The Top 5 Surprises in the Human Genome Map 1.The blue gene exists in 3 genotypes: Straight Leg, Loose Fit](https://reader035.vdocuments.mx/reader035/viewer/2022062519/56649eef5503460f94bfeb8f/html5/thumbnails/76.jpg)
Ensembl transcript predictions
evidence
other groups’ models
manual curation
Overview…
![Page 77: An Introduction to ENSEMBL Cédric Notredame. The Top 5 Surprises in the Human Genome Map 1.The blue gene exists in 3 genotypes: Straight Leg, Loose Fit](https://reader035.vdocuments.mx/reader035/viewer/2022062519/56649eef5503460f94bfeb8f/html5/thumbnails/77.jpg)
Automatic Gene Annotationhuman proteins
Ensembl Genes
Other proteins cDNAs
Pmatch Exonerate
Genewise Est2Genome
ESTs
Genscan exons
Add UTRs
EST genes
other evidence
Merge
![Page 78: An Introduction to ENSEMBL Cédric Notredame. The Top 5 Surprises in the Human Genome Map 1.The blue gene exists in 3 genotypes: Straight Leg, Loose Fit](https://reader035.vdocuments.mx/reader035/viewer/2022062519/56649eef5503460f94bfeb8f/html5/thumbnails/78.jpg)
• Place all available species-specific proteins to make transcripts
• Place similar proteins to make transcriptsUse mRNA data to add UTRs
• Build transcripts using cDNA evidence
• Build additional transcripts using Genscan + homology evidence
• Combine annotations to make genes with alternative transcripts
ENSEMBL Geneset
![Page 79: An Introduction to ENSEMBL Cédric Notredame. The Top 5 Surprises in the Human Genome Map 1.The blue gene exists in 3 genotypes: Straight Leg, Loose Fit](https://reader035.vdocuments.mx/reader035/viewer/2022062519/56649eef5503460f94bfeb8f/html5/thumbnails/79.jpg)
blast and Miniseq
Human protein sequencesSwissProt/TrEMBL/RefSeq
pmatch* v. assembly
Genewise
*R. Durbin, unpublished
Getting Genes from Known Proteins
![Page 80: An Introduction to ENSEMBL Cédric Notredame. The Top 5 Surprises in the Human Genome Map 1.The blue gene exists in 3 genotypes: Straight Leg, Loose Fit](https://reader035.vdocuments.mx/reader035/viewer/2022062519/56649eef5503460f94bfeb8f/html5/thumbnails/80.jpg)
Translatable gene with UTRs
cDNAs - Est2Genome – UTRs, no phases
proteins - Genewise – phases, no UTRs
Adding the UTRs
![Page 81: An Introduction to ENSEMBL Cédric Notredame. The Top 5 Surprises in the Human Genome Map 1.The blue gene exists in 3 genotypes: Straight Leg, Loose Fit](https://reader035.vdocuments.mx/reader035/viewer/2022062519/56649eef5503460f94bfeb8f/html5/thumbnails/81.jpg)
•DNA-DNA alignments don’t give translatable genes
•Protein level Alignment give:– frameshifts and splice sites
•Genewise (Ewan Birney)– Protein – genomic alignment– Has splice site model– Penalises stop codons– Allows for frameshifts
Gene Build is Protein-Based
![Page 82: An Introduction to ENSEMBL Cédric Notredame. The Top 5 Surprises in the Human Genome Map 1.The blue gene exists in 3 genotypes: Straight Leg, Loose Fit](https://reader035.vdocuments.mx/reader035/viewer/2022062519/56649eef5503460f94bfeb8f/html5/thumbnails/82.jpg)
• Combine results of all Genewises and Genscans:
• Group transcripts which share exons• Reject non-translating transcripts• Remove duplicate exons• Attach supporting evidence• Write genes to database
Making Genes
![Page 83: An Introduction to ENSEMBL Cédric Notredame. The Top 5 Surprises in the Human Genome Map 1.The blue gene exists in 3 genotypes: Straight Leg, Loose Fit](https://reader035.vdocuments.mx/reader035/viewer/2022062519/56649eef5503460f94bfeb8f/html5/thumbnails/83.jpg)
• NCBI 34 assembly, released Dec 2003
• Ensembl genes: 21,787 (23.762 in release 35)• Ensembl coding transcripts: 31,609 • (plus 1,744 pseudogenes)• Ensembl exons: 225,897
• Input human seqs: 48,176 proteins; 86,918 cDNAs
• Transcripts made from:– Human proteins with (without) UTRs 68% (19%)– Non-human proteins with (without) UTRs 2% (9%) – cDNA alignment only 0.8%
A Typical Human Release:NCBI 34 (Dec 2003)
![Page 84: An Introduction to ENSEMBL Cédric Notredame. The Top 5 Surprises in the Human Genome Map 1.The blue gene exists in 3 genotypes: Straight Leg, Loose Fit](https://reader035.vdocuments.mx/reader035/viewer/2022062519/56649eef5503460f94bfeb8f/html5/thumbnails/84.jpg)
Genes Sensitivity ~90% of manual genes are in Specificity ~75% of genes are in the manual sets
Exon bps Sensitivity ~70% of manual bps are in exons (90% of coding bps)Specificity ~80% of bps are in manual exons
Alternative transcripts per genemanual 3 1.3
Figures are for the gene build on NCBI 33 (human) and manual annotation for chromosomes 6, 14 & 14
Manual Vs Automatic Annotation
![Page 85: An Introduction to ENSEMBL Cédric Notredame. The Top 5 Surprises in the Human Genome Map 1.The blue gene exists in 3 genotypes: Straight Leg, Loose Fit](https://reader035.vdocuments.mx/reader035/viewer/2022062519/56649eef5503460f94bfeb8f/html5/thumbnails/85.jpg)
Data availabilityHard evidences in mouse, rat, human Similarity build more important For other species;
Structural IssuesZebrafish Many similar genes near each other
Genome from different haplotypes
C. briggsae Very dense genomeShort introns
Mosquito Many single-exon genesGenes within genes
Configuration Files provide flexibility
Each Genebuild is a Story…
![Page 86: An Introduction to ENSEMBL Cédric Notredame. The Top 5 Surprises in the Human Genome Map 1.The blue gene exists in 3 genotypes: Straight Leg, Loose Fit](https://reader035.vdocuments.mx/reader035/viewer/2022062519/56649eef5503460f94bfeb8f/html5/thumbnails/86.jpg)
Species Gene number Exons/geneHomo sapiens 21787 8.7
Mus musculus 24948 8.7
Rattus norvegicus 23751 7.9
Danio rerio (zebra fish) 20062 7.9
Caenorhabditis briggsae (nematode)
11884 7.2
Anopheles gambiae (mosquito)
14707 4.0
Life in Release 2003
![Page 87: An Introduction to ENSEMBL Cédric Notredame. The Top 5 Surprises in the Human Genome Map 1.The blue gene exists in 3 genotypes: Straight Leg, Loose Fit](https://reader035.vdocuments.mx/reader035/viewer/2022062519/56649eef5503460f94bfeb8f/html5/thumbnails/87.jpg)
• Ensembl gene set
• Ensembl EST genes
• Ab initio predictions
• Manual curation (Vega / Sanger)
• Gene models from other groups
• Known v. novel genes
• Gene names & descriptions
Evaluating genes and transcripts
![Page 88: An Introduction to ENSEMBL Cédric Notredame. The Top 5 Surprises in the Human Genome Map 1.The blue gene exists in 3 genotypes: Straight Leg, Loose Fit](https://reader035.vdocuments.mx/reader035/viewer/2022062519/56649eef5503460f94bfeb8f/html5/thumbnails/88.jpg)
human proteins
Ensembl Genes
Other proteins cDNAs
Pmatch Exonerate
Genewise Est2Genome
ESTs
Genscan exons
Add UTRs
EST genes
Other evidence
Merge
Using ESTs
![Page 89: An Introduction to ENSEMBL Cédric Notredame. The Top 5 Surprises in the Human Genome Map 1.The blue gene exists in 3 genotypes: Straight Leg, Loose Fit](https://reader035.vdocuments.mx/reader035/viewer/2022062519/56649eef5503460f94bfeb8f/html5/thumbnails/89.jpg)
EST analysis
Map to genome using Est2Genome(determine strand, splicing)
Map ESTs using Exonerate(determine coverage, % identity and location in genome)
Filter on %identity and depth(5.5 million ESTs from dbEST – maping of about 1/3)
Using ESTs
![Page 90: An Introduction to ENSEMBL Cédric Notredame. The Top 5 Surprises in the Human Genome Map 1.The blue gene exists in 3 genotypes: Straight Leg, Loose Fit](https://reader035.vdocuments.mx/reader035/viewer/2022062519/56649eef5503460f94bfeb8f/html5/thumbnails/90.jpg)
ExonerateGolden path contigs
cDNA hits
•Exonerate positions cDNA sequences to assembly contigs
• Store hits as Ensembl FeaturePairs in database
Exonerate
![Page 91: An Introduction to ENSEMBL Cédric Notredame. The Top 5 Surprises in the Human Genome Map 1.The blue gene exists in 3 genotypes: Straight Leg, Loose Fit](https://reader035.vdocuments.mx/reader035/viewer/2022062519/56649eef5503460f94bfeb8f/html5/thumbnails/91.jpg)
Blast and Est2GenomeVirtual contig
cDNA hits
FilterBlast & MiniseqEst_genome
EST2Genome
![Page 92: An Introduction to ENSEMBL Cédric Notredame. The Top 5 Surprises in the Human Genome Map 1.The blue gene exists in 3 genotypes: Straight Leg, Loose Fit](https://reader035.vdocuments.mx/reader035/viewer/2022062519/56649eef5503460f94bfeb8f/html5/thumbnails/92.jpg)
Merge ESTs according to consecutive exon overlap and set splice ends
Genomewise
Alternative transcripts with translation and UTRs
ESTs
Reconstructing Alternative Splicing
![Page 93: An Introduction to ENSEMBL Cédric Notredame. The Top 5 Surprises in the Human Genome Map 1.The blue gene exists in 3 genotypes: Straight Leg, Loose Fit](https://reader035.vdocuments.mx/reader035/viewer/2022062519/56649eef5503460f94bfeb8f/html5/thumbnails/93.jpg)
Human ESTs
EST transcripts
Display limited to 7 at any one point – full data accessible in the databases
Display of EST Evidences
![Page 94: An Introduction to ENSEMBL Cédric Notredame. The Top 5 Surprises in the Human Genome Map 1.The blue gene exists in 3 genotypes: Straight Leg, Loose Fit](https://reader035.vdocuments.mx/reader035/viewer/2022062519/56649eef5503460f94bfeb8f/html5/thumbnails/94.jpg)
• Ensembl gene set
• Ensembl EST genes
• Ab initio predictions
• Manual curation (Vega / Sanger)
• Gene models from other groups
• Known v. novel genes
• Gene names & descriptions
Evaluating genes and transcripts
![Page 95: An Introduction to ENSEMBL Cédric Notredame. The Top 5 Surprises in the Human Genome Map 1.The blue gene exists in 3 genotypes: Straight Leg, Loose Fit](https://reader035.vdocuments.mx/reader035/viewer/2022062519/56649eef5503460f94bfeb8f/html5/thumbnails/95.jpg)
Ab initio Genscan predictions
Genscan prediction
Evidence supporting Genscan
exons
![Page 96: An Introduction to ENSEMBL Cédric Notredame. The Top 5 Surprises in the Human Genome Map 1.The blue gene exists in 3 genotypes: Straight Leg, Loose Fit](https://reader035.vdocuments.mx/reader035/viewer/2022062519/56649eef5503460f94bfeb8f/html5/thumbnails/96.jpg)
• Ensembl gene set
• Ensembl EST genes
• Ab initio predictions
• Manual curation (Vega / Sanger)
• Gene models from other groups
• Known v. novel genes
• Gene names & descriptions
Evaluating genes and transcripts
![Page 97: An Introduction to ENSEMBL Cédric Notredame. The Top 5 Surprises in the Human Genome Map 1.The blue gene exists in 3 genotypes: Straight Leg, Loose Fit](https://reader035.vdocuments.mx/reader035/viewer/2022062519/56649eef5503460f94bfeb8f/html5/thumbnails/97.jpg)
Manual Curation: VErtebrate Genome Annotation
![Page 98: An Introduction to ENSEMBL Cédric Notredame. The Top 5 Surprises in the Human Genome Map 1.The blue gene exists in 3 genotypes: Straight Leg, Loose Fit](https://reader035.vdocuments.mx/reader035/viewer/2022062519/56649eef5503460f94bfeb8f/html5/thumbnails/98.jpg)
Sanger / Vega manual curation
Manual Curation: VEGA
![Page 99: An Introduction to ENSEMBL Cédric Notredame. The Top 5 Surprises in the Human Genome Map 1.The blue gene exists in 3 genotypes: Straight Leg, Loose Fit](https://reader035.vdocuments.mx/reader035/viewer/2022062519/56649eef5503460f94bfeb8f/html5/thumbnails/99.jpg)
• Ensembl gene set
• Ensembl EST genes
• Ab initio predictions
• Manual curation (Vega / Sanger)
• Gene models from other groups
• Known v. novel genes
• Gene names & descriptions
Evaluating Genes and Transcripts
![Page 100: An Introduction to ENSEMBL Cédric Notredame. The Top 5 Surprises in the Human Genome Map 1.The blue gene exists in 3 genotypes: Straight Leg, Loose Fit](https://reader035.vdocuments.mx/reader035/viewer/2022062519/56649eef5503460f94bfeb8f/html5/thumbnails/100.jpg)
Other models as ‘DAS sources’
Turn on DAS sources
FASTAView display
Other Gene-Models
![Page 101: An Introduction to ENSEMBL Cédric Notredame. The Top 5 Surprises in the Human Genome Map 1.The blue gene exists in 3 genotypes: Straight Leg, Loose Fit](https://reader035.vdocuments.mx/reader035/viewer/2022062519/56649eef5503460f94bfeb8f/html5/thumbnails/101.jpg)
• Ensembl gene set
• Ensembl EST genes
• Ab initio predictions
• Manual curation (Vega / Sanger)
• Gene models from other groups
• Known v. novel genes
• Gene names & descriptions
Evaluating Genes and Transcripts
![Page 102: An Introduction to ENSEMBL Cédric Notredame. The Top 5 Surprises in the Human Genome Map 1.The blue gene exists in 3 genotypes: Straight Leg, Loose Fit](https://reader035.vdocuments.mx/reader035/viewer/2022062519/56649eef5503460f94bfeb8f/html5/thumbnails/102.jpg)
• Naming takes place after the gene build is completed
• Transcripts/proteins mapped to SwissProt, RefSeq and SPTrEMBL entries
• If mapped = ‘known’ : if not = ‘novel’
• Require high sequence similarity, but allow incomplete coverage
• Note: Difficult for families of closely-related genes Wrongly annotated pseudogenes may also cause problems
Known Vs novel transcripts
![Page 103: An Introduction to ENSEMBL Cédric Notredame. The Top 5 Surprises in the Human Genome Map 1.The blue gene exists in 3 genotypes: Straight Leg, Loose Fit](https://reader035.vdocuments.mx/reader035/viewer/2022062519/56649eef5503460f94bfeb8f/html5/thumbnails/103.jpg)
• Ensembl gene set
• Ensembl EST genes
• Ab initio predictions
• Manual curation (Vega / Sanger)
• Gene models from other groups
• Known v. novel genes
• Gene names & descriptions
Evaluating Genes and Transcripts
![Page 104: An Introduction to ENSEMBL Cédric Notredame. The Top 5 Surprises in the Human Genome Map 1.The blue gene exists in 3 genotypes: Straight Leg, Loose Fit](https://reader035.vdocuments.mx/reader035/viewer/2022062519/56649eef5503460f94bfeb8f/html5/thumbnails/104.jpg)
Names and descriptions• Names taken from mapped database entries
• Official HGNC (HUGO) name used if available (or equivalent for other species)
• Otherwise SwissProt > RefSeq > SPTrEMBL
• Novel transcripts have only Ensembl stable ids
• Genes named after ‘best-named’ transcript
• Gene description taken from mapped database entries (source given)
• Hints: Orthology can provide useful confirmation If no description, check for any Family description
Gene Names and Descriptors
![Page 105: An Introduction to ENSEMBL Cédric Notredame. The Top 5 Surprises in the Human Genome Map 1.The blue gene exists in 3 genotypes: Straight Leg, Loose Fit](https://reader035.vdocuments.mx/reader035/viewer/2022062519/56649eef5503460f94bfeb8f/html5/thumbnails/105.jpg)
Stability…
www.ensembl.org/Docs/wiki/html/EnsemblDocs/Answer006.html
![Page 106: An Introduction to ENSEMBL Cédric Notredame. The Top 5 Surprises in the Human Genome Map 1.The blue gene exists in 3 genotypes: Straight Leg, Loose Fit](https://reader035.vdocuments.mx/reader035/viewer/2022062519/56649eef5503460f94bfeb8f/html5/thumbnails/106.jpg)
Evidence used to build the transcript
links to ExonVie
w
Mapping to external
databases
Links to putative orthologues
Transcript name
Gene name &
descriptionAlternative transcripts
Geneview and Exonview
![Page 107: An Introduction to ENSEMBL Cédric Notredame. The Top 5 Surprises in the Human Genome Map 1.The blue gene exists in 3 genotypes: Straight Leg, Loose Fit](https://reader035.vdocuments.mx/reader035/viewer/2022062519/56649eef5503460f94bfeb8f/html5/thumbnails/107.jpg)
Compressed tracks
Expanded tracks
Evidence Tracks in ContigView
![Page 108: An Introduction to ENSEMBL Cédric Notredame. The Top 5 Surprises in the Human Genome Map 1.The blue gene exists in 3 genotypes: Straight Leg, Loose Fit](https://reader035.vdocuments.mx/reader035/viewer/2022062519/56649eef5503460f94bfeb8f/html5/thumbnails/108.jpg)
•Improved pseudogene annotation, for all species •Upstream regulatory elements - using CpG islands, Eponine predictions, motifs to aid in prediction of transcription start sites
• Improve use of cDNAs - can already use to add alternatively spliced transcripts
• Improve UTR extension
• Make use of comparative data
• Non coding RNAs - currently filtered out of build sets
Future Directions
![Page 109: An Introduction to ENSEMBL Cédric Notredame. The Top 5 Surprises in the Human Genome Map 1.The blue gene exists in 3 genotypes: Straight Leg, Loose Fit](https://reader035.vdocuments.mx/reader035/viewer/2022062519/56649eef5503460f94bfeb8f/html5/thumbnails/109.jpg)
ENSEMBL
-Finding the right DATA: ENSMART and BLAST
-The central View of ENSEMBL: ContigView
-Genome Comparison: Synteny View
-ENSEMBL incorporate all the evidences intoits gene models
![Page 110: An Introduction to ENSEMBL Cédric Notredame. The Top 5 Surprises in the Human Genome Map 1.The blue gene exists in 3 genotypes: Straight Leg, Loose Fit](https://reader035.vdocuments.mx/reader035/viewer/2022062519/56649eef5503460f94bfeb8f/html5/thumbnails/110.jpg)
![Page 111: An Introduction to ENSEMBL Cédric Notredame. The Top 5 Surprises in the Human Genome Map 1.The blue gene exists in 3 genotypes: Straight Leg, Loose Fit](https://reader035.vdocuments.mx/reader035/viewer/2022062519/56649eef5503460f94bfeb8f/html5/thumbnails/111.jpg)
![Page 112: An Introduction to ENSEMBL Cédric Notredame. The Top 5 Surprises in the Human Genome Map 1.The blue gene exists in 3 genotypes: Straight Leg, Loose Fit](https://reader035.vdocuments.mx/reader035/viewer/2022062519/56649eef5503460f94bfeb8f/html5/thumbnails/112.jpg)
Genebuild overview
Pmatch
Other Proteins
Genewise genes with UTRs
HumanProteins
Genewise
Genewisegenes
GenebuilderSupportedgenscans(optional)
Preliminarygene set
cDNA genes
ClusterMerge
GeneCombiner
Core Ensemblgenes
PseudogenesFinal set
+ pseudogenesEnsembl
EST genes
Est2Genome
AlignedcDNAs
Exonerate
Human cDNAs
Aligned ESTs
Human ESTs
![Page 113: An Introduction to ENSEMBL Cédric Notredame. The Top 5 Surprises in the Human Genome Map 1.The blue gene exists in 3 genotypes: Straight Leg, Loose Fit](https://reader035.vdocuments.mx/reader035/viewer/2022062519/56649eef5503460f94bfeb8f/html5/thumbnails/113.jpg)
Place all known genes
Map all AVAILABLE species specific proteins in the genome and find gene structure using Genewise
Annotate novel genes
Use protein from other species to build new transcripts based on homology
Use AVAILABLE mRNAs to add UTRs to the built transcripts
Use further homology to proteins, mRNAs and ESTs to build transcripts using Genscan exons
Combine annotations
Annotation Stages
![Page 114: An Introduction to ENSEMBL Cédric Notredame. The Top 5 Surprises in the Human Genome Map 1.The blue gene exists in 3 genotypes: Straight Leg, Loose Fit](https://reader035.vdocuments.mx/reader035/viewer/2022062519/56649eef5503460f94bfeb8f/html5/thumbnails/114.jpg)
Sn Sp
chr13 0.90 0.74
chr14 0.92 0.77
chr6 0.94 0.72
Numbers are for NCBI33 genebuild
Gene locus level
ENSEMBL predictions cover 90% or more
of manually annotated gene structures,
with around 75% of the predictions
covered by a manual annotation
Exon level (based on transcript pairs)
Coding exons only All exons
Sn Sp Sn Sp
chr13 0.83 0.90 0.73 0.78
chr14 0.78 0.88 0.69 0.77
chr6 0.85 0.89 0.73 0.76
UTR exons predictions
are less accurate than
coding exons.
92% of coding exons
and 80% of all exons
are exact matches
Manual Vs Automatic Annotation