lecture 3,4

24
Sucheta Tripathy Genome Sequencing Projects, Genome Size, Application of sequence information for identification of disease genes

Upload: sucheta-tripathy

Post on 18-Dec-2014

417 views

Category:

Education


3 download

DESCRIPTION

Genome projects, Secondgen and Thirdgen genome sequencing, application of genome sequencing in predicting disease genes

TRANSCRIPT

Page 1: Lecture 3,4

Sucheta TripathyGenome Sequencing Projects, Genome Size,

Application of sequence information for identification of disease genes

Page 2: Lecture 3,4

Complete Genome SequencingWhole genome shotgun sequencingBAC end sequencingChromosome walkingEnd sealing

Page 3: Lecture 3,4

Reference: http://en.wikipedia.org/wiki/File:Genome_Sizes.png

Page 4: Lecture 3,4

Cost of Genome Sequencing

Page 5: Lecture 3,4

Nextgen sequencing methods454 sequencing methods(2006)

Principles of pyrophosphate detection(1985, 1988)

Illumina(Solexa) Genome sequencing methods(2007)Applied Biosystems ABI SOLiD System(2007)Helicos single molecule sequencing(Helioscope, 2007)Pacific Biosciences single-molecule real-time(SMRT)

technology, 2010Sequenom for Nanotechnology based sequencing.BioNanomatrixnanofluidiscsRNAP technologyhttp://www.ncbi.nlm.nih.gov/books/NBK20261/

Page 6: Lecture 3,4

Sequencing methods

Ref: http://www.wellcome.ac.uk/Education-resources/Teaching-and-education/Animations/DNA/WTX056046.htm

http://www.wellcome.ac.uk/Education-resources/Teaching-and-education/Animations/DNA/WTX056051.htm

http://www.wellcome.ac.uk/Education-resources/Teaching-and-education/Animations/DNA/WTDV026689.htm

Page 7: Lecture 3,4

Ion Torrent

Page 8: Lecture 3,4

SOLiD Sequencing

Page 10: Lecture 3,4

http://www.insdc.org/

http://www.ebi.ac.uk/embl/Contact/collaboration.html

Page 11: Lecture 3,4

• JGI – IMG [http://img.jgi.doe.gov/]

• Broad [http://www.broadinstitute.org/]

• TIGR [http://www.jcvi.org/]

• WashU [http://genome.wustl.edu/]

• VBI at Virginia Tech [www.vbi.vt.edu]

Microbial Genome Sequencing

Page 12: Lecture 3,4

Human Genome Project

In October 1990 Human

Genome project started

First Publication in 2000

Finished paper in 2003

NHGRI Solicited

pilot proposal

for ENCODE

First Report on Encode

Published in 2007

RFAs were sought for

full ENCODE

ENCODE published

2012

GWAS -90% lies outside coding

2005

Page 13: Lecture 3,4

What happens next?You have 10 million characters – what to do

with them?Locate genesDetermine the function of the gene

By similarity search By domain search By Predicting signal peptide By locating transmembrane region

Ref: http://www.nature.com/nature/journal/v406/n6797/pdf/406799a0.pdf

Page 14: Lecture 3,4

Genome Annotation

ATGAAGATAGACAGCATACTAGCAGCATAGAATAGATAAGAGATAGAAATAGAATAAATATAAGA

GAGA

Run 6 frame translation

Run Blastp with nr

Match

foundN

o

Make an hmmsearch

Match

found

Product found

Pathway analysisOther analysis

Repeat Finding, miRNA finding, tRNAscan etc.

NO

Unknown Genes Hypothesis

Page 15: Lecture 3,4

Genome SizesGametic Nuclear DNA contentRepresented as mass in pg(pico grams) or

length in mega bases

1 pg = 10^-12 gms

1mb = 10^6 bases

1 pg = 978 Mb

Ref: http://www.ncbi.nlm.nih.gov/pmc/articles/PMC1669731/

Page 16: Lecture 3,4

Genome SizesDatabase of Genome Sizes

http://www.cbs.dtu.dk/databases/DOGS/Plant Genome database

http://www.kew.org/genomesize/homepage.html

Mamalian genome size databasehttp://www.unipv.it/webbio/dbagsdb.htm

Animal Genome size databasewww.genomesize.com

Fungal Genome size database.www.zbi.ee/fungal-genomesize

Page 17: Lecture 3,4
Page 18: Lecture 3,4

Ref: http://www.kew.org/genomesize/homepage.html

Page 19: Lecture 3,4

Ref: http://www.genomesize.com/

Page 20: Lecture 3,4

Ref: http://www-3.unipv.it/webbio/dbagsh.htm

Page 21: Lecture 3,4

Ref: http://www.zbi.ee/fungal-genomesize/

Page 22: Lecture 3,4

Identifying Human Disease genesref: http://www.ncbi.nlm.nih.gov/books/NBK7561/

Before 1980, very few genes were recognizedReverse Genetics: Know gene product and go

back to gene and do a positional cloningGenetic Redundancy: Multiple genes have the

same function

Page 23: Lecture 3,4

Identification of genes through protein product

Page 24: Lecture 3,4

1000 genomes project1092 genomes of different individuals

sequenced.14 populationsLow coverage exome sequencing

38 million SNPs1.4 million short insertions14,000 large deletions

Ref: http://www.nature.com/nature/journal/v491/n7422/full/nature11632.html