sequence analysis with artemis and artemis comparison tool ... › fst › dms › icgeb ›...

Post on 28-Jun-2020

14 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Sequence Analysis with Artemis

and

Artemis Comparison Tool (ACT)

Carribean Bioinformatics Workshop18th-29th January, 2010

Overview of the Pathogen Genomics, WTSI

Introduction to Artemis

Demonstration of Artemis

Hands on guided exercise in Artemis.

New features in Artemis

Viewing second generation sequencing data in Artemis

Introduction to Artemis Comparison Tool (ACT)

Demonstration of ACT

Hands on guided exercise in ACT

Viewing new technology sequencing data in Artemis

Workshop Overview

The Wellcome Trust Sanger Institute

Wellcome Trust Photo Library

The Wellcome Trust Sanger

Institute

•Funded by The Wellcome Trust, a

registered charity.

•Established in 1993 to begin the Human

genome project.

•First Draft (2000) complete (2004)

Data release policy:

All sequence data is released

immediately and is freely available

via the internet in order to

maximise its benefit for research.

http://www.sanger.ac.uk

ftp://ftp.sanger.ac.uk/Wellcome Trust Photo Library

The Genomic Revolution

1977 Sanger and co-workers sequence bacteriophage phiX174 (5386 bp)

1 millions years to complete human genome (~3,000 Mbp)

Late 1980s Sanger’s techniques refined

1,000’s of years to complete human genome

1990’s Race to sequence human genome

10 years to complete human genome

2009 Novel sequencing technologies

$1000 genome?

The Human Genome ProjectHuman Genome Sequence Contributors

CSHL

TIGR

UTSW

UOKNOR

SDSTDC

SHGC

UWMSC

GTC

Sanger Institute

WUGSC

WIBR

UWGC

JGI

BCM

Keio

RIKEN

Genoscope

Beijing

GBF

MPIMG

IMB

United States

United Kingdom

Japan

France

Germany

China

United States

United Kingdom

Japan

France

Germany

China

WHO morbidity and mortality estimates (‘02)

World Health Report, 2004

Cause

Mortality Morbidity

(DALYS*)

Population (000) 6 224 985 6 224 985 (000) % total (000) % total

TOTAL 57 029 100 1 490 126 100 I. Communicable diseases, maternal and perinatal

conditions and nutritional deficiencies 18 324 32.1 610 319 41.0

Infectious and parasitic diseases 10 904 19.1 350 333 23.5 Respiratory infections 3 963 6.9 94 603 6.3 Maternal conditions 510 0.9 33 632 2.3 Perinatal conditions

2 462 4.3 97 335 6.5

Nutritional deficiencies 485 0.9 34 417 2.3 II. Noncommunicable conditions 33 537 58.8 697 815 46.8 Malignant neoplasms 7 121 12.5 75 545 5.1 Other neoplasms 149 0.3 1 749 0.1 Diabetes mellitus 988 1.7 16 194 1.1 Nutritional/endocrine disorders 243 0.4 7 961 0.5 Neuropsychiatric disorders 1 112 1.9 193 278 13.0 Sense organ disorders 3 0.0 69 381 4.7 Cardiovascular diseases 16 733 29.3 148 190 9.9 Respiratory diseases 3 702 6.5 55 153 3.7 Digestive diseases 1 968 3.5 46 476 3.1 Diseases of the genitourinary system 848 1.5 15 217 1.0 Skin diseases 69 0.1 3 748 0.3 Musculoskeletal diseases 106 0.2 30 169 2.0 Congenital abnormalities 493 0.9 27 381 1.8 Oral diseases 2 0.0 7 372 0.5 III. Injuries 5 168 9.1 181 991 12.2 Unintentional 3 551 6.2 133 112 8.9 Intentional 1 618 2.8 48 879 3.3

* Disability adjusted life years

Pathogen Sequencing at the Sanger

Mycobacterium tuberculosis

Neisseria meningitidis

Salmonella typhi

Candida albicans

Aspergillus fumigatus

Flu

Dengue

Enteric phage

E. coli Inc plasmids Tsetse fly

Sandfly

Shistosoma mansoni

Plasmodium falciparum

Leishmania major

Trypanosoma brucei

Pathogen Genomics

Genome sequencing of prokaryotic and eukaryotic

pathogens that typically require:

What do we do?

• Bioinformatics tools/software development

• Integration of genome analyses and annotation,

and in silico analyses

• Comparative genomics/functional genomics

• Web accessible databases

Sequencing strategy and assembly

Contiguous sequence

DNA

pUC clones

end sequences

‘Draft sequence’

Order of contigs?

95% coverage, 4-5x depth.

large clone

end sequence

Finished sequence: 100% coverage, 10x depth.

physical gap sequence gap

Shotgun sequencing – strategy

Shotgun assembly - Yersinia pestis

Annotation Strategy

Generating the complete genome sequence

Primary

DNA sequence

Dotter BlastN BlastX

Gene finders

tRNA scan

Repeats Pseudo-genesrRNACDSs

tRNA

Preannotation

manual

curation

Primary

DNA sequence

Dotter BlastN BlastX

Gene finders

tRNA scan

Repeats Pseudo-genesrRNACDSs

tRNA

Fasta BlastP Pfam Prosite Psort SignalP TMHMM

Preannotation

Manual

curation

Manual

curationAnnotated

sequence

top related