detecting single molecules and sequencing dna

Detecting single molecules and sequencing DNARohan T. Ranasinghe

University Chemical Laboratories, Lensfield Road, Cambridge CB2 1EW

Locations and timeline

1 mile


1 mile

http://www.cambridge2000.com

Old CavendishLaboratory

1953: Discovery of the structure of DNA


1 mile




LMB1977: Sanger method for sequencing invented

http://www2.mrc-lmb.cam.ac.uk


1 mile




Sanger Institute1993: Work on Human genome project at the Sanger starts

Genome Research Ltd.




1 mile




Chemistry department

http://www.flickr.com/photos/shai-bl/5584629687/sizes/m/in/photostream/

1997: Work on Solexa method for sequencing started

Sanger Institute1993: Work on Human genome project at the Sanger starts

Genome Research Ltd.



Structure of DNA

http://www.themicrobiologist.comSolved in Cambridge in 1953 by James Watson and Francis Crick using data collected by Rosalind Franklin and Maurice Wilkins at King’s College London

The key to the structure was base pairing

Structure of DNA

http://www.flickr.com/photos/grahams__flickr/504365411/sizes/l/in/photostream/

Solved in Cambridge in 1953 by James Watson and Francis Crick using data collected by Rosalind Franklin and Maurice Wilkins at King’s College London


Structure of DNA

http://www.flickr.com/photos/major_clanger/5881631482/sizes/o/in/photostream/




Structure of DNA

http://www.flickr.com/photos/major_clanger/5881631482/sizes/o/in/photostream/




The fidelity of the Watson-Crick base pairs and the double helix structure are the cornerstones of DNA sequencing and modern forensic science

DNA Sequencing

Why would you want to sequence DNA?http://www.sikeston.k12.mo.us

DNA Sequencing

Why would you want to sequence DNA?http://www.sikeston.k12.mo.us

© Invitrogen

DNA Sequencing

Why would you want to sequence DNA?

A genome contains the information required to build an organism

http://www.sikeston.k12.mo.us

© Invitrogen

DNA Sequencing




It’s a long book...

© InvitrogenWikipedia

DNA Sequencing





© Invitrogen

~3,000,000,000 (3 ×109) letters in each of the ~1014 cells in a human

Wikipedia

DNA Sequencing





© Invitrogen


Distance between base pairs = 0.34 nm (0.34 ×10-9 m)

Wikipedia

DNA Sequencing





© Invitrogen


The DNA in one of your cells would be 2 m long in the B-form structure

Distance between base pairs = 0.34 nm (0.34 ×10-9 m)

Wikipedia

T

Sanger sequencing

CAGTCAGTCA

GA

C

G

TA

C

GTA

CG

T

AC

Based on copying of DNA:Genome Research Ltd.

Sanger sequencing

CAGTCAGTCA

GA

C

G

A

CTA

G

T

C


Sanger sequencing

CAGTCAGTCA

GA

C

G

T

A

CTA

G

T

C


Sanger sequencing

CAGTCAGTCA

GA

C

G

T

A

CTA

CG

T

C


Sanger sequencing

CAGTCAGTCA

GA

C

G

T

A

CTA

CG

T

A

C


Sanger sequencing

CAGTCAGTCA

GA

C

G

T

A

C

G

TA

CG

T

A

C


T

Sanger sequencing

CAGTCAGTCA

GA

C

G

T

A

C

G

TA

CG

T

A

C


Incorporation of fluorescent nucleotide terminates the copying process

T

Sanger sequencing

CAGTCAGTCA

GA

C

G

T

A

C

G

TA

CG

T

A

C

Based on copying of DNA: Repeat ~1030 timesGenome Research Ltd.

Sanger sequencing

Copied sequence

G

C

T

A

C

G

A

T

G

C

T

A

C

G

A

T

G

C

T

A

Original sequence

Repeat 3 × 108 times to read genome(would take another 190 years at this speed!*)

*Note: original animation took ~20 seconds)

The human genome project

http://www.c-spanvideo.org/program/157909-1

Started: 1989 (in the USA)


First draft completed: 2000‘Finished’: 2003




Cost: $3,000,000,000





Cost: $3,000,000,000


http://www.flickr.com/photos/93425126@N00/4394834217/in/set-72157623515077498/




Cost: $3,000,000,000





UK effort on the Human Genome Project largely carried out in this building in the Sanger Centre


Cost: $3,000,000,000





UK effort on the Human Genome Project largely carried out in this building in the Sanger Centre

9 Chromosomes were sequenced here (about a third of the genome)

What does it mean to detect a single molecule?

Looking for a needle in a haystack?



How many blades of grass on a football pitch?



About 200,000,000 or 2×108



How many molecules in a vial of water?


About 200,000,000 or 2×108



18 mL (1 mole) of water contains Avogadro’s number of molecules: 6.02 ×1023



About 200,000,000 or 2×108






About 200,000,000 or 2×108


So 1 mole of grass blades would cover 6.02 ×1023 ÷ 2×108 = 3 ×1015 football pitches





About 200,000,000 or 2×108


So 1 mole of grass blades would cover 6.02 ×1023 ÷ 2×108 = 3 ×1015 football pitches

That’s a lot of haystacks...


1 mole of grass blades = 3×1015 football pitches = 15×1012 km2



Surface area of Earth = 5×108 km2

(1011 football pitches!)



Surface area of Jupiter = 6×1010 km2

*Lab demonstration: 180 µL(15×1010 km2 of grass blades)



Surface area of the Sun = 6×1012 km2




1 mole of grass blades would cover the surface area of about 2.5 Suns!




1 mole of grass blades would cover the surface area of about 2.5 Suns!

All images: nasa.gov

Sanger sequencing

Sanger sequencing uses about 2×1010 molecules per 100 letters

Solexa sequencing

Invented in 1997 in this department

Developed by a spin-out company in Saffron Walden

Sold for $650,000,000 in 2006

Solexa sequencing

Solexa sequencing uses about 103 molecules to read 100 letters

About as many blades of grass as on the penalty spot

Imaging technology: lab demonstration

http://thesportboys.wordpress.com/category/international/page/2/

Invented in 1997 in this department

Developed by a spin-out company in Saffron Walden

Sold for $650,000,000 in 2006

Solexa sequencing

CAGTCAGTCA

GCATATGTTC

AACGTGCTTG

CGTATA

TTGCA

GTCAG

Solexa sequencing

© Royal Society of Chemistry publishing

CAGTCAGTCA

GCATATGTTC

AACGTGCTTG

CGTATA

TTGCA

GTCAG

CAGTCAGTCA

GCATATCTTC

AACGTGCTTG

CGTATAG

TTGCAC

GTCAGT

Densely packed microscopic “islands” of DNA generate information very quickly

Solexa sequencing

© Royal Society of Chemistry publishing

CAGTCAGTCA

GCATATGTTC

AACGTGCTTG

CGTATA

TTGCA

GTCAG

CAGTCAGTCA

GCATATCTTC

AACGTGCTTG

CGTATAG

TTGCAC

GTCAGT

CAGTCAGTCA

AACGTGCTTG

CGTATAG

TTGCAC

GTCAGT

GCATATGTTC

• “Recycled” template molecules ready for a incorporation of the next fluorescent letter

• Possible to read about 100 letters from each DNA strand, rather than 1

Solexa sequencing

© Royal Society of Chemistry publishing Genome Research Ltd.

CAGTCAGTCA

GCATATGTTC

AACGTGCTTG

CGTATA

TTGCA

GTCAG

CAGTCAGTCA

GCATATCTTC

AACGTGCTTG

CGTATAG

TTGCAC

GTCAGT

CAGTCAGTCA

AACGTGCTTG

CGTATAG

TTGCAC

GTCAGT

GCATATGTTC

Solexa sequencing

© Royal Society of Chemistry publishing Genome Research Ltd.

CAGTCAGTCA

GCATATGTTC

AACGTGCTTG

CGTATA

TTGCA

GTCAG

CAGTCAGTCA

GCATATCTTC

AACGTGCTTG

CGTATAG

TTGCAC

GTCAGT

CAGTCAGTCA

AACGTGCTTG

CGTATAG

TTGCAC

GTCAGT

GCATATGTTC

Cost to sequence a human genome: around $10,000Time to sequence a human genome: less than a weekFirst African, Asian and giant panda genomes sequencedSanger Institute owns 37 instruments

Summary

The structure of DNA, discovered in 1953 has been crucial to sequencing the human genome

The first human genome was sequenced using Fred Sanger’s method, invented in 1977. The project ran for 14 years, costing $3 billion

New methods for sequencing use single molecule detection to dramatically accelerate the decoding process

One approach using single molecule techniques, invented by Shankar Balasubramanian and David Klenerman in our department in 1997 is now widely used for sequencing worldwide

The cost of sequencing has fallen to $10,000 and takes less than a week

detecting single molecules and sequencing dna

Technology

structure of dna http

copying of dna

sequencing dna rohan

cornerstones of dna

structure of dna lmb

c ta g t c

sequencing invented

c ta cg t c