low-cost, high accuracy, long-dna synthesis technology george church, joe jacobsen et al. harvard...

31
Low-cost, high accuracy, long-DNA synthesis technology George Church, Joe Jacobsen et al. Harvard & MIT Killer Applications Chip synthesis, fluidics Multiplex assembly Error correction methods Software CAD-PAM Proteome (in vitro) synthesis Homologous recombination & selection for BACs Integrases Process integration, QA, timeline Safety opportunities 16-Feb-2005 10 AM NHGRI

Post on 19-Dec-2015

215 views

Category:

Documents


2 download

TRANSCRIPT

Low-cost, high accuracy, long-DNA synthesis technology

George Church, Joe Jacobsen et al. Harvard & MIT

0. Killer Applications1. Chip synthesis, fluidics2. Multiplex assembly 3. Error correction methods4. Software CAD-PAM5. Proteome (in vitro) synthesis6. Homologous recombination & selection for BACs7. Integrases8. Process integration, QA, timeline9. Safety opportunities

16-Feb-2005 10 AM NHGRI

Low-cost, high accuracy, long-DNA synthesis technology

George Church, Joe Jacobsen et al. Harvard & MIT

All stages: Error correction, Software, QA, safety

50-100 Chip synthesis, fluidics 100-15k Pol-Assembly-Multiplex, Proteome synthesis 15k-100k Annealing assembly 100k-5M. Microbial recombination 100k-200M Mammalian recomb, integrases

16-Feb-2005 10 AM NHGRI

Synthetic Genomes & Proteomes. Why?

• Test or engineer cis-DNA/RNA-elements •Access to any protein (complex) including post-transcriptional modifications• Affinity agents for the above.• Protein design, vaccines, solubility screens • Utility of molecular biology DNA -- RNA -- Protein

in vitro "kits" (e.g. PCR -- T7 -- Roche)

Toward these goals design a chassis:• 115 kbp genome. 150 genes.• Nearly all 3D structures known.• Comprehensive functional data.

(PURE) translation utility

Removing tRNA-synthetases, translational release-factors,RNases & proteases

Allows:

Selection of scFvs[antibodies] specific for HBV DNA polymerase using ribosome display. Lee et al. 2004 J Immunol Methods. 284:147

Programming peptidomimetic syntheses by translating genetic codes designed de novo. Forster et al. 2003 PNAS 100:6353

High level cell-free expression & specific labeling of integral membrane proteins. Klammt et al. 2004 Eur J Biochem 271:568

Cell-free translation reconstituted with purified components. Shimizu et al. 2001 Nat Biotechnol. 19:751-5.

Also: membrane incompatible expression & diverse amino-acids (>21)

in vitro genetic codes

5'

mS yU eU

UGGUUG CAG

AAC... GUU A 3'GAAACCAUG

fM TN V E

| | | | | || | |

5' Second base 3'

U

A

C

C U

mSyU

eU

A C U

G

A

0

500

1000

1500

2000

2500

3000

3500

30 40 50 60 70 80

3H-E dpm

time (min.)

fM yU mS eU E |

Forster, et al. (2003) PNAS 100:6353Zhang et al. (2004) Science. 303:371

80% average yieldper unnatural coupling.

eU = 2-amino-4-pentenoic acid yU = 2-amino-4-pentynoic acid mS = O-methylserine gS = O-GlcNAc–serine bK = biotinyl-lysine

Escherichia coli Mycoplasma 3D structureColiphage 29 DNA polymerase + +Coliphage P1 Cre recombinase - + >Coliphage Lox/Cre recombinase site - +Coliphage T7 RNA polymerase + + >Coliphage T7 RNA polymerase initiation site + + >Coliphage T7 RNA polymerase termination site + +RNase P RNA + -RNase P protein + + >RNase P site/RNA primer for DNA polymerase + +Small subunit 16S ribosomal RNA + +All 21 small subunit ribosomal proteins (1-21) + except 1,21 +Large subunit 5S ribosomal RNA + +Large subunit 23S ribosomal RNA + +Large subunit 23S rRNA G2445>m2G methylase: unknown ? -Large subunit 23S rRNA U2449>dihydroU synthetase: unknown ? -Large subunit 23S rRNA U2457>pseudoU synthetase ? -Large subunit 23S rRNA C2498>Cm methylase: unknown ? -Large subunit 23S rRNA A2503>m2A methylase: unknown ? -Large subunit 23S rRNA U2504>pseudoU synthetase ? -All 33 large subunit ribosomal proteins (1-7,9-11,13-25,27-36) + except 25, 30 +Translational initiation factor 1 + +Translational initiation factor 2 + +Translational initiation factor 3 + +Translational elongation factor Tu + +Translational elongation factor Ts + +Translational elongation factor G + +Translational release factor 1 + +Translational release factor 2 - +Translational release factor Gln methylase + +Translational release factor 3 - +Ribosome recycling factor + +33/45 Transfer RNAs (see Fig. 2) 29/33 +tRNA(I) C34>lysidine synthetase ? +tRNA(R) A34>I deaminase ? +tRNA(ASV) U34>cmo5U (=V) synthetase: unknown - -tRNA(R) U34>2sU Cys desulfurase - +tRNA(R) nm5U34 methylase ? +tRNA(R) U34>cmnm5U GTPase ? +tRNA(R) U34>cmnm5U synthetase ? +tRNA(R) cmnm5U34>nm5U,mnm5U synthetase ? -tRNA(R) G37 N1-methylase + +tRNA(RNIKM) A37>t6A N6-threonylcarbamoyl-A synthetase: unknown + -tRNA(CLFSWY) A37>i6A synthetase - +tRNA(CLFSWY) i6A37>s2i6A(ms2i6A) synthetase - +All 22 aminoacyl-tRNA synthetase subunits (20 enzymes) + except G subunit, Q + except G subunitMet-tRNA formyltransferase + +Chaperonin DnaK + +Chaperonin GroEL + +Chaperonin GroES + +

Total genes = 150Forster & Church

Oligos for 150 & 776

synthetic genes(for E.coli minigenome & M.mobile whole genome

respectively)

Up to 760K Oligos/Chip18 Mbp for $700 raw (6-18K genes)

<1K Oxamer Electrolytic acid/base 8K Atactic/Xeotron/Invitrogen Photo-Generated Acid Sheng , Zhou, Gulari, Gao (U.Houston) 24K Agilent Ink-jet standard reagents 48K Febit 100K Metrigen 380K Nimblegen Photolabile 5'protection Nuwaysir, Smith, Albert

Tian, Gong, Church

Improve DNA Synthesis CostSynthesis on chips in pools is 5000X less expensive per

oligonucleotide, but amounts are low (1e6 molecules rather than usual 1e12) & bimolecular kinetics slow with square of concentration decrease!)

Solution: Amplify the oligos then release them.

10 50 10 => ss-70-mer (chip)

20-mer PCR primers with restriction sites at the 50mer junctions

Tian, Gong, Sheng , Zhou, Gulari, Gao, Church Nature 2004

=> ds-90-mer

=> ds-50-mer

Improve DNA Synthesis Accuracyvia mismatch selection

Tian & Church Other mismatch methods: MutS (&H,L)

Computer Aided Design Polymerase Assembly Multiplexing (CAD-PAM)

Moving forward: 1. Tandem, inverted and dispersed repeats (hierarchical assembly, size-selection and/or scaffolding)2. Reduce mutations (goal <1e-6 errors) to reduce # of intermediates 3. 15kb to 5Mb by homologous recombination (Nick Reppas)4. Phage integrase site-specific recombination, also for counters.

Stemmer et al. 1995. Gene 164:49-53;Mullis 1986 CSHSQB.

50

75

125 225 425 825 … 100*2^(n-1)

All 30S-Ribosomal-protein DNAs(codon re-optimized)

Tian, Gong, Sheng , Zhou, Gulari, Gao, Church

1.7 kb

0.3 kb

s190.3kb

Nimblegen 95K chip

Atactic <4K chip

Improving synthesis accuracy

Method Bp/error

Chip assembly (PAM) 160 1Hybridization-selection 1,400 1MutS-gel-shift 10,000 2MutHLS cleavage 30,000 3 (10X better than PCR)

1. Tian, Church, et al. 2004 Nature 432:1050 2. Carr, Jacobson, et al. 2004 NAR 32:e162 3. Smith & Modrich 1997 PNAS 94:6847

Extreme mRNA makeover for protein expression in vitro

RS-2,4,5,6,9,10,12,13,15,16,17,and 21 detectable initially.

RS-1, 3, 7, 8, 11, 14, 18, 19, 20 initially weak or undetectable.

Solution: Iteratively resynthesize all mRNAs with less mRNA structure.

Tian & Church

20w 20m 17w 17m 16w 16m

10kd

W: wild-typeM: modified

Western blot based on His-tags

Synthetic - homologous recombination

testing of DNA motifs

1.3 2.4 (1.3 in argR)

1.1 1.3

0.7 2.5

0.2 1.4

1.4 3.5

RNA Ratio (motif- to wild type) for each flanking gene

Bulyk, McGuire,Masuda,Church Genome Res. 14:201–208

Safe Synthetic Biology

Church, G.M. (2004) A synthetic biohazard non-proliferation proposal.

http://arep.med.harvard.edu/SBP/Church_Biohazard04c.doc

1. Monitor oligo synthesis via expansion of Controlled substances, Select Agents, &/or Recombinant DNA

2. Computational tools are available; very small number of reagent, instrument & synthetic gene suppliers at present.

3. System modeling checks for synthetic biology projects

4. Multi-auxotroph, novel genetic code for the host genome, prevents functional transfer of DNA to other cells.

Public relations & safetyChurch, G.M. A synthetic biohazard non-proliferation proposal (2004) http://arep.med.harvard.edu/SBP/Church_Biohazard04c.doc

• Monitor oligo synthesis via expanding the purview of Controlled substances, Select Agents, &/or Recombinant DNA.• Computational tools for the above (e.g. Craic)• System modeling for all Synthetic Biology Projects• Avoid environmental release uses (at least initially)

Beckwith'69, Asilomar'75, AGS-Rifkin'84-6, Starlink'00, Roundup'04…http://www.americanscientist.org/template/BookReviewTypeDetail/assetid/16207http://www.social-ecology.org/article.php?story=2003120211014237

Jackson et al. (2001) J Virol. 75:1205-10."immunized genetically resistant mice withthevirusexpressingIL-4 resulted in significant mortality due to fulminant mousepox."

Safety via blocking exchange

Can we make a cell which is resistant to all viruses and incapable of *functional* DNA exchange in or out?

One option is genetic code remapping.

Micrococcus luteus is naturally missing 6 codons: UUA(L), CUA(L), AUA(I), GUA(Q), CAA(Q), AGA(R). Kowal, AK, & Oliver, JS NAR 1997, 25: 4685

Remaking a genome: rE.coli

# total# to

next Average bp/chunk

#primer pairs

Overlap bp Comments

rE.coli_0 1 -

4,648,882 - -3017 bp bio + 12224 bp Red

H1 47 47 100,000 - 760 23 kanR 24 camR

H2 47 47 100,000 - 760 23 tetR 24 zeoR offset 50 kb from H1

T 470 10 10,000 470 25

AT dinuc-ends of each 25mer (needed for UDG cleavage) are constrained by genome

F 13,045 28 400 470 40 Test of PAM at 28-plex

C 208,721 16 50 1 25 15+50+15 mers

S 417,442 2 25 2 25 1Tm ~25b

QA 208,721 1 50 - - Both strands of C-oligos to assess C & F

QB 8328 4% 25 - -

Both strands centered on all potential mismatches 1704 AGG>AGA & 378 UAG>UAA

rE.coli Project: Free up & switch codons in vivo

UAG>A

AGG>A

Amplifying DNA from single

chromosomes

29 real-time amplification

No template control

Affymetrix quantitation of independent amplifications

Prochlorococcus & Escherchia

Zhang, Martiny, Chisholm, Church, unpub.

Polony Bead Sequencing Pipeline

In vitro libraries via paired tag

manipulation

Bead polonies via emulsion PCR

[Dre03]

Monolayered immobilization in acrylamide

Enrichment of amplified beads

SOFTWARE

Images → Tag Sequences

Tag Sequences → Genome

FISSEQ or “wobble”sequencing

Epifluorescence Scope with Integrated Flow

Cell

Mitra, Shendure, Porreca, Rosenbaum, Church unpub.

Oligo-testing dNTP-extension Capillary-sequencing

1 2.5 NA bp read/cycle of 4 bases

10 14-200 800 bp reads

3e-3 4e-5 1e-4 non-homopolymer errors

3e-3 1e-1 1e-3 homopolymer errors

1M 1M 1K bp/$

Integrating with appropriate sequencing strategies

Shendure J, Mitra R, Varma C, Church GM (May 2004) Advanced Sequencing Technologies: Methods & Goals. Nature Reviews of Genetics 5, 335 -344.

NHGRI Seeks Next Generation of Sequencing Technologies (Jan 2004) http://www.genome.gov/12513210

Automated homologous recombination

•Positive & Negative Selection in same gene: URA3 (yeast), ThyA(E.coli), GFP(various)

•Electroporation, viral, conjugative delivery 3 oriT regions: IncP, F, and R64(IncI)

Valenzuela DM, et al. Nat Biotechnol. 2003 Jun;21(6):652-9. High-throughput engineering of the mouse genome coupled with high-resolution expression analysis. up to 25% targeting with BACs.

Yang Y, Seed B. Site-specific gene targeting in mouse embryonic stem cells with intact bacterial artificial chromosomes. Nat Biotechnol. 2003 21:447-51.

Schneckenburger H, et al. J Biomed Opt. 2002 Jul;7(3):410-6. Laser-assisted optoporation of single cells.

Integrase applications

(1) In vivo recombination (increase fidelity & efficiency)Nucleofection of muscle-derived stem cells and myoblasts with

phiC31 integrase. Mol Ther. 2004 10:679-87.

(2) In vitro plasmid construction (Gateway) (3) In vivo counters allow recording & increased analog I/O

through digital reuse of functions. For a 3-bit (8 state counter)

0 0 0 lac-GFP 0 0 1 ara-GFP 0 1 0 trp-GFP 0 1 1 tet-GFP 1 0 0 etc.

Sam MD, Cascio D, Johnson RC, Clubb RT. Crystal structure of the excisionase-DNA complex from bacteriophage lambda. J Mol

Biol. 2004 Apr 23;338(2):229-40.

Mol Cell. 2003 Jul;12(1):187-98. A conformational switch controls the DNA cleavage activity of lambda integrase. Aihara H, Kwon HJ, Nunes-Duby SE, Landy A, Ellenberger T.

Int/Xis contacts

Sam MD, Cascio D, Johnson RC, Clubb RT. Crystal structure of the excisionase-DNA complex from bacteriophage lambda. J Mol

Biol. 2004 Apr 23;338(2):229-40.

Mol Cell. 2003 Jul;12(1):187-98. A conformational switch controls the DNA cleavage activity of lambda integrase. Aihara H, Kwon HJ, Nunes-Duby SE, Landy A, Ellenberger T.

Integrase specificity … diversity

Invitrogen Gateway Vectors

Parr RD, Ball JM.(2003) Plasmid 49:179. Nakayama M, Ohara O. (2003) BBRC 312:825

Potential Commercial Biology Partners / Competitors

Invitrogen Gateway cloningPoetic Genetics Integrases & Gene TherapyRegeneron Mammalian BAC recombination 1%Scarab Genomics Better E. coli strains 20% of genomeAvidia/Diversa Shuffling/selection Ensemble DNA catalystsAmyris Terpenoid pathways Kosan Biosciences Polyketide pathways Big & Small Pharma

ibm.com/chips/services/foundry/partners

Mosis.org "50,000 designs… keep prototype costs low by aggregating many designs onto one mask set, sharing overhead" Fabrication Processes: AMIS, IBM, Austriamicrosystems, OMMIC/PML, Peregrine, TSMC, Vitesse

Analog Bits, Artisan Components, Cadence, eSilicon Corporation, GDA Technologies Inc., insyte, Jennic Limited, Kisel Microelectronics, Magma, MOSIS, QThink, QualCore Logic, Inc., RF Integration, Sierra Monolithics, SOCLE Technology,

Synopsys, Tahoe RF Semiconductors, TelASIC, TriCN, Triscend, Virtual Silicon,

You are

here

Example2: linux.orgredhat.com Q2 $46M up 60%

Low-cost, high accuracy, long-DNA synthesis technology

George Church, Joe Jacobsen et al. Harvard & MIT

0. Killer Applications1. Chip synthesis, fluidics2. Multiplex assembly 3. Error correction methods4. Software CAD-PAM5. Proteome (in vitro) synthesis6. Homologous recombination & selection for BACs7. Integrases8. Process integration, QA, timeline9. Safety opportunities

16-Feb-2005 10 AM NHGRI