gggatttagctcagtt gggagagcgccagact gaa gat ttg gag gtcctgtgttcgatcc acagaattcgcacca share, search,...

60
gggatttagctcagtt gggagagcgccagact gaa gat ttg gag gtcctgtgttcgatcc acagaattcgcacca Share, Search, Merge, Check, Design: e.g. 3D & Sequence alignment

Post on 20-Jan-2016

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Gggatttagctcagtt gggagagcgccagact gaa gat ttg gag gtcctgtgttcgatcc acagaattcgcacca Share, Search, Merge, Check, Design: e.g. 3D & Sequence alignment

gggatttagctcagttgggagagcgccagactgaa gatttg gaggtcctgtgttcgatccacagaattcgcacca

Share, Search, Merge, Check,

Design:e.g. 3D & Sequence alignment

Page 2: Gggatttagctcagtt gggagagcgccagact gaa gat ttg gag gtcctgtgttcgatcc acagaattcgcacca Share, Search, Merge, Check, Design: e.g. 3D & Sequence alignment

Harvard-MIT GtL Center Goals

1 Protein Complexes : Mass Spectrometry multi-species-time-series & crosslinking

2 Regulatory Networks : RNA array quantitation

3 Microbial Communities, Biofilms : Polonies* Tagged-strain-competition, Single Cell Activities.

4 Computational Modeling: Metabolic Optimization & 4D Cell modeling* (Workshop B*)

Page 3: Gggatttagctcagtt gggagagcgccagact gaa gat ttg gag gtcctgtgttcgatcc acagaattcgcacca Share, Search, Merge, Check, Design: e.g. 3D & Sequence alignment

CO2 100 ppmv increase

http://jan.ucc.nau.edu/~doetqp-p/courses/env470/Lectures/lec41/Lec41.htm

Page 4: Gggatttagctcagtt gggagagcgccagact gaa gat ttg gag gtcctgtgttcgatcc acagaattcgcacca Share, Search, Merge, Check, Design: e.g. 3D & Sequence alignment

Energy & CO2 Fluxes4x1013 kW of sunlight hits earth per year.We consume 2kW per person* 6x109 = 1010 kW.

CO2 >370 ppm = 730 x1015 g globally, increase ~3 x1015 /yr.Ocean productivity = ~100 x1015 g/yr.

Autotrophs: 1025 Prochlorococcus cells globally (108 per liter)

Undone by Cyanophages & Heterotrophs: 2x1028 SAR11 cells in the oceansPseudomonas & Caulobacter in a variety of soils & aquatic environments

http://www.gsfc.nasa.gov/gsfc/service/gallery/fact_sheets/earthsci/terra/earths_energy_balance.htmhttp://clear.eawag.ch/models/optionenE.html Morris et al. Nature 2002 Dec 19-26;420(6917):806-10. http://hosting.uaa.alaska.edu/mhines/biol468/pages/carbon.html

Page 5: Gggatttagctcagtt gggagagcgccagact gaa gat ttg gag gtcctgtgttcgatcc acagaattcgcacca Share, Search, Merge, Check, Design: e.g. 3D & Sequence alignment

HarvardMIT DOEGtL

Center

Collaborating PIs: Chisholm, Polz, Church, Kolter, Ausubel, Lory, Laub, Kucherlapati

C.Ting

Page 6: Gggatttagctcagtt gggagagcgccagact gaa gat ttg gag gtcctgtgttcgatcc acagaattcgcacca Share, Search, Merge, Check, Design: e.g. 3D & Sequence alignment

DNA RNA Proteins

Metabolites

Replication rate

Environment

Biosystems Integrating Measures & Models

Microbes Cancer & stem cells Darwinian optimaIn vitro replicationSmall multicellular organisms

RNAiInsertionsSNPs

interactions

Page 7: Gggatttagctcagtt gggagagcgccagact gaa gat ttg gag gtcctgtgttcgatcc acagaattcgcacca Share, Search, Merge, Check, Design: e.g. 3D & Sequence alignment

Link et al. 1997 Electrophoresis 18:1259-313 (Pub)

Comparison of predicted with

observed protein properties

(abundance, localization, postsynthetic modifications)

E.coli

Page 8: Gggatttagctcagtt gggagagcgccagact gaa gat ttg gag gtcctgtgttcgatcc acagaattcgcacca Share, Search, Merge, Check, Design: e.g. 3D & Sequence alignment

In vivo crosslinking DNA-binding proteins

Comparison of Quantification Methods

0.001

0.01

0.1

1

10

100

0.0001 0.001 0.01 0.1 1 10 100

Fractional Composition (percent - total intensity all peptides)

Fra

cti

on

al

Co

mp

os

itio

n (

pe

rce

nt)

dps

rpoc

rpob

hns

dbha

ssb

gyrb

ihfalon

ihfb

top1uvra

crp

argr

nusahrpa

sspa

fur

Page 9: Gggatttagctcagtt gggagagcgccagact gaa gat ttg gag gtcctgtgttcgatcc acagaattcgcacca Share, Search, Merge, Check, Design: e.g. 3D & Sequence alignment

(Optionally protein separation steps)

3rd 2nd

Multidimensional peptide measures

Page 10: Gggatttagctcagtt gggagagcgccagact gaa gat ttg gag gtcctgtgttcgatcc acagaattcgcacca Share, Search, Merge, Check, Design: e.g. 3D & Sequence alignment

Numbers on top in basepairs. 1700 ORFs are predicted . Proteomic Model is based on Mass-spectrometry of peptides at 24h time points. DifferenceMap indicates new peptide regions. The 6 colors represent ORFs in the 6 reading frames .(Harvard-MIT GtL: Jaffe, Church, Lindell, Chisholm, et al. )

Prochlorococcus Proteogenomic Map

Page 11: Gggatttagctcagtt gggagagcgccagact gaa gat ttg gag gtcctgtgttcgatcc acagaattcgcacca Share, Search, Merge, Check, Design: e.g. 3D & Sequence alignment

R2=.992 R2=.635 Linear Regression R2=.1

(Harvard-MIT GtL: Jaffe, Church, Lindell, Chisholm, et al. )

RNA (3 AM)RNA (3 AM)

Circadian time-series (Prochlorococcus) RNA & protein quantitation:

Page 12: Gggatttagctcagtt gggagagcgccagact gaa gat ttg gag gtcctgtgttcgatcc acagaattcgcacca Share, Search, Merge, Check, Design: e.g. 3D & Sequence alignment

Goals 1& 2: RNAs & Proteins Next steps

1 Detect a higher fraction of peptides (currently ~ 80% proteins, 87% peptides max, 19% average)

2 Comparison of two Prochlorococcus isolates (1700 vs 2500 genes, high vs low light adapted)

3 Move from two time points to smooth series.

Page 13: Gggatttagctcagtt gggagagcgccagact gaa gat ttg gag gtcctgtgttcgatcc acagaattcgcacca Share, Search, Merge, Check, Design: e.g. 3D & Sequence alignment

DNA RNA Proteins

Metabolites

Replication rate

Environment

Biosystems Integrating Measures & Models

Microbes Cancer & stem cells Darwinian optimaIn vitro replicationSmall multicellular organisms

RNAiInsertionsSNPs

interactions

Page 14: Gggatttagctcagtt gggagagcgccagact gaa gat ttg gag gtcctgtgttcgatcc acagaattcgcacca Share, Search, Merge, Check, Design: e.g. 3D & Sequence alignment

Why we model cells?

• Tests of understanding• Program minimal cells (100kbp)• Nanobiotechnology - new polymers• Manage complex systems e.g. stem cells & ocean ecology

Page 15: Gggatttagctcagtt gggagagcgccagact gaa gat ttg gag gtcctgtgttcgatcc acagaattcgcacca Share, Search, Merge, Check, Design: e.g. 3D & Sequence alignment

Minimization of Metabolic Adjustment (MoMA)for the analysis of non-optimalmetabolic phenotypes

Daniel Segre, Dennis Vitkup

Suboptimality of mutants --integrating growth rate & flux data

Page 16: Gggatttagctcagtt gggagagcgccagact gaa gat ttg gag gtcctgtgttcgatcc acagaattcgcacca Share, Search, Merge, Check, Design: e.g. 3D & Sequence alignment

- Haemophilus influenzae metabolism (Schilling andPalsson, J.Theor.Biol. 2000)

- Escherichia coli metabolic network and gene deletions (Edwards and Palsson, PNAS 2000, BMC Bioinf. 2000)

- Helicobacter pylori (Edwards, Schilling, Covert, Church, Palsson, J. Bact 2002)

- Escherichia coli MOMA (Segre, Vitkup, & Church, PNAS 2003)

MoMA/FBA REFERENCES

Page 17: Gggatttagctcagtt gggagagcgccagact gaa gat ttg gag gtcctgtgttcgatcc acagaattcgcacca Share, Search, Merge, Check, Design: e.g. 3D & Sequence alignment
Page 18: Gggatttagctcagtt gggagagcgccagact gaa gat ttg gag gtcctgtgttcgatcc acagaattcgcacca Share, Search, Merge, Check, Design: e.g. 3D & Sequence alignment

Xi

MembraneVtrans

Vsyn Vdeg

Vgrowth

Growth: c1Xi+ c2X2+... +cmXm Biomass

Fluxes include transport, & a growth flux

Xi=const.

vj=0

Page 19: Gggatttagctcagtt gggagagcgccagact gaa gat ttg gag gtcctgtgttcgatcc acagaattcgcacca Share, Search, Merge, Check, Design: e.g. 3D & Sequence alignment

0 5 10 15 20 25 30 35 40 4510

-6

10-4

10-2

100

102

ACCOA

COA

ATP

FAD

GLY

NADH

LEU

SUCCOA

metabolites

coef

f. in

gro

wth

rea

ctio

nBiomass Composition

Page 20: Gggatttagctcagtt gggagagcgccagact gaa gat ttg gag gtcctgtgttcgatcc acagaattcgcacca Share, Search, Merge, Check, Design: e.g. 3D & Sequence alignment

Null(S)={v : Sv=0}1

2

Find max{Growth}using simplex

FluxBalanceAnalysis core

Page 21: Gggatttagctcagtt gggagagcgccagact gaa gat ttg gag gtcctgtgttcgatcc acagaattcgcacca Share, Search, Merge, Check, Design: e.g. 3D & Sequence alignment

Can we use flux analysis to say something

about suboptimal states ?

Page 22: Gggatttagctcagtt gggagagcgccagact gaa gat ttg gag gtcctgtgttcgatcc acagaattcgcacca Share, Search, Merge, Check, Design: e.g. 3D & Sequence alignment

Flux ratios at each branch point yields optimal polymer composition for replication

x,y are two of the 100s of flux dimensions

Page 23: Gggatttagctcagtt gggagagcgccagact gaa gat ttg gag gtcctgtgttcgatcc acagaattcgcacca Share, Search, Merge, Check, Design: e.g. 3D & Sequence alignment

Projection can leave the

mutant feasible space…

so Quadratic programming

(QP) to find the nearest point

Page 24: Gggatttagctcagtt gggagagcgccagact gaa gat ttg gag gtcctgtgttcgatcc acagaattcgcacca Share, Search, Merge, Check, Design: e.g. 3D & Sequence alignment

12C13C

FluxRatio Data

Page 25: Gggatttagctcagtt gggagagcgccagact gaa gat ttg gag gtcctgtgttcgatcc acagaattcgcacca Share, Search, Merge, Check, Design: e.g. 3D & Sequence alignment

0 50 100 150 2000

20

40

60

80

100

120

140

160

180

200

1

2

3

456

78

9

10

11121314

15

16

17 18

-50 0 50 100 150 200 250-50

0

50

100

150

200

250

1

2

3456

78

910

11121314

1516

17

18

Experimental Fluxes

Pre

dic

ted

Flu

xes

-50 0 50 100 150 200 250-50

0

50

100

150

200

250

1

2

3

456

78

910

111213

14

15

16

1718

pyk (LP)

WT (LP)

Experimental Fluxes

Pre

dic

ted

Flu

xes

Experimental Fluxes

Pre

dic

ted

Flu

xes

pyk (QP)

=0.91p=8e-8

=-0.06p=6e-1

=0.56p=7e-3

Flux Data C009-limited

Page 26: Gggatttagctcagtt gggagagcgccagact gaa gat ttg gag gtcctgtgttcgatcc acagaattcgcacca Share, Search, Merge, Check, Design: e.g. 3D & Sequence alignment

Flux data (MOMA & FBA)

Condition Method 1 p-val (a) p-val (b) 2 p-val (c) p-val (d)

wt 0.91 8E-8ko (FBA) -0.064 6E-1 -0.36 9E-1ko MoMA 0.56 7E-3 0.48 2E-2wt 0.97 8E-12ko (FBA) 0.77 8E-5 0.36 7E-2ko MoMA 0.94 3E-9 0.74 2E-4wt 0.78 7E-5ko (FBA) 0.86 3E-6 0.096 4E-1ko MoMA 0.73 3E-4 0.49 2E-2

1E-2

5E-2

2E-4C-0

.09

C-0

.4N

-0.0

9

3E-3

3E-3

9E-2

Page 27: Gggatttagctcagtt gggagagcgccagact gaa gat ttg gag gtcctgtgttcgatcc acagaattcgcacca Share, Search, Merge, Check, Design: e.g. 3D & Sequence alignment

Essential 142 80 62Reduced growth 46 24 22

Non essential 299 119 180 p = 4∙10-3

Essential 162 96 66Reduced growth 44 19 25

Non essential 281 108 173 p = 10-5

MOMA

FBA

Competitive growth data

2 p-values

4x10-3

1x10-5

Position effects Novel redundancies

On minimal media

negative small selection effect

Page 28: Gggatttagctcagtt gggagagcgccagact gaa gat ttg gag gtcctgtgttcgatcc acagaattcgcacca Share, Search, Merge, Check, Design: e.g. 3D & Sequence alignment

Replication rate of a whole-genome set of mutants

Badarinarayana, et al. (2001) Nature Biotech.19: 1060

Page 29: Gggatttagctcagtt gggagagcgccagact gaa gat ttg gag gtcctgtgttcgatcc acagaattcgcacca Share, Search, Merge, Check, Design: e.g. 3D & Sequence alignment

Replication rate challenge met: multiple homologous domains

 

1 2 3

1 2 3

thrA

metL

1.1 6.7

1.8 1.8

1 2lysC10.4

 

  

probes

Selective disadvantage in minimal media

Page 30: Gggatttagctcagtt gggagagcgccagact gaa gat ttg gag gtcctgtgttcgatcc acagaattcgcacca Share, Search, Merge, Check, Design: e.g. 3D & Sequence alignment

Multiple mutations per gene

Correlation between two selection experiments

Badarinarayana, et al. (2001) Nature Biotech.19: 1060

Page 31: Gggatttagctcagtt gggagagcgccagact gaa gat ttg gag gtcctgtgttcgatcc acagaattcgcacca Share, Search, Merge, Check, Design: e.g. 3D & Sequence alignment

Goals 3& 4: Populations and models Next steps

1 Generate MOMA models for autotrophs

2 Comparison of models for multiple Prochlorococcus & Pseudomonas genomes

3 Insertion & point mutant competitions for hard-to-grow species (e.g.. Prochlorococcus 24 hr doubling).

Page 32: Gggatttagctcagtt gggagagcgccagact gaa gat ttg gag gtcctgtgttcgatcc acagaattcgcacca Share, Search, Merge, Check, Design: e.g. 3D & Sequence alignment

Harvard-MIT GtL Center Goals

1 Protein Complexes : Mass Spectrometry multi-species-time-series & crosslinking

2 Regulatory Networks : RNA array quantitation

3 Microbial Communities, Biofilms : Polonies* Tagged-strain-competition, Single Cell Activities.

4 Computational Modeling: Metabolic Optimization & 4D Cell modeling

Page 33: Gggatttagctcagtt gggagagcgccagact gaa gat ttg gag gtcctgtgttcgatcc acagaattcgcacca Share, Search, Merge, Check, Design: e.g. 3D & Sequence alignment

DNA RNA Proteins

Metabolites

Environment

Biosystems Integrating Measures & Models

Microbes Cancer & stem cellsIn vitro replicationmulticellular organisms

interactions

Polonies(CD44 & cancer)

MOMADarwinian (sub)optima

Arrays & Mass-spec(circadian & cell cycle)

Page 34: Gggatttagctcagtt gggagagcgccagact gaa gat ttg gag gtcctgtgttcgatcc acagaattcgcacca Share, Search, Merge, Check, Design: e.g. 3D & Sequence alignment
Page 35: Gggatttagctcagtt gggagagcgccagact gaa gat ttg gag gtcctgtgttcgatcc acagaattcgcacca Share, Search, Merge, Check, Design: e.g. 3D & Sequence alignment

GtL Workshop B: Experimental Technology Development and Integration Tue at 2 PM

Co-Chairs – George Church, Harvard Medical SchoolHam Smith, Institute for Biological Energy Alternatives

As we attempt to understand, protect, and/or engineer environmental microbial communities, we need to ask what sorts of data would most benefit our models and how to obtain these cost-effectively. For this session let us answer what small (or large) technological step are we taking toward these specific challenges: (1) microscopic methods capable of tracing the chain of a small genome, (2) quantitation of “all” peptide states (either in single cells or populations), (3) Sequencing at Mbp per $, and (4) automated designed genome engineering.

The framework for the discussions will be the following questions:What are the most useful technologies for our tasks/goals now and for the future? What are the major technological gaps that will need to be addressed to reach the GTL goals? To what extent will the technologies be developed by others?How can technologies best be used to complement each other and strengthen the resulting research/insights? How do we promote the kind of synergistic interactions among the practitioners?

Presentations by Joachim Frank (Wadsworth Center, New York State Department of Health) on Cryo-Electron Microscopy, Bob Hettich or Greg Hurst (ORNL) and Dick Smith (PNNL) on Mass spectrometry, Hoi-Ying Holman (Berkeley Lab) on FTIR imagingSteve Colson (PNNL) on optical imaging

We would like to invite you to bring one viewgraph to share with the participants on your views about technologies needed to meet these challenges.

Page 36: Gggatttagctcagtt gggagagcgccagact gaa gat ttg gag gtcctgtgttcgatcc acagaattcgcacca Share, Search, Merge, Check, Design: e.g. 3D & Sequence alignment

DNA RNA Proteins

Metabolites

Replication rate

Environment

Biosystems Integrating Measures & Models

Microbes Cancer & stem cells Darwinian optimaIn vitro replicationSmall multicellular organisms

RNAiInsertionsSNPs

interactions

Page 37: Gggatttagctcagtt gggagagcgccagact gaa gat ttg gag gtcctgtgttcgatcc acagaattcgcacca Share, Search, Merge, Check, Design: e.g. 3D & Sequence alignment

Improving Models & Measures

Why model?

“Killer Applications”: Share, Search, Merge, Check, Design

Page 38: Gggatttagctcagtt gggagagcgccagact gaa gat ttg gag gtcctgtgttcgatcc acagaattcgcacca Share, Search, Merge, Check, Design: e.g. 3D & Sequence alignment

The issue is not speed, but integration.Cost per 99.99% bp : Including Reagents, Personnel, Equipment/5yr, Overhead/sq.m• Sub-mm scale : 1m = femtoliter (10-15)• Instruments $2-50K per CPU

Why improve measurements?

Human genomes (6 billion)2 = 1019 bpImmune & cancer genome changes >1010 bp per time pointRNA ends & splicing: in situ 1012 bits/mm3

Biodiversity: Environmental & lab evolution Compact storage 105 now to 1017 bits/ mm3 eventually

& How? ($1K per genome, 108-1013 bits/$ )

Page 39: Gggatttagctcagtt gggagagcgccagact gaa gat ttg gag gtcctgtgttcgatcc acagaattcgcacca Share, Search, Merge, Check, Design: e.g. 3D & Sequence alignment

Projected costs determine when biosystems data overdetermination is feasible.

In 1984, pre-HGP (X, pBR322, etc.) 0.1bp/$, would have been $30B per human

genome.

In 2002, (de novo full vs. resequencing ) ABI/Perlegen/Lynx: $300M vs. $3M

103 bp/$ (4 log improvement)

Other data I/O (e.g. video) 1013 bits/$

Page 40: Gggatttagctcagtt gggagagcgccagact gaa gat ttg gag gtcctgtgttcgatcc acagaattcgcacca Share, Search, Merge, Check, Design: e.g. 3D & Sequence alignment

Why single molecules?

Integration from cells/genomes/RNAs to data

Geometric constraints :Who’s “in cis” on a molecule, complex, or cell.e.g. DNA Haplotypes & RNA splice-forms

Page 41: Gggatttagctcagtt gggagagcgccagact gaa gat ttg gag gtcctgtgttcgatcc acagaattcgcacca Share, Search, Merge, Check, Design: e.g. 3D & Sequence alignment

Polymerasecolonies

(Polonies) along a DNA

or RNAmolecule

Page 42: Gggatttagctcagtt gggagagcgccagact gaa gat ttg gag gtcctgtgttcgatcc acagaattcgcacca Share, Search, Merge, Check, Design: e.g. 3D & Sequence alignment

A’

A’A’

A’

A’

A’

B

BB

B

BB

A

Single Molecule From Library

B

BA’

A’

1st Round of PCR

Primer is Extendedby Polymerase

B

A’

BA’

Polymerase colony (polony) PCR in a gel

Primer A has 5’ immobilizing Acrydite

Mitra & Church Nucleic Acids Res. 27: e34

Page 43: Gggatttagctcagtt gggagagcgccagact gaa gat ttg gag gtcctgtgttcgatcc acagaattcgcacca Share, Search, Merge, Check, Design: e.g. 3D & Sequence alignment

• Hybridize Universal Primer • Add Red (Cy3) dTTP. Wash.• Add Green (FITC) dCTP• Wash; Scan

B B’

3’ 5’

AGT.

TC

B B’

3’ 5’

GCG..

C

Sequence polonies by sequential, fluorescent single-base extensions

Page 44: Gggatttagctcagtt gggagagcgccagact gaa gat ttg gag gtcctgtgttcgatcc acagaattcgcacca Share, Search, Merge, Check, Design: e.g. 3D & Sequence alignment

$1K per diploid human sequence

Input: Buccal cells, blood, or forensic samples. Output: Prioritized list of deviant bps (e.g. non-conservative).

Raw data rate: 16 pixels/bp, 1Mpixel per 6sec/CPU = 24 CPU days. Amortization: 5 yr for camera/CPU/transport @ $50K total = $200 per 1011 bp Overhead: $200 /sq ft/yr * 40 sq.ft (400 cu.ft) = $40Reagents: At 20 m per (5 m) polony and 40 bp reads means 10000 cm2 area, 800 ml of fluor dNTP, $100/mg = $40 5 ml PCR reactions = $200Disposables: 500 slides = $50 Electricity: 2 kwatts 24hr*24days* 0.13$/kwatt-hr = $150Labor for repair: 10% of instrument cost = $10 Labor for operation: Slide PCR, slide dips, scans, etc. = $20R&D: Initially NIH grants (roughly 10%).

Page 45: Gggatttagctcagtt gggagagcgccagact gaa gat ttg gag gtcctgtgttcgatcc acagaattcgcacca Share, Search, Merge, Check, Design: e.g. 3D & Sequence alignment

Inexpensive, off-the-shelf equipment

MJR in situ Cycler$10K

Automatedslide fluidics

$4K

                                                                                 

MicroarrayScanner$26K+

Page 46: Gggatttagctcagtt gggagagcgccagact gaa gat ttg gag gtcctgtgttcgatcc acagaattcgcacca Share, Search, Merge, Check, Design: e.g. 3D & Sequence alignment

Human Haplotype:CFTR gene

45 kbp

Rob MitraVincent ButtyJay ShendureBen Williams

Page 47: Gggatttagctcagtt gggagagcgccagact gaa gat ttg gag gtcctgtgttcgatcc acagaattcgcacca Share, Search, Merge, Check, Design: e.g. 3D & Sequence alignment

Quantitative removal of Fluorophores

Rob Mitra

Page 48: Gggatttagctcagtt gggagagcgccagact gaa gat ttg gag gtcctgtgttcgatcc acagaattcgcacca Share, Search, Merge, Check, Design: e.g. 3D & Sequence alignment

Template ST30:3' TCACGAGT

Base added: (C) A G T (C)

(A) G (T) C (A)

(G) T C A

3' TCACGAGT AGTGCTCA

Sequencing multiple polonies

Rob Mitra

Page 49: Gggatttagctcagtt gggagagcgccagact gaa gat ttg gag gtcctgtgttcgatcc acagaattcgcacca Share, Search, Merge, Check, Design: e.g. 3D & Sequence alignment

Mutiple Image Alignment

Metric based on optimal coincidence of high intensity noise pixels over a matrix of local offsets (0.4 pixel precision)

Shendure

Page 50: Gggatttagctcagtt gggagagcgccagact gaa gat ttg gag gtcctgtgttcgatcc acagaattcgcacca Share, Search, Merge, Check, Design: e.g. 3D & Sequence alignment

Polony exclusion principle &Single pixel sequences

Mitra & Shendure

Page 51: Gggatttagctcagtt gggagagcgccagact gaa gat ttg gag gtcctgtgttcgatcc acagaattcgcacca Share, Search, Merge, Check, Design: e.g. 3D & Sequence alignment

Polony Flavors

1. Replica Plating of DNA images [Mitra et al. NAR 1999]

2. Long Range Haplotyping [Mitra et al. PNAS 2003]

3. Allelic mRNA Quantitation (HEP) [Mitra et al. 2003]

4. Alternative Splicing Combinatorics [Zhu et al. 2003]

5. Precise SNP-mutant & mRNA ratios [Merrill et al. 2003]

6. Fluor in situ Sequencing (FISSEQ 1) [Mitra et al. 2003]

7. Multiplex Genotyping (ApoE, Hyman, Shendure & Williams)

8. In situ / single-cell extensions of the above (Zhu & Williams)

Page 52: Gggatttagctcagtt gggagagcgccagact gaa gat ttg gag gtcctgtgttcgatcc acagaattcgcacca Share, Search, Merge, Check, Design: e.g. 3D & Sequence alignment

Synthetic Mini-genomes• 90kbp genome? All 3D structures known.• Comprehensive functional data too.• 100X faster replication (10 sec doubling) & selection to evolve widgets & systems?• Utility of mirror-image & other unnatural polymers.• Chassis & power supply

Page 53: Gggatttagctcagtt gggagagcgccagact gaa gat ttg gag gtcctgtgttcgatcc acagaattcgcacca Share, Search, Merge, Check, Design: e.g. 3D & Sequence alignment

A 90 kbp mini-genomeSP (3D) StochimetryMge# Bp Min access# Gene L.end R.endorientationlen2 SequenceTotal 144 107 89,498 74,310 285316S 1 y 1418 1418 3968 rrsB 4164238 4165779 > 124 aaattgaagagtttgatcatggctcagattgaacgctggcggcaggcctaacacatgcaagtcgaacggtaacaggaagaagcttgcttctttgctgacgagtggcggacgggtgagtaatgtctgggaaactgcctgatggagggggataactactggaaacggtagctaataccgcataacgtcgcaagaccaaagagggggaccttcgggcctcttgccatcggatgtgcccagatgggattagctagtagg23S 1 y 2903 2903 3970 rrlB 4166220 4169123 > 1 ggttaagcgactaagcgtacacggtggatgccctggcagtcagaggcgatgaaggacgtgctaatctgcgataagcgtcggtaaggtgatatgaaccgttataaccggcgatttccgaatggggaaacccagtgtgtttcgacacactatcattaactgaatccataggttaatgaggcgaaccgggggaactgaaacatctaagtaccccgaggaaaagaaatcaaccgagattcccccagtagcggcgagcga5S 1 120 120 3971 rrfB 4169216 4169335 > 0 tgcctggcggcagtagcgcggtggtcccacctgaccccatgccgaactcagaagtgaaacgccgtagcgccgatggtagtgtggggtctccccatgcgagagtagggaactgccaggcat10sb (RNaseP) 375 375 3123 rnpB 3268233 3267857 < 2 gaagctgaccagacagtcgccgcttcgtcgtcgtcctcttcgggggagacgggcggaggggaggaaagtccgggctccatagggcagggtgccaggtaacgcctgggggggaaacccacgaccagtgcaacagagagcaaaccgccgatggcccgcgcaagcgggatcaggtaagggtgaaagggtgcggtaagagcgcaccgcgcggctggtaacagtccgtggcacggtaaactccacccggagcaaggccaatRNAs 20-46 y 3136 1364 3939 eg. gltT 4165951 4166026 > gtccccttcgtctagaggcccaggacaccgccctttcacggcggtaacaggggttcgaatcccctaggggacgccaCca (no) ? 1236 3056 cca 3199532 3200770 > 3 gtgaagatttatctggtcggtggtgctgttcgggatgcattgttagggctaccggtcaaagacagagattgggtggtggtcggcagtacgccacaggagatgctcgacgcgggctaccagcaggtaggccgcgattttcctgtgtttctgcatccgcaaacgcatgaagagtatgcgctggcacgtaccgaacggaaatccggttccggttacaccggttttacttgctatgccgcaccggatgtcacgctggaaTrmA (22?) ? 1098 3965 trmA 4159749 4160849 < 3 atgacccccgaacaccttccaacagaacagtatgaagcgcagttagccgaaaaagtggtacgtttgcaaagtatgatggcaccgttttctgacctggttccggaagtgtttcgctcgccggtcagtcattaccggatgcgcgcggagttccgcatctggcacgatggcgatgacctgtatcacatcattttcgatcaacaaaccaaaagccgcatccgcgtggatagcttccccgccgccagtgaacttatcaacBstNBI (no) 1815 AF329098 1 1815 > 0 atggctaaaaaagttaattggtatgtttcttgttcacctagaagtccagaaaaaattcagcctgagttaaaagtactagcaaattttgagggaagttattggaaaggggtaaaagggtataaagcacaagaggcatttgctaaagaacttgctgctttaccacaattcttaggtactacttataaaaaagaagctgcattttctactcgagacagagtggcaccaatgaaaacttatggtttcgtatttgtagatTri1 ? AP001918 traI 92673 97943 > atgatgagtattgcgcaggtcagatcggccggaagtgccgggaactattataccgacaaggataattactatgtgctgggcagcatgggagaacgctgggccggcaggggggctgaacagctggggctgcagggcagtgtcgataaggatgtttttacccgtcttctggagggcaggctgccggacggagcggatctaagccgcatgcaggatggcagtaacaggcatcgtcccggctacgatctgaccttctccFlp no 1272 NC_001398 5573 523 > 0 atgccacaatttggtatattatgtaaaacaccacctaaggtgcttgttcgtcagtttgtggaaaggtttgaaagaccttcaggtgagaaaatagcattatgtgctgctgaactaacctatttatgttggatgattacacataacggaacagcaatcaagagagccacattcatgagctataatactatcataagcaattcgctgagtttcgatattgtcaataaatcactccagtttaaatacaagacgcaaaaaGFP no 717 AF302837 27 743 > 0 atgagtaaaggagaagaacttttcactggagttgtcccaattcttgttgaattagatggcgatgttaatgggcaaaaattctctgtcagtggagagggtgaaggtgatgcaacatacggaaaacttacccttaaatttatttgcactactgggaagctacctgttccatggccaacacttgtcactactttcgcgtatggtcttcaatgctttgcgagatacccagatcatatgaaacagcatgactttttcaagRnpa (36%) 357 357 3704 rnpA 3882122 3882481 > 3 gtggttaagctcgcatttcccagggagttacgcttgttaactcccagtcaattcacattcgtcttccagcagccacaacgggctggcacgccgcaaattaccattctcggccgcctgaattcgctggggcatccccgtatcggtcttacagtcgccaagaaaaacgttcgacgcgcccatgaacgcaatcggattaaacgtctgacgcgtgaaagcttccgtctgcgccaacatgaactcccggctatggatttcBstPol multiprot 2631 2631 U93028 95 2728 > 3 atgagattgaagaaaaaactcgtcttaattgatggcaacagtgtggcataccgcgccttttttgccttgccacttttgcataacgacaaaggcattcatacgaatgcggtttacgggtttacgatgatgttgaacaaaattttggcggaagaacaaccgacccatttacttgtagcgtttgacgccggaaaaacgacgttccggcatgaaacgtttcaagagtataaaggcggacggcaacaaacgcccccggaaRpol_Bpt7 multiprot 2649 2649 NC_001604 3171 5822 > 2 atgaacacgattaacatcgctaagaacgacttctctgacatcgaactggctgctatcccgttcaacactctggctgaccattacggtgagcgtttagctcgcgaacagttggcccttgagcatgagtcttacgagatgggtgaagcacgcttccgcaagatgtttgagcgtcaacttaaagctggtgaggttgcggataacgctgccgccaagcctctcatcactaccctactccctaagatgattgcacgcatcEFTu 451 1179 1179 3339 tufA 3467782 3468966 < 6 gtgtctaaagaaaaatttgaacgtacaaaaccgcacgttaacgttggtactatcggccacgttgaccacggtaaaactactctgaccgctgcaatcaccaccgtactggctaaaacctacggcggtgctgctcgtgcattcgaccagatcgataacgcgccggaagaaaaagctcgtggtatcaccatcaacacttctcacgttgaatacgacaccccgacccgtcactacgcacacgtagactgcccggggcacEFG (59%) 89 2109 2109 3340 fusA 3469037 3471151 < 6 atggctcgtacaacacccatcgcacgctaccgtaacatcggtatcagtgcgcacatcgacgccggtaaaaccactactaccgaacgtattctgttctacaccggtgtaaaccataaaatcggtgaagttcatgacggcgctgcaaccatggactggatggagcaggagcaggaacgtggtattaccatcacttccgctgcgactactgcattctggtctggtatggctaagcagtatgagccgcatcgcatcaacEFTs 433 846 846 170 tsf 190857 191708 > 6 atggctgaaattaccgcatccctggtaaaagagctgcgtgagcgtactggcgcaggcatgatggattgcaaaaaagcactgactgaagctaacggcgacatcgagctggcaatcgaaaacatgcgtaagtccggtgctattaaagcagcgaaaaaagcaggcaacgttgctgctgacggcgtgatcaaaaccaaaatcgacggcaactacggcatcattctggaagttaactgccagactgacttcgttgcaaaaEFP (no) 26 561 561 4147 efp 4373277 4373843 > 6 atggcaacgtactatagcaacgattttcgtgctggtcttaaaatcatgttagacggcgaaccttacgcggttgaagcgagtgaattcgtaaaaccgggtaaaggccaggcatttgctcgcgttaaactgcgtcgtctgctgaccggtactcgcgtagaaaaaaccttcaaatctactgattccgctgaaggcgctgatgttgtcgatatgaacctgacttacctgtacaacgacggtgagttctggcacttcatgIF1 173 213 213 884 infA 925448 925666 < 6 atggccaaagaagacaatattgaaatgcaaggtaccgttcttgaaacgttgcctaataccatgttccgcgtagagttagaaaacggtcacgtggttactgcacacatctccggtaaaatgcgcaaaaactacatccgcatcctgacgggcgacaaagtgactgttgaactgaccccgtacgacctgagcaaaggccgcattgtcttccgtagtcgctgaIF2 (25%) 142 2682 2682 3168 infB 3310983 3313655 < -9 atgacagatgtaacgattaaaacgctggccgcagagcgacagacctccgtggaacgcctggtacagcaatttgctgatgcaggtatccggaagtctgctgacgactctgtgtctgcacaagagaaacagactttgattgaccacctgaatcagaaaaattcaggcccggacaaattgacgctgcaacgtaaaacacgcagcacccttaacattcctggtaccggtggaaaaagcaaatcggtacaaatcgaagtcIF3 (~50%) 196 540 540 1718 infC 1798120 1798662 < 3 attaaaggcggaaaacgagttcaaacggcgcgccctaaccgtatcaatggcgaaattcgcgcccaggaagttcgcttaacaggtctggaaggcgagcagcttggtattgtgagtctgagagaagctctggagaaagcagaagaagccggagtagacttagtcgagatcagccctaacgccgagccgccggtttgtcgtataatggattacggcaaattcctctatgaaaagagcaagtcttctaaggaacagaagRF1 (no) 258 1080 1211 prfA 1264235 1265317 > 3 atgaagccttctatcgttgccaaactggaagccctgcatgaacgccatgaagaagttcaggcgttgctgggtgacgcgcaaactatcgccgaccaggaacgttttcgcgcattatcacgcgaatatgcgcagttaagtgatgtttcgcgctgttttaccgactggcaacaggttcaggaagatatcgaaaccgcacagatgatgctcgatgatcctgaaatgcgtgagatggcgcaggatgaactgcgcgaagctRRF 435 555 555 172 frr 192872 193429 > 3 gtgattagcgatatcagaaaagatgctgaagtacgcatggacaaatgcgtagaagcgttcaaaacccaaatcagcaaaatacgcacgggtcgtgcttctcccagcctgctggatggcattgtcgtggaatattacggcacgccgacgccgctgcgtcagctggcaagcgtaacggtagaagattcccgtacactgaaaatcaacgtgtttgatcgttcaatgtctccggccgttgaaaaagcgattatggcgtccRL1 (~50%) 1 82 699 699 3984 rplA 4176457 4177161 > 6 atggctaaactgaccaagcgcatgcgtgttatccgcgagaaagttgatgcaaccaaacagtacgacatcaacgaagctatcgcactgctgaaagagctggcgactgctaaattcgtagaaagcgtggacgtagctgttaacctcggcatcgacgctcgtaaatctgaccagaacgtacgtggtgcaactgtactgccgcacggtactggccgttccgttcgcgtagccgtatttacccaaggtgcaaacgctgaaRL2 1 154 816 816 3317 rplB 3448180 3449001 < 6 atggcagttgttaaatgtaaaccgacatctccgggtcgtcgccacgtagttaaagtggttaaccctgagctgcacaagggcaaaccttttgctccgttgctggaaaaaaacagcaaatccggtggtcgtaacaacaatggccgtatcaccactcgtcatatcggtggtggccacaagcaggcttaccgtattgttgacttcaaacgcaacaaagacggtatcccggcagttgttgaacgtcttgagtacgatccg

Page 54: Gggatttagctcagtt gggagagcgccagact gaa gat ttg gag gtcctgtgttcgatcc acagaattcgcacca Share, Search, Merge, Check, Design: e.g. 3D & Sequence alignment

The in vitro assembly (& 3D structure) of the prokaryotic ribosomes is known. (e.g. Nomura et al.; Noller et al.)

Page 55: Gggatttagctcagtt gggagagcgccagact gaa gat ttg gag gtcctgtgttcgatcc acagaattcgcacca Share, Search, Merge, Check, Design: e.g. 3D & Sequence alignment

M 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21

DNA Template

RNA Transcript

All 30S-Ribosomal-protein DNAs & mRNAs synthesized in vitro

Tian & Church

Page 56: Gggatttagctcagtt gggagagcgccagact gaa gat ttg gag gtcctgtgttcgatcc acagaattcgcacca Share, Search, Merge, Check, Design: e.g. 3D & Sequence alignment

His-tagged ribosomal proteins synthesized in vitro

RS-2,4,5,6,9,10,12,13,15,16,17,and 21 as original constructs.

RS1 required deletion of a feedback motif in the mRNA.RS-3, 7, 8, 11, 14, 18, 19, 20 are still weakly expressed.

Note that S1, S4, S7, S8, S20, L1, L4, L10 are known to repress their own translation (and are likely titrated by rRNA).

Tian & Church

Page 57: Gggatttagctcagtt gggagagcgccagact gaa gat ttg gag gtcctgtgttcgatcc acagaattcgcacca Share, Search, Merge, Check, Design: e.g. 3D & Sequence alignment
Page 58: Gggatttagctcagtt gggagagcgccagact gaa gat ttg gag gtcctgtgttcgatcc acagaattcgcacca Share, Search, Merge, Check, Design: e.g. 3D & Sequence alignment

Set o

f N

coor

dina

tes

x y z

Matrix ofdistances

SVD(singularvaluedecomposition)

Euclidean Metric

pdb file (viewed with RasMol)

Matlab visualization

Representations of the Chromosome

Page 59: Gggatttagctcagtt gggagagcgccagact gaa gat ttg gag gtcctgtgttcgatcc acagaattcgcacca Share, Search, Merge, Check, Design: e.g. 3D & Sequence alignment

Bidirectionalreplication Paired fork

Page 60: Gggatttagctcagtt gggagagcgccagact gaa gat ttg gag gtcctgtgttcgatcc acagaattcgcacca Share, Search, Merge, Check, Design: e.g. 3D & Sequence alignment

Origin

Blue: Left replicated segment (yelgr=high gene#)Red: Right (i.e. middle) segmentAqua: unduplicated segment of the circular genome

Avoidance of entanglement throughout cell cycle