gene expression as a stochastic process: from …...outline motivation & basics a stochastic...

36
Gene Expression as a Stochastic Process: From Gene Number Distributions to Protein Statistics and Back Jan-Timm Kuhr June 19, 2007

Upload: others

Post on 11-Jul-2020

6 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Gene Expression as a Stochastic Process: From …...Outline Motivation & Basics A Stochastic Approach to Gene Expression Application to Experimental Data Summary & Outlook Gene Copy

Gene Expression as a Stochastic Process:From Gene Number Distributions to Protein

Statistics and Back

Jan-Timm Kuhr

June 19, 2007

Page 2: Gene Expression as a Stochastic Process: From …...Outline Motivation & Basics A Stochastic Approach to Gene Expression Application to Experimental Data Summary & Outlook Gene Copy

Outline Motivation & Basics A Stochastic Approach to Gene Expression Application to Experimental Data Summary & Outlook

Motivation & Basics

A Stochastic Approach to Gene Expression

Application to Experimental Data

Summary & Outlook

Jan-Timm Kuhr

Gene Expression as a Stochastic Process: From Gene Number Distributions to Protein Statistics and Back

Page 3: Gene Expression as a Stochastic Process: From …...Outline Motivation & Basics A Stochastic Approach to Gene Expression Application to Experimental Data Summary & Outlook Gene Copy

Outline Motivation & Basics A Stochastic Approach to Gene Expression Application to Experimental Data Summary & Outlook

Gene Copy Number and Transfection

A big hope of gene therapy is to treat diseases by use of artificial viruses,that bring genes (coding for beneficial proteins) into the cell.

Bad Treatment:Heterogeneous distribution of plas-mids: Many cells get no plasmids, afew cells get many plasmids.

Good Treatment:Homogeneous distribution of plas-mids: Most cells get a small numberof plasmids.

Jan-Timm Kuhr

Gene Expression as a Stochastic Process: From Gene Number Distributions to Protein Statistics and Back

Page 4: Gene Expression as a Stochastic Process: From …...Outline Motivation & Basics A Stochastic Approach to Gene Expression Application to Experimental Data Summary & Outlook Gene Copy

Outline Motivation & Basics A Stochastic Approach to Gene Expression Application to Experimental Data Summary & Outlook

Gene Copy Number and Transfection

A big hope of gene therapy is to treat diseases by use of artificial viruses,that bring genes (coding for beneficial proteins) into the cell.

Bad Treatment:Heterogeneous distribution of plas-mids: Many cells get no plasmids, afew cells get many plasmids.

Good Treatment:Homogeneous distribution of plas-mids: Most cells get a small numberof plasmids.

Jan-Timm Kuhr

Gene Expression as a Stochastic Process: From Gene Number Distributions to Protein Statistics and Back

Page 5: Gene Expression as a Stochastic Process: From …...Outline Motivation & Basics A Stochastic Approach to Gene Expression Application to Experimental Data Summary & Outlook Gene Copy

Outline Motivation & Basics A Stochastic Approach to Gene Expression Application to Experimental Data Summary & Outlook

Gene Copy Number and Transfection

A big hope of gene therapy is to treat diseases by use of artificial viruses,that bring genes (coding for beneficial proteins) into the cell.

Bad Treatment:Heterogeneous distribution of plas-mids: Many cells get no plasmids, afew cells get many plasmids.

Good Treatment:Homogeneous distribution of plas-mids: Most cells get a small numberof plasmids.

Jan-Timm Kuhr

Gene Expression as a Stochastic Process: From Gene Number Distributions to Protein Statistics and Back

Page 6: Gene Expression as a Stochastic Process: From …...Outline Motivation & Basics A Stochastic Approach to Gene Expression Application to Experimental Data Summary & Outlook Gene Copy

Outline Motivation & Basics A Stochastic Approach to Gene Expression Application to Experimental Data Summary & Outlook

The Central Dogma of Biology

After import ofgenetic mate-rial, genes areexpressed by thecellular machineryvia transcriptionand translation.Each reactionis an inherentlystochastic pro-cesses and thusa spread of inprotein numbers isfound after geneexpression.

Jan-Timm Kuhr

Gene Expression as a Stochastic Process: From Gene Number Distributions to Protein Statistics and Back

Page 7: Gene Expression as a Stochastic Process: From …...Outline Motivation & Basics A Stochastic Approach to Gene Expression Application to Experimental Data Summary & Outlook Gene Copy

Outline Motivation & Basics A Stochastic Approach to Gene Expression Application to Experimental Data Summary & Outlook

Intrinsic and Extrinsic Noise

In biological systems noise arises from two sources:

1. Due to probabilistic nature ofchemical reactions: Intrinsic NoiseCan be treated by means ofprobability calculus: Master-,Fokker-Planck-Equation,Simulations.

2. Due to variations in rate constants(different cell volume, temperature,cell cycle state, number ofenzymes, etc.): Extrinsic NoiseUsually unknown nature andstrength.

Jan-Timm Kuhr

Gene Expression as a Stochastic Process: From Gene Number Distributions to Protein Statistics and Back

Page 8: Gene Expression as a Stochastic Process: From …...Outline Motivation & Basics A Stochastic Approach to Gene Expression Application to Experimental Data Summary & Outlook Gene Copy

Outline Motivation & Basics A Stochastic Approach to Gene Expression Application to Experimental Data Summary & Outlook

Deterministic Approach

Assume the gene number D fixed.

∂R(t)

∂t= λ1D − δ1R(t)

∂P(t)

∂t= λ2R(t)− δ2P(t)

These equations can be solved successively:

R(t) = Dλ1

δ1(1− e−δ1t)

The expression for P(t) is more complicated,but one finds P(t → ∞) = D · C with theexpression factor C := λ1λ2

δ1δ2, which gives the

number of proteins per gene.

Jan-Timm Kuhr

Gene Expression as a Stochastic Process: From Gene Number Distributions to Protein Statistics and Back

Page 9: Gene Expression as a Stochastic Process: From …...Outline Motivation & Basics A Stochastic Approach to Gene Expression Application to Experimental Data Summary & Outlook Gene Copy

Outline Motivation & Basics A Stochastic Approach to Gene Expression Application to Experimental Data Summary & Outlook

Stochasticity - The Master Equation

However, transcription, translation and degradation are stochasticprocesses. Probabilistic approach: Master equation We have a 2d statespace, each state is characterized by by R and P. Usually we would needto deal with pR,P . Instead we split up the problem into two Masterequations:

∂pR

∂t= λ1DpR−1 + δ1(R + 1)pR+1 − (λ1D + δ1R)pR

∂pP

∂t= λ2R(t)pP−1 + δ2(P + 1)pP+1 − (λ2R(t) + δ2P)pP

The first equation is decoupled from the second and can be solvedexactly, while the second one is more tricky...

Jan-Timm Kuhr

Gene Expression as a Stochastic Process: From Gene Number Distributions to Protein Statistics and Back

Page 10: Gene Expression as a Stochastic Process: From …...Outline Motivation & Basics A Stochastic Approach to Gene Expression Application to Experimental Data Summary & Outlook Gene Copy

Outline Motivation & Basics A Stochastic Approach to Gene Expression Application to Experimental Data Summary & Outlook

Stochasticity - The Master Equation

However, transcription, translation and degradation are stochasticprocesses. Probabilistic approach: Master equation We have a 2d statespace, each state is characterized by by R and P. Usually we would needto deal with pR,P . Instead we split up the problem into two Masterequations:

∂pR

∂t= λ1DpR−1 + δ1(R + 1)pR+1 − (λ1D + δ1R)pR

∂pP

∂t= λ2R(t)pP−1 + δ2(P + 1)pP+1 − (λ2R(t) + δ2P)pP

The first equation is decoupled from the second and can be solvedexactly, while the second one is more tricky...

Jan-Timm Kuhr

Gene Expression as a Stochastic Process: From Gene Number Distributions to Protein Statistics and Back

Page 11: Gene Expression as a Stochastic Process: From …...Outline Motivation & Basics A Stochastic Approach to Gene Expression Application to Experimental Data Summary & Outlook Gene Copy

Outline Motivation & Basics A Stochastic Approach to Gene Expression Application to Experimental Data Summary & Outlook

mRNA Distribution

The solution to

∂pR

∂t= λ1DpR−1 + δ1(R + 1)pR+1 − (λ1D + δ1R)pR

is given by a Poisson distribution

pR(t) =µ1(t)

R

R!e−µ1(t)

where

µ1(t) = Dλ1

δ1

(1− e−δ1·t

)is the mean mRNA number, as also given by the deterministic rateequations.

Jan-Timm Kuhr

Gene Expression as a Stochastic Process: From Gene Number Distributions to Protein Statistics and Back

Page 12: Gene Expression as a Stochastic Process: From …...Outline Motivation & Basics A Stochastic Approach to Gene Expression Application to Experimental Data Summary & Outlook Gene Copy

Outline Motivation & Basics A Stochastic Approach to Gene Expression Application to Experimental Data Summary & Outlook

Interlude: The Poisson Distribution

Some properties:

I One-parametricdistribution, i.e. themean 〈X 〉 fullydetermines thedistribution.

I The mean is equalto the variance:〈X 〉 = var(X )

I For large mean, bythe central limittheorem, aPoissonian isequivalent to aGaussian.

Jan-Timm Kuhr

Gene Expression as a Stochastic Process: From Gene Number Distributions to Protein Statistics and Back

Page 13: Gene Expression as a Stochastic Process: From …...Outline Motivation & Basics A Stochastic Approach to Gene Expression Application to Experimental Data Summary & Outlook Gene Copy

Outline Motivation & Basics A Stochastic Approach to Gene Expression Application to Experimental Data Summary & Outlook

Protein Distribution

∂pP

∂t= λ2R(t)pP−1 + δ2(P + 1)pP+1 − (λ2R(t) + δ2P)pP

is analogous to the Master equation for pR , apart from the randomvariable R(t) taking the place of D. The solution is yet again a Poissondistribution:

pP(t) =µ2(t)

P

P!e−µ2(t)

Now the mean is a functional of R(t):

µ2[R(t)] =

(λ2

∫ t

0

R(t ′)eδ2·t′dt ′)

e−δ2·t

t� 1δ2=

λ2

δ2

∫ t

0R(t ′)eδ2·t′dt ′∫ t

0eδ2·t′dt ′

This is a weighted temporal average of R(t), where the weightingfunction is exp(δ2t). The recent past has the most weight!Problem: Every cell has a different realization of R(t) ⇒ for every cell µ2

is different!

Jan-Timm Kuhr

Gene Expression as a Stochastic Process: From Gene Number Distributions to Protein Statistics and Back

Page 14: Gene Expression as a Stochastic Process: From …...Outline Motivation & Basics A Stochastic Approach to Gene Expression Application to Experimental Data Summary & Outlook Gene Copy

Outline Motivation & Basics A Stochastic Approach to Gene Expression Application to Experimental Data Summary & Outlook

Protein Distribution

∂pP

∂t= λ2R(t)pP−1 + δ2(P + 1)pP+1 − (λ2R(t) + δ2P)pP

is analogous to the Master equation for pR , apart from the randomvariable R(t) taking the place of D. The solution is yet again a Poissondistribution:

pP(t) =µ2(t)

P

P!e−µ2(t)

Now the mean is a functional of R(t):

µ2[R(t)] =

(λ2

∫ t

0

R(t ′)eδ2·t′dt ′)

e−δ2·tt� 1

δ2=λ2

δ2

∫ t

0R(t ′)eδ2·t′dt ′∫ t

0eδ2·t′dt ′

This is a weighted temporal average of R(t), where the weightingfunction is exp(δ2t). The recent past has the most weight!Problem: Every cell has a different realization of R(t) ⇒ for every cell µ2

is different!

Jan-Timm Kuhr

Gene Expression as a Stochastic Process: From Gene Number Distributions to Protein Statistics and Back

Page 15: Gene Expression as a Stochastic Process: From …...Outline Motivation & Basics A Stochastic Approach to Gene Expression Application to Experimental Data Summary & Outlook Gene Copy

Outline Motivation & Basics A Stochastic Approach to Gene Expression Application to Experimental Data Summary & Outlook

Protein Distribution

∂pP

∂t= λ2R(t)pP−1 + δ2(P + 1)pP+1 − (λ2R(t) + δ2P)pP

is analogous to the Master equation for pR , apart from the randomvariable R(t) taking the place of D. The solution is yet again a Poissondistribution:

pP(t) =µ2(t)

P

P!e−µ2(t)

Now the mean is a functional of R(t):

µ2[R(t)] =

(λ2

∫ t

0

R(t ′)eδ2·t′dt ′)

e−δ2·tt� 1

δ2=λ2

δ2

∫ t

0R(t ′)eδ2·t′dt ′∫ t

0eδ2·t′dt ′

This is a weighted temporal average of R(t), where the weightingfunction is exp(δ2t). The recent past has the most weight!

Problem: Every cell has a different realization of R(t) ⇒ for every cell µ2

is different!

Jan-Timm Kuhr

Gene Expression as a Stochastic Process: From Gene Number Distributions to Protein Statistics and Back

Page 16: Gene Expression as a Stochastic Process: From …...Outline Motivation & Basics A Stochastic Approach to Gene Expression Application to Experimental Data Summary & Outlook Gene Copy

Outline Motivation & Basics A Stochastic Approach to Gene Expression Application to Experimental Data Summary & Outlook

Protein Distribution

∂pP

∂t= λ2R(t)pP−1 + δ2(P + 1)pP+1 − (λ2R(t) + δ2P)pP

is analogous to the Master equation for pR , apart from the randomvariable R(t) taking the place of D. The solution is yet again a Poissondistribution:

pP(t) =µ2(t)

P

P!e−µ2(t)

Now the mean is a functional of R(t):

µ2[R(t)] =

(λ2

∫ t

0

R(t ′)eδ2·t′dt ′)

e−δ2·tt� 1

δ2=λ2

δ2

∫ t

0R(t ′)eδ2·t′dt ′∫ t

0eδ2·t′dt ′

This is a weighted temporal average of R(t), where the weightingfunction is exp(δ2t). The recent past has the most weight!Problem: Every cell has a different realization of R(t) ⇒ for every cell µ2

is different!

Jan-Timm Kuhr

Gene Expression as a Stochastic Process: From Gene Number Distributions to Protein Statistics and Back

Page 17: Gene Expression as a Stochastic Process: From …...Outline Motivation & Basics A Stochastic Approach to Gene Expression Application to Experimental Data Summary & Outlook Gene Copy

Outline Motivation & Basics A Stochastic Approach to Gene Expression Application to Experimental Data Summary & Outlook

Separation of Time Scales: 1) mRNA kinetics � 1/δ2

R(t) changesrapidly comparedto the lifetimesof proteins 1

δ2

i.e. R(t) totally“explores” itsdistribution whilethe proteins ineach cell only“see” the average〈R(t)〉 = µ1:

µ2(t) =λ2

δ2

∫ t

0R(t ′)eδ2·t′dt ′∫ t

0eδ2·t′dt ′

=λ2

δ2

∫ t

0µ1(t)e

δ2·t′dt ′∫ t

0eδ2·t′dt ′

t→∞=

λ1λ2

δ1δ2︸ ︷︷ ︸:=C

D

Jan-Timm Kuhr

Gene Expression as a Stochastic Process: From Gene Number Distributions to Protein Statistics and Back

Page 18: Gene Expression as a Stochastic Process: From …...Outline Motivation & Basics A Stochastic Approach to Gene Expression Application to Experimental Data Summary & Outlook Gene Copy

Outline Motivation & Basics A Stochastic Approach to Gene Expression Application to Experimental Data Summary & Outlook

Separation of Time Scales: 1) mRNA kinetics � 1/δ2

R(t) changesrapidly comparedto the lifetimesof proteins 1

δ2

i.e. R(t) totally“explores” itsdistribution whilethe proteins ineach cell only“see” the average〈R(t)〉 = µ1:

µ2(t) =λ2

δ2

∫ t

0R(t ′)eδ2·t′dt ′∫ t

0eδ2·t′dt ′

=λ2

δ2

∫ t

0µ1(t)e

δ2·t′dt ′∫ t

0eδ2·t′dt ′

t→∞=

λ1λ2

δ1δ2︸ ︷︷ ︸:=C

D

Jan-Timm Kuhr

Gene Expression as a Stochastic Process: From Gene Number Distributions to Protein Statistics and Back

Page 19: Gene Expression as a Stochastic Process: From …...Outline Motivation & Basics A Stochastic Approach to Gene Expression Application to Experimental Data Summary & Outlook Gene Copy

Outline Motivation & Basics A Stochastic Approach to Gene Expression Application to Experimental Data Summary & Outlook

Separation of Time Scales: 2) mRNA kinetics � 1/δ2

R(t) changes sluggishly, while proteins follow that signal and equilibrateto new steady state, forgetting the past very fast.The mean of the P is determined only by the recent past of R(t), whichcan be assumed to be constant in that period. For cells which have RmRNAs presently, the proteins have a Poisson distribution with mean

µ2(t) =λ2

δ2

∫ t

0R eδ2·t′dt ′∫ t

0eδ2·t′dt ′

=λ2

δ2R .

For the whole population we have to sum up all possible states of R,each with the weight according to its probability:

pP =∑R=0

pR

(λ2

δ2R

)P

P!e−

λ2δ2

R

A superposition of Poissonians!

Jan-Timm Kuhr

Gene Expression as a Stochastic Process: From Gene Number Distributions to Protein Statistics and Back

Page 20: Gene Expression as a Stochastic Process: From …...Outline Motivation & Basics A Stochastic Approach to Gene Expression Application to Experimental Data Summary & Outlook Gene Copy

Outline Motivation & Basics A Stochastic Approach to Gene Expression Application to Experimental Data Summary & Outlook

Separation of Time Scales: 2) mRNA kinetics � 1/δ2

R(t) changes sluggishly, while proteins follow that signal and equilibrateto new steady state, forgetting the past very fast.The mean of the P is determined only by the recent past of R(t), whichcan be assumed to be constant in that period. For cells which have RmRNAs presently, the proteins have a Poisson distribution with mean

µ2(t) =λ2

δ2

∫ t

0R eδ2·t′dt ′∫ t

0eδ2·t′dt ′

=λ2

δ2R .

For the whole population we have to sum up all possible states of R,each with the weight according to its probability:

pP =∑R=0

pR

(λ2

δ2R

)P

P!e−

λ2δ2

R

A superposition of Poissonians!

Jan-Timm Kuhr

Gene Expression as a Stochastic Process: From Gene Number Distributions to Protein Statistics and Back

Page 21: Gene Expression as a Stochastic Process: From …...Outline Motivation & Basics A Stochastic Approach to Gene Expression Application to Experimental Data Summary & Outlook Gene Copy

Outline Motivation & Basics A Stochastic Approach to Gene Expression Application to Experimental Data Summary & Outlook

Separation of Time Scales: 2) mRNA kinetics � 1/δ2

R(t) changes sluggishly, while proteins follow that signal and equilibrateto new steady state, forgetting the past very fast.The mean of the P is determined only by the recent past of R(t), whichcan be assumed to be constant in that period. For cells which have RmRNAs presently, the proteins have a Poisson distribution with mean

µ2(t) =λ2

δ2

∫ t

0R eδ2·t′dt ′∫ t

0eδ2·t′dt ′

=λ2

δ2R .

For the whole population we have to sum up all possible states of R,each with the weight according to its probability:

pP =∑R=0

pR

(λ2

δ2R

)P

P!e−

λ2δ2

R

A superposition of Poissonians!

Jan-Timm Kuhr

Gene Expression as a Stochastic Process: From Gene Number Distributions to Protein Statistics and Back

Page 22: Gene Expression as a Stochastic Process: From …...Outline Motivation & Basics A Stochastic Approach to Gene Expression Application to Experimental Data Summary & Outlook Gene Copy

Outline Motivation & Basics A Stochastic Approach to Gene Expression Application to Experimental Data Summary & Outlook

Separation of Time Scales: 2) mRNA kinetics � 1/δ2

Examples

The distribution of mRNA is still visible in the distribution of proteins.Note: If R = 0 then the Poissonian for P collapses to a peak at P = 0with height pR=0.

Jan-Timm Kuhr

Gene Expression as a Stochastic Process: From Gene Number Distributions to Protein Statistics and Back

Page 23: Gene Expression as a Stochastic Process: From …...Outline Motivation & Basics A Stochastic Approach to Gene Expression Application to Experimental Data Summary & Outlook Gene Copy

Outline Motivation & Basics A Stochastic Approach to Gene Expression Application to Experimental Data Summary & Outlook

Random Number of GenesUpon viral infection, transfection or generally in bacteria carryingplasmids or minichromosomes, the number of genes varies from individualto individual. Thus D is not longer constant, but itself a randomvariable, subject to a distribution pD . In general, to find the proteindistribution ptot

P for the whole population we have to sum over theprotein distributions pP(D) of “subpopulations” with gene copy numbersD according to their respective probabilities:

ptotP =

∞∑D=0

pDpP(D)

Since this expression can’t, in general, be determined explicitly, we stickto the biological relevant case mRNA kinetics � 1/δ2, as discussedabove. Again we find a sum of Poissonians:

pP =∞∑

D=0

pDµP

2

P!e−µ2 =

∞∑D=0

pD(DC )P

P!e−DC

Note: In the opposite case (mRNA kinetics � 1/δ2) we would have a superposition of superpositions of Poissonians...

Jan-Timm Kuhr

Gene Expression as a Stochastic Process: From Gene Number Distributions to Protein Statistics and Back

Page 24: Gene Expression as a Stochastic Process: From …...Outline Motivation & Basics A Stochastic Approach to Gene Expression Application to Experimental Data Summary & Outlook Gene Copy

Outline Motivation & Basics A Stochastic Approach to Gene Expression Application to Experimental Data Summary & Outlook

Random Number of GenesUpon viral infection, transfection or generally in bacteria carryingplasmids or minichromosomes, the number of genes varies from individualto individual. Thus D is not longer constant, but itself a randomvariable, subject to a distribution pD . In general, to find the proteindistribution ptot

P for the whole population we have to sum over theprotein distributions pP(D) of “subpopulations” with gene copy numbersD according to their respective probabilities:

ptotP =

∞∑D=0

pDpP(D)

Since this expression can’t, in general, be determined explicitly, we stickto the biological relevant case mRNA kinetics � 1/δ2, as discussedabove. Again we find a sum of Poissonians:

pP =∞∑

D=0

pDµP

2

P!e−µ2 =

∞∑D=0

pD(DC )P

P!e−DC

Note: In the opposite case (mRNA kinetics � 1/δ2) we would have a superposition of superpositions of Poissonians...

Jan-Timm Kuhr

Gene Expression as a Stochastic Process: From Gene Number Distributions to Protein Statistics and Back

Page 25: Gene Expression as a Stochastic Process: From …...Outline Motivation & Basics A Stochastic Approach to Gene Expression Application to Experimental Data Summary & Outlook Gene Copy

Outline Motivation & Basics A Stochastic Approach to Gene Expression Application to Experimental Data Summary & Outlook

Random Number of GenesUpon viral infection, transfection or generally in bacteria carryingplasmids or minichromosomes, the number of genes varies from individualto individual. Thus D is not longer constant, but itself a randomvariable, subject to a distribution pD . In general, to find the proteindistribution ptot

P for the whole population we have to sum over theprotein distributions pP(D) of “subpopulations” with gene copy numbersD according to their respective probabilities:

ptotP =

∞∑D=0

pDpP(D)

Since this expression can’t, in general, be determined explicitly, we stickto the biological relevant case mRNA kinetics � 1/δ2, as discussedabove. Again we find a sum of Poissonians:

pP =∞∑

D=0

pDµP

2

P!e−µ2 =

∞∑D=0

pD(DC )P

P!e−DC

Note: In the opposite case (mRNA kinetics � 1/δ2) we would have a superposition of superpositions of Poissonians...

Jan-Timm Kuhr

Gene Expression as a Stochastic Process: From Gene Number Distributions to Protein Statistics and Back

Page 26: Gene Expression as a Stochastic Process: From …...Outline Motivation & Basics A Stochastic Approach to Gene Expression Application to Experimental Data Summary & Outlook Gene Copy

Outline Motivation & Basics A Stochastic Approach to Gene Expression Application to Experimental Data Summary & Outlook

Random Number of Genes

pP =∞∑

D=0

pD(DC )P

P!e−DC

Why is this interesting?

Properties of the Poisson Distribution and Coften � 1!

1. For C � 1 the Poissonians have large mean ⇒ can be approximatedby Gaussians!

2. Distance between means of two adjacent Poissonians is C while theirrespective widths go like σ =

√DC .

⇒ significant overlap only for D > (C−1)2

4C

Jan-Timm Kuhr

Gene Expression as a Stochastic Process: From Gene Number Distributions to Protein Statistics and Back

Page 27: Gene Expression as a Stochastic Process: From …...Outline Motivation & Basics A Stochastic Approach to Gene Expression Application to Experimental Data Summary & Outlook Gene Copy

Outline Motivation & Basics A Stochastic Approach to Gene Expression Application to Experimental Data Summary & Outlook

Random Number of Genes

pP =∞∑

D=0

pD(DC )P

P!e−DC

Why is this interesting? Properties of the Poisson Distribution and Coften � 1!

1. For C � 1 the Poissonians have large mean ⇒ can be approximatedby Gaussians!

2. Distance between means of two adjacent Poissonians is C while theirrespective widths go like σ =

√DC .

⇒ significant overlap only for D > (C−1)2

4C

Jan-Timm Kuhr

Gene Expression as a Stochastic Process: From Gene Number Distributions to Protein Statistics and Back

Page 28: Gene Expression as a Stochastic Process: From …...Outline Motivation & Basics A Stochastic Approach to Gene Expression Application to Experimental Data Summary & Outlook Gene Copy

Outline Motivation & Basics A Stochastic Approach to Gene Expression Application to Experimental Data Summary & Outlook

Random Number of Genes

pP =∞∑

D=0

pD(DC )P

P!e−DC

Why is this interesting? Properties of the Poisson Distribution and Coften � 1!

1. For C � 1 the Poissonians have large mean ⇒ can be approximatedby Gaussians!

2. Distance between means of two adjacent Poissonians is C while theirrespective widths go like σ =

√DC .

⇒ significant overlap only for D > (C−1)2

4C

Jan-Timm Kuhr

Gene Expression as a Stochastic Process: From Gene Number Distributions to Protein Statistics and Back

Page 29: Gene Expression as a Stochastic Process: From …...Outline Motivation & Basics A Stochastic Approach to Gene Expression Application to Experimental Data Summary & Outlook Gene Copy

Outline Motivation & Basics A Stochastic Approach to Gene Expression Application to Experimental Data Summary & Outlook

From the Protein Distribution to Copy Number StatisticsExamples

While separation of the Gauss peaks is still much greater then theirwidths one can even approximate then by a sum of delta peaks:

pP = pDδP,D·C ; D ∈ N0 “Discretized approximation”Mean 〈P〉 Variance σ2(P)

Sum of Poissonians 500 5.05 · 104

Sum of Gaussians 500 5.05 · 104

Sum of Gaussians with ηext = 0.1 500 5.35 · 104

Sum of δ-peaks 500 5.00 · 104

Jan-Timm Kuhr

Gene Expression as a Stochastic Process: From Gene Number Distributions to Protein Statistics and Back

Page 30: Gene Expression as a Stochastic Process: From …...Outline Motivation & Basics A Stochastic Approach to Gene Expression Application to Experimental Data Summary & Outlook Gene Copy

Outline Motivation & Basics A Stochastic Approach to Gene Expression Application to Experimental Data Summary & Outlook

From the Protein Distribution to Copy Number StatisticsExamples

While separation of the Gauss peaks is still much greater then theirwidths one can even approximate then by a sum of delta peaks:

pP = pDδP,D·C ; D ∈ N0 “Discretized approximation”Mean 〈P〉 Variance σ2(P)

Sum of Poissonians 500 5.05 · 104

Sum of Gaussians 500 5.05 · 104

Sum of Gaussians with ηext = 0.1 500 5.35 · 104

Sum of δ-peaks 500 5.00 · 104

Jan-Timm Kuhr

Gene Expression as a Stochastic Process: From Gene Number Distributions to Protein Statistics and Back

Page 31: Gene Expression as a Stochastic Process: From …...Outline Motivation & Basics A Stochastic Approach to Gene Expression Application to Experimental Data Summary & Outlook Gene Copy

Outline Motivation & Basics A Stochastic Approach to Gene Expression Application to Experimental Data Summary & Outlook

From the Protein Distribution to Copy Number StatisticsExamples

While separation of the Gauss peaks is still much greater then theirwidths one can even approximate then by a sum of delta peaks:

pP = pDδP,D·C ; D ∈ N0 “Discretized approximation”

Mean 〈P〉 Variance σ2(P)

Sum of Poissonians 500 5.05 · 104

Sum of Gaussians 500 5.05 · 104

Sum of Gaussians with ηext = 0.1 500 5.35 · 104

Sum of δ-peaks 500 5.00 · 104

Jan-Timm Kuhr

Gene Expression as a Stochastic Process: From Gene Number Distributions to Protein Statistics and Back

Page 32: Gene Expression as a Stochastic Process: From …...Outline Motivation & Basics A Stochastic Approach to Gene Expression Application to Experimental Data Summary & Outlook Gene Copy

Outline Motivation & Basics A Stochastic Approach to Gene Expression Application to Experimental Data Summary & Outlook

From the Protein Distribution to Copy Number StatisticsExamples

While separation of the Gauss peaks is still much greater then theirwidths one can even approximate then by a sum of delta peaks:

pP = pDδP,D·C ; D ∈ N0 “Discretized approximation”Mean 〈P〉 Variance σ2(P)

Sum of Poissonians 500 5.05 · 104

Sum of Gaussians 500 5.05 · 104

Sum of Gaussians with ηext = 0.1 500 5.35 · 104

Sum of δ-peaks 500 5.00 · 104

Jan-Timm Kuhr

Gene Expression as a Stochastic Process: From Gene Number Distributions to Protein Statistics and Back

Page 33: Gene Expression as a Stochastic Process: From …...Outline Motivation & Basics A Stochastic Approach to Gene Expression Application to Experimental Data Summary & Outlook Gene Copy

Outline Motivation & Basics A Stochastic Approach to Gene Expression Application to Experimental Data Summary & Outlook

Single Cell Protein Measurements

By single cell studies it is possible to obtain protein numbers of singlecells (e.g. by use of GFP and derivatives), but the gene numberdistribution cannot be measured directly and sometime rate constantsand expression factor are unknown. In these cases the above theory canbe applied, if C � 1 and mRNA kinetics � 1/δ2:

1. Compute mean 〈P〉 and variance var(P) of measured proteinnumbers.

2. Use discretized approximation: Mean and variance are homogeneous

functions of degree 1 and 2, respectively. ⇒ C = var(P)〈P〉

3. Compute the mean gene copy number 〈D〉 = 〈P〉C .

4. If the gene copy number distribution is Poisson (meaningful fortransfection), then we know everything about it!

5. From the found pD we can compute the theoretical pP and compareto the measured protein distribution as a check for consistency.

Jan-Timm Kuhr

Gene Expression as a Stochastic Process: From Gene Number Distributions to Protein Statistics and Back

Page 34: Gene Expression as a Stochastic Process: From …...Outline Motivation & Basics A Stochastic Approach to Gene Expression Application to Experimental Data Summary & Outlook Gene Copy

Outline Motivation & Basics A Stochastic Approach to Gene Expression Application to Experimental Data Summary & Outlook

Results

Non-fluores-cent cellsallow for in-dependentmeasurement.Strong noiseand bias tothe left call forimproved ex-periments anddata analysis.

C from C from C from

pD=0 〈P〉′ σ2(P)′ 〈D〉′ pD=0 and 〈P〉′ pD=0 and σ2(P)′ 〈P〉′ and σ2(P)′

PEI synch. 0.4 4.46 · 106 9.44 · 1012 1.38 3.49 · 106 2.46 · 106 3.24 · 106

PEI asynch. 0.23 2.56 · 106 5.84 · 1012 1.29 2.25 · 106 1.26 · 106 1.99 · 106

Lipo synch. 0.3 5.91 · 106 1.65 · 1013 1.38 4.97 · 106 2.54 · 106 4.29 · 106

Lipo asynch. 0.3 3.75 · 106 1.20 · 1013 1.29 3.15 · 106 2.16 · 106 2.90 · 106

Jan-Timm Kuhr

Gene Expression as a Stochastic Process: From Gene Number Distributions to Protein Statistics and Back

Page 35: Gene Expression as a Stochastic Process: From …...Outline Motivation & Basics A Stochastic Approach to Gene Expression Application to Experimental Data Summary & Outlook Gene Copy

Outline Motivation & Basics A Stochastic Approach to Gene Expression Application to Experimental Data Summary & Outlook

Summary:

I Distributions give us information about the underlying processes.

I Expression factor C := λ1λ2

δ1δ2can be obtained from protein

distribution, yielding a functional relationship between the rates.

I Mean number of genes 〈D〉 and even distribution of genes can becomputed.

I Transfection process can be tested for quality.

Outlook:

I Incorporate promotor activity, poly-A-mRNA-degradation, etc. intoanalysis.

I Check derived results by tuning rates: modification of promotorsequence, destabilizing proteins, mutations in the gene’s openreading frames. . .

I Improve experimental setup, better data analysis, reduce extrinsicnoise.

Jan-Timm Kuhr

Gene Expression as a Stochastic Process: From Gene Number Distributions to Protein Statistics and Back

Page 36: Gene Expression as a Stochastic Process: From …...Outline Motivation & Basics A Stochastic Approach to Gene Expression Application to Experimental Data Summary & Outlook Gene Copy

Outline Motivation & Basics A Stochastic Approach to Gene Expression Application to Experimental Data Summary & Outlook

Thanks for your attention!

Jan-Timm Kuhr

Gene Expression as a Stochastic Process: From Gene Number Distributions to Protein Statistics and Back