Download - Supplementary Materials for · 2020. 3. 4. · 2 At this point plates were pooled in a single 500 µl Eppendorf tube and treated with 1 µl of Exonuclease I (Thermo) for 30 min at

science.sciencemag.org/content/367/6482/1151/suppl/DC1

Supplementary Materials for

Sequencing metabolically labeled transcripts in single cells

reveals mRNA turnover strategies Nico Battich*, Joep Beumer, Buys de Barbanson, Lenno Krenning, Chloé S. Baron,

Marvin E. Tanenbaum, Hans Clevers, Alexander van Oudenaarden*

*Corresponding author. E-mail: [email protected] (N.B.); [email protected] (A.v.O.)

Published 6 March 2020, Science 367, 1151 (2020)

DOI: 10.1126/science.aax3072

This PDF file includes:

Materials and Methods

Supplementary Text

Figs. S1 to S15

Captions for Additional Data Tables S1 to S4

References

Other Supplementary Material for this manuscript includes the following:

(available at science.sciencemag.org/content/367/6482/1151/suppl/DC1)

Additional Data Tables S1 to S4 (Excel files)

1

Materials and Methods

Tissue culture

RPE1-FUCCI cells were cultured in Dulbecco's modified Eagle's medium

(DMEM)/F12, supplemented with GlutaMAX (Gibco), FBS (Gibco), and

penicillin/streptomycin (Gibco), following standard procedures. Similarly, K562 cells

were cultured on RPMI 1640 medium (Gibco), supplemented with FBS, and

penicillin/streptomycin. EU culture and dissociation are described below.

scEU-seq

The scEU-seq protocol was based on CEL-Seq2 (19, 26). Primers for cDNA synthesis

were used at a working solution of 7.5 ng/µl. 50 nl of the primer working solution were

dispensed in wells of the 384 well plate (Greiner) containing 5µl mineral oil (Sigma) using

a mosquito (TTP Labtech), plates were then stored at -80ºC until use.

For the click reaction we used the following reagents: CuSO4 stock solution (200 mM

in water), Ligand stock (THPTA, Lumiprobe, 400 mM in water), 5-ethynyl uridine (EU)

stock solution (0.5 M in DMSO), Azide-PEG3-biotin conjugate (Sigma) stock solution (1

M in DMSO), 1% IGEPAL solution in 50mM TRIS, and Ascorbate (Sigma) solution in

31.6 mg in 1ml of H2O. We first created a master mix (MM) per click reaction by mixing

2.4 µl Ligand stock solution, 2 µl of 30×PBS, and 1.2 µl CuSO4 stock solution. Then we

created Click Solution A with 10.3 µl of Azide Biotin 10 mM (1:100 dilution of 1M stock),

6 µl 1% Triton X100, 1 µl of 0.1 µg/µl DAPI, 30.5 µl H2O, and 5.6 µl MM.

Cells were incubated with the EU and/or DMSO for the desired time (e.g. 60 min).

The final concentration of EU was 200 µM for RPE1-FUCCI chase experiments, 400 µM

for RPE1-FUCCI or K562 pulse experiments, 166 µM for chase experiments in organoids

and 1.6 mM for pulse experiments in Organoids. The U chase phase of the was done with

400 µM U. Cells were dissociated using TrypLE enzyme mix (Gibco) and mechanical

sheering if required. Cell were resuspended in 200 µl PBS and fixed by adding 200 µl of

8% PFA (final fixing medium is 4% PFA in PBS) for 5 in at room temperature (RT), then

200 µl of 1% Triton-X100 and further incubated for 5 min at RT, cells were then

centrifuged at 300g for 5 min and resuspended in 500 µl of 1M Tris-HCL pH7-8 to quench

the fixation reaction. To perform the click reaction, cells were resuspended in 53.4 µl of

Click Solution A after the Tris-HCl was, and 6.6 µl of Ascorbate solution was added and

cells incubated for 30 min at RT. After the click was completed, 540 µl of PBS was added

to the cells and FACS of single cells into wells of a 384-well plates containing primers was

performed immediately. The last column of the plate was generally left empty, to serve as

the empty well control. Sorted plates were stored at -80ºC until further processing.

Prior first strand cDNA synthesis, we reversed fixation of the cells by adding 50 nl

per well of 10 nl (1:50,000 ERCC RNA spike-ins, Thermo), 10 mM dNTPs (promega), 10

nl Proteinase K (Ambion), and 10 nl of 1% IGEPAL (Sigma) solution, an incubating cells

for 30 min at 55 ºC, 80ºC for 10 min, 65ºC for 10 min and then cooled to 4ºC. First strand

synthesis was performed by adding 175 nl of, 35 nl of 5×RT buffer, 17.5 nl of 0.1 M DTT,

8.75 nl Superscript II (Thermo), 5 nl H2O and 8,75 nl RNaseOut (Thermo), and incubating

plates at 42ºC for 60 min, then 70ºC for 15 min and then cooled to 4ºC, plates were then

kept on ice until pooling.

2

At this point plates were pooled in a single 500 µl Eppendorf tube and treated with 1

µl of Exonuclease I (Thermo) for 30 min at 37ºC before proceeding. We then added 67 µl

of activated streptavidin beads in 2 × wash-binding (WB, 10 mM Tris-HCl - pH 7.5, 1 mM

EDTA, 2 M NaCl) buffer (Dynabeads MyOne Streptavidin C1, Thermo, activation done

as specified by manufacturer) and incubated samples at RT for 30 min while shaking. Once

binding the EU labeled mRNA/cDNA hybrid was completed, we separated the beads from

the supernatant. The beads where then washed once with WB buffer at 50ºC for 5min, three

times with W& buffer at RT and once with 200 µl of low salt buffer (0.1 M NaCl in RNase

free water). Streptavidin beads were then resuspended in 20 µl of nuclease free water. The

supernatant was precipitated using 1 × volume of AMPure XP beads (Beckman Coulter),

a 1:4 bead dilution from stock in bead binding buffer was used and bead cleanup performed

as recommended by manufacturer. After clean up AMPure XP bead were resuspended in

20 µl of nuclease free water.

Second strand synthesis was performed for both the labeled and unlabeled

(supernatant) fractions using the NEBNext Ultra II Non-Directional RNA Second Strand

Module (NEB) as specified by manufacturer, then samples were cleaned once more using

the AMPure XP beads, and beads were resuspended in 4.8 µl of nuclease free water. In

vitro transcription was performed over night at 37ºC using the MEGAscript T7

Transcription Kit (Ambion) as specified by manufacturer. Sequencing libraries were

prepared with the TruSeq small RNA primers (Illumina) and then sequenced paired-end at

75 bp read length in the Illumina NextSeq.

The pulse and chase experiments for the RPE1-FUCCI were done in separate

experimental weeks using different batches of cells. The time points for the pulse

experiment we 15min, 30min, 45min, 60min (1h), 120min (2h), 180min (3h) and the

DMSO control. The time points U washout for the chase experiment were 0 min, 60 min

(1h), 120 min (2h), 240 min (4h), 360 min (6h) and the DMSO control. For the intestinal

organoids all time points we preformed the name experimental day, and were 120 min (2h)

and DMSO control for the pulse experiment, and 0 min, 45 min, 360 min (6h) and DMSO

control for the chase experiment. Time points were initiated so that all cells were isolated

and fixed at the same time.

CEL-seq2 and bulk EU-seq

CEL-seq2 libraries were prepared as described in ref. (26), a total of 1,536 cells were

sequenced, of which, 1,065 passed the quality controls of transcript levels (>104 UMIs) and

fluorescence signal. For the bulk EU-seq control experiments we performed the protocol

as described above but scaled the volumes for 500 cells accordingly.

Murine intestinal organoid culture

Primary organoid cultures used in this study were derived from Lgr5DTR-eGFP (22)

and established and grown as described before (20). Briefly, organoids were expanded in

medium termed ENR, consisting of Advanced Dulbecco’s modified Eagle’s medium/F12

(Advanced DMEM) with HEPES (10mM, Sigma), penicillin/streptomycin (1x,

Thermofisher) and Glutamax. Advanced DMEM was supplemented with 1x B27

(Thermofisher), 1 mM N-acetylcysteine (Sigma), 50 ng/ml murine recombinant epidermal

growth factor (PeproTech), R-spondin 1 conditioned medium (5% of final volume) and

Noggin conditioned medium (5% of final volume) to generate ENR medium. To generate

3

conditioned media, HEK293T cells were stably transfected with Rspo1-Fc (gift from

Calvin Kuo, Stanford University) or transient transfection with mouse Noggin-Fc

expression vector and grown for 1 week in Advanced DMEM supplemented with

penicillin/streptomycin, and Glutamax. Organoids were plated in Reduced Growth Factor

Basement Membrane Matrix (BME) Type 2 (Trevigen).

Cell sorting

Cell sorting of fixed human cell lines and mouse intestinal organoids cells was

performed using an INFLUX instrument (BD). Cells were sorted in PBS after click reaction

was completed as described above. Gating in the forward, side scatters, and DAPI channels

were used to discard doublet cells and debris. For RPE1-FUCCI cells indexed

measurements of RFP and GFP signals were recoded but gating on this channels was only

used to discard prominent outliers. The GFP signal of intestinal organoid lines was used to

enrich for cells expressing the Lgr5 stem cell marker (Fig. 3B). For bulk EU-seq

experiments we sorted 500 cells for each chase EU treatment, the G1 gate was set to sort

cells up ~1/3 of the cell cycle progression, the S gate was set from ~1/2 to ~3/4 of the cell

cycle progression, and the G2 gate was set to the last ~1/6 of the cell cycle-progression.

Treatment of K562 cells for the heat shock experiment

Cells were cultured as described above with the difference that cells were incubated

at 37ºC or 42ºC during the 45 min pulse of EU or DMSO, prior to cell fixation. The total

UMI threshold used was 3000 for the unlabeled fractions of cells incubated at 37ºC in EU

(286 cells), 3000 for the unlabeled fractions of cells incubated at 42ºC in EU (277 cells),

1000 for the labeled fractions of cells incubated at 37ºC in EU (208 cells), 1000 for the

labeled fractions of cells incubated at 42ºC in EU (108 cells), 4000 for the unlabeled

fractions of cells incubated at 37ºC in DMSO (230 cells), 4000 for the unlabeled fractions

of cells incubated at 42ºC in DMSO (190 cells). Prior to DESeq analysis, the number of

UMI was down sampled to 1000 in all cases.

Bioinformatics and Statistical Analysis

In the libraries, read one contains cell barcode as well as the UMI information and

read two read contains sequences from transcripts. We mapped read two using STAR 2.5

with default parameters, to the human genome (ensemble release 90 of the homo sapiens

GRCh38 genome, extended with ERCC92 spike-ins) or the mouse genome (ensemble

release 90 of the mus musculus GRCm38 genome, extended with ERCC92 spike-ins). The

design of the primer used for cDNA synthesis was “GCCGG - minimal T7 promoter

(TAATACGACTCACTATAGGG) - A - Illumina adapter

(GTTCTACAGTCCGACGATC) - unique molecular identifier (NNNNNN) - cell barcode

(8 bases) - 24xT - V”. If the length of the poly-T track in read one was less than 19 bases,

the read was discarded. The number of UMI count per gene was obtained by pooling all

reads mapping to the same gene and having the same cell barcode as previously described

(27) using only uniquely mapped reads. Briefly, for each cell barcode, we counted the

number of UMIs for every transcript and aggregated this number across all transcripts

derived from the same gene locus (27). However, we did not use binomial statistics to

convert the number of UMIs to transcript counts (27). The UMI for a given gene in a cell

was considered to be unspliced if at least one base of any read belonging to that UMI

4

mapped outside of the exons of the gene. The minimum number of UMI detected in a cell

for the RPE1-FUCCI cell-cycle analysis was of 2300, while the threshold used for murine

in intestinal organoid cells was of 1000.

For differential gene expression analysis, we used the R package DESeq2 (28). All

reported enrichments were carried out using Fisher’s exact test, and obtained P values were

adjusted for multiple testing using the Benjamini–Hochberg method.

Self-organizing maps (SOMs) for analysis of organoid gene expression and transcript

regulatory strategies were created using the SOM library for python, using the cosine

distance metric to match genes to SOM nodes. The single cell UMAP sand SOM analyses

to identify cell type lineages for intestinal organoids dataset were constructed using the

cosine metric on the normalized spliced UMI counts per cell (relative expression of each

gene per cell, see below), with the UMAP implementation for Python (29, 30). The SOM

analysis initially resulted in 12 nodes but node 12 was merged to node 8 as these two were

clusters of cells mapping to the same region of the UMAP but represented cell coming

from the Pulse and chase experiments respectively. The identification of cell type identity

was done based of differential gene expression analysis using DESeq. Briefly, the

expression of cell in every cluster was compared to cells from all other clusters, and we

used the identity of the top upregulated genes to call cell types (Table S1). GO enrichment

analysis was done using DAVID (31) as described in (32).

Identification of highly variable genes during organoid development

To identify genes that were highly variable during intestinal organoid differentiation,

we first filtered genes using the coefficient of variation (CV) as a function of the mean

expression level. We define the relative expression of each gene by dividing the total

spliced UMI counts for a gene in a cell by the total spliced UMIs detected in that cell across

all genes and multiplying this by 105. We learned the general scaling of the CV and mean

relative expression by linear regression, and selected genes that were in the to 10% most

variable for a given expression level. To obtain genes above this threshold, we used a

sliding window with a width value of 0.2 in the log10(relative expression) (fig. S13B). This

resulted in 1033 putative regulated genes. We then applied DESeq analysis (28) comparing

the relative expression of the nodes 11 and 8, against all other nodes, to find genes that

were significantly different between Lgr5 positive stem cells, and the rest. The adjusted P

threshold was set to 10-5 (fig. S13C). Additionally, to correct for any systematic bias on the

measurement of the log2(fold change), we performed Gaussian mixture modeling assuming

5 Gaussian distribution and discarded all genes that were within 2 standard deviations of

the mean value of the central Gaussian distribution. All ribosomal protein coding genes,

Malat1 (33), and Hck, Hbegf and Ptprr were removed from the final list, resulting in 295

genes that were used for further analysis. We added selected housekeeping genes as

controls (see main text).

Computation of the cell cycle progression

To compute the cell-cycle progression of RPE1-FUCCI cells using the signals from

Geminin-GFP and the Cdt1-RFP, we first depleted the dataset according to the local data

density and then matched the fluorescence measurements from the pulse and chase

experiments by performing a z-score normalization on the log10(fluorescence). Briefly, we

iteratively discarded two points chosen from the 10 points with largest local density until

5

500 data points were left, then the mean and standard deviation (std) values were computed

for the GFP and RFP signals of the pulse and chase experiments. Next, the values of the

chase experiment were matched to the pulse experiment using the following expression:

𝑐𝑜𝑟𝑟𝑒𝑐𝑡𝑒𝑑𝐶ℎ𝑎𝑠𝑒 = 𝑚𝑒𝑎𝑛𝑃𝑢𝑙𝑠𝑒 + 𝑠𝑡𝑑𝑃𝑢𝑙𝑠𝑒 ∗ ((𝑜𝑟𝑖𝑔𝑖𝑛𝑎𝑙𝐶ℎ𝑎𝑠𝑒 − 𝑚𝑒𝑎𝑛𝐶ℎ𝑎𝑠𝑒)/𝑠𝑡𝑑𝐶ℎ𝑎𝑠𝑒)

We constructed the cell-cycle progression trajectory by using an implementation of

the wanderlust algorithm (34) written in Python. We changed the algorithm so that cells

were at allocated to one of 300 equally spaced points along the trajectory, to allow for later

calculation of the average time cells spent at each cell-cycle point using the ergodicity

principle. The cell-cycle progression was computed independently for the scEU-seq and

CEL-seq2 datasets. We also used the cell-cycle progression computations to guide the

sorting of the bulk G1, S and G2 control experiments, as described above.

Derivation of organoid differentiation trajectories

To compute the differentiation trajectory for intestinal organoids we used the R

package of Monocle2 (24) based on the total spliced UMI counts for the 295 genes that

significantly varied during differentiation plus the 6 selected housekeeping genes (301

genes in total) (fig. S14A). For computation of the average transcript levels, and synthesis

and degradation rates in the secretory lineage leading to Paneth cell differentiation, we took

cells belonging to clusters 4, 6, 8, 9, 10 and 11 (Fig. 3C), and that were part of branch 1 in

the Monocle2 analysis (Fig. 3E and fig. S14A). Similarly, for computation of the average

transcript levels, and synthesis and degradation rates in the enterocyte lineage, we took

cells belonging to clusters 1, 2, 6, 7, 8 and 11 (Fig. 3C), and that were part of branch 2 in

the Monocle2 analysis (Fig. 3E and fig. S14A). The values were derived by first rescaling

the difference between two adjacent cells in the Monocle2 trajectory by the Manhattan

distance between the relative expression of the 301 selected genes per cell. We then

averaged normalized rates and normalized UMI counts along equally spaced 200 position

of a sliding window with a window size equivalent to 5 positions, in each branch

individually.

Supplementary Text

Generation of simulated data for testing the fitting procedure

To assess the best fitting procedure to estimate κ and γ using our dataset, we

considered two alternative interpretations of the chase experiment, a non-steady state

model, where the number of labeled molecules at the start of the chase time is considered

to be unknown, and a quasi-steady state model, which views the dynamics of the chase

experiment as an exponential decay process. We then built a probabilistic framework to

simulates realistic dynamics of the pulse and chase experiments under non-steady state

conditions.

For the pulse experiment we assume the simplest model of gene expression:

6

Note that this model assumes continuous synthesis of mRNA. The time dependent

probability mass function of such a model follows a Poisson distribution if the initial expression level is zero (35):

𝑃𝑜𝑖𝑠𝑠(𝑁, 𝑚(𝑡)) [1]

where 𝑚(𝑡) = 𝜅

𝛾(1 − 𝑒−𝛾𝑡) corresponds to the dynamics of the population mean, 𝛾

is the degradation rate constant, 𝜅 is the transcription rate, and 𝑁 is the observed number

of molecules at time t. To model the dispersion observed in our single cell sequencing

dataset (27), we convolve eqn. [1] with a negative binomial distribution:

𝑃𝑁(𝑡) = ∑ 𝑃𝑜𝑖𝑠(𝑀, 𝑡) 𝑁𝐵𝑡=0(𝑁, 𝑀 × 𝑝, 𝑠)∞𝑀=0 [2]

where 𝑠 is the dispersion of the negative binomial, and is a parameter that can be

modified to approximate different experimental recovery rates. For simulations presented

in the manuscript p was set to 0.25 and log2 𝑠 = 2 + log2 𝑝 ∗ 𝑀. In fig. S7, A and B, 𝜅 =10 𝑚𝑜𝑙𝑒𝑐𝑢𝑙𝑒𝑠 ∙ ℎ−1, and 𝛾 = 0.346 ℎ−1, fig. S7A displays an example of a pulse

experiment and fig. S7B displays an example of a chase experiment. We use eqn. [2] to

sample the dynamics of the pulse experiment for different combinations of the synthesis

and degradation rates seen in fig. S7.

For simulation of the chase experiment we divided the dynamics in two phases, the

induction phase, where the mRNA molecules are labeled with the dynamics of the pulse

experiment using eqn. [2]. For genes close to steady-state expression, this induction phase

is relatively long (induction window in fig. S7C). However, for genes that have not reached

steady state, which is the case of most genes in the time scale of a single cell cycle, this

induction window will be relatively short, which in turns leads to changes in the total

number of molecules at the start of each chase phase, as discussed above (fig. S7C). We

then simulate the chase phase of the experiment as a stochastic exponential decay, which

is equivalent to a Bernoulli trial, and follows the binomial distribution (36). 𝐵(𝑁, 𝑁0, 𝑝(𝑡)) [3] where 𝑝(𝑡) = 𝑒−𝛾𝑡, 𝑁 is the observed number of molecules at time 𝑡, and 𝑁0 is the

initial number of molecules. We model the dynamics of the observed chase experiment as

the convolution of the equations [2] and [3].

𝐶𝑁(𝑡) = ∑ 𝑃𝑁0(𝑤 − 𝑡) 𝐵(𝑁, 𝑁0, 𝑡)∞

𝑁0=0 [4]

where 𝑁0 is the of molecules at the start of the chase phase, 𝑡 is the chase time, 𝑤 is

the time of the induction phase. We use 𝑃𝑁0(0) for 𝑤 < 𝑡. For results shown in fig. S7, E

to I, we used 𝑤 = 6ℎ, a range for 𝜅 values from 1 to 100 molecules/h, and a range of 𝛾 values from 0.069 to 1.38 h-1, equivalent to half-lives of ~10 to 0.5 h. For results shown in fig. S7, J and K, we used 𝑤 = 2h to 20h as indicated, 𝜅 = 40 molecules/h.

7

Generation of simulated data for testing the validity of the regulatory strategies

To test the validity of the estimated regulatory strategies, we assumed a simple gene

expression model as shown above, and then allowed κ and γ to change dynamically for a

period of 1440 minutes (24 h). The change of the rates over time was based on the Gaussian

function 𝑔(𝑡) ≡ exp [−1

2(

𝑥−𝜇

𝜎)

2

], where the 𝜇 is the peak time which was set to 720

minutes for all simulations, 𝜎 is the standard deviation, which was set to 100, 200, 300 and

400 minutes, and 𝑡 is the time along the cell cycle. The synthesis rate (units molecules/h)

was defined by the function log10 𝜅(𝑡) = 𝑎 + 𝑏 ∗ 𝑔(𝑡), where 𝑎 was 1.5 or 1.2, and 𝑏

ranged between 0.5 and 1.5. Similarly, the degradation rate constant (units ℎ−1) was

defined by the function log10 𝛾(𝑡) = (𝑎 − 1) + 𝑐 ∗ 𝑔(𝑡), where 𝑎 was 1.5 or 1.2, 𝑐 ranged

between −𝑏 and 𝑏. This choice of parameterization leads to a steady state level of

𝜅(𝑡 → ∞)/ 𝛾(𝑡 → ∞) = 10 transcripts. Varying the parameters a and b resulted in

synthesis rates ranging from 15.6 to 1,000 molecules/h and degradation rate constants

ranging from 0.05 to 100 ℎ−1 (equivalent to a range of half-lives between 13.9 and 0.007

h). The parameters b and c define the fold change of the synthesis rate and degradation rate

constant (fig. S12A): 𝐾 ≡𝜅(𝑡=𝜇)

𝜅(𝑡→∞)= 10𝑏 and 𝐺 ≡

𝛾(𝑡=𝜇)

𝛾(𝑡→∞)= 10𝑐. Next, we simulated the

stochastic evolution of the pulse and chase experiments using the Gillespie algorithm,

which we implemented in Python. During simulations the values of the rates where updated

every 10 minutes. For the chase experiment, we first simulated the evolution of a full

system progression (1440 minutes). During the 1440 minutes, we stopped the simulation

at the time corresponding to the start of the chase phase, and then simulated an exponential

decay process with the corresponding degradation rate constants. For simulations of the

pulse experiment the initial transcript count was set to zero, and the values from the

degradation and synthesis rates were set to values corresponding to the EU labeling time-

points. We simulated 100 traces for each rate regime (either cooperative or destabilizing),

and acquired measurements at each time-points corresponding to 0, 130, 261, 392, 523,

654, 784, 915, 1,046, 1,177, 1,308 and 1,439 minutes, of the second run, resulting in a total

of 105,600 simulated traces. To estimate the probability density function of the

experimental measurements, we convoluted the final values of the 100 simulated traces per

condition with a negative binomial distribution as described above, p was set to 0.25 and

log2 𝑠 = 2 + log2 𝑝 ∗ 𝑀. The fitting of the simulated dataset was done as described below

for the non-steady state case.

As can be seen from fig. S12 large errors in the calling of regulatory strategies

(cooperative versus destabilizing) are only observed for cooperative strategies when 𝜎 ≤

100 minutes and 𝑎 = 1.2, and the fold change of the synthesis rate K is relatively large

(higher than 4 times) (fig. S12, E, G and H). Since we do not observe incorrect calls of true

destabilizing strategies as “cooperative”, our findings regarding cooperative strategies

during the cell cycle are robust to biases introduced by the fitting procedure. Furthermore,

these errors are only observed during fast dynamics (expression changes in the order of

100 minutes).

8

General modeling of kinetic rates

To fit the chase experiment assuming a non-steady state dynamic we used the

following arguments and model. The real dynamics of the chase experiment, when the

change in expression unknown can be illustrated by the following figure (fig. S7C), where 𝑚0, 𝑚1, and 𝑚2are the measurement of labeled RNA at chase time point 𝑡0, 𝑡1 and 𝑡2

(washouts in figure), respectively. Then,

𝑚0 =

𝜅

𝛾−

𝜅

𝛾𝑒−𝛾(𝑡0−𝑡1) + ℎ1𝑒−𝛾(𝑡0−𝑡1), where ℎ1 is the unseen initial transcript levels of 𝑡1

𝑚0 = 𝜅

𝛾−

𝜅

𝛾𝑒−𝛾(𝑡0−𝑡2) + ℎ2𝑒−𝛾(𝑡0−𝑡2), …

`𝒎𝟎 = 𝜿

𝜸−

𝜿

𝜸𝒆−𝜸𝒕 + 𝒉𝒕𝒆−𝜸𝒕

𝑚1 = ℎ1𝑒−𝛾(𝑡0−𝑡1),

𝑚2 = ℎ2𝑒−𝛾(𝑡0−𝑡2), …

𝑚(𝑡) = ℎ𝑡𝑒−𝛾𝑡

𝒉𝒕 = 𝒎(𝒕)

𝒆−𝜸𝒕

and,

𝑚(𝑡) = −𝜅

𝛾+

𝜅

𝛾𝑒−𝛾𝑡 + 𝑚0 [5]

where eqn. [5] is the non-steady state dynamics of the chase experiment. The quasi-

steady state interpretation and dynamics of the chase experiment can be represented by fig.

S7D and follows the exponential decay process,

𝑙(𝑡) = 𝑚0𝑒−𝛾𝑡 [6] The dynamics for the pulse experiment is:

𝑙(𝑡) =𝜅

𝛾−

𝜅

𝛾𝑒−𝛾𝑡 + 𝑙0𝑒−𝛾𝑡 [7]

and assuming 𝑙0 = 0, it becomes

𝑙(𝑡) = 𝜅

𝛾(1 − 𝑒−𝛾𝑡) [8]

where 𝑙(𝑡) is average number of molecules detected at pulse or chase time t, and 𝑙0 is

the initial number of molecules in the experiment at time t = 0. 𝑙0, 𝜅 and 𝛾 are fitted

parameters in the chase experiment.

Fitting of the synthesis rate κ degradation rate constant γ

To sample cells from the different pulse and chase time points of the cell cycle

experiment we constructed cell pools for cell-cycle position 𝐶𝐶𝑃𝑖 by pooling cells from 𝑗

9

neighboring positions (𝐶𝐶𝑃𝑖−𝑗 … 𝐶𝐶𝑃𝑖+𝑗), assuming circular cell-cycle structure. To

determine the value of 𝑗 for a given gene, we incrementally expanded it until at least 10%

of the total number of measured UMIs and at least 15 cells of each pulse and chase time

points was within the pool of cells, then for each bootstrap we subsampled the pool of cell

with replacement to draw a total of 30 cells per time point. For the intestinal organoid we

computed one value of κ and γ per cell, for each cell in all time points in the pulse and

chase experiments. The pool of cells to be sampled per cell 𝐶𝑖 was constructed by taking

all closest 20 neighbors of 𝐶𝑖 per pulse and chase time point using the cosine distance

between the total spliced UMIs, taking only the selected genes identified to change

significantly during development plus the 5 housekeeping genes (301 genes in total). For

bootstrapping we sampled 20 cells per time point with replacement. Prior fitting the pulse

and chase experiments we normalized the mean labeled UMI counts of a given gene in the

pool of cells for a given pulse or chase time point by the following equation:

𝑚𝑒𝑎𝑛 𝑠𝑢𝑚 𝑜𝑓 𝑡𝑜𝑡𝑎𝑙 𝑈𝑀𝐼 𝑖𝑛 𝑒𝑥𝑝𝑒𝑟𝑖𝑚𝑒𝑛𝑡 ∙𝑚𝑒𝑎𝑛 𝑙𝑎𝑏𝑒𝑙𝑒𝑑 𝑈𝑀𝐼 𝑓𝑜𝑟 𝑔𝑒𝑛𝑒 𝑖𝑛 𝑝𝑜𝑜𝑙 𝑜𝑓 𝑐𝑒𝑙𝑙𝑠

𝑠𝑢𝑚 𝑜𝑓 𝑡𝑜𝑡𝑎𝑙 𝑈𝑀𝐼 𝑓𝑜𝑟 𝑝𝑜𝑜𝑙 𝑜𝑓 𝑐𝑒𝑙𝑙𝑠 𝑜𝑓 𝑡ℎ𝑒 𝑡𝑖𝑚𝑒 𝑝𝑜𝑖𝑛𝑡 [9]

Where the total UMI is the sum of the unlabeled and labeled UMIs. Since we did not

simulate the entire transcriptome of a cell, for fitting the simulations we used the mean

labels UMI number, and sampled 20 cells per bootstrap per time point. When fitting the

chase experiment assuming a the non-steady state model we used eqn. [5], where 𝑚(𝑡) are

the normalized labeled UMI for each time point q of the chase experiment, the parameters

fitted where 𝜅, γ and 𝑚0, and the minimized the sum or squared errors as the cost function;

𝑒𝑟𝑟𝑜𝑟 = ∑(𝑝𝑟𝑒𝑑_𝑚(𝑡) − 𝑚𝑒𝑎𝑠𝑢𝑟𝑒𝑑_𝑚(𝑡))2. Similarly, when fitting the chase experiment only

assuming the quasi-steady state model we used eqn. [6], and the parameters fitted where γ

and 𝑙0, and the minimized cost function as shown above. For fitting the pulse experiment

only, we used eqn. [8] and fitted 𝜅 and γ, and minimized a cost function as shown above.

Finally, for fitting the non-steady state model on the combination of pulse and chase used

equations [5] and [8]. In the case of the cell cycle we have enough time points to allow us

to fit two independent 𝜅 values for the chase and the pulse experiments. This to account

for the expected differences in library depth and transcript detection efficiencies between

the pulse and chase experiments. For further analysis in this case we used the estimated 𝜅

corresponding to the pulse experiment. In the case of the organoids, this difference was

estimated by a correction factor for the chase experiment 𝑓 =

𝑚𝑒𝑎𝑛 𝑡𝑜𝑡𝑎𝑙 𝑈𝑀𝐼 𝑖𝑛 𝑐ℎ𝑎𝑠𝑒 𝑚𝑒𝑎𝑛 𝑡𝑜𝑡𝑎𝑙 𝑈𝑀𝐼 𝑖𝑛 𝑝𝑢𝑙𝑠𝑒⁄ , which was used to multiply 𝜅 when

applied to eqn. [5]. In the cost function the errors for the pulse and chase experiments, were

weighted for the number of time points in each experiment and for the expression level.

The fitting for each pool of cells was bootstrapped 100 times, for the simulations, 50 times

for each cell-cycle progression point, and 20 times for each cell of the organoid dataset.

The result of the simulations showed that sometimes our fitting procedure generates global

outliers, which had values of rates outside of what we can expect in our biological systems

(fig. S7, E and F). Hence, the thresholds for discarding these global outliers were defined

by inspection of the distribution of all obtained rates, as shown in (fig. S7, E and F, S8D,

and S14D). The values for the threshold used for the cell cycle dataset are 0.00316 and 10

molecules/h for the synthesis rate, and 0.00316 and 10 h-1 for the degradation rate. The

values for the threshold used for the intestinal organoid dataset are 0.000316 and 7.1

molecules/h for synthesis rate and 0.00316 and 7.1 h-1 for the degradation rate. Upon

10

manual inspection of the final rates in the organoids dataset, we adjusted the lower

threshold of the computation of the degradation rate of the Defa17 gene to 0.25 h-1. The

median relative error between the predicted rates and the true rate used for the simulation

was defined as 𝑚𝑒𝑑𝑖𝑎𝑛(|𝑟𝑎𝑡𝑒𝑝𝑟𝑒𝑑𝑖𝑐𝑡𝑒𝑑 − 𝑟𝑎𝑡𝑒𝑒𝑥𝑝𝑒𝑐𝑡𝑒𝑑| 𝑟𝑎𝑡𝑒𝑒𝑥𝑝𝑒𝑐𝑡𝑒𝑑⁄ ). To approximate

the expected synthesis rate during simulations in order to account for the error introduced

by the negative binomial term in eqn. [2], the true synthesis rate was adjusted by the defined

s parameter of eqn. [2], 𝑠𝑦𝑛𝑡ℎ𝑒𝑖𝑠𝑒𝑥𝑝𝑒𝑐𝑡𝑒𝑑 = 𝑠𝑦𝑛𝑡ℎ𝑒𝑠𝑖𝑠𝑡𝑟𝑢𝑒 ∗ (1 − 𝑠). In the case of the

cell-cycle data set we also discarded clear local outliers by fitting a Gaussian distribution

to the log10 transformed rates around the 𝐶𝐶𝑃 of interest. The Gaussian distribution was

calculated with a window size of 30-60 𝐶𝐶𝑃𝑠 around the point of interest to avoid over

estimation of local outliers. We discarded all points with a deviation higher that 2.5σ. The

median of all non-outlier bootstraps for a given progression point was taken as the measure

rates for that cell-cycle progression point. In the case of the organoids the final rates were

the median of cells in a given Monocle2 trajectory, falling with in a window size equivalent

to 7h and expanded to avoid non-defined values. For trajectory branch 1 only cells in

clusters 1, 3, 8, 10, 11 were considered, and for trajectory branch 2 only clusters 1, 2, 5, 6

and 7 were considered.

Prediction of expression level though cell-cycle progression and organoid differentiation

We estimated the time cells spent on average at each position of the 𝐶𝐶𝑃 or the

differentiation trajectory by following the ergodicity principle. Briefly, we calculated the

fraction of cells that mapped to each position of the 𝐶𝐶𝑃 or differentiation trajectory, and

computed the average time in minutes that cells had to spend at each point, maintaining the

distribution shape and making the cumulative time equal to 24hrs (fig. S9C) for the cell-

cycle, and 72hr for differentiation of organoids (fig S14C) (37). For the organoids we

considered the start of the differentiation of branch 1 to be at the monocle 2 trajectory value

3.5 and progress as the monocle trajectory decreased, for branch 2, we considered the start

point to be 6.5 and increase as the monocle 2 trajectory increased (fig. S14C). For

prediction of the total levels, we recalibrated the computed rates. In the case of the cell-

cycle progression, we first found the scaling factor multiplying the degradation rate

constant γ, required to best predict the gene expression progression along the cell cycle of

the total UMI in the scEU-seq data set using simulations according to eqn. [7], by

initializing the simulations with the average level of the first 20 cell cycle positions, and

calculated the predicted change in expression occurring while cells are at 𝐶𝐶𝑃𝑖 by,

𝑙𝑖 = 𝜅𝑖

𝑐𝛾𝑖−

𝜅𝑖

𝑐𝛾𝑖𝑒−𝑐𝛾𝑖𝑡𝑖 + 𝑙𝑖−1𝑒−𝑐𝛾𝑖𝑡𝑖 [10]

where 𝑖 is a position on the 𝐶𝐶𝑃, 𝑐 is the tested gene correction factor, and 𝜅𝑖 , 𝛾𝑖 ,

and 𝑡𝑖 are the synthesis rate, degradation rate and time spent at each cell-cycle position, the

average recalibrated γ correlate well with known degradation rates in human cells (see fig.

S10C). Similarly, we obtained the correction factor for the recalibration of κ by optimizing

the prediction of the CEL-seq2 experiment to account for the difference in library depth

and transcript detection efficiency. We used the final corrected rate values for further

analysis. In the case, of the organoids to avoid over correction of rates, we only corrected

κ, and the observed expression was computed as the running average of the total UMI

11

detected in the DMSO controls and the 6h (360 min) chase time point for branches 1 and

2. Note that the correction applied to the rates do not change the magnitude of the relative

change, between the two rates. For modeling the cell-cycle or differentiation systems with

constant synthesis rate or degradation rate constants, we averaged the recalibrated rates

weighting the values of each position (in the cell cycle or differentiation) by the expected

number of cells and simulated the resulting gene expression as described above.

Analysis of regulatory strategies

Because we were interested on the relative changes of synthesis with respect to the

degradation rates either during the cell cycle or during intestinal organoid differentiation,

and given that at the steady state RNA levels 𝑙 = 𝜅 𝛾⁄ , we performed the initial analysis

of regulatory strategies using normalized rates rather than the absolute rate values. For the

cell-cycle experiment the rates at each position of 𝐶𝐶𝑃 was normalized to the median rate

along the CCP and then log2 transformed. For the organoids the normalization was done

by the average rate values of the first 50 positions (1/4) of the differentiation trajectory

branch. The cosine similarity between the synthesis and degradation rates were calculated

on the normalized rates per cell-cycle position (RPE1-FUCCI cells) across the entire CCP.

For the cosine similarity of organoids, we used positions 100 to 200 for branch 1 and 50 to

200 for branch 2, to maximize the computation of the similarity during the expected change

in expression. To define clusters enriched in cooperative, neutral or destabilizing strategies

for each strategy cluster in the RPE1-FUCCI dataset, we computed the enrichment of genes

with a low cosine similarity (s<0.5) for as a marker of strong cooperative strategies,

moderate cosine similarities (-0.5<s<0.5) as a marker for neutral strategies, and high cosine

similarities (s>0.5) as a marker for strong destabilizing strategies, in each strategy cluster

using the Fisher test (Fig. S11A). For the organoids, we selected the strategies group of

interest (Fig. 3, I and J) and discarded genes with a cosine similarity between -0.2 and 0.2

to make sure genes with cooperative or destabilizing strategies were taken for further

analysis. The dynamic range was estimated as the difference between the 2% and 98%

percentiles of the mean observed or predicted expression along the cell-cycle progression,

and 1% to 95% percentiles for the differentiation trajectory. The change in dynamic range

was then defined as the 𝑙𝑜𝑔2(𝑚𝑜𝑑𝑒𝑙𝑝𝑟𝑒𝑑𝑖𝑐𝑡𝑒𝑑 𝐶𝑆2𝑜𝑏𝑠𝑒𝑟𝑣𝑒𝑑⁄ ), for the three different models;

the full dynamic model, the constant synthesis model, and the constant degradation model.

The expression timing was defined by the combination of the length of the detected peak

and the time delay of the peak. We calculated the length of the detected peak by applying

the Otsu threshold detection algorithm to the mean observed signal along cell-cycle

progression or differentiation trajectory. The time delay was obtained my maximizing the

time cross-correlation function of the predicted vs observed expression. The final timing

distance was define as 𝑠𝑖𝑔𝑛(𝑑𝑒𝑙𝑎𝑦) × 𝑠𝑞𝑟𝑡(𝑑𝑒𝑙𝑎𝑦2 + 𝑝𝑒𝑎𝑘 𝑙𝑒𝑛𝑔ℎ𝑡2). The delta value

reported are calculated between the constant synthesis or degradation models and the full

dynamic model for both properties, the timing distance or the change in dynamic range,

i.e. 𝑑𝑒𝑙𝑡𝑎 = 𝑝𝑟𝑜𝑝𝑒𝑟𝑡𝑦𝑐𝑜𝑛𝑠𝑡𝑎𝑛𝑡𝑀𝑜𝑑𝑒𝑙 − 𝑝𝑟𝑜𝑝𝑒𝑟𝑡𝑦𝑑𝑦𝑛𝑎𝑚𝑖𝑐𝑀𝑜𝑑𝑒𝑙.

Fitting of rates to bulk chase experiments

To fit the bulk chase experiment for G1, S and G2 gated cells we used [5] and [7]. In

this case as before [5] describes the evolution of the levels of labeled transcripts as a

function of the chase time, in addition subtracting [5] to the total levels of transcripts, also

12

describes the evolution of the unlabeled transcripts, and [7] describes the changes in total

transcripts from a one cell-cycle phase to the next given the average time difference

between the two phases. In this case the cos function is 𝑒𝑟𝑟𝑜𝑟 = 𝑒𝑟𝑟𝑜𝑟𝑙𝑎𝑏𝑒𝑙𝑒𝑑 + 𝑒𝑟𝑟𝑜𝑟𝑢𝑛𝑙𝑎𝑏𝑒𝑙𝑒𝑑 +

𝑒𝑟𝑟𝑜𝑟𝑡𝑜𝑡𝑎𝑙. We fitted both biological replicates together. The time the end G2 and G1 was

estimated to be 24min, in this case the initial expression level was the (G2 expression)/2,

The time assumed for G1 to S was estimated to be 876 min, and the time between G2 to be

137 min. The estimation of these times were derived from the above mentioned ergodic

timing of the cell-cycle progression.

.

13

Fig. S1. Technical controls for scEU-seq. (A) Scatter plot showing the effect of EU

treatment in gene expression. The slope represents the mean slope of a linear fit of total

UMI counts, as a function of short incubation times (DMSO, 15, 30, 45 and 60 minutes) in

RPE1-FUCCI cells. The fit was bootstrapped 104 times. P is the probability of the slope

being zero (corrected for multiple testing by the Benjamini-Hochberg method). Genomic

and mitochondrial genes are shown in black and grey, respectively. The strongest EU-

regulated genes are outlined in red. (B) Mean UMI counts of total, labeled or unlabeled

mRNAs derived from RPE1-FUCCI cells treated with EU for 120 min as a function of the

total UMI counts of DMSO-treated cells. Linear fits are shown. Values for the mean slope

± 99% confidence interval (CI) are: 0.995±0.004 for total, 0.905±0.005 for unlabeled, and

0.089±0.007 for labeled mRNAs. (C) Histogram showing the sum of all UMIs detected

per cell in the pulse experiments. The threshold indicates the minimal UMI for a cell to be

taken for further analysis. (D) Histogram showing the sum of all UMIs detected per cell in

the chase experiments. (E) Bar plot showing the percentage of cells above the UMI

threshold of the pulse experiments. (F) Bar plot showing the percentage of cells above the

UMI threshold of the chase experiments. (G) Top panels show the signal to noise ratios

(SNRs) of the labeled UMI counts for all conditions relative to the DMSO controls for cells

14

above the UMI threshold, for the pulse (left) and chase (right) experiments, respectively.

The SNR was calculated as ratio of the median of the total number of labeled UMIs

detected for cells with a given EU treatment, and the median of the total number of labeled

UMIs detected for cells of the DMSO control. Bottom panels show the P values (corrected

Mann-Whitney U tests) for the total number of labeled UMIs detected for cells with a given

EU treatment versus the of the total number of labeled UMIs detected for cells in DMSO

control. The red line indicates P=0.05. Cell numbers of the pulse experiments are n = 265

for DMSO, n = 442 for 15 min, n = 574 for 30 min, n = 564 for 45 min, n = 405 for 60 min,

n = 408 for 120 min, and n = 400 for 180 min. Cell numbers of the chase experiments are

n = 202 for DMSO, n = 460 for 0 min, n = 436 for 60 min, n = 541 for 120 min, n = 391

for 240 min, and n = 334 for 360 min.

15

Fig. S2. scEU-seq performed after heat shock detects enrichment of stress response genes

in the EU-labeled mRNA fraction. (A) Scatter plots showing the average UMI number for

DMSO treated cells (total mRNA, left), and EU-treated K562 cells (unlabeled mRNAs,

middle, labeled mRNAs, right) that had either been incubated for 45 min at 37ºC or at 42ºC.

During the 45 min of heat-shock treatment cells were treated with EU or DMSO. Genes

that were differentially upregulated were detected using DEseq for adjusted P<0.05 and

are indicated in colors. (B) DAVID GO-term annotation enrichment analysis for genes with

upregulated mRNAs detected in the EU-labeled fraction. (C) Bar graph showing the

percentage of all UMIs that represent genes and were found to be upregulated in the EU-

labeled fraction. Enrichment factors (Enr) and P values were calculated using the Fisher

exact test.

16

Fig. S3. Cell cycle progression. (A) Scatter plot of Geminin-GFP and the Cdt1-RFP

corrected signals of RPE1-FUCCI cells (n = 5,422) showing the estimated cell cycle

progression. (B) Scatter plots as in A but showing cells from individual time points from

the pulse and chase experiments, respectively. Time is indicated in minutes. (C) Scatter

plots showing the expression of example genes as a function of the levels of Geminin-GFP

and Cdt1-RFP. Total UMI counts are indicated in color.

17

Fig. S4. Transcript levels of individual genes along the cell cycle progression derived from

EU pulse experiments. (A) UMI counts (color bars) of total transcripts (leftmost panel) and

18

labeled transcripts for each time point of the pulse experiment shown for nine example

genes. (B) Mean labeled UMI counts along the cell cycle trajectory for all detected genes

(n = 11,848) in the pulse experiments.

20

Fig. S5. Transcript levels of individual genes along the cell cycle progression derived from

EU chase experiments. (A) UMI counts (color bars) of total transcripts (leftmost panel)

and labeled transcripts for each time point of the chase experiment shown for nine example

genes. (B) Mean labeled UMI counts along the cell cycle trajectory for all detected genes

(n = 11,848) in the chase experiments.

21

Fig. S6. Total transcripts in cells along the cell cycle progression. (A) Median total UMI

counts per cell from the pulse experiments (sum of all genes) as a function of the cell cycle

progression. Gray lines indicate the 25-75% quantiles. Values were calculated with a

sliding window of size = 0.07. (B) As in A but for cells of the chase experiments.

22

Fig. S7. Evaluation of the fitting procedure for κ and γ. (A) Example of simulated pulse

experiments. (B) Example of simulated chase experiments. (C) Schematics of the non-

steady state interpretation of the chase experiment. (D) Schematics of the steady state

interpretation of the chase experiment. (E) Distribution of all estimated synthesis rates and

threshold for global outlier discarding. (F) As in E but for the degradation rates. (G) Heat

maps of the relative median errors for the degradation rate for different combinations of

23

synthesis rate and degradation rates, shown as half-lives (ln(2) /𝛾). using the non-steady

state model, for the pulse experiment (left), the chase experiment (middle), and their

combination (right), respectively. Induction time is 6h. (H) As in G but using the quasi-

steady state model on the chase experiment. (I) As in G but showing the relative median

errors for the synthesis rate. (J) Relative median error for the estimation of the degradation

rate constant as a function of the true transcript half-life. Errors are shown for the non-

steady state model (left) fitting the pulse and chase experiments together, and the quasi-

steady state model (right). Induction window times are indicated in color. (K) Estimated

degradation rate constants as a function of the true degradation rate constant for the non-

steady state model (left) fitting the pulse and chase experiments together, and the quasi-

steady state model (right). Induction window times are indicated in color as in J.

24

Fig. S8. Example of the fitting procedure for the non-steady state model with pulse and

chase experiments combined for the cell cycle dataset. (A) Workflow of the procedure to

fit the synthesis and degradation rate constants along the cell cycle progression. (B)

Example of sampled cells from different phases along the cell cycle progression for PLK1.

25

(C) Fitting of average UMI levels derived from cells sampled from different cell cycle

points for PLK1 (as shown in B). Gray lines represent individual bootstraps, sampling 30

cell with replacement per time point. (D) Representative histograms of fitted rate values

for the synthesis rate (left) and degradation rate constant (right) derived from 50 bootstraps

for 528 selected genes. Gray dashed lines indicate the thresholds for discarding global

outliers. (E) Scatter plots of the fitted synthesis (left) and degradation rates (right) along

the cell cycle progression for PLK1. Local outliers are shown in green and were discarded.

The black line shows the median computed rate at each point along the cell cycle

progression.

26

Fig. S9. Selection of genes for the cell cycle progression analysis. (A) Heat maps showing

the expression levels along the cell cycle progression of all genes that have more than 500

labeled UMIs detected in all time points of the pulse and chase experiments (n = 6,086).

Clusters of genes are indicated on top and black dots indicate the clusters selected for

further analysis. (B) Scatter plot showing the expression peak length and the peak dynamic

range for selected genes (n = 591). Dashed lines indicate the thresholds used for the final

gene selection (black dots, n = 528). (C) The estimated time that cells spend at each point

along the the cell cycle progression for the scEU-seq and the CEL-seq2 datasets.

27

Fig. S10. Predicted degradation rates for the pulse and chase experiment. (A) Clustered

heat map showing the relative predicted degradation rates along the cell cycle progression

obtained by fitting the pulse experiment (left panel), the chase experiment using the non-

steady state model (middle), and the chase experiment using the quasi-steady state model

(right), respectively. Gray bars indicate gene clusters with similar rates and individual

genes are highlighted. (B) Scatter plot of the correlations between the standard deviations

of the estimated synthesis and degradation rates derived from different models as indicated

on the right (n = 528 genes). (C) The correlation between the average computed

degradation rates and the mean rates reported in Schofield et al. (38). (D) Histograms of

the sum of square errors (left), the correction factors for the synthesis rate (middle), and

28

the Pearson correlation between the predicted and observed expression levels (right), for

the non-steady state (pink) and the quasi-steady state (gray) models, respectively. The P

value was calculated from a Wilcoxon test, n = 528 genes.

29

Fig. S11. Gene groups with different regulatory strategies along the cell cycle progression.

(A) Heat map showing the enrichment or depletion for the destabilizing (s >= 0.5), neutral

(-0.5 < s < 0.5), and cooperating (s <= -0.5) strategies for the indicated gene groups

(clusters). Black dots represent significant enrichment or depletion (P<0.05) according to

the Fisher test after correction for multiple testing. s from Fig. 2B (B) Network

visualization of functional GO term annotation enrichments calculated for the indicated

gene groups. Group A example genes: KIF5B, KIF20B, KIF18A. Group B example genes:

CDK1, UBE2C, KIF11, KIF22, TOP2A, CENPE and PLK1. Group C example genes: LIF,

TGFB2, KITLG, POLA1 and MCM6. Group D example genes: PCNA, MCM2, MCM4,

POLD3, LIG1. Group E example genes: BRCA1, MRE11, RMI1. Group F example genes:

VRK1, MELK, PAK1. (C) Scatter plots of the degradation rates derived from gating and

bulk sequencing EU-seq experiments, against average rates derived from scEU-seq

experiments. Plots are shown for three example gene groups and color indicates the cell

cycle phase.

30

Fig. S12. Accuracy of calling the type of regulatory strategy for different regimes of the

synthesis rate and degradation rates constants along the cell cycle. (A) Workflow of the

31

procedure to simulate the pulse and chase experiment through the cell cycle. (B) Simulation

example of the chase experiment. The right panel shows the dynamics of the chase

experiment though a 24 h period, where the measurement time-point is at 13 h. The middle

panel shown the obtained transcript counts at the measurement time-point, and the right

panel shows the probability of measured transcript counts after accounting for sampling

error. (C) As B but for the pulse experiment. (D) Schematics of the sampling space of rate

dynamics, e.g. 1 to 4, corresponds to position of examples given in E. (E) Examples of

different tested true rate regimes and estimated final rates. For a=1.5, the initial synthesis

rate is 31.6 m/h, and the initial degradation rate constant is 21.6 1/h. For a = 1.2, the initial

synthesis rate is 15.8 m/h, and the initial degradation rate constant is 5.8 1/h. (F) Measured

cosine similarities after estimation of rates from the simulated experiments, the expected

cosine similarity for each column is shown at the bottom. 𝜎 and a are indicated. (G)

Histograms of the cosine similarity as a function of 𝜎 and a.

32

Fig. S13. Batch variability of scEU-seq experiments in organoids and selection of genes.

(A) UMAPs of cells belonging to different EU treatments in the organoid dataset; n = 660

for chase 0 min, n = 821 for chase 45 min, n = 646 for chase 360 min, n = 1373 for pulse

120 min and n = 331 for the DMSO control. (B) Scatter plot of the coefficient of variation

(CV) as a function of the mean expression level of all detected genes (n=9,157). Red dots

indicate highly variable preselected genes (n = 1,033). (C) Scatter plot fold changes in

expression against mean expression levels of the preselected 1,033 genes. Genes that show

differential expression are highlighted in red (n=295 genes), gray dots mark genes that

were discarded upon manual inspection. Three marker genes (Apoa1, Lyz1 and Lgr5) are

highlighted, in yellow, green and blue respectively.

33

Fig. S14. Controls for the analysis of the differentiation trajectories of intestinal organoids.

(A) Monocle 2 analysis and derivation of branches 1, 2, and 3. (B) UMAP showing branch

3 of the monocle analysis. (C) The frequency of cells along the monocle 2 trajectory.

Values and cells below the threshold at 3.5 (left dashed line) were used for further analysis

34

and to estimate the differentiation time of branch 1. Similarly, values and cells to the right

of the threshold at 6.5 (right dashed line) were used for further analysis and to estimate the

differentiation time of branch 2. (D) Histograms showing estimated synthesis rate and

degradation rate constants. Gray dashed lines indicate the threshold used to discard global

outliers. (E) Calculated degradation rates (left panels), synthesis rates (middle), and

expression levels (right) along the estimated pseudo time (h) of differentiation for five

example genes of the secretory lineage (branch 1). Red lines represent the median rate or

level used for further analysis. (F) As in E but for five example genes of the enterocyte

lineage (branch 2).

35

Fig. S15. Regulatory strategies of genes during intestinal organoid differentiation. (A) Heat

maps showing the observed (left panels) and predicted (right panels) normalized expression

levels for the differentiation branches 1 and 2, respectively. Genes are clustered according

to their different regulatory strategies and expression levels. Rightmost panels indicate the

r2 for the predicted vs the observed expression in branches 1 and 2. (B) Network

representation of genes of group A, B and D (see Fig. 3) highlighting functional GO term

annotation enrichments. Blue edges link a gene to a strategy group, and gray edges link

genes that share enriched GO term annotations. (C) UMAPs showing the sum of labeled

UMI counts for genes in strategy group A (top panels) or group B (bottom panels) for the

different experimental time points as indicated.

36

Additional Data Table S1

Data related to the regulatory strategy analysis during cell-cycle progression. Related to

Fig. 2.


Differential gene expression analysis of organoid individual clusters for identification of

cell types in intestinal organoids. Related to in Fig. 3.


Differential gene expression analysis for identification of genes involved in organoid

differentiation. Related to Fig. 3.


Data related to the regulatory strategy analysis during organoid differentiation. Related

to Fig. 3

37

References and Notes

1. B. Schwalb, M. Michel, B. Zacher, K. Frühauf, C. Demel, A. Tresch, J. Gagneur, P. Cramer,

TT-seq maps the human transient transcriptome. Science 352, 1225–1228 (2016).

doi:10.1126/science.aad9841 Medline

2. M. Rabani, R. Raychowdhury, M. Jovanovic, M. Rooney, D. J. Stumpo, A. Pauli, N. Hacohen,

A. F. Schier, P. J. Blackshear, N. Friedman, I. Amit, A. Regev, High-resolution

sequencing and modeling identifies distinct dynamic RNA regulatory strategies. Cell 159,

1698–1710 (2014). doi:10.1016/j.cell.2014.11.015 Medline

3. O. Shalem, O. Dahan, M. Levo, M. R. Martinez, I. Furman, E. Segal, Y. Pilpel, Transient

transcriptional responses to stress are generated by opposing effects of mRNA production

and degradation. Mol. Syst. Biol. 4, 223 (2008). doi:10.1038/msb.2008.59 Medline

4. S. C. Little, M. Tikhonov, T. Gregor, Precise developmental gene expression arises from

globally stochastic transcriptional activity. Cell 154, 789–800 (2013).

doi:10.1016/j.cell.2013.07.025 Medline

5. H. Tani, R. Mizutani, K. A. Salam, K. Tano, K. Ijiri, A. Wakamatsu, T. Isogai, Y. Suzuki, N.

Akimitsu, Genome-wide determination of RNA stability reveals hundreds of short-lived

noncoding transcripts in mammals. Genome Res. 22, 947–956 (2012).

doi:10.1101/gr.130559.111 Medline

6. M. Rabani, J. Z. Levin, L. Fan, X. Adiconis, R. Raychowdhury, M. Garber, A. Gnirke, C.

Nusbaum, N. Hacohen, N. Friedman, I. Amit, A. Regev, Metabolic labeling of RNA

uncovers principles of RNA production and degradation dynamics in mammalian cells.

Nat. Biotechnol. 29, 436–442 (2011). doi:10.1038/nbt.1861 Medline

7. A. Raghavan, R. L. Ogilvie, C. Reilly, M. L. Abelson, S. Raghavan, J. Vasdewani, M.

Krathwohl, P. R. Bohjanen, Genome-wide analysis of mRNA decay in resting and

activated primary human T lymphocytes. Nucleic Acids Res. 30, 5529–5538 (2002).

doi:10.1093/nar/gkf682 Medline

8. T. Hashimshony, F. Wagner, N. Sher, I. Yanai, CEL-Seq: Single-cell RNA-Seq by multiplexed

linear amplification. Cell Rep. 2, 666–673 (2012). doi:10.1016/j.celrep.2012.08.003

Medline

9. D. A. Jaitin, E. Kenigsberg, H. Keren-Shaul, N. Elefant, F. Paul, I. Zaretsky, A. Mildner, N.

Cohen, S. Jung, A. Tanay, I. Amit, Massively parallel single-cell RNA-seq for marker-

free decomposition of tissues into cell types. Science 343, 776–779 (2014).

doi:10.1126/science.1247651 Medline

10. A. B. Rosenberg, C. M. Roco, R. A. Muscat, A. Kuchina, P. Sample, Z. Yao, L. T. Graybuck,

D. J. Peeler, S. Mukherjee, W. Chen, S. H. Pun, D. L. Sellers, B. Tasic, G. Seelig, Single-

cell profiling of the developing mouse brain and spinal cord with split-pool barcoding.

Science 360, 176–182 (2018). doi:10.1126/science.aam8999 Medline

11. D. Grün, A. Lyubimova, L. Kester, K. Wiebrands, O. Basak, N. Sasaki, H. Clevers, A. van

Oudenaarden, Single-cell messenger RNA sequencing reveals rare intestinal cell types.

Nature 525, 251–255 (2015). doi:10.1038/nature14966 Medline

http://dx.doi.org/10.1126/science.aad9841

http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&list_uids=27257258&dopt=Abstract

http://dx.doi.org/10.1016/j.cell.2014.11.015


http://dx.doi.org/10.1038/msb.2008.59




http://dx.doi.org/10.1101/gr.130559.111


http://dx.doi.org/10.1038/nbt.1861


http://dx.doi.org/10.1093/nar/gkf682


http://dx.doi.org/10.1016/j.celrep.2012.08.003


http://dx.doi.org/10.1126/science.1247651


http://dx.doi.org/10.1126/science.aam8999


http://dx.doi.org/10.1038/nature14966


38

12. E. Z. Macosko, A. Basu, R. Satija, J. Nemesh, K. Shekhar, M. Goldman, I. Tirosh, A. R.

Bialas, N. Kamitaki, E. M. Martersteck, J. J. Trombetta, D. A. Weitz, J. R. Sanes, A. K.

Shalek, A. Regev, S. A. McCarroll, Highly Parallel Genome-wide Expression Profiling of

Individual Cells Using Nanoliter Droplets. Cell 161, 1202–1214 (2015).


13. A. M. Klein, L. Mazutis, I. Akartuna, N. Tallapragada, A. Veres, V. Li, L. Peshkin, D. A.

Weitz, M. W. Kirschner, Droplet barcoding for single-cell transcriptomics applied to

embryonic stem cells. Cell 161, 1187–1201 (2015). doi:10.1016/j.cell.2015.04.044

Medline

14. B. Pijuan-Sala, J. A. Griffiths, C. Guibentif, T. W. Hiscock, W. Jawaid, F. J. Calero-Nieto, C.

Mulas, X. Ibarra-Soria, R. C. V. Tyser, D. L. L. Ho, W. Reik, S. Srinivas, B. D. Simons,

J. Nichols, J. C. Marioni, B. Göttgens, A single-cell molecular map of mouse gastrulation

and early organogenesis. Nature 566, 490–495 (2019). doi:10.1038/s41586-019-0933-9

Medline

15. See supplementary materials.

16. T. Zerjatke, I. A. Gak, D. Kirova, M. Fuhrmann, K. Daniel, M. Gonciarz, D. Müller, I.

Glauche, J. Mansfeld, Quantitative cell cycle analysis based on an endogenous all-in-one

reporter for cell tracking and classification. Cell Rep. 19, 1953–1966 (2017).

doi:10.1016/j.celrep.2017.05.022 Medline

17. W. da Huang, B. T. Sherman, R. A. Lempicki; W. Huang da, Systematic and integrative

analysis of large gene lists using DAVID bioinformatics resources. Nat. Protoc. 4, 44–57

(2009). doi:10.1038/nprot.2008.211 Medline

18. L. Krenning, F. M. Feringa, I. A. Shaltiel, J. van den Berg, R. H. Medema, Transient

activation of p53 in G2 phase is sufficient to induce senescence. Mol. Cell 55, 59–72

(2014). doi:10.1016/j.molcel.2014.05.007 Medline

19. T. Hashimshony, N. Senderovich, G. Avital, A. Klochendler, Y. de Leeuw, L. Anavy, D.

Gennert, S. Li, K. J. Livak, O. Rozenblatt-Rosen, Y. Dor, A. Regev, I. Yanai, CEL-Seq2:

Sensitive highly-multiplexed single-cell RNA-Seq. Genome Biol. 17, 77 (2016).

doi:10.1186/s13059-016-0938-8 Medline

20. T. Sato, R. G. Vries, H. J. Snippert, M. van de Wetering, N. Barker, D. E. Stange, J. H. van

Es, A. Abo, P. Kujala, P. J. Peters, H. Clevers, Single Lgr5 stem cells build crypt-villus

structures in vitro without a mesenchymal niche. Nature 459, 262–265 (2009).

doi:10.1038/nature07935 Medline

21. H. Tian, B. Biehs, S. Warming, K. G. Leong, L. Rangell, O. D. Klein, F. J. de Sauvage, A

reserve stem cell population in small intestine renders Lgr5-positive cells dispensable.

Nature 478, 255–259 (2011). doi:10.1038/nature10408 Medline

22. X. Qiu, Q. Mao, Y. Tang, L. Wang, R. Chawla, H. A. Pliner, C. Trapnell, Reversed graph

embedding resolves complex single-cell trajectories. Nat. Methods 14, 979–982 (2017).

doi:10.1038/nmeth.4402 Medline

23. F. Xie, X. Ding, Q. Y. Zhang, An update on the role of intestinal cytochrome P450 enzymes

in drug disposition. Acta Pharm. Sin. B 6, 374–383 (2016).

doi:10.1016/j.apsb.2016.07.012 Medline





http://dx.doi.org/10.1038/s41586-019-0933-9


http://dx.doi.org/10.1016/j.celrep.2017.05.022


http://dx.doi.org/10.1038/nprot.2008.211


http://dx.doi.org/10.1016/j.molcel.2014.05.007


http://dx.doi.org/10.1186/s13059-016-0938-8






http://dx.doi.org/10.1038/nmeth.4402


http://dx.doi.org/10.1016/j.apsb.2016.07.012


39

24. S. Geula, S. Moshitch-Moshkovitz, D. Dominissini, A. A. F. Mansour, N. Kol, M. Salmon-

Divon, V. Hershkovitz, E. Peer, N. Mor, Y. S. Manor, M. S. Ben-Haim, E. Eyal, S.

Yunger, Y. Pinto, D. A. Jaitin, S. Viukov, Y. Rais, V. Krupalnik, E. Chomsky, M. Zerbib,

I. Maza, Y. Rechavi, R. Massarwa, S. Hanna, I. Amit, E. Y. Levanon, N. Amariglio, N.

Stern-Ginossar, N. Novershtern, G. Rechavi, J. H. Hanna, m6A mRNA methylation

facilitates resolution of naïve pluripotency toward differentiation. Science 347, 1002–

1006 (2015). doi:10.1126/science.1261417 Medline

25. P. J. Batista, B. Molinie, J. Wang, K. Qu, J. Zhang, L. Li, D. M. Bouley, E. Lujan, B. Haddad,

K. Daneshvar, A. C. Carter, R. A. Flynn, C. Zhou, K.-S. Lim, P. Dedon, M. Wernig, A. C.

Mullen, Y. Xing, C. C. Giallourakis, H. Y. Chang, m(6)A RNA modification controls cell

fate transition in mammalian embryonic stem cells. Cell Stem Cell 15, 707–719 (2014).

doi:10.1016/j.stem.2014.09.019 Medline

26. M. J. Muraro, G. Dharmadhikari, D. Grün, N. Groen, T. Dielen, E. Jansen, L. van Gurp, M.

A. Engelse, F. Carlotti, E. J. P. de Koning, A. van Oudenaarden, A single-cell

transcriptome atlas of the human pancreas. Cell Syst. 3, 385–394.e3 (2016).

doi:10.1016/j.cels.2016.09.002 Medline

27. D. Grün, L. Kester, A. van Oudenaarden, Validation of noise models for single-cell

transcriptomics. Nat. Methods 11, 637–640 (2014). doi:10.1038/nmeth.2930 Medline

28. M. I. Love, W. Huber, S. Anders, Moderated estimation of fold change and dispersion for

RNA-seq data with DESeq2. Genome Biol. 15, 550 (2014). doi:10.1186/s13059-014-

0550-8 Medline

29. L. McInnes, J. Healy, J. Melville, UMAP: Uniform Manifold Approximation and Projection

for Dimension Reduction. arXiv:1802.03426 [stat.ML] (9 February 2018).

30. E. Becht, L. McInnes, J. Healy, C. A. Dutertre, I. W. H. Kwok, L. G. Ng, F. Ginhoux, E. W.

Newell, Dimensionality reduction for visualizing single-cell data using UMAP. Nat.

Biotechnol. (2018). Medline

31. B. T. Sherman, W. Huang, Q. Tan, Y. Guo, S. Bour, D. Liu, R. Stephens, M. W. Baseler, H.

C. Lane, R. A. Lempicki, DAVID Knowledgebase: A gene-centered database integrating

heterogeneous gene annotation resources to facilitate high-throughput gene functional

analysis. BMC Bioinformatics 8, 426 (2007). doi:10.1186/1471-2105-8-426 Medline

32. D. Berchtold, N. Battich, L. Pelkmans, A systems-level study reveals regulators of

membrane-less organelles in human cells. Mol. Cell 72, 1035–1049.e5 (2018).

doi:10.1016/j.molcel.2018.10.036 Medline

33. S. C. van den Brink, F. Sage, Á. Vértesy, B. Spanjaard, J. Peterson-Maduro, C. S. Baron, C.

Robin, A. van Oudenaarden, Single-cell sequencing reveals dissociation-induced gene

expression in tissue subpopulations. Nat. Methods 14, 935–936 (2017).

doi:10.1038/nmeth.4437 Medline

34. S. C. Bendall, K. L. Davis, A. D. Amir, M. D. Tadmor, E. F. Simonds, T. J. Chen, D. K.

Shenfeld, G. P. Nolan, D. Pe’er, Single-cell trajectory detection uncovers progression and

regulatory coordination in human B cell development. Cell 157, 714–725 (2014).


http://dx.doi.org/10.1126/science.1261417


http://dx.doi.org/10.1016/j.stem.2014.09.019


http://dx.doi.org/10.1016/j.cels.2016.09.002




http://dx.doi.org/10.1186/s13059-014-0550-8

http://dx.doi.org/10.1186/s13059-014-0550-8


https://arxiv.org/abs/1802.03426


http://dx.doi.org/10.1186/1471-2105-8-426


http://dx.doi.org/10.1016/j.molcel.2018.10.036






40

35. V. Shahrezaei, P. S. Swain, Analytical distributions for stochastic gene expression. Proc.

Natl. Acad. Sci. U.S.A. 105, 17256–17261 (2008). doi:10.1073/pnas.0803850105 Medline

36. W. Sun, Q. Gao, B. Schaefke, Y. Hu, W. Chen, Pervasive allele-specific regulation on RNA

decay in hybrid mice. Life Sci. Alliance 1, e201800052 (2018).

doi:10.26508/lsa.201800052 Medline

37. H. Gehart, J. H. van Es, K. Hamer, J. Beumer, K. Kretzschmar, J. F. Dekkers, A. Rios, H.

Clevers, Identification of enteroendocrine regulators by real-time single-cell

differentiation mapping. Cell 176, 1158–1173.e16 (2019). doi:10.1016/j.cell.2018.12.029

Medline

38. J. A. Schofield, E. E. Duffy, L. Kiefer, M. C. Sullivan, M. D. Simon, TimeLapse-seq: Adding

a temporal dimension to RNA sequencing through nucleoside recoding. Nat. Methods 15,

221–225 (2018). doi:10.1038/nmeth.4582 Medline

http://dx.doi.org/10.1073/pnas.0803850105


http://dx.doi.org/10.26508/lsa.201800052






Download - Supplementary Materials for · 2020. 3. 4. · 2 At this point plates were pooled in a single 500 µl Eppendorf tube and treated with 1 µl of Exonuclease I (Thermo) for 30 min at

Top Related