supplementary material for a. champhekar et...
TRANSCRIPT
SUPPLEMENTARY MATERIAL FOR A. CHAMPHEKAR ET AL.
A. Supplementary Figures S1-S8 and Figure Legends
B. Supplementary Methods and References
C. Supplementary Tables S1-S6
Table S1: DESeq analysis
Table S2: PU.1 binding sites and analysis
Table S3: PCA
Table S4: GSEA
Table S5: GO/KEGG analysis
Table S6: List of qPCR primers
CreBcl-xL
_
+++
+_
__
Chan
ge in
% o
f eac
h po
pula
tion
rela
tive
to d
ay 0
val
ue
Day 0Day 0 Day 4 Day 6
Day 4 Day 6B6 :
Spi1fl/fl :
0.01
0.1
1
1015 * * * *
* * * * * * * *
A.
B.
0 102 10 3 10 4 105
0
102
103
104
105 14.1 32.4
39.713.7
0 102 10 3 10 4 105
0
102
103
104
105 21.2 38
32.58.38
0 102 10 3 10 4 105
0
102
103
104
105 38.6 35.2
20.85.36
0 102 10 3 10 4 105
0
102
103
104
105 22.2 35.8
33.28.82
0 102 10 3 10 4 105
0
102
103
104
105 16.9 12.3
18.352.4
0 102 10 3 10 4 105
0
102
103
104
105 18.2 30
32.719.1
0 102 10 3 10 4 105
0
102
103
104
105 28.8 34.8
28.87.72
0 102 10 3 10 4 105
0
102
103
104
105 19.1 10.6
30.439.9
0 102 10 3 10 4 105
0
102
103
104
105 16.2 52.6
29.21.94
0 102 10 3 10 4 105
0
102
103
104
105 12 48.7
34.94.4
0 102 10 3 10 4 105
0
102
103
104
105 15.3 41.2
39.24.32
0 102 10 3 10 4 105
0
102
103
104
105 12.3 52.4
30.25.12
0 102 10 3 10 4 105
0
102
103
104
105 14.4 38.7
41.85.18
0 102 10 3 10 4 105
0
102
103
104
105 13.5 35
42.39.23
0 102 10 3 10 4 105
0
102
103
104
105 27.2 17.3
30.225.2
0 102 10 3 10 4 105
0
102
103
104
105 44.9 16.3
12.826.1C
D44
Spi1fl/flB6
Day 4
Cre+ Bcl-xL+
Cre+ Bcl-xL-
Cre- Bcl-xL+
Cre- Bcl-xL-
CD25
Day 6 Day 4 Day 6
Champhekar_Fig.S1
0 50K 100K 150K 200K 250K
0
50K
100K
150K
200K
250K
0 50K 100K 150K 200K 250K
0
102
103
104
105
91.10 50K 100K 150K 200K 250K
0
102
103
104
105
98.5
0 102 103 104 105
0
102
103
104
105 22.3
B6
0 50K 100K 150K 200K 250K
0
50K
100K
150K
200K
250K
0 50K 100K 150K 200K 250K
0
102
103
104
105
86.20 50K 100K 150K 200K 250K
0
102
103
104
105
97.6
0 102 103 104 105
0
102
103
104
105 21.2
0 102 103 104 105
0
102
103
104
105 10.7 35.1
48.55.690 102 103 104 105
0
102
103
104
105 17 36.9
39.16.98
Spi1fl/fl
68.5
71.3
x= FSC, y= SSC x= FSC, y= Lin+7-AAD x= FSC, y= CD45 x= Cre, y= Bcl-xL
B6 Spi1fl/fl
CD25
c-Ki
t
DN1 DN2a
DN2b
A
B C
Rela
tive
expr
essio
n of
WT
Spi1
mRN
A
DN1 DN2a DN2b0
20
40
60
80
100
120 Spi1fl/flB6
Spi1fl/flB6
0
20
40
60
80
100
120
DN1 DN2a DN2b
Geo
met
ric m
ean
of %
yie
ld
rela
tive
to B
6 C
re+
cells
Champhekar_Fig.S2
0
1
2 3 4 5
c-Kit
Zfpm1
Zbtb16
Il2rb
Sox13Id2
Gata2
Fold
exp
ress
ion
over
B6
DN
2a
A
B6 DN2a B6 DN2b Spi1-/- DN2a Spi1-/- DN2b
Id3
6
Fold
exp
ress
ion
over
B6
DN
2a
0
1
1.5
2
CD44CD11
b
Bcl11a
Mef2c
Hhex
0.5
* # *
0 1 2 3
Notch1
Nrarp
Mint
Deltex
1
Fold
exp
ress
ion
over
B6
DN
2a
4 5 6 7
Hes1
Hes5
Nkap
Notch3
* * * * *# # # *
0
1
2
3
4
Fold
exp
ress
ion
over
B6
DN
2a
5
Ets2Gata
3
HEBalt
HEBcan
Ikaros Myb Tc
f7CD25
Runx1 Lc
kBcl1
1b
* * #
−2
−1
0
1
2Itg
amBc
l11a
Il2rb
Mef
2c Id3
Lef1
Il7r
Cd4
4Id
2Po
u6f1
Cd3
eZe
b1N
otch
1R
unx3
Il2ra
Hhe
xPt
cra
Lmo2
Spen
Tcfe
2aN
otch
3R
unx1 Lck
Bcl1
1bN
kap
Ets2
Tal1
HEB
Can
Rag
1G
ata3
Ets1
c-Ki
tH
es1
Dtx
1zf
pm1
Hes
5Ze
b2G
fi1N
rarp
Myb
Ikzf
1Tc
f7So
x13
zbtb
16H
EBAl
tG
ata2
Log2
(Spi1-
/- DN
1/B
6 D
N1
rela
tive
expr
essi
on v
alue
s)
Cre only
Cre and Bcl-xL
Data from DN1cells expressing
B C
D E
Champhekar_Fig.S3
0 102 103 104 105
0
102
103
104
105 1.03 10.9
22.865.20 102 103 104 105
0
102
103
104
105 3.1 31.1
16.849
0 102 103 104 105
0
102
103
104
105 9.84 63.6
14.212.40 102 103 104 105
0
102
103
104
105 2.86 51.2
25.420.5
CD
122
NK1.1
B6 Spi1fl/fl
Cre+
Cre+Bcl-xL+
0
10
20
30
40
50
60
70
80
90
Expt #1 Expt #2
% C
D12
2+ N
k1.1
+ c
ells
of t
heto
tal C
D45
+ C
D25
- pop
ulat
ion B6 Cre+ Bcl-xL-
B6 Cre+ Bcl-xL+ Spi1fl/fl Cre+ Bcl-xL- Spi1fl/fl Cre+ Bcl-xL+
A
B
Champhekar_Fig.S4
0.1
1
3
10
30
0.03
0.3
Contro
l
2a 2bPU.1
WT
PU.1 ∆D
EQ
PU.1-ETS
PU1-Eng
Mef2cLmo2
Pou6f1Cd11bBcl11a
HhexCd44
Id2Zeb2Zeb1
Sclc-Kit
IL-7RaRunx3Sox13
Zbtb16Gata2
Il2rbId3
Zfpm1Cd25Hes5
Tcfe2aNotch3Runx1
LckNkapMyb
SpenEts1Ets2
NrarpHEBcan
Hes1Gfi1
IkarosHEBaltGata3Bcl11bNotch1
Lef1Rag1
Deltex1PtcraCd3e
2a 2b 2a 2b 2a 2b 2a 2b Contro
l
PU.1-Eng
PU.1-Eng
W21
5G
DN2 cells
Mef2cLmo2
Pou6f1Cd11bBcl11a
HhexCd44
Id2Zeb2
Sclc-Kit
IL-7RaRunx3Sox13Zbtb16Gata2
Il2rbId3
Zfpm1Cd25Hes5
Tcfe2aNotch3Runx1
LckNkapMyb
SpenEts1Ets2Tcf7
NrarpHEBcan
Hes1Gfi1
IkarosHEBaltGata3Bcl11bNotch1
Lef1Rag1
Deltex1pTa
Cd3e
e14.5 Bcl2tg FL cells
Infect with PU.1-Eng
constructs
Sort infected DN subsets for qPCR
OP9-DL1 culture
OP9-DL1 culture
4d 1d
A
B C
Champhekar_Fig.S5
A BLZRS-EV PU.1 WT PU.1-Eng LZRS-EV DN2a DN2b
DN2a DN2b
DN2a DN2b
PU.1-WT PU.1-Eng
Rel
ativ
e ex
pres
sion
0.001
0.01
0.1
1
Rel
ativ
e ex
pres
sion
Cd3e
Ets1Gata
3Gfi1
Hes1
Lef1
Rag1
Tcf7 Cd3e
Ets1Gata
3Gfi1
Hes1
Lef1
Rag1
Tcf70.0001
0.001
0.01
0.1
1
0.0001
0.01
0.1
1
Rel
ativ
e ex
pres
sion
0.001
0.0001
0.01
0.1
1
Rel
ativ
e ex
pres
sion
0.001
Zfpm1
Gata2 Id3 Il2
rbZfpm
1Gata
2 Id3 Il2rb
1 x 10-5
1 x 10-1
1 x 10-2
1 x 10-3
1 x 10-4
1 x 10-7
1 x 10-6
1 x 100
Rel
ativ
e ex
pres
sion
Bcl11a
Cd11b
Mef2c
Pou6f1
0.0001
0.01
0.1
1
0.001
Rel
ativ
e ex
pres
sion
Bcl11a
Cd11b
Mef2c
Pou6f1
C D
E F
G
EV PU.1-ENG PU.1-ETS PU.1-ENG/EV PU.1-ETS/EV Expt 1 Expt 2 Expt 1 Expt 2Ccnd1 1.2 1.1 0.7 0.90 0.53 1 1 1 1Ccnd2 46.5 49.6 37.7 1.07 0.81 1 1 1 1Ccnd3 111.9 132.7 170.7 1.19 1.53 1 1 1 1Ccne1 15.8 13.3 17.0 0.85 1.08 1 1 1 1Ccne2 21.0 11.2 22.3 0.54 1.06 1 1 1 1Cdk1 86.6 107.7 107.1 1.24 1.24 1 1 1 1Cdk2 41.0 40.1 69.7 0.98 1.70 0.49 1 1 1Cdk4 351.4 467.1 346.8 1.33 0.99 1 1 1 1Cdk5 10.1 7.5 8.7 0.74 0.86 1 1 1 1Cdk6 37.8 35.3 26.0 0.93 0.69 1 1 1 1Cdkn1a 119.3 88.1 159.7 0.74 1.34 1 1 1 1Cdkn1b 46.6 28.3 52.6 0.61 1.13 1 1 1 1Cdkn1c 1.8 2.4 12.2 1.39 6.97 0.04 0.54 1 1Cdkn2c 4.3 3.7 4.5 0.87 1.05 0.91 1 1 1Cdkn2d 0.3 1.3 1.3 3.90 3.79 1 1 1 1Cdkn3 7.7 6.1 9.1 0.79 1.18 1 1 1 1Rb1 19.1 12.8 19.7 0.67 1.03 1 1 1 1Trp53 116.2 131.3 130.6 1.13 1.12 1 1 1 1
Fold effect EV vs PU.1-ENG (padj) EV vs PU.1-ETS (padj) Gene name Average fpkm (2 expts)
Champhekar_Fig.S6
LZRS-EV
PU.1-ETS
0 50K 100K 150K 200K 250K
0
50K
100K
150K
200K
250K
78.30 50K 100K 150K 200K 250K
0
102
103
104
105
94.6
0 102 103 104 105
0
102
103
104
105 32.5
0 102 103 104 105
0
102
103
104
105 8.08 49.8
37.64.48
0 50K 100K 150K 200K 250K
0
50K
100K
150K
200K
250K
75.60 50K 100K 150K 200K 250K
0
102
103
104
105
93.9
0 102 103 104 105
0
102
103
104
105 43.4
0 102 103 104 105
0
102
103
104
105 9.39 54.4
324.21
0 50K 100K 150K 200K 250K
0
50K
100K
150K
200K
250K
68.10 50K 100K 150K 200K 250K
0
102
103
104
105
93.3
0 102 103 104 105
0
102
103
104
105 23.5
0 102 103 104 105
0
102
103
104
105 11 51.8
32.44.72
PU.1-Eng
x= FSC, y=SSC x= FSC, y=Lin+7-AAD x= GFP, y=CD45 x= CD25, y=c-Kit
DN2
Champhekar_Fig.S7
LZRS-EV PU.1-WT PU.1-Eng
DN2a DN2b
DN2a DN2b
DN2aPU.1-Eng DN2bLZRS-EV PU.1-WT
Mef2c
Lmo2
Bcl11a Scl Erg
c-Kit
Mycn
Mef2c
Lmo2
Bcl11a Scl Erg
c-Kit
Mycn
1
0.1
0.01
0.001
1
0.1
0.01
0.001
0.0001
0.00001
Rel
ativ
e ex
pres
sion
Rel
ativ
e ex
pres
sion
Champhekar_Fig.S8
2
LEGENDS FOR SUPPLEMENTARY FIGURES
Figure S1 – Deletion of PU.1 in c-Kit+ CD27+ FLPs results in impaired DN progression and
poor survival and recovery of early DN-stage T cells. E14.5 B6 and Spi1fl/fl FLPs were co-
infected with Cre- and Bcl-xL-carrying retroviruses or empty vector controls, and cultured on
OP9-DL1 cells. A: DN progression was assayed on days 4 and 6 of culture by analysis of cells
still expressing both retroviruses. FACS plots are representative of results from 2 independent
experiments. Red and brown arrows indicate the comparable populations between B6 and Spi1fl/fl
cultures on day 4 and day 6 respectively. B: Plot showing change in the proportion of each
double-infected population (GFP+ NGFR+) relative to its day 0 value. For example, a decline
from 50% of the day 0 population to 5% of the day 6 population would be reported as “0.1”.
Note the log scale. Each bar represents mean values from 2 independent experiments while error
bars indicate range. * Indicates significant increase or decrease of a cell population compared to
its day 0 value at p < 0.05.
Figure S2 – Impacts of PU.1 deletion in cells that have already entered the T-cell development
pathway. A: Sorting B6 and Spi1-/- DN subsets. E14.5 B6 and Spi1fl/fl FLPs were co-cultured with
OP9-DL1 as described in the methods section. Cells were harvested on day 3 of culture,
transduced only with Cre or with Cre- and Bcl-xL-carrying retroviruses, and put back in fresh
OP9-DL1 co-cultures. FLDN cells were harvested after 2 days and Lin- 7-AAD- CD45+ Cre+
Bcl-xL+ DN subsets were sorted using the gates indicated in the lower panels. FACS plots show
the gating strategy and phenotype of retrovirally-transduced FLDN cells on the day of the sort.
Numbers in the boxes show percentage in each quadrant or gated population. B: Spi1f/fl is rapidly
deleted by Cre transduction. Cre mediated deletion of PU.1 in DN subsets sorted in A was
determined immediately after the sort by measuring the undeleted WT message by qPCR. Bars
represent mean expression values while error bars indicate 1 standard deviation. The graph
summarizes data from 2 independent experiments. C: PU.1 supports DN proliferation in early T-
cells. Cre+ B6 and Spi1fl/fl DN1, DN2a and DN2b cells were sorted as in Fig 2F. Equal numbers
of sorted Cre+ B6 and Spi1fl/fl DN subsets were cultured on fresh OP9-DL1 cells, harvested after
3
5 days and FLDN cells were counted to determine the yield from each seeded population. Cell
numbers were normalized to B6 samples for each subset and the geometric mean was plotted as
shown in the graph. Error bars represent 1 standard deviation around the geometric mean. Plots
are calculated from 3 independent experiments for DN1 and 4 independent experiments for
DN2a and DN2b each.
Figure S3 – Effect of deletion of PU.1 on gene expression in DN1, DN2a and DN2b cells. B6
and PU.1 KO DN subsets cells were sorted as described in fig S2 and expression of the indicated
genes relative to Actin was determined using qPCR. A: Gene expression changes in DN1 cells
following PU.1 deletion. Each point on the plot represents log2-transformed ratio of Spi1-/- and
B6 expression values for a particular gene in DN1 cells from 4-5 independent experiments. The
red dots are expression ratios from experiments in which DN1 cells were infected with Cre alone
(to delete Spi1), while the blue dots represent experiments in which Bcl-xL was co-expressed
with Cre to improve the recovery of Spi1-/- DN1 cells. Open triangles indicate the mean ratio
while the black bars represent one standard deviation. B-E: Gene expression changes in B6 and
Spi1-/- DN2a and 2b subsets sorted as in Fig 3. Actin-normalized expression values averaged
from 2 independent experiments are expressed as fold-change relative to B6 DN2a values. The
resulting data were plotted for B: phase 1 and alternate lineage genes; C: T-lineage genes; D:
components of the Notch-signaling pathway and E: alternate lineage genes. Error bars represent
one standard deviation. Significant differences at p < 0.05 in the DN2a stages are indicated with
an asterisk (*) while those in the DN2b cells are indicated by the pound (#) sign. Similar results
were obtained in two additional experiments in which Bcl-xL was not used. However, without
Bcl-xL the DN1 subsets did not show as complete deletion of PU.1, suggesting selection for
some retention of PU.1 activity.
Figure S4 – PU.1 restricts access to the NK lineage fate in developing thymocytes. E14.5 FLPs
were transduced with Cre and Bcl-xL carrying retroviruses and co-cultured with OP9-DL1 in
media supplemented with 5 ng/ml each of IL-7 and Flt3L to initiate T-cell development. Cultures
were not supplemented with any other cytokines (such as IL-15) that are known to be important
for NK cell survival, proliferation and development. Cells were harvested on day 6 and the
fraction of NK cells in each of the indicated populations was determined by flow cytometry. NK
4
cells were defined by the CD45+ CD25- CD122+ NK1.1+ phenotype. A: FACS plots showing
NK cell percentage in the total CD45+ CD25- population. B: Graph showing the proportion of
NK cells in the indicated samples from two independent experiments.
Figure S5 – PU.1 indirectly represses T-lineage gene expression. A: Bcl2tg FLDN cells were
processed as shown in the flowchart and sorted to purify DN2a and DN2b cells in B, or DN2
cells in C, and used for gene expression analysis by qPCR. B-C: Retroviral supernatants used to
express various constructs are indicated on top of each lane. Sorted DN stage cells from 2-3
independent experiments were pooled to obtain each set of qPCR data. Average, Actin-
normalized log10 expression values from 2 such pooled datasets were used to generate the
heatmaps shown in the figure. All values are row normalized to the control DN2a sample in B
and to the control DN2 sample in C. The common scale on the right denotes 30-fold up- (dark
red) and 30-fold down-regulation (dark blue) of gene expression. Bold face and arrows in panel
B designate Notch-activated target genes, as characterized previously (Del Real and Rothenberg
2013).
Figure S6 – Gene expression changes in response to PU.1-Eng expression. Bcl2tg FLPs were
cultured on OP9-DL1 and transduced to obtain EV, PU.1-Eng and PU.1 WT expressing DN1,
DN2a and DN2b cells as shown in fig 5B. Sorted cells were used to analyze the expression of
several genes as shown in the heatmap in fig 5C and D. Here the relative gene expression levels
over two independent experiments are shown for DN1 (panels, A, C and E) and DN2a and DN2b
cells (panels B, D and F) for A-B: T-cell genes C-D: alternate lineage genes and E-F: phase 1
and alternate lineage genes. G: RNAseq experiments and data analysis were performed as
described in materials and methods section and the table shows the effect of PU.1-Eng and PU.1-
ETS on the expression of selected cell cycle genes from two independent experiments. The last 4
columns show the padj values obtained from comparing the expression of cell cycle genes in
indicated samples using the DESeq package. Note that none of the cell cycle genes were
significantly differentially expressed (padj ≤ 0.05) in both RNAseq datasets.
5
Figure S7 – Sorting DN2 cells for RNA-seq experiments. OP9-DL1 co-cultures were seeded with
e14.5 Bcl2tg FLPs and retrovirally-transduced on day 4 with the constructs indicated on the left.
Cells were harvested the next day and Lin- 7-AAD- CD45+ GFP+ c-Kit+ CD25+ DN2 cells
were sorted using the gating strategy shown in the figure. GFP is a marker of retroviral infection.
Numbers in the boxes show percentage in each quadrant or gated population. PU.1 regulates
CD45 expression in myeloid cells (Anderson et al 2001) and here PU.1-Eng, an obligate
repressor construct, is seen to downregulate its surface expression on infected FLDN cells.
Figure S8 – Changes in expression of phase 1 genes in response to PU.1 WT and PU.1-Eng
expression. OP9-DL1 co-cultures were seeded with e14.5 Bcl2tg FLPs and retrovirally-
transduced with the indicated constructs after 4 days as shown in fig 5B. DN1, DN2a and DN2b
subsets were sorted the next day and expression of the indicated phase 1 genes was determined
by qPCR. Bars represent average, Actin-normalized expression of each gene from 2 independent
experiments and error bars indicate 1 standard deviation.
SUPPLEMENTARY METHODS
Purification of fetal liver-derived precursors, cell culture and flow cytometry
Fetal liver cells from various crosses indicated in the methods section “Cell cultures” were used
to initiate OP9-DL1 cultures as indicated for each experiment. Briefly, male and female mice
were co-housed on the day of mating and the time until the next morning was counted as 0.5
days. After 14 days, pregnant females were sacrificed and fetal livers were isolated. Biotin
conjugated antibodies against Ter-119, Gr-1, NK1.1, CD19 and F4/80 (eBioscience) were used
to stain lineage positive cells, which were then depleted using streptavidin bound magnetic beads
(Miltenyi) per manufacturer’s protocol. The enriched fetal liver precursors (FLPs) were either
used directly to start OP9-DL1 co-cultures or frozen at -80°C for future use. Frozen FLPs were
first thawed in OP9 medium (α-MEM, 20% FBS, 50 µM β-ME, Pen-Step-Glutamine)
supplemented with stem cell factor (SCF, 1 ng/ml) and IL-7 and Flt3L (5 ng/ml each) for a day
before starting OP9-DL1 co-cultures, which were without SCF. For some experiments, depleted
6
fetal liver cells were used to sort c-Kit+ CD27+ T-cell precursors on the BD FACSAria sorter
using 6-color flow cytometry. OP9-DL1 cultures were harvested at the indicated times, stained
for DN progression and cytokine receptor expression, and analyzed with the MACSQuant flow
cytometer (Miltenyi).
Retroviral infection and sorting
Fetal liver precursors were isolated from gestational day 14.5 embryos as above and co-cultured
with OP9-DL1 cells in the presence of IL-7 and Flt3L (5 ng/ml each) for 3-4 days to obtain
FLDN1, DN2a and DN2b cells. These cultures were harvested and pre-plated to obtain FLDN
cells free of OP9-DL1 stromal cells. Briefly, culture medium was removed and the OP9-DL1 co-
cultures were trypsinized for 5-7 min at 37°C. Fresh media with cytokines was added to these
cultures and the cells were transferred to 30 cm tissue culture dishes for 35-40 mins at 37°C to
allow the OP9-DL1 cells to re-adhere to the plate. The non-adherent FLDN cells were then
harvested and used for retroviral infection.
Retroviral supernatants were generated by transfecting Phoenix-Eco cells with the appropriate
retroviral constructs using Fugene 6 (Roche). Culture supernatants were collected 48 and 72
hours after transfection, filtered through a 0.45 µm filter and were either frozen at -80°C or used
directly to infect FLDN cells. In short, 24-well non tissue culture treated plates were prepared by
adding 500 µl of 30-40 µg/ml Retronectin solution/well and incubated at 4°C. The next day,
Retronectin was replaced by 500 µl retroviral supernatant and the plates were spun at 2000 x g
for 2 hours at 32°C. Retroviral supernatant was then replaced with FLDN cells from OP9-DL1
cultures, incubated at 37°C for 4 hours and transferred back on fresh OP9-DL1 plates. Cells were
harvested at indicated timepoints, and the retrovirally transduced (GFP+ or NGFR+) DN1 (Lin-
c-Kithi CD44hi CD25-), DN2a (Lin- c-Kithi CD44hi CD25+) and DN2b (Lin- c-Kit++ CD44+
CD25+) hematopoietic cells (all CD45+) were sorted using 6-color flow cytometry. Note that at
later stages of infection with PU.1-Eng, the CD45 gate had to be expanded due to the
downregulation of direct PU.1 target gene Ptprc (CD45)(Anderson et al. 2001).
Determination of Spi1fl/fl deletion efficiency
7
Efficiency of deletion of Spi1 exon 5 was assessed at the RNA level (cf. Fig. S2B), by
quantitating the level of transcripts in the cells detected with primers spanning exons 4 and 5
(Table S6) and comparing the level in sorted Cre+ cells from the floxed genotype with Cre+
sorted wildtype cells at the same developmental stage. This assay measured deletion rather than
altered transcriptional regulation, because cells with >80% deletion of Spi1 exon 5 still expressed
equal or more RNA than the controls from the 5’ exons of the Spi1 gene (data not shown).
RNA isolation and qPCR
Sorted FLDN populations were used to extract RNA with the RNeasy mini kit (Qiagen).
Typically, total RNA was eluted in 30 µl DEPC treated water and cDNA was synthesized using
SuperScript III Reverse Transcriptase (Invitrogen). An appropriate cDNA dilution was then used
for qPCR with SYBR® GreenerTM reagent (Life Technologies) and expression data was obtained
on an ABI 7900 HT fast qPCR machine. Sequences of PCR primers used were from published
work (David-Fung et al. 2009; Li et al. 2010; Yui et al. 2010; Del Real and Rothenberg 2013;
Scripture-Adams et al. 2014). Each reaction was run in triplicate, and the value for each
biological sample was taken as the average of these technical replicates. The resulting data were
normalized to Actin expression and mean values for each sample type from 2-3 experiments were
expressed as a heat map (Matlab) or plotted as log10-scale graphs. For statistical analysis, actin-
normalized gene expression values were log10-transformed and two-tailed Student’s t tests
(paired, equal variance) were performed to determine the significance of differences between
expression values. Significant differences in expression with a p-value < 0.05 are indicated with
an asterisk. Error bars in all plots represent one standard deviation, or range when experiments
had only two biological replicates.
Complete methods for transcriptome profiling and analysis of sequencing data
1. Read mapping and differential expression analysis
Total RNA was processed essentially as described (Zhang et al. 2012). cDNA libraries were
sequenced with the Illumina HiSeq 2000 sequencer following manufacturer’s protocols
8
(http://www.illumina.com) and produced between 11-14 million mapped 50bp single-end reads
per experiment. Sequenced reads were mapped onto mouse genome build NCBI37/mm9 using
TopHat (TopHat v2.0.6, http://tophat.cbcb.umd.edu) with settings ‘--library-type fr-unstranded --
no-novel-juncs’ and bowtie settings ‘-v 2 -k 40 -m 40’. HTseq (v0.6.0 , http://www-
huber.embl.de/users/anders/HTSeq/doc/overview.html)(Anders et al. 2015) was used to process
read alignments produced by TopHat. Read counts were obtained for each gene across all
samples using htseq-count with the '-S no -a 10' options and with the Ensembl gene model file
Mus_musculus.NCBIM37.66.gtf. The count data was analyzed for differential expression using
the DESeq package (v1.16.0, http://bioconductor.org/packages/release/bioc/html/DESeq.html)
(Anders and Huber 2010). Each replicate experiment was analyzed separately and a set of 168
genes with an adjusted p-value of =< 0.1 in both experiments were considered to be differentially
expressed (DE); this set was augmented by 36 genes with padj-value <0.1 in one set and
0.1<padj<0.2 in the other. Note that the position of Spi1 (Sfpi1) in these gene lists as an
“upregulated gene” is an artifact of the overexpressed sequences from the exogenous PU.1-Eng
or PU.1-ETS.
2. Gene ontology and pathway enrichment analysis
We performed GO term and KEGG pathway over representation analysis on a set of genes that
were called DE between the vector control and PU.1-Eng expressing samples at padj <= 0.1 in at
least one experiment. This set of 290 genes was further divided into 148 repressed and 142 PU.1-
Eng upregulated genes. These lists were processed in R (R Core Team 2014)(http://www.R-
project.org/) using the package GOseq (v1.16.2), which calculates enrichment after correcting
for transcript length bias (Young et al. 2010). Only the GO terms and KEGG pathways with an
adjusted (Benjamini-Hochberg method) p-value of <= 0.05 were considered to be enriched in the
DE gene set.
3. Analysis of PU.1 binding sites for repressed and activated genes
9
A set of genes which were differentially expressed at padj <=0.1 in at least one set and at padj
<0.2 in both sets were selected if there was a >= 2-fold expression difference in either direction
between the PU.1-Eng and the vector control samples. The genes that were up- or down-
regulated by PU1-Eng were separated, and compared with each other and with a control set of
non-regulated but well-expressed genes in early T cells. The control set consisted of 841 genes
from expression clusters 16 and 18 in (Zhang et al. 2012), two clusters of genes that are
expressed stably at moderately high levels throughout T cell development, after the removal of 4
genes that were affected by PU.1-Eng. All DN1 and DN2a stage PU.1 peaks in the vicinity of
each gene (Supp. Table 5, from (Zhang et al. 2012)) were first ordered by strength of occupancy
signals (rpm) and all sites in order of strength were matched to the relevant genes on the three
lists. The cumulative occupancy index for each gene was then calculated by summing the peak
signals from DN1 and DN2a stages for peaks assigned to that gene. Although the number of sites
per gene ranged from zero to 21 (Haao), only values for the top 4 sites from the two stages were
summed to minimize background error and reduce outlier effects. The distribution of the sum of
binding strength for each group of genes was tested for similarity using a pairwise Z-test for
means using the R package BSDA (Kitchens 2002). This score does not take into account gene
length, but there was no significant difference in gene length distribution among the three groups
of genes (Z scores -1.47 to +1.46, p values 0.14-0.43, Table S2).
4. Indexing developmental state by principal component analysis
We used a set of 173 developmentally regulated transcription factors to assess the developmental
state of our RNA-seq samples (Scripture-Adams et al. 2014). The expression values for these
genes in 11 wild-type samples spanning DN1/ETP to DP stages and 6 samples from the present
study were first log-transformed and normalized so that the mean of the expression values for
each gene across all samples was zero. Then, the signals from the wildtype samples were used to
calculate the principal components using the prcomp function from the R stats package (R Core
Team 2014). These values provide a framework within which a perturbed sample can be related
to normal development. We then used the principal component loadings for the 173 genes from
the normal samples to transform the values from the six experimental samples, so that they could
10
be plotted on the scales defined by the normal reference ones. The first two principal components
for the experimental and reference samples were plotted using the R package ggplot2 (Wickham
2009).
5. Gene Set Enrichment Analysis (GSEA)
We used the “GSEAPreranked” tool in order to fully characterize the effects of PU.1-ENG on
developmentally regulated and lineage-restricted genes sets. All genes that were expressed at
>=1 FPKM in at least 1 sample were included in the analysis. Genes were ranked in descending
order of mean log2 ratios between all pairwise comparisons from both RNA-seq experiments.
These ranked lists were used to test a number of gene sets obtained from two previous reports.
First, we used 25 gene sets from our previous study (Zhang et al 2012) that were obtained by K-
means clustering of all genes expressed during T-cell development. The most important ones
included clusters of genes whose expression was a) highest in pre-commitment stages (clusters 3,
7, 9 and 23) b) upregulated as PU.1 expression is silenced (clusters 1, 6, 15 and 25). Some
clusters, which showed similar expression patterns, were combined into larger sets that were
tested in addition to running each cluster by itself. Additionally, we used data from a report by
Kamath et al looking at changes in gene expression in myeloid precursor cells expressing PU.1 at
2% (Blac allele) of WT levels. This dataset was used to identify genes that were repressed by
PU.1 (2-, 5-, or 10-fold upregulated in Blac samples compared to WT) and positively regulated
(5-, 10-, or 15-fold downregulated in Blac samples compared to WT). This dataset was chosen
for two reasons - 1) a decrease in PU.1 dosage was shown to upregulate T-lineage-specific genes
in myeloid cells similar to our results showing that PU.1 represses the T-lineage developmental
program in early T-cells and 2) several myeloid genes are expressed in early T-cells and we
wanted to identify PU.1 functions that are common to both these lineages. GSEAPreranked was
run using phenotype permutations and by using the conservative “classic” setting to calculate
enrichment score as suggested on the GSEA website
(http://www.broadinstitute.org/gsea/index.jsp). All gene sets with FDR <=0.01 were considered
to be significantly enriched.
11
SUPPLEMENTARY REFERENCES
Anders S, Huber W. 2010. Differential expression analysis for sequence count data. Genome
Biol 11: R106.
Anders S, Pyl PT, Huber W. 2015. HTSeq-a Python framework to work with high-throughput
sequencing data. Bioinformatics 31: 166-169.
Anderson KL, Nelson SL, Perkin HB, Smith KA, Klemsz MJ, Torbett BE. 2001. PU.1 is a
lineage-specific regulator of tyrosine phosphatase CD45. JBiolChem 276: 7637-7642.
David-Fung ES, Butler R, Buzi G, Yui MA, Diamond RA, Anderson MK, Rowen L, Rothenberg
EV. 2009. Transcription factor expression dynamics of early T-lymphocyte specification
and commitment. Dev Biol 325: 444-467.
Del Real MM, Rothenberg EV. 2013. Architecture of a lymphomyeloid developmental switch
controlled by PU.1, Notch and Gata3. Development 140: 1207-1219.
Kitchens LJ. 2002. Basic Statistics and Data Analysis. Thomson/Brooks/Cole.
Li L, Leid M, Rothenberg EV. 2010. An early T cell lineage commitment checkpoint dependent
on the transcription factor Bcl11b. Science 329: 89-93.
R Core Team. 2014. A language and environment for statistical computing. R Foundation for
Statistical Computing, Vienna, Austria.
Scripture-Adams DD, Damle SS, Li L, Elihu KJ, Qin S, Arias AM, Butler RR, 3rd, Champhekar
A, Zhang JA, Rothenberg EV. 2014. GATA-3 dose-dependent checkpoints in early T cell
commitment. J Immunol 193: 3470-3491.
Wickham H. 2009. ggplot2: Elegant graphics for data analysis (Use R!). Springer, New York.
Young MD, Wakefield MJ, Smyth GK, Oshlack A. 2010. Gene ontology analysis for RNA-seq:
accounting for selection bias. Genome Biol 11: R14.
Yui MA, Feng N, Rothenberg EV. 2010. Fine-scale staging of T cell lineage commitment in
adult mouse thymus. J Immunol 185: 284-293.
Zhang JA, Mortazavi A, Williams BA, Wold BJ, Rothenberg EV. 2012. Dynamic
transformations of genome-wide epigenetic marking and transcriptional control establish
T cell identity. Cell 149: 467-482.