the phenotypes of proliferating glioblastoma cells reside on a …€¦ · 1 the phenotypes of...
TRANSCRIPT
The phenotypes of proliferating glioblastoma cells reside on a single axis of variation. 1
Running Title: A draft single-cell atlas of human glioma 2
Lin Wang1,3,*, Husam Babikir1,3,*, Sören Müller1,3,*, Garima Yagnik1,3, Karin Shamardani1,3, Francisca 3
Catalan1,3, Gary Kohanbash4, Beatriz Alvarado1,2,3, Elizabeth Di Lullo2,3, Arnold Kriegstein2,3, Sumedh 4
Shah1,3, Harsh Wadhwa1,3, Susan M. Chang1,3, Joanna J. Phillips1,3, Manish K. Aghi1,3, Aaron A. 5
Diaz1,3,† 6
1 Department of Neurological Surgery. 7
2 Eli and Edythe Broad Center of Regeneration Medicine and Stem Cell Research. 8
3University of California, San Francisco, San Francisco, CA 94158. 9
4Department of Neurological Surgery, University of Pittsburgh, Pittsburgh, PA 15261. 10
*Equal contribution 11
†Correspondence: Aaron Diaz, 1450 3rd Street, San Francisco, CA, 94158, [email protected], (415) 12
514-0408 13
The authors certify that they have no affiliations with or involvement in any organization or entity with 14
any financial interest or non-financial interest in the subject matter or materials discussed in this 15
manuscript. 16
Abstract 17
Although tumor-propagating cells can be derived from glioblastomas (GBMs) of the proneural and 18
mesenchymal subtypes, a glioma stem-like cell (GSC) of the classical subtype has not been identified. 19
It is unclear if mesenchymal GSCs (mGSCs) and/or proneural GSCs (pGSCs) alone are sufficient to 20
generate the heterogeneity observed in GBM. We performed single-cell/nuclei RNA-sequencing of 28 21
gliomas, and single-cell ATAC-sequencing for 8 cases. We find that GBM GSCs reside on a single axis 22
of variation, ranging from proneural to mesenchymal. In silico lineage tracing using both transcriptomics 23
and genetics supports mGSCs as the progenitors of pGSCs. Dual inhibition of pGSC-enriched and 24
mGSC-enriched growth and survival pathways provides a more complete treatment than combinations 25
targeting one GSC phenotype alone. This study sheds light on a long-standing debate regarding 26
lineage relationships among GSCs and presents a paradigm by which personalized combination 27
therapies can be derived from single-cell RNA signatures, to overcome intra-tumor heterogeneity. 28
Significance 29
Research. on May 29, 2020. © 2019 American Association for Cancercancerdiscovery.aacrjournals.org Downloaded from
Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on September 25, 2019; DOI: 10.1158/2159-8290.CD-19-0329
Tumor-propagating cells can be derived from mesenchymal and proneural glioblastomas. 30
However, a stem cell of the classical subtype has yet to be demonstrated. We show that classical-31
subtype gliomas are comprised of proneural and mesenchymal cells. This study sheds light on a long-32
standing debate regarding lineage relationships between glioma cell-types. 33
Introduction 34
Malignant gliomas are the most common primary tumors of the adult brain and are essentially 35
incurable. Glioma genetics have been studied extensively and yet targeted therapeutics have produced 36
limited results. Glioblastomas (GBMs) have been classified into subtypes based on gene expression 37
(e.g. 1). However, we and others have shown that GBMs contain heterogeneous mixtures of cells from 38
distinct transcriptomic subtypes (2, 3). This intratumor heterogeneity is at least partially to blame for the 39
failures of targeted therapies. 40
Tumor-propagating cells expressing markers of the mesenchymal and proneural transcriptomic 41
subtypes can be derived from GBMs (e.g. 4). However, no GSC of the classical subtype has been 42
convincingly identified. The lineage relationship between proneural GSCs (pGSC) and mesenchymal 43
GSCs (mGSCs) is unknown. Surprisingly little is known about the cellular progeny of GSCs in vivo. It is 44
unclear if proneural and/or mesenchymal GSCs are sufficient to generate the heterogeneity observed in 45
GBM. 46
We performed single-cell RNA sequencing (scRNA-seq), single-nuclei RNA sequencing 47
(snRNA-seq), single-cell assay for transposase-accessible chromatin using sequencing (scATAC-seq), 48
and whole-exome DNA sequencing (exome-seq) of specimens from untreated human gliomas. Via in-49
silico transcriptomic and genetic lineage tracing of these data we defined the lineage relationships 50
between glioma cell types. Integrating this with meta-analysis of sequencing data from The Cancer 51
Genome Atlas (TCGA) (https://cancergenome.nih.gov/) and spatial data from The Ivy Foundation 52
Glioblastoma Atlas Project (Ivy GAP) (http://glioblastoma.alleninstitute.org/), we mapped glioma cells to 53
analogous cell types in the developing brain and to specific tumor-anatomical structures. From the 54
scATAC-seq we elucidated cell-type specific cis-regulatory grammars and associated transcription 55
factors. Using immunohistochemistry (IHC) and automated image analysis of human GBM microarrays 56
we validated learned phenotypes at the protein level. Lastly, we performed an in vitro screen of drug 57
combinations that target genes identified from our single-cell analysis of GSCs. 58
We show that proliferating IDH-wildtype GBM cells can be described by a single axis of gene 59
signature, which ranges from proneural to mesenchymal. At the extremes of this axis reside stem-like 60
cells which express canonical markers of mGSCs and pGSCs. We identify a mGSC signature that 61
Research. on May 29, 2020. © 2019 American Association for Cancercancerdiscovery.aacrjournals.org Downloaded from
Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on September 25, 2019; DOI: 10.1158/2159-8290.CD-19-0329
correlates with significantly inferior survival in IDH-wildtype GBM. Our analysis shows that mGSCs, 62
pGSCs, their progeny and stromal/immune cells are sufficient to explain the heterogeneity observed in 63
GBM. We show that combinatorial therapies that address intra-tumor heterogeneity can be designed 64
based on single-cell RNA signatures. 65
Results 66
Single-cell mRNA and bulk DNA profiling of untreated human gliomas 67
We applied scRNA-seq or snRNA-seq to biopsies from 22 primary untreated human IDH-68
wildtype GBMs and 6 primary untreated IDH-mutant gliomas (Table S1). Our goal was to profile both 69
for cellular coverage (to survey cellular phenotypes) and for transcript coverage (to compare genetics). 70
Therefore, we performed sc/snRNA-seq for 19 samples via the 10X Genomics Chromium platform 71
(10X) and obtained 3’-sequencing data for 31,281 cells after quality control. ScRNA-seq for 3 cases 72
was done using the Fluidigm C1 platform (C1), which yielded full-transcript coverage for 291 cells. We 73
incorporated 4 more published cases from our C1 pipeline, adding 384 cells (3). For 6 of the 10X cases 74
and 5 of the C1 cases, the biopsies were minced, split and both scRNA-seq and exome sequencing 75
(exome-seq) were performed (Table S1). 76
We applied our pipeline for sc/snRNA-seq pre-processing (5), quantification of expressed 77
mutations (3, 6), and cell-type identification (7, 8). We identified 10,816 tumor-infiltrating stromal and 78
immune cells based on expressed mutations, clustering and canonical marker genes (Fig. 1 A and Fig. 79
S1A). We term the remaining 20,465 cells neoplastic, as they express clonal malignant mutations. Only 80
neoplastic cells were used for all subsequent analyses. 81
The transcriptional phenotypes of proliferating IDH-wildtype GBM neoplastic cells can be explained 82
by a single axis that varies from proneural to mesenchymal 83
Unbiased principal component analyses (PCAs) of the IDH-wild-type GBM cases revealed two 84
patient-independent clusters of neoplastic cells, consistent across all 3 platforms (Fig. 1B-I; Table S2). 85
A differential-expression test between clusters identified markers of the proneural (e.g. ASCL1, OLIG2) 86
and mesenchymal (e.g. CD44, CHI3L1) subtypes as significant (Table S3). Mesenchymal cells 87
significantly over-express markers of response to hypoxia (e.g. HIF1A) and cytokines that promote 88
myeloid-cell chemotaxis (e.g. CSF1, CCL2, CXCL2). However, mesenchymal cells do not express high 89
levels of MKI67 or other markers of cell-cycle progression. Conversely, proneural cells express high 90
levels of MKI67 as well as cyclin-dependent kinase. We estimated the fraction of actively cycling cells 91
using the Seurat package (9). By this metric, 21%-30% of proneural cells are cycling compared to 92
0.3%-10% of mesenchymal cells (Fig. 1D,G). Importantly, all our clinical specimens contain cells of 93
Research. on May 29, 2020. © 2019 American Association for Cancercancerdiscovery.aacrjournals.org Downloaded from
Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on September 25, 2019; DOI: 10.1158/2159-8290.CD-19-0329
both phenotypes: proliferating proneural cells and mesenchymal cells with a quiescent, cytokine-94
secretory phenotype (Fig. S1B). 95
In all PCAs, cells at the left and right extremes of principal component 1 (PC1) express high 96
levels pGSC or mGSC markers (4) respectively, as indicated by PC1 gene loadings (Fig. 1B,E). We 97
therefore interpreted the top-loading genes from each direction as representing pGSC and mGSC gene 98
signatures. We used those signatures to score all cells for stemness, controlling for technical variation 99
as previously described (9, 10). In all PCAs, principal component 2 (PC2) is loaded by markers of cell-100
cycle progression (e.g. MKI67). Thus, while stemness correlates with cell cycle for both proneural and 101
mesenchymal cells, more pGSCs than mGSCs express markers of cell-cycle progression, and at 102
higher levels. 103
IDH-mutant gliomas are composed of cycling stem-like cells, noncycling astrocyte- and 104
oligodendrocyte-like cells 105
A PCA analysis of the IDH-mutant gliomas identified 3 patient-independent populations (Fig. 106
S1C-F). Two of the clusters fall on opposite ends of PC1. They differentially express markers (e.g. 107
OLIG2, GFAP) of astrocyte-like (IDH-A) and oligodendrocyte-like (IDH-O) cell types, as has been 108
previously described in scRNA-seq of IDH-mutant glioma (11). PC2 is positively loaded by cell cycle 109
genes, such as MKI67. A third cluster of cells has high PC2 sample scores and specifically expresses 110
markers of IDH-mutant GSCs (10, 11), as well as genes associated with cancer stem-cells not 111
previously described in IDH-mutant glioma (Table S2). We term this cluster IDH-S following the 112
convention of Venteicher et al. (11). Most cells in IDH-S have low PC1 sample scores, and do not 113
express IDH-A or IDH-O genes. Almost all cycling cells are found in IDH-S (Fig. S1G). 114
IDH-wild-type GBM cells are stratified by differentiation gradients observed in gliogenesis 115
In our differential expression test we observed that proneural cells specifically express markers 116
of the oligodendrocyte lineage (e.g. OLIG2, SOX10), while mesenchymal cells instead express markers 117
of astrocytes (e.g. GFAP, AQP4). We evaluated the mGSC and pGSC gene signatures in scRNA-seq 118
of glia from fetal and adult human brain (12, 13). We found that pGSC-signature genes are enriched in 119
oligodendrocyte progenitor cells. The mGSC signature, however, is comprised of genes expressed by 120
neural stem cells as well as markers of astrocytes (Fig. 2A). 121
In addition to mGSCs and pGSCs, we find neoplastic cells (possessing clonal malignant 122
mutations) that do not express stemness or cell-cycle signatures above background. Instead they 123
express either markers of differentiated astrocytes (e.g. ALDOC) or differentiated oligodendrocytes 124
(e.g. MAG, MOG). While the GSCs express high levels of positive WNT-pathway regulators, these 125
Research. on May 29, 2020. © 2019 American Association for Cancercancerdiscovery.aacrjournals.org Downloaded from
Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on September 25, 2019; DOI: 10.1158/2159-8290.CD-19-0329
more differentiated cells express high levels of WNT-pathway agonists (Fig. S2A). Thus, GBMs contain 126
mesenchymal and proneural populations that align with astrocyte and oligodendrocyte differentiation 127
gradients respectively. 128
PGSCs, mGSCs, their differentiated progeny and stromal/immune cells explain the phenotypic 129
heterogeneity observed in GBM 130
We found that our mGSC and pGSC gene signatures are co-expressed across TCGA datasets 131
(Fig. 2B, S2B). While mGSC and pGSC signature genes are correlated among themselves, the mGSC 132
and pGSC signatures are anti-correlated with each other. Moreover, the cell-cycle signature more 133
strongly correlates in TCGA data with the pGSC signature than the mGSC signature, consistent with 134
our scRNA-seq data (Fig. S2C). 135
Using our scRNA-seq data and published scRNA-seq from human brain tissue (12, 13) as a 136
basis, we pooled reads across cells of the same type. This yielded data-driven profiles for mGSCs, 137
pGSCs, astrocytes, oligodendrocytes, neurons, endothelial cells, myeloid cells and T-cells. We then 138
used these cell-type signatures as predictors in a linear regression model. We fit our model to each 139
TCGA GBM RNA-sequencing dataset individually. A heatmap of cell-type specific genes (Fig. 2C) 140
agreed with our regression analysis (Fig. 2D). We found that samples of both the mesenchymal and 141
classical Verhaak subtypes are enriched for mGSCs and depleted of pGSCs. While classical samples 142
are distinguished by higher infiltration of astrocytes, mesenchymal samples contain high levels of 143
infiltrating immune cells. Proneural samples are characterized by the highest levels of pGSCs, 144
oligodendrocytes and neurons. In summary, the full spectrum of heterogeneity observed in TCGA GBM 145
data can be explained by varying proportions of pGSCs and mGSCs, their differentiated progeny and 146
infiltrating stromal/immune content. Cox-regression analysis identifies the mGSC signature as 147
correlating with significantly inferior survival in TCGA data. However, pGSC content is not prognostic 148
(Fig. 2E). 149
GSC niche localization and microenvironment interactions elucidated using reference atlases and 150
quantitative IHC 151
We compared our mGSC, pGSC and cell-cycle signatures to RNA sequencing from the Ivy GAP 152
(Fig. 2F-G). The Ivy GAP has annotated, micro-dissected and RNA sequenced GBM anatomical 153
structures from human specimens. We found that mGSCs are enriched in hypoxic regions. PGSCs are 154
enriched in the tumor’s leading edge and in regions of diffuse infiltration of tumor-adjacent white matter. 155
To visualize and quantify the associations between GSCs, cell cycle and hypoxia, we performed 156
IHC for CD44 (mGSCs), DLL3 (pGSCs), CA9 (hypoxia) and Ki67 (cell cycle) on GBM microarrays (Fig. 157
Research. on May 29, 2020. © 2019 American Association for Cancercancerdiscovery.aacrjournals.org Downloaded from
Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on September 25, 2019; DOI: 10.1158/2159-8290.CD-19-0329
2H; Table S4), and non-malignant controls (Fig. S2D). We did not find any cycling CD44+ cells in our 158
samples, while approximately 5% of DLL3+ cells expressed Ki67. On the other hand, CD44+ cells 159
colocalized with CA9 at 2-fold greater frequency than DLL3+ cells. This dovetails with our findings in 160
scRNA-seq, TCGA and GAP data, which show that pGSCs are more proliferative while mGSCs are 161
enriched in hypoxic regions. 162
Genetic and transcriptomic in-silico lineage tracing of GBM GSCs supports a mesenchymal to 163
proneural hierarchy 164
We identified all cycling cells from our IDH-wildtype GBM datasets (Methods). These cells were 165
then used to perform RNA velocity analysis via velocyto (14). This approach compares the fractions of 166
spliced to un-spliced transcripts per gene in order to estimate the time derivative of gene abundance in 167
single cells. When projected back onto a PCA axis (Fig. 3A), we found that RNA velocity supports 168
stable mGSC and pGSC populations as well as an intermediate population that gradually transitions 169
from a mesenchymal to proneural phenotype. For example, in this intermediate population mGSC 170
markers (e.g. CD44, CHI3L1) are highly expressed but have a negative time derivative, conversely 171
pGSC markers (e.g. DLL3, PDGFRA) are lowly expressed but with a high, positive velocity. 172
Mitochondrial expressed mutations have recently been identified as a reliable way to perform 173
genetic lineage tracing due to the robust coverage of mitochondrial genes, even in scRNA-seq (15). 174
Following this approach, for each sample we pooled cells and identified mitochondrial single-nucleotide 175
variants (SNVs) (Methods). For each cell we then annotated the percentage of its sample’s 176
mitochondrial mutations it harbored. We found that the percentage of expressed mitochondrial 177
mutations correlated with progression from the mesenchymal phenotype (Fig. 3B-D). Thus, both 178
genetic and transcriptomic lineage tracing supports a mesenchymal to proneural hierarchy. 179
We also compared chromosomal SNVs and CNVs in our scRNA-seq that were validated by 180
exome-seq data via our previously described approach (3, 6). We restricted ourselves to mutations that 181
occurred at a minimum of 10% variant allele frequency in exome-seq and identified cells in our scRNA-182
seq which expressed these mutations. For all patients, we found that all expressed, validated mutations 183
are present in the GSCs (Fig. S3A). This shows that the GSCs are sufficient to explain the genetic 184
heterogeneity observed in our tumor specimens. 185
IDH-mutant and -wildtype GSCs share a core signature which is prognostic 186
RNA-velocity analysis on IDH-mutant glioma datasets revealed a hierarchy with a single root 187
focused on the IDH-S population and two terminal points corresponding to differentiated IDH-A and 188
IDH-O populations (Fig. S3B-C). We compared gene loadings of PC2 between the original PCAs of our 189
Research. on May 29, 2020. © 2019 American Association for Cancercancerdiscovery.aacrjournals.org Downloaded from
Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on September 25, 2019; DOI: 10.1158/2159-8290.CD-19-0329
IDH-wildtype and -mutant gliomas. This identified a core signature of stemness enriched in the cycling 190
cells of both diseases (Fig. S3D). This signature is prognostic in a combined cohort of IDH-mutant 191
grade II/III oligodendrogliomas and astrocytomas (Fig. S3E). An analysis of expressed mitochondrial 192
mutations showed that mitochondrial mutational load correlated with progression along the IDH-O and 193
IDH-A lineages (Fig. S3F-G), further supporting a cellular hierarchy with IDH-S at the apex. 194
scATAC-seq of human gliomas identifies cell-type specific regulatory grammars and supports a 195
single-axis hypothesis 196
We performed scATAC-seq on 4 IDH-wildtype GBMs, 2 IDH-mutant grade-II astrocytomas, and 197
2 IDH-mutant oligodendrogliomas. We considered the IDH-wildtype and IDH-mutant samples 198
separately. IDH-wildtype GBM scATAC-seq cells were triaged according to the presence of observed 199
genomic alterations (Fig. 4A; Methods). For each cell and each gene we computed a gene-body activity 200
score (i.e. the gene-wise sum scATAC-seq read-counts) using Seurat v3. We then clustered gene-body 201
activity scores separately for neoplastic and non-neoplastic cells using Seurat. The gene-body activities 202
of non-neoplastic cells readily separated by canonical markers of myeloid, glial and endothelial cells 203
(Fig. 4B). 204
When we clustered the gene activity scores of neoplastic cells we found 3 clusters of cells. We 205
then performed differential motif enrichment tests between these groups of cells to identify enriched 206
regulatory motifs and associated transcription factors (Fig. 4C-G; Table S5; Methods). 207
In the first cluster, the proneural bHLH genes OLIG2, ASCL1, NEUROG2 were enriched, both 208
their gene-activity scores and representative-motif frequencies were significant. The second cluster 209
showed enrichment for mesenchymal makers (e.g. CD44, STAT3). When we compared the third cluster 210
to the mesenchymal cluster, proneural markers were differentially enriched (Fig. 4E). Likewise, when 211
we compared the third cluster to the proneural cluster we saw differential enrichment of mesenchymal 212
markers (Fig. 4F). Moreover, the most cluster-specific gene activities identified for this third cluster 213
showed partial overlap with both the mesenchymal and proneural clusters, e.g. CD44, CDH9 (Fig. 4C, 214
Table S4). We plotted the expression of the cluster-specific genes from the third cluster on top of a 215
PCA of cycling cells. We found those genes expressed a population which was intermediate to the 216
mesenchymal and proneural populations in our RNA-velocity and mitochondrial-mutation analysis (e.g. 217
MET, Fig. S4A). Thus, we interpret this third cluster as an intermediate population between 218
mesenchymal and proneural clusters. 219
IDH-mutant glioma cells were likewise separated by the presence of clonal mutations into 220
neoplastic cells and non-malignant glial/immune cells (Fig. S4B-C). Clustering by gene-activity scores 221
Research. on May 29, 2020. © 2019 American Association for Cancercancerdiscovery.aacrjournals.org Downloaded from
Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on September 25, 2019; DOI: 10.1158/2159-8290.CD-19-0329
identified 3 clusters corresponding to the IDH-S, -O and -A cell types (Fig. S4D-G). Motif enrichment in 222
each cluster was assessed to determine cell-type specific regulatory grammars (Fig. S4H). 223
A drug-combination screen based on single-cell signatures 224
We identified FDA-approved drugs which target genes found to be specific to pGSCs and 225
mGSCs in our single-cell analysis (Fig. 5A, S5A). Drugs were screened in combination to assess 226
synergy in vitro (Methods). Only combinations that targeted both the pGSC and mGSC phenotypes 227
were found to by synergistic (Fig. 5B, Table S6). In particular, inhibition of EGFR and FGFR3 growth-228
factor receptors did not synergize with WNT inhibition (all mGSC-specific targets), but targeting ether 229
EGFR, FGFR3 or WNT synergized with inhibition of pGSC-specific Survivan. 230
Discussion 231
It is known that an individual tumor may contain multiple GSC clones (e.g. 16). We applied 232
exome-seq, sc/snRNA-seq and scATAC-seq to human tumor-specimens and found that all GBMs 233
contain hierarchies of mesenchymal and proneural GSCs and their more differentiated progeny. We 234
also observe these same cellular hierarchies in scRNA-seq profiling of recurrent GBM following 235
treatment (Fig. S5B). However, our study was limited to the analysis of primary tumors. Additional 236
studies will be required to determine if standard therapy exerts a selection pressure on this hierarchy. 237
While our manuscript was in revisions Neftel et al. published a model of GBM heterogeneity 238
based on scRNA-seq analysis of adult and pediatric human specimens (17). Neftel et al. describe a 239
model with 4 cellular states: astrocyte and mesenchymal cell types enriched in Verhaak classical and 240
mesenchymal TCGA samples and NPC- and OPC-like cell-types enriched in proneural TCGA samples. 241
Their data corroborate a mesenchymal to proneural axis, and anti-correlation of mesenchymal (e.g. 242
CD44) and proneural (e.g. DLL3) genes in neoplastic cells (c.f. Neftel et al. Fig. 2C). Both studies 243
contribute to recent advances which use transcriptomics to identify clinically relevant phenotypes (e.g. 244
18, 19). 245
Murine GBMs can be separated into two cell populations that have different capacities for 246
tumorigenicity and self-renewal (20). The Id1high/Olig2- and Id1low/Olig2+ populations found in the model 247
of Barrett et al. (20) match the expression signatures of mGSCs and pGSCs respectively. Moreover, 248
the finding of Barrett et al. (20) that Id1high cells initiate tumors containing both Id1high and Olig2+ cells, 249
while tumors from Id1low/Olig2- cells do not generate Id1high progeny, is consistent with our 250
transcriptomic and genetic lineage tracing in human specimens. More recently, Narayanan et al. 251
showed that ASCL1 is a master-regulator of the proneural phenotype of GBM, and that ASCL1 directly 252
represses mesenchymal-phenotype genes such as CD44 and GFAP (21). This is consistent with our 253
Research. on May 29, 2020. © 2019 American Association for Cancercancerdiscovery.aacrjournals.org Downloaded from
Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on September 25, 2019; DOI: 10.1158/2159-8290.CD-19-0329
data, in particular our scATAC-seq analysis shows anti-correlation of CD44 and other mesenchymal 254
genes with ASCL1 gene-body activity and motif enrichment (Fig. 4C, G). Knowledge of GBM cell types, 255
their lineage relationships and functional differences is needed to develop combination therapies that 256
address intratumor heterogeneity. 257
Material and Methods 258
Tumor tissue acquisition 259
We acquired fresh tumor tissue and peripheral blood from patients undergoing surgical 260
resection for glioma at UCSF. De-identified samples were provided by the UCSF Neurosurgery Tissue 261
Bank. Sample use was approved by the Institutional Review Board at UCSF. The experiments 262
performed here conform to the principles set out in the WMA Declaration of Helsinki and the 263
Department of Health and Human Services Belmont Report. All patients provided informed written 264
consent. 265
Tissue processing for scRNA-seq 266
Fresh tissues were minced in collection media (Leibovitz’s L-15 medium, 4 mg/mL glucose, 100 267
u/mL Penicillin, 100 ug/mL Streptomycin) with a scalpel. Samples dissociation was carried out in a 268
mixture of papain (Worthington Biochem. Corp) and 2000 units/mL of DNase I freshly diluted in EBSS 269
and incubated at 37 °C for 30 min. After centrifugation (5 min at 300 g), the suspension was 270
resuspended in PBS. Subsequently, suspensions were triturated by pipetting up and down ten times 271
and then passed through a 70-μm strainer cap (BD Falcon). Last, centrifugation was performed for 5 272
min at 300 g. After resuspension in PBS, pellets were passed through a 40-μm strainer cap (BD 273
Falcon), followed by centrifugation for 5 min at 300 g. The dissociated, single cells were then 274
resuspended in GNS (Neurocult NS-A (Stem Cell Tech.), 2 mM L-Glutamine, 100 U/mL Penicillin, 100 275
ug/mL Streptomycin, N2/B27 supplement (Invitrogen), sodium pyruvate). Nuclei were extracted from 276
frozen tissues following the “Frakenstein” protocol developed by Luciano Martelotto, Ph.D., Melbourne, 277
Centre for Cancer Research, Victorian Comprehensive Cancer Centre, and available from 10X 278
Genomics (https://community.10xgenomics.com/t5/Customer-Developed-Protocols/ct-p/customer-279
protocols). 280
Fluidigm C1-based scRNA-seq 281
Fluidigm C1 Single-Cell Integrated Fluidic Circuit (IFC) and SMARTer Ultra Low RNA Kit were 282
used for single-cell capture and complementary DNA (cDNA) generation. cDNA quantification was 283
performed using Agilent High Sensitivity DNA Kits and diluted to 0.15–0.30 ng/μL. The Nextera XT 284
DNA Library Prep Kit (Illumina) was used for dual indexing and amplification with the Fluidigm C1 285
Research. on May 29, 2020. © 2019 American Association for Cancercancerdiscovery.aacrjournals.org Downloaded from
Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on September 25, 2019; DOI: 10.1158/2159-8290.CD-19-0329
protocol. CDNA was purification and size selection were carried out twice using 0.9X volume of 286
Agencourt AMPure XP beads (Beckman Coulter). 287
10X genomics-based scRNA-seq/snRNA-Seq 288
For fresh tissues, 10.2 μL of live cells, at a concentration of 1700 live-cells/μL, were loaded into 289
the 10X Chromium Single Cell capture chip. For frozen tissues, ~15,000 nuclei were loaded per 290
capture. Single-cell/nucleus capture, reverse transcription, cell lysis, and library preparation were 291
performed per manufacturer’s protocol. Sequencing was performed on an Illumina NovaSeq using a 292
paired-end 100bp protocol. 293
Public data acquisition 294
Normalized counts from TCGA RNA-seq data were obtained from the Genomics Data 295
Commons portal (https://gdc.cancer.gov/). Patients diagnosed as GBM and wild-type IDH1 expression 296
(n = 144) were normalized to log2(CPM + 1) and used for analysis. TCGA glioma microarray and 297
associated survival data were obtained from the Gliovis portal (22). Z-score normalized counts from 298
regional RNA-seq of 122 samples from ten patients were obtained via the web interface of the Ivy GAP 299
(http://glioblastoma.alleninstitute.org/) database. RNA-seq of non-malignant human brain tissues were 300
obtained from the GTEx portal (https://www.gtexportal.org/home/datasets). 301
Exome-sequencing and genomic mutation identification 302
The NimbleGen SeqCap EZ Human Exome Kit v3.0 (Roche) was used for exome capture on a 303
tumor sample and a blood control sample from each patient. Samples were sequenced with an 304
Illumina-HiSeq 2500 machine (100-bp paired-end reads). Reads were mapped to the human grch37 305
genome with BWA (23) and only uniquely matched paired reads were used for analysis. PicardTools 306
(http://broadinstitute.github.io/picard/) and the GATK toolkit (24) carried out quality score re-calibration, 307
duplicate-removal, and re-alignment around indels. Large-scale (>100 Exons) somatic copy number 308
variants (CNVs) were inferred with ADTex (25). To increase CNV size, proximal (< 1 Mbp) CNVs were 309
merged. Somatic SNVs were inferred with MuTect (https://www.broadinstitute.org/cancer/cga/mutect) 310
for each tumor/control pair and annotated with the Annovar software package (26). 311
Sc/snRNA-seq data pre-processing 312
Data processing of the C1 data was performed as described previously (3). Briefly, reads were 313
quality trimmed and TrimGalore! (http://www.bioinformatics.babraham.ac.uk/projects/trim_galore/) was 314
used to clip Nextera adapters. HISAT2 (27) was used to perform alignments to the grCh38 human 315
genome. Gene expression was quantified using the ENSEMBL reference with featureCounts (28). Only 316
Research. on May 29, 2020. © 2019 American Association for Cancercancerdiscovery.aacrjournals.org Downloaded from
Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on September 25, 2019; DOI: 10.1158/2159-8290.CD-19-0329
correctly paired, uniquely mapped reads were kept. In each cell, expression values were scaled to 317
counts per million (CPM)/100+1 and log transformed. Low-quality cells were filtered by thresholding 318
number of genes detected at 1000 and at least 100,000 uniquely aligned reads. For 10x scRNA-Seq 319
and snRNA-Seq data we utilized CellRanger (version 3.0.2) for data pre-processing and gene 320
expression quantification, following the guidelines from the CellRanger website 321
(https://support.10xgenomics.com/single-cell-gene-322
expression/software/pipelines/latest/advanced/references#premrna). 323
Classification of somatic mutations 324
The presence/absence of somatic CNVs in 10x scRNA-Seq and snRNA-Seq data was 325
assessed with CONICSmat (6). We retained CNVs with a CONICSmat likelihood-ratio test <0.001 and 326
a difference in Bayesian Criterion >300. For each CNV we used a cutoff of posterior probability >0.5 in 327
the CONICSmat mixture model to infer the presence/absence of that CNV in a given cell. For the 328
Fluidigm C1 data we utilized our previous CNV and SNV classifications (3, 6). We retained only SNVs 329
detected in exome-seq at >10% variant frequency in the tumor and <10% variant frequency in patient‐330
matched normal blood. Cells expressing those tumor-restricted variant alleles were considered positive 331
for the respective SNVs. For the presence/absence of somatic CNVs in 10x snATAC-Seq data was 332
estimated with CONICSmat (version 1.0) (6). Here, the gene body activity generated by snapATAC 333
(29) was used to perform the CNVs analysis with with a CONICSmat likelihood-ratio test <0.001 and a 334
difference in Bayesian Criterion >300. SNV calls in 10X sc/snRNA-seq data were obtained by pooling 335
reads by patient and running the GATK RNA-seq best-practices pipeline 336
(https://software.broadinstitute.org/gatk/best-practices/workflow?id=11164). Variant assignments in 337
single cells were then assessed via VarTrix (version 1.0) tool (https://github.com/10xgenomics/vartrix). 338
Dimensionality reduction and calculation of stemness scores 339
First, we filtered cells that have >5% mitochondrial read counts. A gene with non-zero read 340
counts in more than 10 cells was considered as expressed, and each cell was required to have at least 341
200 expressed genes. Seurat v3 was used for tSNE plots based on the first 10 PCs (9). PCA was done 342
using R 3.4.2. The top 1000 genes with the highest regularized variances were identified via Seurat v3 343
for each case. PCA was then performed using those genes that were among the 1000 most variable 344
genes for at least three patients. We defined the mGSC and pGSC gene sets as the top 15% of genes 345
most strongly loading PC1, positively for mGSCs and negatively pGSCs. Stemness scores were 346
calculated using these gene sets as input to the AddModuleScore function from the Seurat package. 347
Cluster-specific genes were identified via the FindMarkers/FindAllMarkers function from Seurat 348
Research. on May 29, 2020. © 2019 American Association for Cancercancerdiscovery.aacrjournals.org Downloaded from
Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on September 25, 2019; DOI: 10.1158/2159-8290.CD-19-0329
package, using a likelihood-ratio test. Only genes enriched and expressed at least in 25% of the cells of 349
at least one population and with a log fold-change bigger that 0.25 were considered. 350
Deconvolution of TCGA RNA-sequencing data via linear models 351
To deconvolve GBM RNA-sequencing data from TCGA according to cell types learned from 352
scRNA-seq, we first pooled scRNA-seq read-counts by cell type across mGSCs, pGSCs, non-353
malignant oligodendrocytes, astrocytes, neurons, TAMs, T-cells, and endothelial cells. The data used 354
for this were our GBM scRNA-seq, as well as scRNA-seq of human fetal and adult non-malignant brain 355
tissues (12, 13). The resulting 8 count vectors were independently normalized to log2(CPM/10+1). We 356
then fit a linear model to each TCGA RNA-sequencing dataset (also scaled to log2(CPM/10+1) using 357
these vectors as predictors. We assessed the relative contribution of each predictor to the overall 358
variance explained using the Lindeman, Merenda and Gold (lmg) method as implemented in the 359
relaimpo R package (30). 360
Lineage reconstruction via RNA velocity 361
RNA velocities were computed via velocyto (14). For GBM cases cells with PC2 sample-score > 362
0 were deemed cycling cells and used for velocyto analysis under default parameters. Root and 363
terminal points in lineage reconstruction were identified with the functions prepare_markov and 364
run_markov included in velocyto. These use backward and forward Markov processes on the transition 365
probability matrix of the cells to determine high density regions for the start and end points of 366
trajectories, respectively. 367
SnATAC-Seq data processing 368
The CellRanger ATAC software (version 1.1.0) was used for read alignment, deduplication, and 369
identifying transposase cut sites (https://support.10xgenomics.com/single-cell-370
atac/software/pipelines/latest/algorithms/overview). The output matrix of CellRanger was further 371
processed by using the snapATAC package (https://github.com/r3fang/SnapATAC) (29). We selected 372
the highest quality barcodes for each case based on two criteria: 1) number of filtered fragments at 373
least 1000; 2) fragments in promoter ratio (FRiP) at least 0.2 for the case. Clustering was performed 374
using Seurat v3 SNN graph clustering “FindClusters”, with the gene body-accessibility scores 375
generated by the snapATAC package as input. Differentially accessible regions, peaks, and motif 376
enrichments were computed using snapATAC “findDAR”, “runMACSForAll” and “runHomer”. 377
Immunohistochemistry 378
Research. on May 29, 2020. © 2019 American Association for Cancercancerdiscovery.aacrjournals.org Downloaded from
Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on September 25, 2019; DOI: 10.1158/2159-8290.CD-19-0329
Immunohistochemistry (IHC) Optimization was performed on a Leica Bond automated 379
immunostainer using conditions optimized for each antibody (Table S4). Heat-induced antigen retrieval 380
was performed using Leica Bond Epitope Retrieval Buffer 1 (Citrate Buffer, pH 6.0) and Leica Bond 381
Epitope Retrieval Buffer 2 (EDTA solution, pH 9.0) for 20 minutes (ER2(20)). Non-specific antibody 382
binding was blocked using 5% milk in PBST or Novocastra Protein Block (Novolink #RE7158). Positivity 383
was detected using Novocastra Bond Refine Polymer Detection and visualized with 3’3 384
diaminobenzidine (DAB; brown) and alkaline phosphatase (AP; red). A Hematoxylin nuclear 385
counterstain (blue) was applied. Duplex controls were performed to provide a reference of specificity of 386
the selected primary antibody and secondary detection system. For the DLL3 and CA9 duplex, 2 sets of 387
single-stained control tissues were used since the antigens of interest are not commonly co-expressed 388
in normal tissue types. Image analysis of whole slide images (8 slides from GBM patients, 5 control 389
slides) was performed using the Aperio software (Leica) and the ImageDx slide-management pipeline 390
(Reveal Bio). All tissue and staining artifacts were digitally excluded from the reported quantification. 391
Drug screen 392
The U87MG cell line was obtained from American Type Culture Collection and cultured in 393
DMEM, supplemented with 10%FBS and 1% penicillin-streptomycin. Cells were seeded in 96-well 394
plates at a density of 5,000 cells/well with DMEM and held overnight at 37 °C, 5% CO2. The media was 395
then aspirated, and test compounds administered through serial dilution in DMEM. After 48h, the cells 396
were washed with PBS. DMEM containing AlamarBlue (Invitrogen) and supplemented with FBS, 397
penicillin and streptomycin was added, and the plate was then incubated for 4 hours. Cell viability was 398
assessed via absorbance using a microplate reader and compared to untreated control wells. 399
Declarations 400
Ethics approval and consent to participate 401
Study protocols were approved by the UCSF Institutional Review Board. All clinical samples 402
were analyzed in a de-identified fashion. All experiments were carried out in conformity to the principles 403
set out in the WMA Declaration of Helsinki as well as the Department of Health and Human 404
Services Belmont Report. Informed written consent was provided by all patients. 405
Availability of data and material 406
The study data are available from the European Genome-phenome Archive repository, under 407
EGAS00001002185, EGAS00001001900 and EGAS00001003845. Third-party data that were used in 408
the study are available from the GlioVis portal (http://gliovis.bioinfo.cnio.es/), the gbmseq portal 409
(http://gbmseq.org/), and The Glioblastoma Atlas Project (http://glioblastoma.alleninstitute.org/). 410
Research. on May 29, 2020. © 2019 American Association for Cancercancerdiscovery.aacrjournals.org Downloaded from
Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on September 25, 2019; DOI: 10.1158/2159-8290.CD-19-0329
Competing interests 411
None declared. 412
Funding 413
This work has been supported by research awards from the UCSF Glioma Precision Medicine 414
Program to AD, JP, and SC; the UC Cancer Research Coordinating Committee (CRN-19-586041) to 415
AD. 416
Author contributions 417
A.D. conceived of and designed the study. G.K and G.Y performed the 10X Genomics single-418
cell sequencing under the supervision of A.D. S.C, J.P. and A.D. identified cases for inclusion in the 419
study. J.P. screened archival tissue specimens for suitability of use. B.A. and E.D. performed the 420
Fluidigm C1 single-cell sequencing under the supervision of A.D. and A.K. H.B. and K. S. performed 421
the single-nuclei RNA and ATAC sequencing under the supervision of AD. H.B., S.S., and H.W. 422
performed the drug screen under the supervision of AD. M.A. provided biopsies via the UCSF 423
Neurosurgery Tissue Core and contributed to the analysis of the data. S.M. and A.D. first formulated 424
the single-axis hypothesis based on the bioinformatics analysis of S.M. L.W. and F.C. performed the 425
bioinformatics analysis of the snRNA-seq, scATAC-seq, genetic and transcriptomic lineage tracing 426
under the supervision of A.D. A.D., H.B., S.M., and L.W. wrote the manuscript. All authors read and 427
approved the final manuscript. 428
Bibliography 429
1. R. G. W. Verhaak et al., Integrated genomic analysis identifies clinically relevant subtypes of 430
glioblastoma characterized by abnormalities in PDGFRA, IDH1, EGFR, and NF1. Cancer Cell. 431
17, 98–110 (2010). 432
2. A. P. Patel et al., Single-cell RNA-seq highlights intratumoral heterogeneity in primary 433
glioblastoma. Science. 344, 1396–401 (2014). 434
3. S. Müller et al., Single-cell sequencing maps gene expression to mutational phylogenies in 435
PDGF and EGF driven gliomas. Mol. Syst. Biol. 12, 889 (2016). 436
4. X. Jin et al., Targeting glioma stem cells through combined BMI1 and EZH2 inhibition. Nat. Med. 437
23, 1352–1361 (2017). 438
5. A. Diaz et al., SCell: Integrated analysis of single-cell RNA-seq data. Bioinformatics. 32 (2016), 439
doi:10.1093/bioinformatics/btw201. 440
Research. on May 29, 2020. © 2019 American Association for Cancercancerdiscovery.aacrjournals.org Downloaded from
Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on September 25, 2019; DOI: 10.1158/2159-8290.CD-19-0329
6. S. Müller, A. Cho, S. J. Liu, D. A. Lim, A. Diaz, CONICS integrates scRNA-seq with DNA 441
sequencing to map gene expression to tumor sub-clones. Bioinformatics, 10–12 (2018). 442
7. T. E. Bartlett, S. Müller, A. Diaz, Single-cell Co-expression Subnetwork Analysis. Sci. Rep. 7 443
(2017), doi:10.1038/s41598-017-15525-z. 444
8. S. Müller et al., Single-cell profiling of human gliomas reveals macrophage ontogeny as a basis 445
for regional differences in macrophage activation in the tumor microenvironment. Genome Biol. 446
18 (2017), doi:10.1186/s13059-017-1362-4. 447
9. A. Butler, P. Hoffman, P. Smibert, E. Papalexi, R. Satija, Integrating single-cell transcriptomic 448
data across different conditions, technologies, and species. Nat. Biotechnol. 36, 411–420 (2018). 449
10. I. Tirosh et al., Single-cell RNA-seq supports a developmental hierarchy in human 450
oligodendroglioma. Nature. 539, 309–313 (2016). 451
11. A. S. Venteicher et al., Decoupling genetics, lineages, and microenvironment in IDH-mutant 452
gliomas by single-cell RNA-seq. Science. 355, 1391–1402 (2017). 453
12. S. Darmanis et al., A survey of human brain transcriptome diversity at the single cell level. Proc. 454
Natl. Acad. Sci. 112, 7285–7290 (2015). 455
13. T. J. Nowakowski et al., Spatiotemporal gene expression trajectories reveal developmental 456
hierarchies of the human cortex. Science. 358, 1318–1323 (2017). 457
14. G. La Manno et al., RNA velocity of single cells. Nature. 560, 494–498 (2018). 458
15. L. S. Ludwig et al., Lineage Tracing in Humans Enabled by Mitochondrial Mutations and Single-459
Cell Genomics. Cell. 176, 1325-1339.e22 (2019). 460
16. M. Meyer et al., Single cell-derived clonal analysis of human glioblastoma links functional and 461
genomic heterogeneity. Proc. Natl. Acad. Sci. U. S. A. 112, 851–6 (2015). 462
17. C. Neftel et al., An Integrative Model of Cellular States, Plasticity, and Genetics for Glioblastoma. 463
Cell, 1–15 (2019). 464
18. A. C. Decarvalho et al., Discordant inheritance of chromosomal and extrachromosomal DNA 465
elements contributes to dynamic disease evolution in glioblastoma. Nat. Genet. 50, 708–717 466
(2018). 467
19. X. Hu et al., TumorFusions: An integrative resource for cancer-associated transcript fusions. 468
Nucleic Acids Res. 46, D1144–D1149 (2018). 469
20. L. E. Barrett et al., Self-renewal does not predict tumor growth potential in mouse models of 470
Research. on May 29, 2020. © 2019 American Association for Cancercancerdiscovery.aacrjournals.org Downloaded from
Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on September 25, 2019; DOI: 10.1158/2159-8290.CD-19-0329
high-grade glioma. Cancer Cell. 21, 11–24 (2012). 471
21. A. Narayanan et al., The proneural gene ASCL1 governs the transcriptional subgroup affiliation 472
in glioblastoma stem cells by directly repressing the mesenchymal gene NDRG1. Cell Death 473
Differ., 1813–1831 (2018). 474
22. R. L. Bowman, Q. Wang, A. Carro, R. G. W. Verhaak, M. Squatrito, GlioVis data portal for 475
visualization and analysis of brain tumor expression datasets. Neuro. Oncol. 19, 139–141 476
(2017). 477
23. H. Li, R. Durbin, Fast and accurate short read alignment with Burrows-Wheeler transform. 478
Bioinformatics. 25, 1754–60 (2009). 479
24. A. McKenna et al., The Genome Analysis Toolkit: a MapReduce framework for analyzing next-480
generation DNA sequencing data. Genome Res. 20, 1297–303 (2010). 481
25. K. C. Amarasinghe et al., Inferring copy number and genotype in tumour exome data. BMC 482
Genomics. 15, 732 (2014). 483
26. K. Wang, M. Li, H. Hakonarson, ANNOVAR: Functional annotation of genetic variants from high-484
throughput sequencing data. Nucleic Acids Res. 38, 1–7 (2010). 485
27. M. Pertea, D. Kim, G. M. Pertea, J. T. Leek, S. L. Salzberg, Transcript-level expression analysis 486
of RNA-seq experiments with HISAT, StringTie and Ballgown. Nat Protoc. 11, 1650–1667 487
(2016). 488
28. Y. Liao, G. K. Smyth, W. Shi, FeatureCounts: An efficient general purpose program for assigning 489
sequence reads to genomic features. Bioinformatics. 30, 923–930 (2014). 490
29. R. Fang et al., Fast and Accurate Clustering of Single Cell Epigenomes Reveals Cis-Regulatory 491
Elements in Rare Cell Types. bioRxiv (2019), doi:10.1101/615179. 492
30. R. Importance, L. Regression, T. Package, U. Groemping, Relative Importance for Linear 493
Regression in R : The Package relaimpo. October. 17, 139–147 (2006). 494
495
Figure Legends 496
Figure 1. Single-cell sequencing reveals a single axis of variation in proliferating GBM cells. (A) 497
Left: t-SNE plot of 8,992 10X scRNA-seq cells, from 6 patients; center: t-SNE plot of 15,975 nuclei from 498
10 patients; right: t-SNE plot of 568 cells from 7 patients. Cells/nuclei are colored by the presence (red) 499
or absence (black) of clonal CNVs. (B) (Top) PCA of neoplastic cells from 10X scRNA-seq of IDH-500
Research. on May 29, 2020. © 2019 American Association for Cancercancerdiscovery.aacrjournals.org Downloaded from
Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on September 25, 2019; DOI: 10.1158/2159-8290.CD-19-0329
wildtype GBMs. Curves represent the density of a Gaussian mixture model fit to PC1 sample scores. 501
Heatmaps display the expression of top-loading PC1 genes across cells, sorted by PC1 sample score. 502
(C) Differentially expressed genes between PCA clusters (abs. log2 fold-change>1 and adj. p <0.001 in 503
red). (D) Fractions of cycling cells, ** : Fisher p<0.01, *** : p<0.001. (E-G) PCA and analysis as in (B-504
D), for 10X snRNA-seq of IDH-wildtype GBMs. (H) PCA of neoplastic cells from C1 scRNA-seq of IDH-505
wildtype GBMs. (I) Correlations between the C1-scRNA-seq and 10X-scRNA-seq loadings. 506
Figure 2. MGSCs and pGSCs explain the genetic and phenotypic heterogeneity of GBM. (A) 507
Expression of mGSC and pGSC markers in single cells from non-malignant human brain. (B) 508
Hierarchical clustering of Pearson correlations between mGSC, pGSC and cell-cycle genes in IDH-509
wildtype GBM RNA-seq samples from TCGA (n=144). (C-D) Heatmap (C) and boxplots (D) of the 510
relative contributions of predictor cell-types to the overall variance explained by a linear model fit to 511
each TCGA sample. (E) Kaplan-Meier analysis comparing survival of IDH-wildtype GBMs from TCGA 512
to average expression of the mGSC and pGSC gene signatures in patient-matched RNA sequencing. 513
(F-G) MGSC, pGSC and cell-cycle signatures in Ivy GAP RNA-sequencing of GBM-anatomical 514
structures. (H) Percentages of CD44/DLL3 cells also positive for CA9/Ki67; ** : Wilcoxon p<0.001. 515
Figure 3. In-silico genetic and transcriptomic lineage tracing supports a mGSC to pGSC 516
hierarchy. (A) (left) RNA velocities of cycling neoplastic 10X scRNA-seq IDH-wildtype GBM cells are 517
projected onto a PCA axis. (center) Velocyto-based lineage reconstruction identifies a stable mGSC 518
root population and pGSC terminal population. (right) Gene expression and velocity for mGSC and 519
pGSC marker genes. (B) The percent of expressed mitochondrial mutations found in a given cell, out of 520
all mitochondrial mutations in a patient’s sample. (C) Percent expressed mitochondrial mutations are 521
compared between pGSCs and mGSCs. (D) PCA analysis as in (A), but for 10X snRNA-seq IDH-522
wildtype GBM data. 523
Figure 4. (A) TSNE plot of IDH-wildtype GBM scATAC-seq annotated by the presence of clonal 524
mutations. (B) Clustering of scATAC-seq gene-body activity scores of non-neoplastic cells. (C) 525
Clustering of scATAC-seq gene-body activity scores of neoplastic cells, with boxplots of activity scores 526
for Verhaak-subtype gene sets annotate above. (D-F) Differential motif enrichment tests between 527
neoplastic cell clusters. (G) Neoplastic cluster-specific motifs and associated transcription factors. 528
Figure 5. (A) A schematic overview of the drug screen. (B) HSA synergy scores and dose responses 529
for drug combinations screened in U87 cells in vitro. 530
Research. on May 29, 2020. © 2019 American Association for Cancercancerdiscovery.aacrjournals.org Downloaded from
Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on September 25, 2019; DOI: 10.1158/2159-8290.CD-19-0329
50-5-10PC1
0
10
20
PC2
Cycling
Non-cycling
pGSC
mG
SC
10X scRNA-seq
-25 0 25 50-50tSNE 1
-50
-25
025
50tS
NE
210X scRNA-seq
Cells without clonal mutationsCells with clonal mutations
−50
−25
0
25
50
-25 0 25 50tSNE1
-50
10X snRNA-seq
−15 −10 −5 0 5 10 15
−15
−10
−50
510
15
tSNE1
C1 scRNA-seq
OligodendrocytesAstrocytes Endothelial cellsMyeloid
A
B
***
% cycling
Proneural
Mesenchymal
0 5 10 15 20
0
10
20
PC2
-10
Cycling
Non-cycling
pGSC
mG
SC
50-5-10 1510
10X snRNA-seq
MKI
67TO
P2A
CEN
PFPR
C1
UBE
2TTP
X2C
HL1
APO
DTN
RN
TRK2
ALD
OC
NEA
T1N
TM
PC2
scor
e
PDGFRAOLIG2SOX10ASCL1DLL3BCANPTGDSCD99FN1EGFRVIMCHI3L1CD44
PC1 score
Z sc
ore
-22
−150 −100 −50 0 50 100
−50
050
100
PC1
PC2
C1 PCA
−0.1 0.0 0.1Loadings PC1 10x
−0.4
0.0
0.4
Load
ings
PC
1 C
1
MKI67OLIG2
PDGFRA
CCND1
DLL3
CHI3L1CD44
PCC=0.68P value=1.2e-09
MKI
67M
KI67
TOP2
AC
ENPF
PRC
1U
BE2T
TPX2
CH
L1AP
OD
TNR
NTR
K2AL
DO
CN
EAT1AT
1A NTM
Z sc
ore
-22
ND
RG
1N
RN
1N
RXN
1M
YT1
PLPP
R1
VEG
FAC
NP
CEN
PEC
ENPF
MKI
67C
ENPK
TOP2
A
PC2
scor
e
C
D
E F
G
H
I
***
% cycling
Proneural
Mesenchymal
0 10 20 30
Log2 FC-2 -1 0 1
-Log
10(p
val)
100
200
0
300
-3 2
CD44GFAP
CHI3L1OLIG2
DLL3
OLIG1SOX10
TNCBCAN
-Log
10(p
val)
100
200
0
PDGFRA
CD44
OLIG2
CDK1MKI67
GFAP
CHI3L1
SOD2
Log2 FC-2 -1 0 1 2 3
300
Figure 1Research.
on May 29, 2020. © 2019 American Association for Cancercancerdiscovery.aacrjournals.org Downloaded from
Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on September 25, 2019; DOI: 10.1158/2159-8290.CD-19-0329
A B−0.8 0.8Cell-cycleMesenchymalProneural
ScRNA-seq derived signatures:
OLIG2PDGFRA
MKI67
TOP2A
DLL3
CCNB1
CHI3L1CD44
CD99
TNR
IGFBP7
PCC
DC
z-score
1-1
BCAN
CHI3L1CLUGFAPSPARCL1AQP4NAMPTCD44TNCC8orf4CD99SMOC1PDGFRATNROLIG1OLIG2
CCND1
NSCs AstrocytesOPCs Oligoden-drocytes
mG
SC m
arke
rspG
SCm
arke
rs
−20
12
−11
23
45
−10
12
3
−20
12
3
−20
2
−10
23
−10
12
34
−20
12
34
pGSC Myeloid EndithelialmGSC
1
Astrocyte Oligodendrocyte Neuron T-cell
z-sc
ore
of im
porta
nce
(lmg)
z-sc
ore
of im
porta
nce
(lmg)
-1
0
-1-1
−1 1ClassicalMesenchymalProneural Verhaak subtypes: z-score
mGSCAstrocyte
pGSCOligodendrocyte
Endothelial Myeloid
High myeloidinfiltration
High pGSC
High mGSC
High Astro-diffintermed. mGSC
High mGSCand Astro-diff.
mGSC signature genespGSC signature genes−0
.50.
51.
0
−0.5
0.5
1.5
2.5
NSCsOPCs Astro Oligo NSCsOPCs Astro Oligo
0
0 20 40 60 80
0.0
0.2
0.4
0.6
0.8
1.0
Months
Ove
rall s
urvi
val
p=0.4951
pGSC signaturehistology GBM, IDH-WT
High expr.Low expr.
0.0
0.2
0.4
0.6
0.8
1.0
Months0 20 40 60 80
Ove
rall s
urvi
val
High expr.Low expr.
p=0.001906
mGSC signaturehistology GBM, IDH-WT
-1.0 1.0z-score
PDGFRA TNROLIG2 BCANAPOD STMN2SCG3 MKI67BIRC5 CENPFUBE2C TOP2ANUSAP1 GTSE1PRC1 TPX2NEAT1 SOD2NMB CHI3L1CD99 CD44VIM
Cell cyclemGSCpGSC
Cellular Tumor
Infiltrating Tumor
Leading Edge
Microvascular proliferation
Pseudopali-sading cells
1
0
-1
pGSC signature mGSC signatue Cell Cycle-signatue
z-sc
ore
of m
ean
expr
essi
on
*** NS, all other comparisons ***
NS, all other comparisons ***
Cellular Tumor
Infiltrating Tumor
Leading Edge
Microvascular proliferation
Pseudopali-sading cells
DLL3
CD
44
% CA9 double positive0 10 20 30 40
100µmCD
44 (D
AB)/C
A9
(Red
)
20µmDLL
3 (D
AB)/C
A9
(Red
)
20µmCD
44 (D
AB)/K
i67
(Red
)
20µmDLL
3 (D
AB)/K
i67
(Red
)
DLL3
CD
44
% KI67 double positive0 10 20 30 40 50
p<0.001
E
F
G H
Expr
essio
n z-
scor
e
Figure 2Research.
on May 29, 2020. © 2019 American Association for Cancercancerdiscovery.aacrjournals.org Downloaded from
Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on September 25, 2019; DOI: 10.1158/2159-8290.CD-19-0329
PC1
PC2
Root
Terminal
Low High
Density
PDGFRADLL3 CD44 CHI3L1
0 Max
ExpressionPositiveNegative
RNA velocity
0 Max
ExpressionPositiveNegative
RNA velocity
CD44 CHI3L1DLL3 PDGFRA
PC1
PC2
PC1
PC2
Low High
Density
Terminal
High
DensityDensity
Root
PC1 High
0%
20%
40%
60%
80%
-10 0 10-20PCA 1
PCA
2
-10
0
10
-5
5
Percent of mitochondiral mutaions
0.00
0.50
0.25
0.75
proneural mesenchymal
Perc
ent o
f mito
chon
dria
l mut
atio
ns in
sam
ple
***
A
B C
D
Figure 3Research.
on May 29, 2020. © 2019 American Association for Cancercancerdiscovery.aacrjournals.org Downloaded from
Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on September 25, 2019; DOI: 10.1158/2159-8290.CD-19-0329
−10
−5
0
5
10
−10 −5 0 5
Cells without clonal CNVsCells with clonal CNVs
UMAP Dim1
UM
AP D
im2
A B
C
B
−2 2
OligodendrocytesAstrocytes
Endothelial cellsMyeloid
Gene body accessibility Z score
PLP1
MAG
HLA-DRAC1QA
AQP4ALDOC
APOLD1
CLDN5
D
E F
0
100
200
300
−5 0 5
-Log
(p v
alue
)
ASCL2
ASCL1
SOX10
OLIG2
TCF21TCF4
AP4
Rfx2Rfx
Stat3
Rfx1TCF12
TWIST2
Negative: MES Positive: PN
CEBP
EBF2
BCL6
Motif enrichment fold change
NF1
G Motif enrichment of known TF motifs
0 200-log10(P value)
ASCL1ASCL2
OLIG2NeuroG2
JunBJun
Stat3 ATF3
MES
PN
Intermediate
SOX4
NF1 361
119
TEAD3
AP-1
44
351
NeuroD1
Oct-6
579
95
Top enriched de novo motifs Best match -log10(P value)
ASCL1
OLIG2
DLL3
PDGFRA
CD44
GFAP
SOD2
MET
DDR2
−2 2z-score:
0
500
1000
Gen
eac
cess
ibili
ty Verhaaksignature
MESClassicalPN
* *NS
Gen
e ac
cess
ibili
ty
−5 0 5Motif enrichment fold change
Negative: Int Positive: PN
0
100
200
300
-Log
(p v
alue
)
−10
ASCL1
OLIG2
TCF21
TCF4NeuroG2 NeuroD1
SOX10
Oct4ASCL2
SCL
MYF5MYOD
Fra1 Fra2ATF3
MAFK
NFE2L2
ATOH1
Stat30
100
200
300
-Log
(p v
alue
)
−5 0 5
Negative: Int Positive: MES
−10Motif enrichment fold change
Rfx2
Rfx1
GRE
Rfx2
ARE
NF1
Stat3SOX4
Fra1Fra2ATF3 JunB
Bach2
NFE2
NeuroG2TCF21
NeuroG2
Fra1Fra2
Fosl2
Fosl2
TGIF2
BATF
TGIF2BATF
CD44CDH9
CDH9
SOD2
Figure 4Research.
on May 29, 2020. © 2019 American Association for Cancercancerdiscovery.aacrjournals.org Downloaded from
Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on September 25, 2019; DOI: 10.1158/2159-8290.CD-19-0329
PDGFRAOLIG2SOX10ASCL1DLL3BCANPTGDSCD99FN1EGFRVIMCHI3L1CD44
pGSC mGSC
Single cell-derived signatures
SP600125 JUN PNYM155 BIRC5 PNTrifluridine TYMS PNiMDK MDK MESAZD4547 FGFR3 MESICG001 WNT MESAG1478 EGFR MES
DrugGenetarget
Cell-typetarget
Candidate therapeutics
Combination drug screen in vitro
A
B
20
10
0
-10
-20
syne
rgis
ticad
ditiv
ean
tago
nist
ic
HSA synergy score0
25.33
26.55
18.15
39.01
39.97
27.78
40.38
54.34
0
5
20
0 10 40ICG001 (uM)
YM15
5 (u
M)
inhibition (%)
0
25.33
26.55
13.93
39.4
40.43
24.41
39.72
55.34
0
5
20
0 5 20AZD4547 (uM)
YM15
5 (u
M)
inhibition (%)
0
25.33
26.55
8.96
38.9
39.01
11.44
33.97
51.58
0
5
20
0 5 10iMDK (uM)
YM15
5 (u
M)
inhibition (%)
0
13.93
24.41
18.15
2.73
21.24
27.78
22.36
36.58
0
5
20
0 10 40ICG001 (uM)
AZD
4547
(uM
)
inhibition (%)
0
6.1
14.06
18.15
4.22
13.04
27.78
17.62
20.85
0
10
20
0 10 40ICG001 (uM)
AG14
78 (u
M)
inhibition (%)
Figure 5
Research. on May 29, 2020. © 2019 American Association for Cancercancerdiscovery.aacrjournals.org Downloaded from
Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on September 25, 2019; DOI: 10.1158/2159-8290.CD-19-0329
Published OnlineFirst September 25, 2019.Cancer Discov Lin Wang, Husam Babikir, Soren Muller, et al. single axis of variationThe phenotypes of proliferating glioblastoma cells reside on a
Updated version
10.1158/2159-8290.CD-19-0329doi:
Access the most recent version of this article at:
Material
Supplementary
http://cancerdiscovery.aacrjournals.org/content/suppl/2019/09/25/2159-8290.CD-19-0329.DC1
Access the most recent supplemental material at:
Manuscript
Authoredited. Author manuscripts have been peer reviewed and accepted for publication but have not yet been
E-mail alerts related to this article or journal.Sign up to receive free email-alerts
Subscriptions
Reprints and
To order reprints of this article or to subscribe to the journal, contact the AACR Publications
Permissions
Rightslink site. Click on "Request Permissions" which will take you to the Copyright Clearance Center's (CCC)
.http://cancerdiscovery.aacrjournals.org/content/early/2019/09/25/2159-8290.CD-19-0329To request permission to re-use all or part of this article, use this link
Research. on May 29, 2020. © 2019 American Association for Cancercancerdiscovery.aacrjournals.org Downloaded from
Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on September 25, 2019; DOI: 10.1158/2159-8290.CD-19-0329