fast and robust deconvolution of tumor infiltrating ...tion, and estimates coefﬁcients using a...

Fast and Robust Deconvolution of Tumor Infiltrating Lymphocyte fromExpression Profiles using Least Trimmed Squares

Yuning Hao 1, Ming Yan 2, 3, Yu L. Lei4, 5∗ and Yuying Xie 1, 2∗

1Department of Statistics and Probability, Michigan State University, East Lansing, 48824, U.S.A.2Department of Computational Mathematics, Science and Engineering, Michigan State University,East Lansing, 48824, U.S.A.3Department of Mathematics, Michigan State University, East Lansing, 48824, U.S.A.4 Department of Periodontics and Oral Medicine, University of Michigan School of Dentistry AnnArbor, 48109, U.S.A. and5 University of Michigan Comprehensive Cancer Center, Ann Arbor, 48109, U.S.A.

ARTICLE HISTORYCompiled June 28, 2018

ABSTRACTGene-expression deconvolution is used to quantify different types of cells in a mixed popula-tion. It provides a highly promising solution to rapidly characterize the tumor-infiltrating im-mune landscape and identify cold cancers. However, a major challenge is that gene-expressiondata are frequently contaminated by many outliers that decrease the estimation accuracy. Thus,it is imperative to develop a robust deconvolution method that automatically decontaminatesdata by reliably detecting and removing outliers. We developed a new machine learning tool,Fast And Robust DEconvolution of Expression Profiles (FARDEEP), to enumerate immunecell subsets from whole tumor tissue samples. To reduce noise in the tumor gene expressiondatasets, FARDEEP utilizes an adaptive least trimmed square to automatically detect and re-move outliers before estimating the cell compositions. We show that FARDEEP is less suscep-tible to outliers and returns a better estimation of coefficients than the existing methods withboth numerical simulations and real datasets. FARDEEP provides the absolute quantitation ofeach immune cell subset in addition to relative percentages. Hence, FARDEEP represents anovel robust algorithm to complement the existing toolkit for the characterization of tissue-infiltrating immune cell landscape. The source code for FARDEEP as implemented in R isavailable for download at https://goo.gl/SqGKuo.

1. Introduction

The immune system constitutes a primary host defense mechanism against malignancy. So-matic mutations can be detected by surveying antigen-presenting cells (APC), which facilitatethe expansion of tumor-specific effector T cells. But these effectors rapidly become function-ally exhausted upon entry into the immunosuppressive tumor microenvironment (TME), withhigh expression levels of the inhibitory immune checkpoint receptors (ICR). Blockade ofICR has generated broad enthusiasm in the clinics thanks to its potential in reinvigoratingthe pre-existing T-cell pool, but the majority of cancers are hypoimmunogenic with coordi-nated mechanisms to exclude APC and effectors, which drives resistance to ICR blockadeimmunotherapy. Thus, combinatorial strategies are essential to prime the immune system andsensitize cold cancers to ICR blockade. The success of these emerging strategies depends on

To whom correspondence should be addressed. Yuying Xie; Email: [email protected]. Yu L Lei; Email: [email protected]

not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint (which wasthis version posted June 29, 2018. . https://doi.org/10.1101/358366doi: bioRxiv preprint

https://doi.org/10.1101/358366

the precise identification of cold cancers and selection of personalized treatment protocols.Immunogenomics presents an unprecedented opportunity to stratify patients based on can-

cer immune infiltrates and complement the current TNM staging system. Multiple clinicalstudies highlight the prognostic significance of certain immune cell subsets in the TME(Balermpas et al., 2014, 2016; Nguyen et al., 2016; Pages et al., 2009; Wolf et al., 2015; Leiet al., 2016). Immunohistochemistry (IHC)-based immunoscores, which quantify the numberof CD8+ cytotoxic T lymphocytes and CD45RO+ memory T cells, show better prognos-tic potential than conventional pathological methods in colon cancer patients (Galon et al.,2006; Mlecnik et al., 2016). Hence, harnessing the composition of intra-tumoral immunecell infiltration is a highly promising approach to identify cold tumors. The current IHC im-munoscoring approach has two limitations. First, the interpretation of immune cell subsetsvaries among pathologists and institutions, thus lacking a consistent standard for the scoringpractice. Second, only a limited number of biomarkers can be assessed simultaneously, whichprevents a comprehensive annotation of the immune contexture in the TME. Hence, robustmethods for genome data-informed cell type quantitation are in urgent need. A significanttechnical obstacle is that the efficacy and accuracy of gene expression deconvolution are lim-ited by the large number of outliers, which are frequently observed in tumor gene expressiondatasets (Jiang et al., 2004). The first step towards enhancing the overall gene deconvolutionalgorithms is to improve methods for outliers identification and processing. In this study, wereport a novel FAst and Robust DEconvolution of Expression Profiles (FARDEEP) methodthat significantly improves the estimation of coefficients.

Let yi be the observed expression value for the ith gene; xi, a p-dimensional vector, bethe expected expression of the ith gene for the p different cell types. The gene-expressiondeconvolution problem can be formulated as equation (0.1),

yi = x′iβ + εi, (1.1)

where β ∈ Rp is an unknown parameter corresponding to the compositions of p types of cell,and εi is a noise term with a mean of 0. Several methods were proposed to solve this decon-volution problem. To enforce the non-negativity of β, several groups have proposed usingthe Non-Negative Least Square (NNLS) and Non-negative Maximum Likelihood (NNML)frameworks to solve (1.1) (Lawson and Hanson, 1995; Gong et al., 2011; Gong and Szus-takowski, 2013; Li et al., 2016; Mackey et al., 1996; Qiao et al., 2012). Additionally, thegene expression of each cell may vary depending on its microenvironment and other factors,which will lead to a biased estimation. To address this issue, Microarray Microdissectionwith Analysis of Differences (MMAD) incorporates the concept of the effective RNA frac-tion, and estimates coefficients using a maximum likelihood approach (Liebner et al., 2014).To further adapt deconvolution to high-dimensional settings, Altboum et al. (2014) proposeda penalized regression framework, Digital Cell Quantifier (DCQ), to encourage sparsity forthe estimated β using elastic net Zou and Hastie (2005). CIBERSORT introduces ν-supportvector regression (ν-SVR) to enhance the robustness of gene expression deconvolution. Themethod performs a regression by finding a hyperplane that fits as many data points as possiblewithin a tube whose vertical length is a constant ε (Newman et al., 2015). The ε-tube providesa region in which estimation errors are ignored. However, CIBERSORT does not include anintercept in the model to capture contributions of other cell types. Additionally, to increase

2


https://doi.org/10.1101/358366

the computational efficiency CIBERSORT apply z-normalization to the data before fitting theregression, which introduces estimation bias. Discussion of the effect of Z-score normaliza-tion for CIBERSORT is included in Supplementary material. The number of effector immunecells is a critical factor underpinning the efficacy of immune killing. Hence, a robust methodthat determines both the distribution and absolute volume of tumor-infiltrating lymphocytes(TILs) will substantially improve the current gene deconvolution pipeline.

To handle contaminated gene expression data and provide absolute cell abundance estima-tion, we developed a robust method based on the Least Trimmed Square (LTS) framework(Rousseeuw, 1984; Rousseeuw and Leroy, 1987). LTS finds h observations with smallestresiduals, and the estimator β is the least squares fit over these h obeservations. LTS is aNP hard problem, and Rousseeuw and Driessen (2006) proposed a stochastic FAST-LTS al-gorithm. Nevertheless, it may give a suboptimal fitting result and get much slower when thesample size and dimension of variables become larger and higher since its accuracy relieson the initial random h-subsets and the number of intial subsets. When n is the sample sizeand p is the number of coefficients, h is suggested to be the smallest integer that is not lessthan (n + p + 1)/2 to remove as many outliers as possible while keeping a unbias result.Using the information of only half of the data reduces the power of the estimator becausethe amount of outliers in real case cannot be presumed and can be small. Xu et al. (2017)proposed an adaptive least trimmed square which is not limited to the randomness of initialsubset but only applied in binary dataset. In this study, we extend the adaptive least trimmedsquare to introduce a model-free and efficient method, which could find the number of outliersautomatically based on LTS. As evidence of high fidelity and robustness, the performance ofFARDEEP exceeds that of the existing tools in simulated and real world datasets.

2. MATERIALS AND METHODS

2.1. Model formulation

The usual linear deconvolution model can be expressed as below,

y = Xβ + ε, ε ∼ N (0, σ2I),

where y ∈ Rq is the observed expression data for q immune subset signature genes, X ∈Rq×p denotes a mean gene expression signature matrix for p different cell types, β ∈ Rp

contains each unknown cell type abundance, and ε ∈ Rq is a vector of random errors. Toincorporate outliers, we propose the following model

y = Xβ + τ + ε, ε ∼ N (0, σ2I), (2.1)

where parameter τ = (τ1, ..., τq)′ is a sparse vector in Rq with τi 6= 0 indicating the ith gene

is an outlier.Under the formulation of (2.1), let βols = (X>X)−1X>y be the Ordinary Least Square

(OLS) estimate and H = X(X>X)−1X> be the projection matrix. The residuals r =(r1, ..., rq) using OLS could be formulated as

r = y −Xβols ∼ N((I −H)τ , σ2(I −H)

). (2.2)

3


https://doi.org/10.1101/358366

2.2. Adaptive least trimmed square

From (2.2), the residuals, ri with the corresponding τi 6= 0 , would deviate from zero, whichsuggests that the set of outliers can be identified through thresholding as follows

E = {i : |ri| > k × rmed}, (2.3)

whereE is the set of detected outliers, k is a tuning parameter controlling the sensitivity of themodel, and rmed is the median of {|r|i}qi=1. We denote the number of elements in setE as |E|and letN be the number of true outliers in the data. First, we can use least squares and formula(2.3) to obtain a rough estimate of E denoted as E. Let the cardinality of E be N . Since themodel at this moment is inaccurate with contamination of outliers, N is an overestimation ofN which can be used to get an underestimate via N = α1N

(1) with α1 ∈ (0, 1). With N , wecan then update the least square fitting after removing theN samples with the largest absolutevalue of residuals and obtain an improved estimate of E and the corresponding N . We canimprove the model by repeating the procedure, but we need to increase the underestimate ofoutliers, N , by a factor of α2 with α2 > 1 for each iteration to force the convergence betweenN and N . In sum, for the jth iteration, we update N (j) by

N(j)

=

∣∣∣{i : |ri| > rmed}

∣∣∣, j = 1,

min(∣∣ {i : |ri| > k · rmed}

∣∣, N (j−1)), j ≥ 2.

(2.4)

where the min(·, ·) operator guarantees that N (j), an overestimation of N , is non-increasingafter each iteration. Similarly, we update N (j) through

N (j) =

{dα1N

(1)e, for j = 1,

min{dα2N(j−1)e, N (j)}, otherwise,

(2.5)

where dxe means the ceiling of x ∈ R, α1 ∈ (0, 1) is used to obtain a lower bound for N inthe first step, α2 > 1 guarantees the monotonicity of N (j), and the min(·, ·) operator makessure N (j) is smaller than N (j). Then we update βols after removing N (j) outliers, and thealgorithm stops when N and N converge.

Hence, we hereby report a novel approach, coined as adaptive Least Trimmed Square(aLTS), to automatically detect and remove contaminating outliers. Our aLTS is an exten-sion of the iterative LTS algorithm proposed by Xu et al. (2017) which is designed for binaryoutput such as comparison between two images or videos.

2.3. FARDEEP

Because the amount of cell types is always non-negative, we replaced the OLS regression inthe procedure of aLTS with non-negative least square regression. By applying aLTS to thedeconvolution model (0.2) and solving the following problem,

β = argminβ‖y −Xβ‖22 , subject to β ≥ 0

using Lawson-Hanson algorithm (Lawson and Hanson, 1995), we developed a robust tool,FARDEEP, for cellular deconvolution (Algorithm 1).

4


https://doi.org/10.1101/358366

Algorithm 1 FAst and Robust DEconvolution of Expression ProfilesInput : k > 0,0 < α1 < 1, α2 > 1, y,XInitialization : solving the following non-negative least square problem

β(0) = argminβ‖y −Xβ‖22 , subject to β ≥ 0

and r(0) = y −Xβ(0).

1: compute N (1) and N (1) using (2.4) and (2.5);2: solving non-negative least square problem by removing N (1) observations with largest

residuals, and update β(1), r(1) = y −Xβ(1).3: repeat4: compute N (j) and N (j) using (2.4) and (2.5) (j 6= 1);5: solving non-negative least square problem by removing N (j) observations with

largest residuals, and update β(j), r(j) = y −Xβ(j);6: until N = N .

Output : Coefficients β, Number of outliers N , Index of outliers

One unique advantage of FARDEEP is that it is fast and guarantees to converge withinfinite steps, which is summarized in the following theorem.

Theorem 1. Algorithm 1 (FARDEEP) stops in no more than j∗ steps, where

j∗ =

⌊− logα1

logα2

⌋+ 2.

Here b·c is the largest integer that is less than or equal to x.

Proof. It follows from the fact that the sequence {N (j)} is non-increasing, and {N (j)} isa geometrically increasing sequence that is bounded by the smallest component of {N (j)}.Specifically, assume that j∗ steps have been taken in FARDEEP, then j has approached j∗−1,and N (j) ≥ α2N

(j−1) for 0 ≤ j ≤ j∗ − 1, so

N(0) ≥ N (j∗−2) ≥ N (j∗−2) ≥ αj∗−2

2 N0 ≥ αj∗−22 α1N

0.

which leads to the result.

2.4. Parameter tuning

There are three tuning parameters k, α1, and α2 in FARDEEP. Since α1 is only used in thefirst iteration, a relatively small α1 is preferred to ensure that FARDEEP does not remove toomany outliers at the first step. In practice, FARDEEP is not sensitive to different values ofα1, and α2. However, k controls the number of outliers in each iteration and is critical forthe performance of FARDEEP. k is fine tuned on a case-by-case basis to preserve meaningfulfluctuations of gene expression levels. Effects for different tuning parameters are shown inSupplementary Table S1. Cross-validation is not advised since the test group may containoutliers that influence the accuracy of the tuning result. Instead, we applied the Bayesian In-formation Criterion (BIC) and assume that the errors follow a log-normal distribution instead

5


https://doi.org/10.1101/358366

of a normal distribution among gene expression datasets as suggest by Beal (2017). We definethe modified BIC referring to the setting of She and Owen (2011):

BIC∗(k) = m log

∑qi=1 1{i6∈E} log2(yi − yi)2

m+ b(log(m) + 1), (2.6)

where E being the set of detected outliers, b is number of parameters and equals N + p + 1with N = |E| being the number of outliers, and m equals q − N . Then, we choose the valueof k associated with the smallest BIC∗.

3. RESULTS

To test the performance of FARDEEP, we compared our approach with the existing methodsusing numerical simulations and real datasets. We first used LM22 as the signature matrix thatwas generated by Newman et al. (2015). This signature matrix contains 547 genes that couldbe used to accurately deconvolve 22 human hematopoietic cells. We use the sum of squarederror (SSE) and coefficient of determination denoted as R-squared (R2) as the metrics:

SSE =

p∑j=1

(β∗j − βj)2,

R2 = 1−p∑

j=1

(β∗j − βj)2/p∑

j=1

(β∗j − β∗)2, β∗ =1

p

p∑j=1

β∗j ,

where β∗ is the true amount of cells, and β is the estimate. Here, we use the coefficient ofdetermination instead of the Pearson correlation because we are focusing on the estimationaccuracy for the absolute not the relative amount while the Pearson correlation can only mea-sure the strength of linear relationship. For instance, if the true amount is β∗ = (0.2, 0.4, 0.6)′

and the estimate is β = (0.1, 0.2, 0.3) which is not a good estimate, the Pearson correlationis exactly 1 treating it as a perfect estimate. On the other hand, the corresponding coefficientof determination is −0.75 suggesting the poor performance of the results.

3.1. In silico simulation based on leukocyte gene signature matrix file

We randomly generated the amount of different cells from interval [0, 1]. Notably, thesum of cell abundance is not necessarily 1. The measurement errors were sampled from2N (0, (0.1 log2(s))

2). To incorporate outliers, we randomly selected i/50 of the data and re-placed them with data drawn from 2N (10, (0.3 log2(s))

2) where i = 1, 2, . . . , 25 and s is thestandard deviation of original mixtures (Newman et al., 2015).

We repeated the procedure for nine times and compared the coefficients estimator using theSSE and coefficient of determination (R2). By comparing the results of FARDEEP, CIBER-SORT (without converting to percentage), NNLS, PERT and DCQ, we found that the SSErange for FARDEEP is 3.53 × 10−8 to 2.92 × 10−4 and R2 keeps being 1 regardless of thenumber of outliers. Other methods show significantly larger SSE and smaller R2 (Supple-mentary Table S2).

6


https://doi.org/10.1101/358366

20% outlier 30% outlier

5% outlier 10% outlier

−5

0

5

−5

0

5

−5

0

5

−5

0

5

Proportion of outliers

log2(S

SE

)

NNLS DCQ PERT CIBERSORT FARDEEP

Figure 1. SSE of coefficients for different approaches. We simulated different percentage of outliers ({5%, 10%, 20%, 30%})and compared the SSE for coefficients applying NNLS, DCQ , PERT, CIBERSORT and FARDEEP.

3.2. In silico simulation with heavy-tailed error

To test the robustness of FARDEEP under heavy-tailed random error, we simulated a newdataset following the setting in She and Owen (2011); Alfons et al. (2013). The observa-tions were generated based on the linear regression model (2.1). The predictor matrix isX = (x1, . . . ,xn)′ = UΣ1/2, where Uij ∼ U(0, 20) and Σij = ρI{i6=j} with ρ = 0.5.Consider the proportion of outliers f ∈ {5%, 10%, 20%, 30%}, sample size n = 500, andnumber of predictors p = 20, we randomly added outliers to the simulated data as follows:i) Vertical outliers : we generated a zero vector τ , and randomly selected nf elements in τ tobe the outlier terms. Then, we replaced these elements by nf samples which were generatedfrom non-central t-distribution with 1 degree of freedom and a non-centrality parameter of30.ii) Leverage points : we took 20% of the contaminated data in i) as leverage points, that is,replacing the corresponding predictors by the samples from N (2max(X), 1).The coefficients βj and random errors εi were respectively sampled from U(0, 1) and t-distribution with 3 degrees of freedom, where j = 1, . . . , p and i = 1, . . . , n. Based onthe framework above, the dependent variable could be obtained by

y = Xβ + τ + ε.

Table 1. Tuned k for FARDEEP with the adjusted BIC. We simulated normal distributed error and heavy-tailed error respec-tively for different proportion of outliers and computed true positive rate and false positive rate to evaluate the tunning result.

Percentage of outliers 5% 10% 20% 30% 40%

NormalTrue positive rate 1 1 1 1 1False positive rate 0.008 0.018 0.003 0.009 0.053

Parameter (k) 3.80 2.30 1.80 1.40 1.10

Heavy-tailedTrue positive rate 1 1 1 1 1False positive rate 0.036 0.051 0.048 0.166 0.010

Parameter (k) 2.80 2.20 1.40 1.10 1.10

7


https://doi.org/10.1101/358366

0.00

0.25

0.50

0.75

1.00

0.00 0.25 0.50 0.75 1.00

Estimated coefficients

Tru

e c

oeffic

ients

NNLS ( −8.47 )

DCQ ( −2.588 )

PERT ( −2.349 )

CIBERSORT ( 0.951 )

FARDEEP ( 0.984 )

5% Outliers

0.00

0.25

0.50

0.75

1.00

0.00 0.25 0.50 0.75 1.00


Tru

e c

oeffic

ients

NNLS ( −13.684 )

DCQ ( −2.674 )

PERT ( −2.55 )

CIBERSORT ( 0.864 )

FARDEEP ( 0.983 )

10% Outliers

0.00

0.25

0.50

0.75

1.00

0.00 0.25 0.50 0.75 1.00


Tru

e c

oeffic

ients

NNLS ( −231.998 )

DCQ ( −22.656 )

PERT ( −2.317 )

CIBERSORT ( 0.705 )

FARDEEP ( 0.98 )

20% Outliers

0.00

0.25

0.50

0.75

1.00

0.00 0.25 0.50 0.75 1.00


Tru

e c

oeffic

ients

NNLS ( −32.343 )

DCQ ( −2.049 )

PERT ( −2.479 )

CIBERSORT ( 0.679 )

FARDEEP ( 0.974 )

30% Outliers

Figure 2. Compare the accuracy of different deconvolution approaches (the values in parentheses are R2). Based on{5%, 10%, 20%, 30%} percentage of outliers, we computed R2 to evaluate how well are the estimators fit for a straight lineβ = β. PERT and DCQ fail to estimate the coefficients accurately (R2 are negative since the SSE are too large), FARDEEPshows the best performance in each case.

We simulated each model 50 times. As shown in Figures 1 and 2, FARDEEP outperformsother methods, evidenced by the SSE and R2 values.

To check FARDEEP’s accuracy of outlier detection, we simulated{5%, 10%, 20%, 30%, 40%} outliers using the same method as above for a model withboth normal distributed noise and heavy-tailed noise. Tuning parameter k using the modifiedBIC in (2.6), we show that k decreases when the amount of outliers becomes larger. Thetrue positive rates always stay around 1, indicating that the tunning of k is highly effective(Table 1).

3.3. Synthetic dataset

We used the cell line dataset GSE11103 generated by Abbas et al. (2009) that contains geneexpression profiles of four immune cell lines (Jurkat, IM-9, Raji, and THP-1) and four mix-tures (MixA, MixB, MixC, and MixD) with various ratios of cells. Before analysis, we quan-tile normalized the mixture data for 54675 genes and downloaded the immune gene signaturematrix with 584 genes from CIBERSORT website. Then, we applied five deconvolution meth-ods, including FARDEEP, CIBERSORT (without converting to percentage), DCQ, NNLS andPERT, to calculate the sum of squared errors of the estimated amount of the four immune celllines. We also compared with CIBERSORT absolute mode, which is a beta version in CIBER-SORT website (Supplementary Figure S1). Since the CIBERSORT absolute mode is a betaversion and leads to suboptimal results compared with CIBERSORT, we only focused on thecomparisons with CIBERSORT. We show that FARDEEP gives the smallest SSE and thelargest R2, which indicates the most accurate result (Figure 3).

8


https://doi.org/10.1101/358366

mixC mixD

mixA mixB

mixC1 mixC2 mixC3 mixD1 mixD2 mixD3

mixA1 mixA2 mixA3 mixB1 mixB2 mixB3

0.00

0.05

0.10

0.15

0.20

0.00

0.05

0.10

0.15

0.20

0.00

0.05

0.10

0.15

0.20

0.00

0.05

0.10

0.15

0.20

SS

E

NNLS DCQ PERT CIBERSORT FARDEEP

A

CIBERSORT (R2 = 0.347 ) FARDEEP (R2 = 0.776 )

NNLS (R2 = 0.297 ) DCQ (R2 = −1.039 ) PERT (R2 = 0.256 )

0.0 0.2 0.4 0.6 0.0 0.2 0.4 0.6

0.0 0.2 0.4 0.6 0.0 0.2 0.4 0.6 0.0 0.2 0.4 0.6

0.0

0.2

0.4

0.6

0.0

0.2

0.4

0.6

0.0

0.2

0.4

0.6

0.0

0.2

0.4

0.6

0.0

0.2

0.4

0.6

Estimated cell amount

Tru

e c

ell

am

ount

IM−9 Jurkat Raji THP−1

B

Figure 3. Applying different deconvolution approaches on the gene expression data of IM-9, Jurkat, Raji, THP-1 and themixture of these four immune cell lines with known proportion (MixA, MixB, MixC, MixD). All of the mixtures were performedand measured in triplicate. (A) SSE of coefficients for FARDEEP, CIBERSORT, NNLS, PERT, DCQ. (B) Amount of cell linesestimated from different deconvolution approaches vs. amount of cell lines truly mixed, and using R-squared to evaluate theperformance of each method.

3.4. Synthetic dataset with added unknown content

Both CIBERSORT and FARDEEP are robust deconvolution methods and show advan-tages in the above datasets, we next sought to compare their performances on mixtureswith unknown content. We followed the simulation setting proposed by Newman et al.(2015) and downloaded the mixture dataset and the signature gene file from CIBERSORTwebsite. The mixture file was constructed from the four immune cell lines data, as men-tioned in the previous section, and a colon cancer cell line ( average of GSM269529 andGSM269530 in GSE10650 ). Cancer cells were mingled into immune cells at different ratios{0%, 30%, 60%, 90%}. Noise was added by sampling from the distribution 2N (0,(f log2(s))

2),in which f ∈ {0%, 30%, 60%, 90%} and s is the standard deviation of original mixtures. Byapplying FARDEEP and CIBERSORT (without converting to percentage) on 64 mixtures,we found that FARDEEP remains an accurate estimation, while the tumor contents skews theresults of CIBERSORT with larger deviation from the the ground truth (Figure 4).

0.0

0.1

0.2

0.3

0.4

mixA mixB mixC mixD

SS

E

CIBERSORT FARDEEP

A CIBERSORT (R2 = −0.473 ) FARDEEP (R2 = 0.513 )

0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00

0.00

0.25

0.50

0.75

1.00

0.00

0.25

0.50

0.75

1.00

Estimated cell amount

Tru

e c

ell

am

ou

nt

IM−9 Jurkat Raji THP−1

B

Figure 4. Comparing the performance of CIBERSORT and FARDEEP on mixtures with unknown content by SSE and R2. (A)SSE for various noise and amount of tumor contents on 4 different mixtures. (B) Amount of cell lines estimated by CIBERSORTand FARDEEP vs. amount of cell lines truly mixed.

9


https://doi.org/10.1101/358366

CIBERSORT (R2 = 0.913, SSE = 0.41) FARDEEP (R2 = 0.933, SSE = 0.32)

0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00

0.00

0.25

0.50

0.75

1.00

0.00

0.25

0.50

0.75

1.00

Estimated cell fraction

Flo

w c

yto

me

try f

ractio

n

B cells CD4 T cells CD8 T cells

A

FARDEEP

CIBERSORT

B1 B3 B4 B5 B6 T1 T3 T4 T5 T6

0.00

0.25

0.50

0.75

1.00

0.00

0.25

0.50

0.75

1.00

Pro

po

rtio

n

Other T cells B cells

B

Figure 5. Performance assessment on real data with SSE and R2. (A) Follicular lymphoma dataset with estimation of B cell,CD8 T cell, and CD4 T cell using FARDEEP, and CIBERSORT. (B) Normal tonsil dataset with purified B cells and T cells.

3.5. Deconvolution performance on immune-cell-rich datasets

To evaluate the performance of FARDEEP in immune-cell-rich settings that are less affectedby outliers, we downloaded and analyzed two gene expression datasets (GSE65136) fromlymph node biopsy and peripheral blood used in Newman et al. (2015). In the two datasets,gene expression data were generated from Illumina BeadChip which is the same platformused for the signature matrix LM22, and the TILs of interest have abundance consists onaverage greater than 5%. These studies contains both gene expression and cell flow cytometrydata measuring multiple immune cell subsets and normal tissues:

(i) surgical lymph node biopsies of follicular lymphoma patients;(ii) purified B and T cells from the tonsils of five healthy controls;

Flow cytometry data in these studies, which are treated as ground truth, are in relative scales,thus we normalized the estimated parameters of each method to sum of 1 before comparison.

As shown in Figure 5A for case (i), FARDEEP outperformed CIBERSORT in terms ofR2 and SSE, which is consistent with our findings with simulated datasets. For case (ii), weestimated the immune cell composition for purified B and T cells with purity level exceeding95% and 98%, respectively. For purified B cells, CIBERSORT tends to return a non-zeroestimates for T cell and a large proportion of other cell types, while FARDEEP gave almostall zero estimates for T cell and on average reduced the estimation error by 61%. Similarly,for the purified T cell, although CIBERSORT had a better performance compared to purified

10


https://doi.org/10.1101/358366

p = 0.52

p = 0.37

p = 0.099

p = 0.0065

Method: CIBERSORT.modified Method: FARDEEP

Method: ESTIMATE Method: CIBERSORT

0 1000 2000 3000 4000 5000 0 1000 2000 3000 4000 5000

0.00

0.25

0.50

0.75

1.00

0.00

0.25

0.50

0.75

1.00

Time

Su

rviv

al p

rob

ab

ility

high.immune.score low.immune.score

Figure 6. Kaplan-Meier survival curves are plotted based on ESTIMATE, FARDEEP- and CIBERSORT- assisted TIL profil-ing. Log-Rank test were applied to data for 514 patients with ovarian cancer classified into two groups according to whetherimmunoscore (ESTIMATE) or TH1/Tc1-skewed immune compartment (CIBERSORT and FARDEEP) was above or below themedian.

B cell, FARDEEP still improved significantly improve the estimation accuracy by reducingon average 48% of the estimation error (Figure 5B).

Overall, FARDEEP and CIBERSORT are both robust methods, which generate generallycomparable results in specimens that are rich in immune cells and contain minimal carcinomacell contamination.

3.6. TCGA Ovarian Serous Cystadenocarcinoma dataset

However, the TME of solid carcinomas are far different from a lymph node biopsy or pe-ripheral blood in that the highly variable gene expression in cancer cells challenges the ac-curacy of immune cell deconvolution. Hence, we next utilized survival and gene expressiondata (n = 514) of the ovarian cancer TCGA database to assess the prognostic relevance ofdifferent deconvolution methods. Using gene expression data (n = 514), we estimated theimmunoscore using ESTIMATE proposed by Yoshihara et al. (2013) as well as TILs abun-dance using CIBERSORT and FARDEEP. Cold tumors typically harbor lower numbers ofCD8+ T cells, γδ T cells, M1-macrophages, and NK cells (Lei et al., 2016; Qiao et al., 2017;Binnewies et al., 2018; Corrales et al., 2016). Thus, we calculated the TH1/Tc1-skewed im-mune compartment by the summation of CD8+ T cells, γδ T cells, M1-macrophages, andNK cells. Then, we partitioned the patients into two groups with equal size using the medianof either the immunoscore (ESTIMATE) or TH1/Tc1-skewed immune compartment (CIBER-SORT and FARDEEP). We compared the survival curves between the two groups. As shownin Figure 6, FARDEEP most effectively separates patients into high-and-low risk groups withthe smallest p-value (p-value = 0.0065). Recently, CIBERSORT website supports a beta-

11


https://doi.org/10.1101/358366

version of absolute mode for cell deconvolution, which report the absolute instead of relativeabundance of TILs. We also included (CIBERSORT absolute mode) in this survival analysisand showed that it returned a slightly better results (p-value = 0.037) compared to the relativemode but still was dominated by FARDEEP (Figure S2). These results demonstrated that theabsolute abundances of TILs could provide additional clinical-relevant information comparedto the relative abundance. Additionally, these results suggest that FARDEEP can complementthe current robust methods, such as ESTIMATE and CIBERSORT, for TIL deconvolutionchallenges in datasets.

4. DISCUSSION

Cancer immune microenvironment has emerged as a critical prognostic dimension that mod-ulates patient responses to neoadjuvant therapy. However, the current clinical TNM stagingsystem does not have a consistent method to stratify cancers based on their immunogenicity.Recent study shows that the RNA-Seq datasets of whole tumors contain valuable prognosticinformation to assess the cancer-immunity interactions (Newman et al., 2015; Robinson et al.,2017). But the current methods to extract immune signatures are susceptible to the frequentoutliers in the datasets, leading to less effective identification of cold tumors. In this study, wedeveloped a new machine learning tool, FARDEEP, to streamline the removal of outliers andincrease the robustness of gene-expression profile deconvolution. Robustness is an indispens-able feature to solve a problem of deconvolution because gene expression data are frequentlycontaminated by large amount of outliers. FARDEEP solves the deconvolution problem in arobust way because this tool evaluates all outliers across the datasets and then examines thetrue immune gene signature using non-negative regression. This feature is especially useful toanalyze tumors with significant non-hematopoietic tumor components. Interestingly, althoughFARDEEP and the current robust methods can both tackle immune-cell-rich specimens suchas lymph node and PBMCs, FARDEEP exhibits improved prognostic potential when dealingmore complex datasets with significant carcinoma cell content.

Overall, here we show that FARDEEP is a powerful and rapid machine learning toolthat outperforms existing robust methods for gene deconvolution in datasets with signif-icant heavy-tailed noise. FARDEEP provides a new technology to interrogate cancer im-munogenomics and more accurately map the immune landscape of solid tumors.

5. ACKNOWLEDGEMENTS

This work is supported by NIH grants R03 DE027399 (YLL and YX), R01 DE026728 (YLL)and R00 DE024173 (YLL), NSF grant DMS-1621798 (MY), the Michigan State UniversitySTEM Gateway Fellowship (YX), State Key Laboratory of Oral Diseases (SKLOD) OpenFund of China (YX), University of Michigan Rogel Cancer Center Research Grant (YLL)and POM Clinical Research Supplement (YLL).

References

Abbas, A. R., Wolslegel, K., Seshasayee, D., Modrusan, Z., and Clark, H. F. (2009). Decon-volution of blood microarray data identifies cellular activation patterns in systemic lupuserythematosus. PLOS ONE, 4, e6098.

Alfons, A., Croux, C., and Gelper, S. (2013). Sparse least trimmed squares regression foranalyzing high-dimensional large data sets. The Annals of Applied Statistics, 7, 226–248.

12


https://doi.org/10.1101/358366

Altboum, Z., Steuerman, Y., David, E., Barnett-Itzhaki, Z., Valadarsky, L., Keren-Shaul, H.,Meningher, T., Mendelson, E., Mandelboim, M., Gat-Viks, I., and Amit, I. (2014). Dig-ital cell quantification identifies global immune cell dynamics during influenza infection.Molecular Systems Biology, 10, 720.

Balermpas, P., Michel, Y., Wagenblast, J., Seitz, O., Weiss, C., Rodel, F., Rodel, C., and Fokas,E. (2014). Tumour-infiltrating lymphocytes predict response to definitive chemoradiother-apy in head and neck cancer. British Journal of Cancer, 110, 501–509.

Balermpas, P., Rodel, F., Rodel, C., Krause, M., Linge, A., Lohaus, F., Baumann, M., Tin-hofer, I., Budach, V., Gkika, E., Stuschke, M., Avlar, M., Grosu, A.-L., Abdollahi, A.,Debus, J., Bayer, C., Stangl, S., Belka, C., Pigorsch, S., Multhoff, G., Combs, S. E., Mon-nich, D., Zips, D., and Fokas, E. (2016). CD8+ tumour-infiltrating lymphocytes in relationto HPV status and clinical outcome in patients with head and neck cancer after postoper-ative chemoradiotherapy: A multicentre study of the German cancer consortium radiationoncology group (DKTK-ROG). International Journal of Cancer, 138, 171–181.

Beal, J. (2017). Biochemical complexity drives log-normal variation in genetic expression.Engineering Biology, 1(1), 55–60.

Binnewies, M., Roberts, E. W., Kersten, K., Chan, V., Fearon, D. F., Merad, M., Coussens,L. M., Gabrilovich, D. I., Ostrand-Rosenberg, S., Hedrick, C. C., Vonderheide, R. H., Pittet,M. J., Jain, R. K., Zou, W., Howcroft, T. K., Woodhouse, E. C., Weinberg, R. A., andKrummel, M. F. (2018). Understanding the tumor immune microenvironment (time) foreffective therapy. Nature Medicine, 24, 541–550.

Corrales, L., McWhirter, S. M., Thomas W. Dubensky, J., and Gajewski, T. F. (2016). Thehost sting pathway at the interface of cancer and immunity. The Journal of Clinical Inves-tigation, 126, 2404–2411.

Galon, J., Costes, A., Sanchez-Cabo, F., Kirilovsky, A., Mlecnik, B., Lagorce-Pages, C.,Tosolini, M., Camus, M., Berger, A., Wind, P., Zinzindohoue, F., Bruneval, P., Cugnenc,P.-H., Trajanoski, Z., Fridman, W.-H., and Pages, F. (2006). Type, density, and locationof immune cells within human colorectal tumors predict clinical outcome. Science, 313,1960–1964.

Gong, T. and Szustakowski, J. (2013). Deconrnaseq: a statistical framework for deconvolutionof heterogeneous tissue samples based on mrna-seq data. Bioinformatics, 29, 1083–1085.

Gong, T., Hartmann, N., Kohane, I. S., Brinkmann, V., Staedtler, F., Letzkus, M., Bongio-vanni, S., and Szustakowski, J. D. (2011). Optimal deconvolution of transcriptional profil-ing data using quadratic programming with application to complex clinical blood samples.PLOS ONE, 6, e27156.

Jiang, D., Tang, C., and Zhang, A. (2004). Cluster analysis for gene expression data: A survey.IEEE Transactions on Knowledge and Data Engineering, 16, 1370–1386.

Lawson, C. L. and Hanson, R. J. (1995). Solving Least Squares Problems. Classics in AppliedMathematics. Society for Industrial and Applied Mathematics.

Lei, Y., Xie, Y., Tan, Y. S., Prince, M. E., Moyer, J. S., Nor, J., and Wolf, G. T. (2016).Telltale tumor infiltrating lymphocytes (til) in oral, head & neck cancer. Oral oncology,61, 159–165.

Li, B., Severson, E., Pignon, J.-C., Zhao, H., Li, T., Novak, J., Jiang, P., Shen, H., Aster, J. C.,Rodig, S., Signoretti, S., Liu, J. S., and Liu, X. S. (2016). Comprehensive analyses of tumorimmunity: implications for cancer immunotherapy. Genome Biology, 17, 174.

Liebner, D. A., Huang, K., and Parvin, J. D. (2014). Microarray microdissection with analysisof differences is a computational tool for deconvoluting cell type-specific contributionsfrom tissue samples. Bioinformatics, 30, 682–689.

Mackey, M. D., Mackey, D. J., Higgins, H. W., and Wright, S. W. (1996). Chemtax - aprogram for estimating class abundances from chemical markers:application to hplc mea-

13


https://doi.org/10.1101/358366

surements of phytoplankton. Marine Ecology Progress Series, 144, 265–283.Mlecnik, B., Bindea, G., Angell, H. K., Maby, P., Angelova, M., Tougeron, D., Church, S. E.,

Lafontaine, L., Fischer, M., Fredriksen, T., Sasso, M., Bilocq, A. M., Kirilovsky, A., Obe-nauf, A. C., Hamieh, M., Berger, A., Bruneval, P., Tuech, J.-J., Sabourin, J.-C., Pessot,F. L., Mauillon, J., Rafii, A., Laurent-Puig, P., Speicher, M. R., Trajanoski, Z., Michel,P., Sesboue, R., Frebourg, T., Pages, F., Valge-Archer, V., Latouche, J.-B., and Galon, J.(2016). Integrative analyses of colorectal cancer show immunoscore is a stronger predictorof patient survival than microsatellite instability. Immunity, 44, 698–711.

Newman, A. M., Liu, C. L., Green, M. R., Gentles, A. J., Feng, W., Xu, Y., Hoang, C. D.,Diehn, M., and Alizadeh, A. A. (2015). Robust enumeration of cell subsets from tissueexpression profiles. Nature Methods, 12, 453–457.

Nguyen, N., Bellile, E., Thomas, D., McHugh, J., Rozek, L., Virani, S., Peterson, L., Carey,T. E., Walline, H., Moyer, J., Spector, M., Perim, D., Prince, M., McLean, S., Bradford,C. R., Taylor, J. M. G., Wolf, G. T., Head, and Investigators, N. S. P. (2016). Tumor infil-trating lymphocytes and survival in patients with head and neck squamous cell carcinoma.Head Neck, 38, 1074–1084.

Pages, F., Kirilovsky, A., Mlecnik, B., Asslaber, M., Tosolini, M., Bindea, G., Lagorce, C.,Wind, P., Marliot, F., Bruneval, P., Zatloukal, K., Trajanoski, Z., Berger, A., Fridman, W.-H., and Galon, J. (2009). In situ cytotoxic and memory t cells predict outcome in patientswith early-stage colorectal cancer. Journal of clinical oncology, 27, 5944–5951.

Qiao, J., Tang, H., and Fu, Y. X. (2017). DNA sensing and immune responses in cancertherapy. Current opinion in immunology, 45, 16–20.

Qiao, W., Quon, G., Csaszar, E., Yu, M., Morris, Q., and Zandstra, P. W. (2012). Pert : Amethod for expression deconvolution of human blood samples from varied microenviron-mental and developmental conditions. PLOS Computational Biology, 8, e1002838.

Robinson, D., Wu, Y.-M., Lonigro, R., Vats, P., Cobain, E., Everett, J., Cao, X., Rabban, E.,Kumar-Sinha, C., Raymond, V., Schuetze, S., Alva, A., Siddiqui, J., Chugh, R., Worden,F., Zalupski, M. M., Innis, J., Mody, R. J., Tomlins, S. A., Lucas, D., Baker, L., Ramnath,N., Schott, A., Hayes, D., Vijai, J., Offit, K., Stoffel, E., Roberts, S., Smith, D., Kunju, L.,Talpaz, M., Cieslik, M., and Chinnaiyan, A. M. (2017). Integrative clinical genomics ofmetastatic cancer. Nature, 548, 297–303.

Rousseeuw, P. J. (1984). Least median of squares regression. Journal of the American Statis-tical Association, 79, 871–880.

Rousseeuw, P. J. and Driessen, K. V. (2006). Computing lts regression for large data sets.Data Mining and Knowledge Discovery, 12, 29–45.

Rousseeuw, P. J. and Leroy, A. M. (1987). Robust Regression and Outlier Detection. JohnWiley & Sons.

She, Y. and Owen, A. B. (2011). Outlier detection using nonconvex penalized regression.Journal of the American Statistical Association, 106, 626–639.

Wolf, G. T., Chepeha, D. B., Bellile, E., Nguyen, A., Thomas, D., McHugh, J., of Michi-gan Head, T. U., and Program, N. S. (2015). Tumor infiltrating lymphocytes (til) andprognosis in oral cavity squamous carcinoma: a preliminary study. Oral oncology, 51,90–95.

Xu, Q., Yan, M., Huang, C., Xiong, J., Huang, Q., and Yao, Y. (2017). Exploring outliers incrowdsourced ranking for QoE. Proceedings of the 25th ACM Multimedia, to appear.

Yoshihara, K., Shahmoradgoli, M., Martnez, E., Vegesna, R., Kim, H., Torres-Garcia, W.,Trevio, V., Shen, H., Laird, P. W., Levine, D. A., Carter, S. L., Getz, G., Stemke-Hale,K., Mills, G. B., and Verhaak, R. G. W. (2013). Inferring tumour purity and stromal andimmune cell admixture from expression data. Nature Communications, 4, 2612.

Zou, H. and Hastie, T. (2005). Regularization and variable selection via the elastic net. Jour-

14


https://doi.org/10.1101/358366

nal of the Royal Statistical Society: Series B (Statistical Methodology), 67, 301–320.

15


https://doi.org/10.1101/358366

fast and robust deconvolution of tumor infiltrating ...tion, and estimates coefﬁcients using a...

Documents