online supplemental material · processing performed to test for lc contaminants in lc-replete...
TRANSCRIPT
s1
ONLINE SUPPLEMENTAL MATERIAL
Supplemental Methods include information regarding isolation of murine immune cells, flow
cytometry analysis and cell sorting, sample preparation of gene expression analysis, multi-photon
in vivo imaging, immunofluorescence staining of epidermal sheets, microarray and RNAseq
analysis, data set pre-processing and gene subtraction, sample relationship visualization,
differential gene expression, GSEA, ssGSEA, WGCNA, prediction of upstream master regulators,
overrepresentation analysis, and module-based differential expression analysis. Supplemental
Table 1, included in a separate Excel file, lists for the 19 highly preserved F→M WGCNA gene
modules, and the 2 skin-specific B6→129 WGCNA modules: the over-represented Gene
Ontology categories (Supplemental Table 1A), the putative driver genes identified on the basis of
high intra-modular connectivity (Supplemental Table 1B), and the transcription factors predicted
to act as upstream regulators (Supplemental Table 1C). Supplemental Table 2, included in a
separate Excel file, lists the clinical details for the 5 HSCT patients included in this study.
Supplemental Table 3, included in a separate Excel file, contains the results from the pre-
processing performed to test for LC contaminants in LC-replete versus LC-depleted epidermal
samples (Supplemental Table 3A), and lists the intestine- and skin-specific genes subtracted
from the analysis (Supplemental Table 3B). Supplemental Figure 1 shows the visualization of
sample relationship without gene subtraction. Supplemental Figure 2 shows the overlap between
TE gene expression profiles in the F→M and B6→129 experimental models, and the similarities
between F→M and B6→129 based WGCNA modules. Supplemental Figure 3 depicts the
analytical pipeline followed, based upon correlation network analysis and downstream validation.
Supplemental Figure 4 shows the network representation of the B6→129 skin-specific modules
MD and ME, and their overlap with the F→M skin-specific module M28. Supplemental Figure 5
shows the expression of JAG1 and DLL4 Notch ligands by LC compared to non-LC populations
in the epidermis of mice developing GVHD, and the effect of in vitro Notch blockade on the
capacity of male LC to stimulate IFN-g generation by activated MataHari T cells. Supplemental
Figure 6 shows the kinetics of LC host-to-donor turn over and the effect host LC depletion upon
TE accumulation in the epidermis in independent murine BMT models of GVHD.
s2
Supplemental Methods
Isolation of murine immune cells
a) Lymph nodes and spleens – To prepare cell suspensions from spleens and lymph nodes, the
freshly removed organs were mashed and passed through a 40 µm cell strainer; red blood cells
were removed by isotonic lysis with ammonium chloride (ACK Lysing Buffer; Lonza, UK). Cells
were re-suspended in FACS buffer (PBS, 2% FCS, 2 mM EDTA; Lonza UK) for counting and
immunolabelling.
b) Bone marrow – To isolate bone marrow cells, both epiphyses of the long bones of the hind
limbs were cut and the bone marrow was flushed out with FACS buffer. The cell suspension was
filtered through a 40 µm cell strainer and red blood cells were removed by isotonic lysis with
ammonium chloride. Cells were re-suspended in FACS buffer for counting and immunolabelling.
c) Blood – Erythrocytes were removed from whole blood samples by hypotonic lysis with distilled
water. Cells were re-suspended in FACS buffer for counting and immunolabelling.
d) Small intestine – The entire small intestine was flushed and rinsed with 40 ml of ice cold
harvest medium (PBS, 2% FCS, 1% penicillin-streptomycin; Lonza, UK), and sectioned into
~0.5 cm pieces. The intestinal pieces were incubated with detaching medium (HBSS, 5% FCS,
1% penicillin-streptomycin, 5 mM EDTA; Lonza, UK) at 37°C with shaking at 150 rpm for 60 min.
The supernatant, containing the IEL, was passed sequentially through a 100 µm and a 40 µm cell
strainer and enriched for lymphocytes by density gradient centrifugation with Ficoll PaqueTM Plus
(GE Healthcare, UK). The intestinal pieces were further incubated in digestion medium (RPMI,
2% FCS; Lonza, UK; 200 U/ml collagenase IV; LifeTechnologies, USA; 200 U/ml DNAse I;
Sigma, UK) at 37°C with shaking at 150 rpm for 60 min, dissociated and passed sequentially
through a 100 µm and a 40 µm cell strainer. The cell suspension, containing the LP cells was
enriched for lymphocytes by density gradient centrifugation with Ficoll PaqueTM Plus. Cells were
re-suspended in FACS buffer for counting and immunolabelling.
e) Skin – Epidermal and dermal immune cells were isolated from the skin in accordance with the
protocol described by Henri et al. (1), with minor modifications. Briefly, the body skin was cut into
~1x1 cm pieces, after having been shaved and the subcutaneous fat removed, and the ears were
s3
split in two parts (ventral and dorsal). The pieces of skin were incubated overnight in dispase
medium (HBSS, 2% FCS; Lonza, UK; 2.5 mg/dl dispase II; Sigma, UK) at 4°C, and the epidermal
and dermal sheets were separated and cut into ~0.5 cm fragments. The epidermal fragments
were vortexed, mashed and passed sequentially through a 100 µm and a 40 µm cell strainer to
disintegrate the remaining tissue and create a cell suspension. The dermal fragments further
incubated in digestion medium at 37°C with shaking at 150 rpm for 60 min, dissociated and
passed sequentially through a 100 µm and a 40 µm cell strainer. Cells were re-suspended in
FACS buffer for counting and immunolabelling.
Flow cytometry
Cells were plated out at up to 1x106 cells per well in a 96 well conical bottom plate and incubated
with 2.4G2 antibody at 4°C for at least 10 min to block Fc receptors. For cell surface
immunolabeling, cells were incubated with the fluorochrome-conjugated antibodies diluted in
100 μl of FACS buffer at 4°C for 20 min in the dark – CD4 (GK1.5, eBioscience, USA), CD8a (53-
6.7, BD Biosciences, Germany), CD11b (M1/70, eBioscience, USA), CD45 (30-F11, BioLegend,
USA), CD45.1 (A20, BD Biosciences, Germany), CD45.2 (104, eBioscience, USA), Thy-1.1
(HIS51, eBioscience, USA), Thy-1.2 (53-2.1, BD Biosciences, Germany), EpCAM (G8.8,
eBioscience, USA), MHC class II I-A/I-E (M5/114.15.2, eBioscience, USA), DLL4 (HMD4-1,
BioLegend, USA), JAG1 (HMJ1-29, BioLegend, USA), LPAM-1 (DATK32, BioLegend, USA),
Vβ8.3 TCR (1B3.3, BD Biosciences, Germany). When intracellular staining was required, after
having performed cell surface immunolabelling, samples were washed twice with FACS buffer,
fixed in 100 µl of fixation solution (BD Cytofix/Cytoperm solution; BD Biosciences, UK) for 15 min
at 4°C in the dark, washed twice with permeabilization solution (BD Perm/Wash™ buffer;
BD Biosciences, UK), and incubated with the fluorochrome-conjugated antibodies diluted in
100 μl of permeabilization solution at 4°C for 30 min in the dark – Active Caspase-3 (C92-605,
BD Biosciences, Germany), CD207 (eBioL31, eBioscience, USA), IFN-γ (XMG1.2, BD
Biosciences, Germany). For detection of cytokine production, cells were treated with brefeldin A
(Sigma, UK) for 2 h at 37°C, prior to immunolabelling. Samples were washed twice with FACS
buffer and re-suspended in 300 µl of FACS buffer for immediate analysis by flow cytometry. For
s4
dead cell exclusion, 2 µl of propidium iodide (LifeTechnologies, USA) was added to the unfixed
samples prior to analysis. For assessment of cell proliferation, cells were stained with
carboxyfluorescein succinimidyl ester (CellTrace™ CFSE Cell Proliferation Kit, Invitrogen, USA),
according to the manufacturer’s instructions. Multicolor flow cytometry data acquisition was done
with BD LSRFortessa and BD LSR II cell analyzers equipped with BD FACSDiva v6.2 software
(BD Biosciences, Germany). Fluorescence activated cell sorting was performed on a
BD FACSAria equipped with BD FACSDiva v5.0.3 software (BD Biosciences, Germany). All
samples were maintained at 4°C for the duration of the sort. Sort purity was accessed for all
samples and only those with purity ≥ 95% were used for RNA extraction. Cells were sorted
directly into Buffer RLT (QIAGEN, USA) with 1% 2-β-mercaptoethanol (Sigma, UK), disrupted
through vortexing at 3200 rpm for 1min, and immediately stored at -80°C until further processing.
Three biological replicates were obtained for every tissue from 3 independent experiments, each
containing a minimum of 4000 cells (pooling where necessary from multiple mice from individual
experiments). Flow cytometry data were analyzed with FlowJo X v10 (LLC, USA).
Sample preparation for gene expression analysis
RNA was extracted using the RNeasy Micro Kit (QIAGEN, USA) following the manufacturer’s
protocol. RNA yield, quality and integrity were evaluated using the RNA 6000 Pico kit on an
Agilent 2100 Bioanalyser (Agilent Technologies, USA). Only samples with a RNA Integrity
Number (RIN) above 8.0 were included in the study. For microarray studies, the Ovation Pico
WTA System V2 kit (NuGEN, USA) was used to prepare amplified cDNA from total RNA.
Spectrophotometric absorbance of the cDNA products at 260, 280 and 320 nm was determined
to assess purity; all samples had an adjusted (A260 – A320):(A280 – A320) ratio > 1.8.
Fragmentation and labeling of the cDNA samples was performed using the Encore Biotin Module
kit (NuGEN, USA), according to kit instructions, and then hybridized onto GeneChip Mouse Gene
2.0 ST arrays (Affymetrix, USA). Hybridisation was performed in a single batch per experimental
system. For RNA-seq studies, RNA samples were amplified using the SMART Seq® v4 Ultra®
Low Input RNA Kit. Paired-end sequencing libraries were prepared from the amplified cDNA
according to the Nextera® XT DNA library prep protocol, and sequenced (38 base-paired reads)
s5
using an Illumina NextSeq 500 (Illumina, USA). Microarray and RNA-seq studies were performed
in collaboration with UCL Genomics.
Multi-photon in vivo imaging
Mice were anesthetized with fentanyl-fluanisone (Hypnorm®; VetaPharma, UK) 0.4 ml/kg and
midazolam 5 mg/kg injected IP. Anesthesia was prolonged by injecting additional doses of
Hypnorm® (0.3 ml/kg) every 40-60 min. Ears were depilated using NairTM Hair Remover (Church
& Dwight, UK). Mice were then placed on a custom-made imaging platform and the ears fixed
with a double-sided tape. PBS was placed on top of the ear and a water reservoir, created with a
metal ring sealed by a coverslip, was glued on top. Mice were then placed in the thermostated
imaging chamber and movies of 22-60 min were acquired. Images were acquired with the Leica
DM6000-CFS multiphoton microscope (Leica Microsystems, UK) enclosed in a dark chamber
heated at 37°C with heated air, and the HC FLUOTAR 25x/0.95 N.A. water immersion objective
(Leica Microsystems, UK). EYFP and DsRed were excited at 950 nm. The emitted fluorescence
was acquired using a hybrid detector (HyD; Leica microsystems, UK). The second harmonic
generation, produced by the collagen present in the dermis, was visualized by setting the first
HyD channel to a 450-500 nm filter window; the EYFP fluorescence by setting the second HyD
channel to 510-565 nm and the DsRed fluorescence by setting the third HyD channel to 575-
660 nm. Raw image data were processed with Fiji (ImageJ): drift was corrected with the Fiji
plugin “Correct 3D drift” using the SHG channel as reference; a median filter (0.3 pixel) was
applied to reduce background noise and autofluorescence was removed using the command
Image Calculator to subtract crossover signal between channels:
corrected channel 1 = channel 1 – channel 2 – channel 3
corrected channel 2 = channel 2 – channel 1 – channel 3
corrected channel 3 = channel 3 – channel 1 – channel 2
Compilation images (z-stack projection and yz orthogonal view) were created with Imaris 8.3
software (Bitplane).
s6
Immunofluorescence staining of epidermal sheets
Epidermal sheets were fixed in ice-cold acetone, blocked in a 0.25% fish gelatin, 10% goat serum
solution and stained for LC with CD207 (eBioL31) (eBioscience, USA). Images were acquired
with the Leica DMI4000b fluorescence microscope (Leica Microsystems, UK), using the Leica
Application Suite Advanced Fluorescence v3.2 software (Leica Microsystems, UK). Raw image
data were processed with Fiji (ImageJ).
Microarray analysis
A total of 36 RNA samples from the B6®129 model and 61 RNA samples from the F®M model
were used for microarray analysis, representing 3 independent biological replicates for each of
the tissues/compartments, treatment conditions and time points. Hybridized arrays were scanned
with a GeneChip 3000 7G scanner (Affymetrix, USA) (1 batch for B6®129 samples; 3 batches
for F®M samples) and the image data processed to generate .cel files. Expression Console
Software, version 1.4.1 (Affymetrix, USA) was used to generate quality control statistics for each
sample; samples with a high mean absolute deviation of the residual for a chip versus all chips in
the data set (≥ data set average + 2 SD), a high mean absolute relative log expression (≥ data set
average + 2 SD) and a low area under the curve (AUC) for a receiver operator curve (ROC)
comparing the intron controls to the exon controls (≤ 0.8) were considered to have poor
hybridisation quality and excluded from the analysis (1 sample in the B6®129 data set; 3
samples in de F®M data set). Raw sample expression signals were background subtracted,
quantile normalized, and the probe level data were summarized using the Robust Multi-array
Average algorithm (2, 3) implemented in the oligo BioConductor R package (4). The ComBat
algorithm from the sva BioConductor R package (5) was employed to adjust for batch effects.
Transcripts identified through multiple probes were collapsed based on maximum expression
values using the CollapseDataset module of GenePattern software (Broad Institute) (6).
Data set pre-processing and gene subtraction
Pre-processing of the data sets was performed to test for LC contamination in the microarray
samples used in experiments shown in Fig. 8. We found no significant difference between LC-
s7
replete and LC-depleted epidermal samples in terms of the levels of expression of LC signature
genes (7) (Supplemental Table 3A). To exclude any effect of tissue-specific transcripts derived
from non-T cell contamination of individual samples, transcripts identified as highly specific for the
intestine and skin in the PaGenBase database [specificity measure (PMS) > 0.9] (8) were
subtracted from the entire dataset prior to analysis (Supplemental Table 3B).
RNAseq analysis
FASTQ Toolkit, version 1.0.0 (9), was used for adapter trimming of the reads. Alignment and
mapping of all libraries were performed using TopHat Alignment, version 1.0.0 and Cufflinks
Assembly & DE, version 2.0.0, selecting Homo sapiens hg38 RefSeq gene annotations. Gene
expression levels were calculated using the Cufflinks Assembly & DE, version 2.0.0, employing a
geometric library normalization method and a fragment bias correction algorithm.
Samples relationship visualization
Multivariate statistical analysis methods implemented in the stats R package, in particular
multidimensional scaling, were applied to perform dimensionality reduction of the datasets and
visualization of the samples relationships. Similarities between groups were evaluated by
hierarchical clustering, using average-linkage upon the Pearson’s correlation-based dissimilarity
matrix; the validity and stability of the clusters was assessed using a non-parametric bootstrap as
implemented in the pvclust R package (10). Additional hierarchical clustering and heat maps
were produced using the matrix visualization and analysis platform GENE-E (Broad Institute,
USA).
Differential gene expression
The limma BioConductor R package was used to perform analyses of gene differential
expression, using an empirical Bayes moderated t-statistic corrected for multiple hypothesis
testing using Benjamini-Hochberg procedure, with a cut-off of false discovery rate (FDR) ≤ 0.05,
and an absolute fold-change cut-off ≥ 2.0.
s8
Gene set enrichment analysis (GSEA)
Gene Set Enrichment Analysis was performed using the GSEA software with the gene sets
derived from the Biological Process Ontology database (C5) defined by the Gene Ontology
Consortium, collected in the Molecular Signatures Database (MSigDB v5.1), and the gene sets
derived from the modules identified by WGCNA in the F→M model dataset.
Single-sample gene set enrichment analysis (ssGSEA)
ssGSEA was performed using the GSVA R package employing the “ssgsea” method (11), as
described by Hänzelmann et al. (12). The Tc1 gene signature was derived from Best et al. data
set (13) as the top 100 over-expressed genes (fold change ≥ 1.5 and FDR ≤ 0.05) in effector OT-I
cells on day 6 post Lm-OVA or VSV-OVA infection in comparison to naïve OT-I cells; the Tc17
gene signature was derived from Gartlan et al. (14) data set as the differentially expressed genes
(fold change ≥ 1.5 and FDR ≤ 0.05) between CD8+YFP+ and CD8+YFPneg T-cells 7 days after
allogeneic transplant; the MDR1+ Th1/Th17 gene signature corresponds to the ex vivo
transcriptional signature of MDR1+ Th17.1 cells isolated from involved Crohn’s disease patient
tissue published by Ramesh et al. (15) (Supplemental Table 4).
Weighted gene co-expression network analysis (WGCNA)
Scale-free network topology analysis of microarray expression data was performed using the
WGCNA R package, as previously described (16, 17). A signed hybrid weighted correlation
network was constructed using a Pearson correlation matrix created from the pairwise
comparison between all pairs of genes, and a soft thresholding power β=8. The topological
overlap was calculated as a measure of network interconnectedness, and genes were grouped
by average linkage hierarchical clustering on the basis of the topological overlap dissimilarity (1-
topological overlap). Module eigengenes were calculated using a dynamic tree-cutting algorithm
and merging threshold function at 0.25. The modules identified were correlated to the sample
traits using a binary vector representation of the tissues of origin and study groups. To validate
the microarray analysis, preservation of the WGCNA modules identified in the F→M model
dataset was tested against the dataset of the B6→129 model, using the R function
s9
“modulePreservation” in the WGCNA R package, as previously described (18). Results were
interpreted according to the following thresholds for Zsummary: if Zsummary≥10, module “strongly
preserved”; if Zsummary ≥2 and <10, module “weak to moderately preserved”; if Zsummary<2, module
“not preserved”. Significance of the Zsummary scores was calculated by gene permutation testing.
Intramodular hub genes, which are genes that have the highest number of connections within a
module, were identified on the basis of having eigengene-based connectivity (kME) > 0.8 and
gene significance (GS) > 0.2. Visualisation of the modules network of gene connections was
accomplished with the Cytoscape v3.5 software (19).
Prediction of upstream master regulators
The Cytoscape plugin iRegulon was used to predict the transcriptional regulatory network
underlying each of the modules of co-expressed genes, as described by Janky et al. (20). Briefly,
for each gene set, the regulatory sequences in 20 Kb around the transcription start site were
analyzed, and motif prediction was performed using an enrichment score threshold of 2.0, a ROC
threshold for AUC calculation of 0.03%, and a rank threshold for visualization of 5000. Candidate
transcription factors were predicted with a maximal FDR for motif similarity of 0.05.
Overrepresentation analysis
The Web-based Gene Set Analysis Toolkit (WebGestalt), a suite of tools for functional
enrichment analysis, was used to identify overrepresented KEGG pathways categories and
translate gene lists into functional profiles. Statistical significance of KEGG pathways enrichment
was calculated based on hypergeometric distribution statistics, adjusting the false discovery rate
using the Benjamini-Hochberg procedure.
Module-based differential expression analysis (modDE)
In order to quantify potentially significant associations at the level of WGCNA-derived gene
modules between tissue types and/or between conditions, we developed a gene based
association test that takes advantage of the magnitude and sign of the differential expression
effect size for each gene. We start by translating each gene’s p value into a chi-squared statistic
s10
Xi with one degree of freedom. In its simplest form, the test is designed to assess whether genes
are consistently over-expressed in a set of cases compared to controls. Therefore, the test
statistic T is the sum over all genes of the Xi, if Xi is positive, or 0, if Xi is negative. The distribution
of this statistic under the null is obtained by assuming that each gene has a 50% chance of being
over-expressed and 50% of being under-expressed. Note that this can be refined by computing
the genome-wide probability of over-expressed genes, in case that the proportion differs from
50%. With either of these assumptions, the probability of observing a number K of genes over-
expressed among the n genes in the module can be computed using a binomial distribution. For
a given K, the statistic T is chi-squared with K degrees of freedom (because it is the sum of K one
degree of freedom chi-square distributions). The probability of observing a test statistic greater
than the observed value T can then be computed using a weighted sum of probabilities, summing
over all possible values of K. The derivation is analogous if all genes are expected to be under-
expressed. If the expectation is a mixture of over and under expressed genes, the T statistic is
then summed over all genes in a manner that reflects that combination.
s11
Supplemental References
1. Henri S, Poulin LF, Tamoutounour S, Ardouin L, Guilliams M, de Bovis B, et al. CD207+
CD103+ dermal dendritic cells cross-present keratinocyte-derived antigens irrespective of
the presence of Langerhans cells. J Exp Med. 2010;207(1):189-206.
2. Irizarry RA, Bolstad BM, Collin F, Cope LM, Hobbs B, and Speed TP. Summaries of
Affymetrix GeneChip probe level data. Nucleic Acids Res. 2003;31(4):e15.
3. Irizarry RA, Hobbs B, Collin F, Beazer-Barclay YD, Antonellis KJ, Scherf U, et al.
Exploration, normalization, and summaries of high density oligonucleotide array probe level
data. Biostatistics. 2003;4(2):249-64.
4. Carvalho BS, and Irizarry RA. A framework for oligonucleotide microarray preprocessing.
Bioinformatics. 2010;26(19):2363-7.
5. Leek JT, Johnson WE, Parker HS, Jaffe AE, and Storey JD. The sva package for removing
batch effects and other unwanted variation in high-throughput experiments. Bioinformatics.
2012;28(6):882-3.
6. Reich M, Liefeld T, Gould J, Lerner J, Tamayo P, and Mesirov JP. GenePattern 2.0. Nat
Genet. 2006;38(5):500-1.
7. Artyomov MN, Munk A, Gorvel L, Korenfeld D, Cella M, Tung T, et al. Modular expression
analysis reveals functional conservation between human Langerhans cells and mouse cross-
priming dendritic cells. J Exp Med. 2015;212(5):743-57.
8. Pan JB, Hu SC, Shi D, Cai MC, Li YB, Zou Q, et al. PaGenBase: a pattern gene database for
the global and dynamic understanding of gene function. PLoS One. 2013;8(12):e80747.
9. Cock PJ, Fields CJ, Goto N, Heuer ML, and Rice PM. The Sanger FASTQ file format for
sequences with quality scores, and the Solexa/Illumina FASTQ variants. Nucleic Acids Res.
2010;38(6):1767-71.
10. Suzuki R, and Shimodaira H. Pvclust: an R package for assessing the uncertainty in
hierarchical clustering. Bioinformatics. 2006;22(12):1540-2.
s12
11. Barbie DA, Tamayo P, Boehm JS, Kim SY, Moody SE, Dunn IF, et al. Systematic RNA
interference reveals that oncogenic KRAS-driven cancers require TBK1. Nature.
2009;462(7269):108-12.
12. Hanzelmann S, Castelo R, and Guinney J. GSVA: gene set variation analysis for microarray
and RNA-seq data. BMC Bioinformatics. 2013;14:7.
13. Best JA, Blair DA, Knell J, Yang E, Mayya V, Doedens A, et al. Transcriptional insights into
the CD8(+) T cell response to infection and memory T cell formation. Nat Immunol.
2013;14(4):404-12.
14. Gartlan KH, Markey KA, Varelias A, Bunting MD, Koyama M, Kuns RD, et al. Tc17 cells are
a proinflammatory, plastic lineage of pathogenic CD8+ T cells that induce GVHD without
antileukemic effects. Blood. 2015;126(13):1609-20.
15. Ramesh R, Kozhaya L, McKevitt K, Djuretic IM, Carlson TJ, Quintero MA, et al. Pro-
inflammatory human Th17 cells selectively express P-glycoprotein and are refractory to
glucocorticoids. J Exp Med. 2014;211(1):89-104.
16. Langfelder P, and Horvath S. WGCNA: an R package for weighted correlation network
analysis. BMC Bioinformatics. 2008;9:559.
17. Langfelder P, and Horvath S. Fast R Functions for Robust Correlations and Hierarchical
Clustering. J Stat Softw. 2012;46(11).
18. Langfelder P, Luo R, Oldham MC, and Horvath S. Is my network module preserved and
reproducible? PLoS Comput Biol. 2011;7(1):e1001057.
19. Shannon P, Markiel A, Ozier O, Baliga NS, Wang JT, Ramage D, et al. Cytoscape: a
software environment for integrated models of biomolecular interaction networks. Genome
Res. 2003;13(11):2498-504.
20. Janky R, Verfaillie A, Imrichova H, Van de Sande B, Standaert L, Christiaens V, et al.
iRegulon: from a gene list to a gene regulatory network using large motif and track
collections. PLoS Comput Biol. 2014;10(7):e1003731.
s13
Supplemental Table 1, included in a separate Excel file. Characterization of the 19 highly
preserved F→M WGCNA gene modules, and the 2 skin-specific B6→129 WGCNA modules: (A)
over-represented GO categories, (B) putative driver genes identified on the basis of high intra-
modular connectivity, and (C) transcription factors predicted to act as upstream regulators. AUC,
area under the curve; FDR, false discovery rate; kME, module eigengene-based connectivity;
NES, normalized enrichment score.
Supplemental Table 2, included in a separate Excel file. Clinical details concerning the 5
patients included in the paired-tissue RNAseq analysis. ALL, acute lymphoblastic leukemia; AML,
acute myeloid leukemia; 2ry AML, secondary acute myeloid leukemia; CR1, first complete
remission; CR2, second complete remission; DLI, donor lymphocyte infusion; Haplo,
haploidentical donor; HL, Hodgkin s lymphoma; MMF, mycophenolate mofetil; MUD, matched
unrelated donor; SIB, sibling donor; TBI, total body irradiation.
Supplemental Table 3, included in a separate Excel file. (A) Results from the pre-processing
performed to test for LC contaminants in LC-replete versus LC-depleted epidermal samples. (B)
List of the intestine- and skin-specific genes subtracted from the analysis. DT, diphtheria toxin;
FDR, false discovery rate; SD, standard deviation.
Supplemental Table 4, included in a separate Excel file. Tc1, Tc17 and MDR1+ Th1/Th17
gene signatures.
Supplemental Figure 1. MDS of SLO and TO-derived TE samples in B6→129 BMT model: no prior subtraction of skin- and gut-derived
gene sets. MDS plot showing the proximity of the complete transcriptional profiles of donor-derived CD8+ T cells isolated from different organs.
donor (SLO)
syn-BMT (SLO)
allo-BMT (SLO)
allo-BMT (TO)
Legend:
Coordinate 1
10x103
0-10x103
0
10x103
5x103
-5x103
Coord
inate
2
-20x103
Supplemental Figure 2. Spatial diversity of TE profiles is conserved between two independent experimental BMT models. (A) Average linkage hierarchical clustering of all samples from B6→129 and F→M BMT models and controls, linked to a similarity matrix with color coded Pearson’s correlation coefficients to show clustering according to experimental system (circles: B6→129; squares: F→M) and sample cohort (light grey: donor; dark grey: syn-BMT; black: allo-BMT). (B) F→M BMT module preservation in B6→129 BMT model showing the summary statistics Zsummary as a function of the module size (red line Zsummary = 2; blue line Zsummary = 10). (C) Correlation matrix depicting the association between the 22 WGCNA gene modules defined in B6→129 dataset (MA-MV) and the experimental groups (donor, syn-BMT, allo-BMT), GVHD subgroups (SLO, TO), and GVHD individual organs. Cell color and cell number indicate Pearson’s correlation coefficient and corresponding -log10(p-value), respectively. (D) Bar graphs showing the mean eigengene expression in each of the tissues, for donor a syn-BMT (MT), pan-GVHD TO (MI), organ-selective (MH, MF, MD) and tissue-specific modules (MJ, ME). (E) B6→129 BMT module preservation in F→M BMT model showing the summary statistics Zsummary as a function of the module size (red line Zsummary = 2; blue line Zsummary = 10).Bl, Blood; Der, dermis; Epi, epidermis; IEL, intraepithelial lymphocytes; LN, lymph nodez; LP, lamina propria; SLO, secondary lymphoid organs; Sp, spleen; TO, target organs.
A B
D
donor+
syn-BMTallo-BMT
SLOallo-BMTGVHD TO
T
Eige
ng. e
xpr.
-0.2-0.10.00.10.20.30.4
Sp LN Sp LN Sp Bl LP IEL Der Epi-0.20.00.20.40.6
H
F
J
D
E
-0.4-0.20.00.20.4
-0.2
0.0
0.2
0.4
-0.20.00.20.40.6
-0.4-0.20.00.20.4
-0.6-0.4-0.20.00.2
I
Dono
r
syn-
BMT
allo
-BM
T
allo
-BM
T: S
LO
allo
-BM
T: G
VHD
TO
Lym
ph n
odes
Sple
en
Bloo
d
Gut
- LP
Gut
- IE
L
Skin
- de
rmis
Skin
- ep
ider
mis
-1 10
Pearson’s correlation
ABCDEFGHIJKLMNOPQRSTUV
0.38 1.19 1.501.92 0.26 1.460.75 0.35 0.891.98 1.92 4.400.23 0.35 0.510.60 1.70 2.311.09 1.65 2.913.53 1.38 4.915.99 0.23 3.120.01 0.74 0.600.80 0.33 0.900.15 1.32 0.820.01 1.07 0.870.22 1.32 0.740.16 1.74 1.090.69 4.62 1.910.18 4.58 2.7418.0 0.38 2.544.90 2.74 10.70.80 6.47 7.951.08 2.00 3.353.07 0.51 2.62
2.00 6.062.76 0.451.46 0.260.12 4.041.36 2.440.81 4.461.75 9.370.46 2.040.01 2.410.09 0.371.59 3.680.71 0.020.01 0.700.21 0.320.06 1.030.30 2.460.66 0.830.43 1.010.30 4.690.47 3.150.75 0.973.55 0.26
0.98 0.38 0.69 1.14 0.07 8.78 0.100.85 0.76 1.06 0.05 0.93 5.26 0.270.07 0.21 2.32 1.58 3.41 1.61 0.820.58 0.01 0.44 0.12 0.25 2.20 3.930.75 0.38 0.36 0.47 0.20 0.60 15.80.54 0.06 0.40 1.32 5.20 0.61 0.251.36 0.19 0.41 0.48 2.25 0.12 3.280.43 1.75 0.16 0.42 0.26 0.02 0.140.23 0.07 0.23 0.38 0.41 0.79 1.050.18 0.29 0.20 0.16 0.11 2.99 0.261.65 0.26 0.83 0.32 0.24 0.69 0.590.55 0.21 0.72 0.12 0.41 0.35 1.320.12 0.38 0.21 0.42 8.63 0.58 0.490.31 0.31 0.29 1.80 1.02 0.15 0.610.26 2.36 1.08 0.60 0.63 0.80 0.401.61 0.19 0.59 0.82 0.08 1.00 0.510.31 0.16 0.34 0.34 0.10 0.23 0.120.06 0.39 0.19 0.12 0.08 0.48 0.280.12 0.12 0.64 1.40 0.82 1.03 0.400.05 0.01 0.85 0.71 0.63 0.95 0.330.03 0.68 0.42 0.29 0.22 0.93 0.010.98 0.97 1.30 0.75 0.06 0.12 3.55
C
BMT model B6→129 F→M
donor syn-BMTGroup
Legend:
allo-BMT
SpleenSpleen
Lymph nodesLymph nodes
SpleenSpleen
Lymph nodesLymph nodes
SpleenBloodBlood
SpleenBone marrow
Gut - LPGut - IELGut - IELGut - LP
Skin - DermisSkin - Epidermis
Skin - DermisSkin - Epidermis
-1 10
Pearson’s correlation
0
20
40
60
80
10 20 50 100 200 500 1000 2000Module size
Pres
erva
tion Z s
umm
ary
2823
14
21
2
30
1611
1031
17
6
1
29
128
3 413
20
7
927
2518
5
242615
19
22
0
10
20
30
40
50
O
H
RF
D
A
U
KJ
M
G
E
I
P
Q
S
LB
CN
VT
10 20 50 100 200 500 1000 2000Module size
Pres
erva
tion Z s
umm
ary
E
Supplemental Figure 3. Analytical pipeline. A summary of analytical process following WGCNA is shown. See Results text for details.
Gene dendrogram & Module colors
0.6
0.8
1.0
Heig
ht
Modules
WGCNA
Identify modules ofco-expressed genes
0 5 10 15 20 25
0.2
0.4
0.6
0.8
bisque4 cor=0.72, p=5.7e−72
Connectivity
Gen
e Si
gnifi
canc
e
Connectivity0 20 40 60 80
0.0
0.1
0.2
0.3
0.4
0.5
0.6 green cor=0.067, p=0.056
Gen
e Si
gnifi
canc
e
0 5 10 15
0.0
0.1
0.2
0.3
0.4
darkolivegreen cor=−0.13, p=0.13
Connectivity
Gen
e Si
gnifi
canc
e
0 5 10 15 20 25
0.0
0.2
0.4
0.6
orange cor=0.39, p=2.9e−12
Connectivity
Gen
e Si
gnifi
canc
e
Relate modulesto array information
Find key driversin interesting modules
iRegulon
Detect master regulons
Predict directTF-target interactions
Whole-transcriptomemicroarray analysis
Murine aGVHD model
WebGestalt
250
5
Notch signaling
Osteoclast differentiation
Cell cycle
MAPK signaling
RIG-I-like receptor signaling
Glycerophospholipid metabolism
B cell receptor signaling
TGF-beta signaling
Glycerolipid metabolism
Adherens junction
Apoptosis
Adipocytokine signaling
NOD-like receptor signaling
PPAR signaling
Cytosolic DNA-sensing
Long-term potentiation
Melanogenesis
Toll-like receptor signaling
T cell receptor signaling
Neurotrophin signaling
0 50 100 150 200
0 1 2 3 4
Ratio of enrichment
FDR [-log10(q-value)]
Ratio of enrichment
FDR
Determineenrichment forbiological pathways
Functional perturbation
to test module function
in vivo
Evaluate module preservationin independent data sets
Supplemental Figure 4. Module M28 (defined in the F→M BMT model) overlaps with 2 epidermis-specific modules (MD and ME) independently identified in the B6→129 BMT model. (A) Cytoscape generated visualization of the network connections among the 100 most connected genes in B6→129 MD and ME. Nodes represent the genes (circle area proportional to the intra-modular connectivity, kME) and the color reflects the FDR q-value of its correlation with the module; edges represent the topological overlap between genes (line thickness proportional to adjacency). Driver genes common to F→M M28 are highlighted in bold. (B) Graphs showing the ratio of enrichment (bars) and FDR q-values (line) for KEGG pathways predicted by WebGestalt to regulate MD and ME. Pathways common to F→M M28 are highlighted in bold. (C) Graph showing module association with immunossupressive therapy resistance assessed by determining the over-representation of a gene signature specific for a human MDR1+ Th1/Th17 subset that is resistant to glucocorticoids. Hypergeometric test.FDR, false discovery rate; kME, intra-modular connectivity.
B
Osteoclast differentiation
Wnt siganling pathwayNotch signaling pathway
Cell cycle
MAPK signaling pathway
1500 50 100Ratio of enrichment
60 2 4FDR [-log10(q-value)]
Module D
1500 50 100Ratio of enrichment
Osteoclast differentiation
Toll-like receptor signaling
MAPK signaling
Jak-STAT signalingB cell receptor signaling
TGF-beta signaling
T cell receptor signaling
Cell cycle
Chemokine signalingAdipocytokine signaling
Adherens junction
ErbB signalingNeurotrophin signaling
Notch signaling
Cytosolic DNA-sensing
NOD-like receptor signaling
RIG-I-like receptor signaling
Long-term potentiation
150 5 10FDR [-log10(q-value)]
Ratio of enrichment FDR
Module E
-log1
0(p-
valu
e)
Modules AB
CD
EF
GH
IJ
KL
MN
OP
QR
ST
UV
0
1
2
3C MDR1+ Th17.1 signature
Module E
Rara
Egln3
Ski
Ncf4
Sorl1
Styk1
Dscam
Fhod1
Nckap1
Tiam1
Trpv2
Ifitm3
Nfatc2
Swap70
Efemp2
Pvr
Mapkapk3
Sit1
Soat1
Nek7
Gpr68
Isg20
Rfx5
Ubash3a
Abcc1
Cep170b
Ebpl
Ccr8
Ly6g5b
Adgrg3
Cd101
Il13
Cdh1
Rasgrp4
Tjp1
Qpct
Gdpd5
Sema6d
Pld3
Lmna
Fam20a
Csf1
Lta
Rhbdf2
Dkkl1
Athl1
Itsn1
Cmklr1
Cd44
Pag1
Trpm6
Ptgfrn
Ptafr
Fndc3a
Anxa1
Dnase1l1
Pxn
P2ry14
Fut7
Rbpj
Slc9a9
Itpripl2
Ifitm1
Smpd1
Med10
Plec
Capg
Ldlrad4
Fam129b
Ar
Mgat5
Hrh4
Rnf13
Rab3d
Kif3a
Ahr
Tnf
Cobll1
Myadm
Rxra
Synj2
Adgrg1
Cysltr1
A
Clic1
Cnih1
Iqgap1
Pqlc3
Tab2
Zmiz1
Dstn
Tspan2
Car5b
Cd99l2
Mif4gd
Bscl2
Atp10a
Thy1
Ccr2
Pacsin2
Hopx
Arl15
Prr13
Tjap1
Rell1
Actr1a
Golt1b
Krtcap2
Itgax
Slc2a1
Tpm4
Cd40lg
Ifng
Plp2
Ptprj
Galnt3
Vim
L1cam
Anxa2
Dennd5a
Adam8
Ptpn13
Reep5
Fam188a
Lgals3
Ahnak
Degs1
Crip1
Aim1
Lgals1
Mapkapk2
Atf6
Ctsd
Ulbp1
Klrk1
Il2ra
Id2
Crip2
Capn2
Cd82
Gna15
S100a4
Adipor2
Fam129a
Anxa5
Dgkh
Havcr2
Map2k3
Gata3
Fut8
Klrc1
Ppp1r11
Dok2
Anxa4
S100a10
Ifitm2
Prelid1
Rbms1
Serinc3
Plxdc1
Pcgf2
Raph1
Gcnt1
Diaph1
Emp1
Surf4
Ero1l
Prex1
Lmf2
Carhsp1
C3orf58
Module DCircle area = kME
Color intensity = FDR
0.60.81.0
>0.05<0.01<0.001
Supplemental Figure 5. LC express higher levels of JAG1 and DLL4 than other epidermal cell populations. (A) Notch ligand expression by the main cell populations in the epidermis post-BMT. Top - Representative flow cytometric plots showing JAG1 and DLL4 expression by LC versus CD4+/γδ T cells versus keratinocytes in the absence (BM only) or presence (BMT + T cells) of GVHD. CD8+ T cells infiltrating the epidermis in GVHD were excluded by gating. Bottom - Summary data of the difference in MFI staining between each sample and the respective isotype control (n = 2, graphs show mean ± SD; §, isotype control MFI > sample MFI). No detectable DLL1 and JAG2 expression was found on any epidermal population (data not shown). (B) Effect of in vitro LY411575 or PBS (untreated) exposure upon IFN-γ generation by concanalavin-preactivated MataHari T cells upon interaction with male or female LC. Data derived from 6 independent experiments, graphs show mean ± SD. ****p ≤ 0.0001, two-way ANOVA with Holm-Sidak correction for multiple comparisons.BM, bone marrow; F LC, female Langerhans cells; M LC, male Langerhans cells; MFI, median fluorescence intensity.
T cells +M LC
T cells +F LC
T cells
% IF
N-γ
+
0
10
20
30
40 UntreatedLY411575
****B
IFN
-γ
CD8
T cells
T cells+
Female LC
T cells+
Male LC
Untreated LY4115750.9
0.6
0.5
0.2
21.3 1.3
MFI
Keratinocytes
LC
CD4+
&γδ T cells
coun
t
JAG1
coun
t
DLL4
BM only BM + T cells BM only BM + T cellsASampleIsotype control
BM onlyBM + T cells
JAG1
Kerat.LC CD4+
& γδT cells
DLL4
Kerat.LC CD4+
& γδT cells
0
200
400
600
800
Δ M
FI (s
ampl
e - i
soty
pe c
ontro
l)
§ §§
Supplemental Figure 6. Host-derived LC remain the dominant LC population in the epidermis in the first two weeks post-BMT and their
presence is required for epidermal TE accumulation. (A) Kinetics of replacement of host-derived LC by donor-derived LC after BMT.
Graphs showing the relative proportion (top) and absolute number (bottom) of host- and donor-derived LC over the first 4 weeks post-transplant. Baseline (n = 3), week 1 (n = 7), week 2 (n = 7), week 3 (n = 8), week 4 (n = 8). Lines indicate mean (top graph) and geometric mean (bottom graph). (B) Effect of presence or absence of host LC, upon CD8+ TE accumulation in the epidermis in the B6→129 BMT model (left) and CD4+ TE Marilyn accumulation in the F→M BMT model (right). B6→129, n = 8-9/group; Marilyn F→M, n = 6/group. Bars indicate mean + SD. *p ≤ 0.05, two-tailed Mann-Whitney test. LC, Langerhans cells.
ARecipient LC
Donor LC
0
20
40
60
80
100
% L
ange
rin+ E
pCAM
+ cel
ls
B
0 1 2 3 4
102
103
104
105
106
Weeks
Lang
erin
+ EpC
AM+ c
ells
/ g
0
B6 → 129
PBS DT0
20
40
60
80
% D
onor
CD
4+ T c
ells *
MarilynF → M
0
1
2
3
% D
onor
CD
8+ T c
ells
*
PBS DT