approach for limited cell chip-seq on a semiconductor-based sequencing platform

1
S. Ghosh 1 , K. Giorda 1 , R. Marcus 2 , M. Taylor 1 , E. Farias-Hesson 1,2 , D. Bluestein 2 , G. Meredith 1 , S. Leach 2 , B. O'Conner 2 1 Thermo Fisher Scientific, 180 Oyster Point Blvd., South San Francisco, CA 94080; 2 National Jewish Health, Center for Genes, Environment & Health, 1400 Jackson St., Denver, CO 80206 ABSTRACT Dendritic cell (DC) lineages coordinate immune system activity through functional specialization. Irf4, a transcription factor(TF), is required for CD11b+ DC lineage development from bone marrow stem cells and has been implicated in multiple inflammatory diseases, eg. asthma. The epigenetic consequences of immune specialization in CD11b+ DCs and relation to inflammatory diseases remain largely unexplored partly due to the difficulty of using highly purified, and typically, limited populations of cells in ChIP-seq (chromatin immunoprecipitation then sequencing) assays. A robust, multiplexed ChIP-seq protocol – using an input control, TF (CTCF) and histone modification marks (H3K9me3- methylation, H3K27ac-acetylation) - was developed using limited amounts of K562 cells, for the Ion Proton TM system. Peak-calling analysis was performed using using MACS2. Significant data correlations were observed with ENCODE. The Ion Proton TM results are based on chromatin derived from 1 million(M) cells, making it viable for generating data from a limited number of primary cells. This is in contrast to the 10M cells recommended by ENCODE. The developed methodology was used to compare Irf4 genomic binding sites generated from flow-sorted populations of 1, 3, 5, and 20M CD11b+ lineage murine DCs. Comparable Irf4 ChIP-seq results were obtained from 5M versus 20M cells, indicating that as low as 5M flow-sorted cells can be used to acquire high quality(FDR: 10 -19 ) data. We identified genomic Irf4 binding sites proximal to genes, whose activity is consistent with CD11b+ DC lineage activity and/or known to contribute to inflammatory disease. We examined Irf4 functional regulation of the identified gene targets via RNA-seq analysis with CD11b+ DCs and a related lineage, CD103+ DCs. Integrating expression analysis with ChIP-seq indicates a unique CD11b+ DC gene expression program concordant with Irf4 loci association in comparison to CD103+ DC (data not shown). Approach for limited cell ChIP-Seq on a semiconductor-based sequencing platform INTRODUCTION ChIP-seq (Fig. 1) is used to enrich and map binding sites for transcription factors, histone modification marks, and other chromatin modifying complexes. The limited cell, ChIP-seq, multiplexed on a single P1 chip is comprised of input control, a TF, transcriptionally active histone (H3K27ac) and histone silencer (H3K27me3). The simplicity and comprehensiveness of the experimental design coupled with limited cell ChIP-seq from 1M cells that yields high quality data, makes it an effective tool to study salient aspects of an epigenetic mechanism Life Technologies • 5791 Van Allen Way • Carlsbad, CA 92008 • www.lifetechnologies.com Table 2. CTCF Motif analysis CONCLUSIONS • A proof of principle ChIP-seq design on the Ion Proton™ system that allows for a functional analysis on a single chip has been demonstrated. • Robust profiling of binding sites can be done from limited numbers of cells, i.e. 3-5M flow-sorted cells. • The simple and comprehensive experimental design highlighted by low input volume and quick turnaround time – and potentially coupled with RNA-seq provides a powerful tool for exploring regulation mechanism. REFERENCES 1. MACS v2.0.10 https://github.com/taoliu/MACS/ 2. ENCODE Consortium, (2012) Nature 489:57-74. 3. ENCODE comparisons are presented against BroadChromHMM, K562 cell-line data for Broad Histone, Hudson Alpha(HAIB) CTCF, HAIB RR DNA methylation – genome.ucsc.edu 4. M Herold et al. (2012) Development 139:1045-57. 5. GEM(Motif analysis): Y Guo et al, (2012), PLoS Comput Biol 8(8) 6. D Martin et al, (2011), Nat Struct Mol Biol, 18(6):708-714 7. BV Lugt et al. (2014) Nat Immunol 15: 161–167 ACKNOWLEDGEMENTS Colin Davidson, Ion Torrent [email protected] . CTCF - A transcription factor : A transcriptional repressor encoded in human by the CTCF gene, is also an insulator binding protein. ~41% of the loci[3] show significant enrichment at insulator domains (Fig. 3). ~17% of the loci show enrichment for promoter and enhancer domains, each[3]. GEM[5] analysis (k_min=16, k_max=23) recapitulates the MEME[6] based 17-bp CTCF motif (Table 2). Figure 3. [top] A schematic of insulator locations [4]; [bottom] Example of CTCF enrichment at insulators [2-3]. Chr1:11091-11530: depth:98, log10.qValue:145, fold-change :47 DC lineage specialization of CD103+ and CD11b+ cells is controlled by the TFs Irf8 and Irf4, respectively. To monitor the epigenetic consequences of CD11b+ lineage differentiation, cells were collected 6 days after induction and formaldehyde fixed for flow cytometry sorting and ChIP-Seq library construction. Libraries were sequenced using Ion PI TM v3 templating and sequencing. Loci detected by Irf4 ChIP-seq from CD11b+ DCs were rank-ordered by significance, confirming the Irf4 gene targets highlighted by Lugt et al[7], (Fig. 7). Additionally, 61% of significant peaks [–log10 q- value 20] associated with Irf4 binding, were in common across all cell-sorted populations of 1, 3, 5, 20M CD11b+ DCs (Fig. 8). Figure 7. [left] Genes and biological pathways identified by Irf4 ChIP-seq loci. Known Irf4 binding targets are noted. [right]: Peaks from the immunoprecipitation of 1, 3, 5, 20M mouse CD11b+ DCs for Zbtb46 (A) and F13a1 (B) genes. Figure 8. Significant overlap in ChIP-seq loci (with q-values of 20) (total n = 17,417) from cell sorted populations of 1, 3, 5, and 20M CD11b+ lineage mouse dendritic cells. Figure 1. ChIP-Seq METHODS Input ≤10ng End repair Ligate adapters Nick translate & Amp Size select 145 300 Figure 2. ChIP-Seq Library Protocol Input control and immunoprecitated chromatin was quantitated using the Qubit® HS kit. 10 ng (max) of DNA was used for end repair reactions. Samples were purified using 1.8x AMPure® XP beads prior to adapter ligation. Libraries were purified with 1.5x AMPure® XP beads to remove excess adapter dimer. This was followed by nick translation, amplification for 18 cycles, size-selection using either double SPRI clean up (0.7/1.5x) AMPure® XP or the Pippin Prep TM with internal size standards for a range of 145-300 bp. 4-plexed Proton data Barcode balanced: %of reads/PI Redundancy Rate # Putative Loci @ FDR cutoff: 10 -3 Confirmation against ENCODE* @ variable levels of overlap Any At least 50% 100% CTCF (BC=38) 24.0 0.44 17,538 68.5% (12,018) 67.9% (11,888) 60.3% (10,571) H3K27ac (BC=39) 22.1 0.11 52,861 73.2% (38,716) 70.2% (37,105) 66.4% (35,101) H3K27me3 (BC=45) 29.5 0.11 54,742 89.1% (48,772) 87.9% (48,123) 87.0% (47,632) Input (BC=51) 24.3 0.12 *wgEncodeBroa dHistoneK562H3 k27acStdPk *wgEncodeBroa dHistoneK562H3 k27me3StdPk *wgEncodeHaibTfb sK562CtcfcPcr1xPk Rep2 Table 1. 4-plex Ion PI™ v3 ChIP-seq w/ significant confirmation rates vs ENCODE Low input-volume samples – using the human leukemia cell-line K562, and antibody from Abcam - were 4-plexed (Table 1) onto a single P1 chip; the protocol ensured a barcode balance such that the background estimate from the input matched all immunoprecipitated samples. MACS2 2.0.10[1] was used for peak-calling. RESULTS A multiplexed template that would enable a functional study via the triangulation of transcription histone activation and silencing H3K27ac – an activating acetylation mark: A confirmation rate of 73% is observed against ENCODE[2]. ~55% of loci enrich for enhancers and 35% for promoters[2]. Fig 4. shows, representative association with the active promoter (red) and strong enhancer (beige) of CHD8 – a chromatin remodeling factor[2]. Figure 4. Confirmation of H3K27ac against ENCODE H3K27me3 – a repressive methylation mark: A confirmation rate against ENCODE of ~87%[2] is observed. The moderate presence of DNA methylation[2] concomitant with H3K27me3, a signature for polycomb repression is attested to by the Broad ChromHMM data[2-3] (Fig. 5). ~74% enrichment is observed for the repressed segments of the genome. Since this mark is typically associated with closed/inactive chromatin, a no/negative correlation(R:-0.1) is observed with the activating acetylation mark(H3K27ac) (Fig. 5). ~19% of the putative loci are associated with the heterochromatin. The tri-methyl form has been noted as an important mark for facultative heterochromatin, in the literature. Figure 5. Confirmation of H3K27me3 against ENCODE Figure 6. [top] Lineage specialization of dendritic cells; [bottom] Flow cytometry sorting of differentiated DCs Approach for limited cell ChIP-Seq Stem cell CD10 3+ DC CD11 b+ DC Irf8 Irf4 Fix Flow sort Library © 2014 Thermo Fisher Scientific Inc. All rights reserved. All trademarks are the property of Thermo Fisher Scientific and its subsidiaries unless otherwise specified. AMPure is a registered trademark of Beckman Coulter, Inc. For Research Use Only. Not for use in diagnostic procedures. 1 Million 3 Million 5 Million 20 Million 1 Million 3 Million 5 Million 20 Million Master regulator of the antigen presentation pathway required for DC function; known target of Irf4 [7] Transcription factor that defines classical DC pathway differentiation; known target of Irf4 [7] Regulator of the antigen presentation pathway required for DC function; known target of Irf4 [7] Transcription factor regulator of immunity and DC function; known target of Irf4 [7] Transcription factor regulator of immunity and DC function; known target of Irf4 [7] Part of the coagulation pathway and known mediator of innate immune defense, which includes DC An E3 ligase with known immune regulation properties and asthma activity Complement receptor that regulates adaptive immunity and DC function

Upload: thermo-fisher-scientific

Post on 13-Aug-2015

500 views

Category:

Science


0 download

TRANSCRIPT

Page 1: Approach for limited cell ChIP-Seq on a semiconductor-based sequencing platform

1

S. Ghosh1, K. Giorda1, R. Marcus2, M. Taylor1, E. Farias-Hesson1,2, D. Bluestein2, G. Meredith1, S. Leach2, B. O'Conner2

1Thermo Fisher Scientific, 180 Oyster Point Blvd., South San Francisco, CA 94080; 2National Jewish Health, Center for Genes, Environment & Health, 1400 Jackson St., Denver, CO 80206 ABSTRACT •  Dendritic cell (DC) lineages coordinate immune system activity

through functional specialization. •  Irf4, a transcription factor(TF), is required for CD11b+ DC

lineage development from bone marrow stem cells and has been implicated in multiple inflammatory diseases, eg. asthma.

•  The epigenetic consequences of immune specialization in CD11b+ DCs and relation to inflammatory diseases remain largely unexplored partly due to the difficulty of using highly purified, and typically, limited populations of cells in ChIP-seq (chromatin immunoprecipitation then sequencing) assays.

•  A robust, multiplexed ChIP-seq protocol – using an input control, TF (CTCF) and histone modification marks (H3K9me3-methylation, H3K27ac-acetylation) - was developed using limited amounts of K562 cells, for the Ion ProtonTM system.

•  Peak-calling analysis was performed using using MACS2. •  Significant data correlations were observed with ENCODE. •  The Ion ProtonTM results are based on chromatin derived from

1 million(M) cells, making it viable for generating data from a limited number of primary cells. This is in contrast to the 10M cells recommended by ENCODE.

•  The developed methodology was used to compare Irf4 genomic binding sites generated from flow-sorted populations of 1, 3, 5, and 20M CD11b+ lineage murine DCs.

•  Comparable Irf4 ChIP-seq results were obtained from 5M versus 20M cells, indicating that as low as 5M flow-sorted cells can be used to acquire high quality(FDR: 10-19) data.

•  We identified genomic Irf4 binding sites proximal to genes, whose activity is consistent with CD11b+ DC lineage activity and/or known to contribute to inflammatory disease.

•  We examined Irf4 functional regulation of the identified gene targets via RNA-seq analysis with CD11b+ DCs and a related lineage, CD103+ DCs. Integrating expression analysis with ChIP-seq indicates a unique CD11b+ DC gene expression program concordant with Irf4 loci association in comparison to CD103+ DC (data not shown).

Approach for limited cell ChIP-Seq on a semiconductor-based sequencing platform

INTRODUCTION ChIP-seq (Fig. 1) is used to enrich and map binding sites for transcription factors, histone modification marks, and other chromatin modifying complexes. The limited cell, ChIP-seq, multiplexed on a single P1 chip is comprised of input control, a TF, transcriptionally active histone (H3K27ac) and histone silencer (H3K27me3). The simplicity and comprehensiveness of the experimental design coupled with limited cell ChIP-seq from 1M cells that yields high quality data, makes it an effective tool to study salient aspects of an epigenetic mechanism

Life Technologies • 5791 Van Allen Way • Carlsbad, CA 92008 • www.lifetechnologies.com

Table 2. CTCF Motif analysis

CONCLUSIONS • A proof of principle ChIP-seq design on the Ion Proton™ system that allows for a functional analysis on a single chip has been demonstrated. • Robust profiling of binding sites can be done from limited numbers of cells, i.e. 3-5M flow-sorted cells. • The simple and comprehensive experimental design highlighted by low input volume and quick turnaround time – and potentially coupled with RNA-seq provides a powerful tool for exploring regulation mechanism.

REFERENCES 1.  MACS v2.0.10 https://github.com/taoliu/MACS/ 2.  ENCODE Consortium, (2012) Nature 489:57-74. 3.  ENCODE comparisons are presented against BroadChromHMM, K562

cell-line data for Broad Histone, Hudson Alpha(HAIB) CTCF, HAIB RR DNA methylation – genome.ucsc.edu

4.  M Herold et al. (2012) Development 139:1045-57. 5.  GEM(Motif analysis): Y Guo et al, (2012), PLoS Comput Biol 8(8) 6.  D Martin et al, (2011), Nat Struct Mol Biol, 18(6):708-714 7.  BV Lugt et al. (2014) Nat Immunol 15: 161–167

ACKNOWLEDGEMENTS Colin Davidson, Ion Torrent [email protected]

.

CTCF - A transcription factor : •  A transcriptional repressor encoded in

human by the CTCF gene, is also an insulator binding protein.

•  ~41% of the loci[3] show significant enrichment at insulator domains (Fig. 3).

•  ~17% of the loci show enrichment for promoter and enhancer domains, each[3].

•  GEM[5] analysis (k_min=16, k_max=23) recapitulates the MEME[6] based 17-bp CTCF motif (Table 2).

Figure 3. [top] A schematic of insulator locations [4]; [bottom] Example of CTCF enrichment at insulators [2-3]. Chr1:11091-11530: depth:98, log10.qValue:145, fold-change :47

DC lineage specialization of CD103+ and CD11b+ cells is controlled by the TFs Irf8 and Irf4, respectively. •  To monitor the epigenetic consequences of CD11b+

lineage differentiation, cells were collected 6 days after induction and formaldehyde fixed for flow cytometry sorting and ChIP-Seq library construction.

•  Libraries were sequenced using Ion PITM v3 templating and sequencing.

•  Loci detected by Irf4 ChIP-seq from CD11b+ DCs were rank-ordered by significance, confirming the Irf4 gene targets highlighted by Lugt et al[7], (Fig. 7).

•  Additionally, 61% of significant peaks [–log10 q-value ≥20] associated with Irf4 binding, were in common across all cell-sorted populations of 1, 3, 5, 20M CD11b+ DCs (Fig. 8).

Figure 7. [left] Genes and biological pathways identified by Irf4 ChIP-seq loci. Known Irf4 binding targets are noted. [right]: Peaks from the immunoprecipitation of 1, 3, 5, 20M mouse CD11b+ DCs for Zbtb46 (A) and F13a1 (B) genes.

Figure 8. Significant overlap in ChIP-seq loci (with q-values of ≥20) (total n = 17,417) from cell sorted populations of 1, 3, 5, and 20M CD11b+ lineage mouse dendritic cells.

Figure 1. ChIP-Seq

METHODS

Input ≤10ng

End repair

Ligate adapters

Nick translate & Amp

Size select

145 300 Figure 2. ChIP-Seq Library Protocol

•  Input control and immunoprecitated chromatin was quantitated using the Qubit® HS kit.

•  10 ng (max) of DNA was used for end repair reactions. •  Samples were purified using 1.8x AMPure® XP beads prior to

adapter ligation. •  Libraries were purified with 1.5x AMPure® XP beads to remove

excess adapter dimer. •  This was followed by nick translation, amplification for 18 cycles,

size-selection using either double SPRI clean up (0.7/1.5x) AMPure® XP or the Pippin PrepTM with internal size standards for a range of 145-300 bp.

4-plexed Proton

data

Barcode balanced:

%of reads/PI

Redundancy Rate

# Putative Loci @

FDR cutoff: 10-3

Confirmation against ENCODE* @ variable levels of overlap

Any At least 50% 100%

CTCF (BC=38) 24.0 0.44 17,538 68.5%

(12,018) 67.9%

(11,888) 60.3%

(10,571) H3K27ac (BC=39) 22.1 0.11 52,861 73.2%

(38,716) 70.2%

(37,105) 66.4%

(35,101) H3K27me3

(BC=45) 29.5 0.11 54,742 89.1% (48,772)

87.9% (48,123)

87.0% (47,632)

Input (BC=51) 24.3 0.12

*wgEncodeBroadHistoneK562H3

k27acStdPk

*wgEncodeBroadHistoneK562H3

k27me3StdPk

*wgEncodeHaibTfbsK562CtcfcPcr1xPk

Rep2

Table 1. 4-plex Ion PI™ v3 ChIP-seq w/ significant confirmation rates vs ENCODE Low input-volume samples – using the human leukemia cell-line K562, and antibody from Abcam - were 4-plexed (Table 1) onto a single P1 chip; the protocol ensured a barcode balance such that the background estimate from the input matched all immunoprecipitated samples. MACS2 2.0.10[1] was used for peak-calling.

RESULTS u  A multiplexed template that would enable a

functional study via the triangulation of transcription histone activation and silencing

H3K27ac – an activating acetylation mark: •  A confirmation rate of 73% is observed against ENCODE[2]. •  ~55% of loci enrich for enhancers and 35% for promoters[2].

Fig 4. shows, representative association with the active promoter (red) and strong enhancer (beige) of CHD8 – a chromatin remodeling factor[2].

Figure 4. Confirmation of H3K27ac against ENCODE

H3K27me3 – a repressive methylation mark: •  A confirmation rate against ENCODE of ~87%[2] is observed. •  The moderate presence of DNA methylation[2] concomitant with

H3K27me3, a signature for polycomb repression is attested to by the Broad ChromHMM data[2-3] (Fig. 5).

•  ~74% enrichment is observed for the repressed segments of the genome. Since this mark is typically associated with closed/inactive chromatin, a no/negative correlation(R:-0.1) is observed with the activating acetylation mark(H3K27ac) (Fig. 5).

•  ~19% of the putative loci are associated with the heterochromatin. The tri-methyl form has been noted as an important mark for facultative heterochromatin, in the literature.

Figure 5. Confirmation of H3K27me3 against ENCODE

Figure 6. [top] Lineage specialization of dendritic cells; [bottom] Flow cytometry sorting of differentiated DCs

u  Approach for limited cell ChIP-Seq

Stem cell

CD103+ DC

CD11b+ DC

Irf8

Irf4

Fix Flow sort Library

© 2014 Thermo Fisher Scientific Inc. All rights reserved. All trademarks are the property of Thermo Fisher Scientific and its subsidiaries unless otherwise specified. AMPure is a registered trademark of Beckman Coulter, Inc.

For Research Use Only. Not for use in diagnostic procedures.

1 Million 3 Million 5 Million 20 Million

1 Million 3 Million 5 Million 20 Million

Master regulator of the antigen presentation pathway required for DC function; known target of Irf4 [7]

Transcription factor that defines classical DC pathway differentiation; known target of Irf4 [7]

Regulator of the antigen presentation pathway required for DC function; known target of Irf4 [7]

Transcription factor regulator of immunity and DC function; known target of Irf4 [7]

Transcription factor regulator of immunity and DC function; known target of Irf4 [7]

Part of the coagulation pathway and known mediator of innate immune defense, which includes DC

An E3 ligase with known immune regulation properties and asthma activity

Complement receptor that regulates adaptive immunity and DC function