document s1. figures s1–s7 and supplemental experimental

18
Cell Stem Cell Supplemental Information Quantitative Dynamics of Chromatin Remodeling during Germ Cell Specification from Mouse Embryonic Stem Cells Kazuki Kurimoto, Yukihiro Yabuta, Katsuhiko Hayashi, Hiroshi Ohta, Hiroshi Kiyonari, Tadahiro Mitani, Yoshinobu Moritoki, Kenjiro Kohri, Hiroshi Kimura, Takuya Yamamoto, Yuki Katou, Katsuhiko Shirahige, and Mitinori Saitou

Upload: haliem

Post on 08-Jan-2017

238 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Document S1. Figures S1–S7 and Supplemental Experimental

Cell Stem Cell

Supplemental Information

Quantitative Dynamics of Chromatin Remodeling

during Germ Cell Specification

from Mouse Embryonic Stem Cells

Kazuki Kurimoto, Yukihiro Yabuta, Katsuhiko Hayashi, Hiroshi Ohta, Hiroshi Kiyonari,

Tadahiro Mitani, Yoshinobu Moritoki, Kenjiro Kohri, Hiroshi Kimura, Takuya Yamamoto,

Yuki Katou, Katsuhiko Shirahige, and Mitinori Saitou

Page 2: Document S1. Figures S1–S7 and Supplemental Experimental

Kurimoto et al., Supplemental Figure S1

456789

1011121314 Prdm1

Prdm14Tfap2cKitDnd1Itgb3

Pou5f1NanogSox2EsrrbTbx3Klf4Tcfcp2l1

TFgf8Hoxb1Hoxa1Mesp1Hand1Cdx2Cdx1Snai1Cdh2

Fgf5Dnmt3aDnmt3bMycOtx2

Log 2

exp

ress

ion

leve

l

ESC EpiLC d2PGCLC

d6PGCLC

ESC EpiLC d2PGCLC

d6PGCLC

ESC EpiLC d2PGCLC

d6PGCLC

ESC EpiLC d2PGCLC

d6PGCLC

d2 PGCLC d6 PGCLCF

B

A

C D

E

Rep

2

Rep

2

Rep1 Rep1 d2 avg.

d6 a

vg.

Testis

Ovary

Prdm1 (+/+) Prdm1(EGFP-Blimp1/EGFP-Blimp1)

E12.5 gonad, Prdm1(+/EGFP-Blimp1)MVH GFP DAPI/MVH/GFP

BLIMP1

EGFP-BLIMP1

Prdm1 (EGFP-Blimp1/EGFP-Blimp1) Prdm1 (+/+)

αTUBLIN

100

50

kDa

E12.5 PGC

Male

Female

108640-2-4 108640-2-4

10

8

6

4

0

-2

-4

10

8

6

4

0

-2

-4108640-2-4

10

8

6

4

0

-2

-4

Ex1Ex5 Ex6 Ex7 Ex8

Wild type3.1k

0.7k

ATG

EGFP-Blimp1-Neo

Targeting vectorMC1-DTA-pA

pApGKNeo

pApGKNeo

NotI

SwaI

SwaI

AvrII

HindIIIHindIII

HindIII

HindIII

AvrII

AvrII

AvrII

EGFPSALA

0.1k

EGFP

PmlI

PmlI

PmlI

SwaI

5’ probe3’ probe

Prdm1Ex3/F Prdm1Ex3/R

Prdm1Int3/F Prdm1Int3/R(247 bp)

(901 bp)

EGFP-Blimp1 EGFP

PmlI

Prdm1Ex3/F Prdm1Ex3/R(154 bp)

Figure S1. Expression of Key Genes During In Vitro PGC Specification, and Generation and Analysis of EGFP-Blimp1 Knock-in Mice, Related to Figure 1 and Figure 7(A) The expression patterns of genes for germ cell specification/development, pluripotency, an early mesodermal program, and epiblast development during in vitro PGC specification are shown with standard deviations (SDs) of two biological replicates. (B) Schematic representation of the wild-type Blimp1 locus, the targeting vector, and the EGFP-Blimp1 knock-in allele. White boxes: non-coding exons; black boxes: coding exons; black triangles: loxP sequences; LA: long arm; SA: short arm; folded line: MC1-DTA-pA cassette for negative selection. Key restriction enzyme sites and primers for genotyping are shown. (C) Histological sections of the testes and ovaries of the wild-type and EGFP-Blimp1 homozygous knock-in mice stained by hematoxylin and eosin. EGFP-Blimp1 homozygous knock-in mice are healthy and fertile, with their testes and ovaries showing normal spermatogenesis and oogenesis, respectively, indicating that the EGFP-BLIMP1 fusion protein functions appropriately in vivo. Bar, 50 µm. (D) Immunofluorescence analysis of MVH (red) and EGFP-BLIMP1 (green) expression counterstained by DAPI in the embryonic gonads (top, male; bottom, female) at E12.5 of EGFP-Blimp1 heterozygous knock-in mice. Note that EGFP-BLIMP1 localizes specifically in the nuclei of MVH-positive PGCs. Bar, 20 µm. (E) Western blot analysis of EGFP-BLIMP1 and BLIMP1 expression in PGCs at E12.5 in EGFP-Blimp1 homozygous knock-in and wild-type mice, respectively. αTUBLIN was used as a control. (F) Contour graphs for scatter plot comparisons of EGFP-BLIMP1 peaks in the two biological replicates (Rep.1 and Rep. 2) of d2 and d6 PGCLCs (top left and right, respectively), and of averaged EGFP-BLIMP1 peaks in d2 and d6 PGCLCs (bottom).

Normalized EGFP-BLIMP1 (log2)

Nor

mal

ized

EG

FP-B

LIM

P1

(log 2

)

Page 3: Document S1. Figures S1–S7 and Supplemental Experimental

C D

E F

Kurimoto et al., Supplemental Figure S2

J

LESC EpiLC

genome wide H3K27me3 Rep1 (log2) genome wide H3K27me3 in ESC avg. (log2)-8 -4 0 44 -8-8 -4 0 -4 0 4-8 -4 0 4-8 -4 0 4 -8 -4 0 4 -8 -4 0 4 -8 -4 0 4 -8 -4 0 4

Rep

2 (l

og2)

Epi

LC a

vg.

d2 L

IF A

g av

g.

d2 P

GC

LC a

vg.

d6 P

GC

LC a

vg.

G ESC

Rep

2 (lo

g 2)

Rep

2 (lo

g 2)

EpiLC d2 PGCLC d6 PGCLC

20-2-4-6-8 20-2-4-6-8 20-2-4-6-8 20-2-4-6-8 20-2-4-6-8 20-2-4-6-8 20-2-4-6-8

20

-2-4-6-8

H3K9me2 IP level at TSS ±1Kb Rep1 (log2)

H3K27me3 IP level at TSS ±1Kb Rep1 (log2)

H3K9me2 IP level at TSS ±1Kb in ESC avg. (log2)

H3K27me3 IP level at TSS ±1Kb in ESC avg. (log2)

Epi

LC a

vg.

d6 P

GC

LC a

vg.

d2 P

GC

LC a

vg.

H

I

genome wide H3K9me2 Rep1 (log2) genome wide H3K9me2 in ESC avg. (log2)

-8

-4

0

-8 -4 0 4 -8 -4 0 4-8 -4 0 4-8 -4 0 4 -8 -4 0 4 -8 -4 0 4 -8 -4 0 4

4

-8

-4

0

4

0-22-44-66-88-1010-1212-1414-1616-18 Log2 num

ber of genom

ic loci

0-22-44-66-88-1010-1212-1414-1616-18 Log2 num

ber of genom

ic loci

Log2 number

of genes

0-11-22-33-44-55-66-77-88-9

Log2 number

of genes

0-11-22-33-44-55-66-77-88-9

Epi

LC a

vg.

d2 P

GC

LC a

vg.

d6 P

GC

LC a

vg.

d2 LIF Ag d2 PGCLC d6 PGCLC

ESC EpiLC d2 PGCLC d6 PGCLC

420-2-4-6-8-10 420-2-4-6-8-10 420-2-4-6-8-10 420-2-4-6-8-10 420-2-4-6-8-10 420-2-4-6-8-10 420-2-4-6-8-10 420-2-4-6-8-10 420-2-4-6-8-10

20

-2-4-6

ESC EpiLC d2LIF Ag d2 PGCLC d6 PGCLC

Rep

2 (l

og2)

Epi

LC a

vg.

d2 L

IF A

g av

g.

d2 P

GC

LC a

vg.

d6 P

GC

LC a

vg.

LINE 1ORF2

SINEB1

IAPChI

P-s

eq IP

leve

l (lo

g 2)

norm

aliz

ed b

y C

hIP

-QP

CR

(L

INE

1 O

RF2

)

H3K9me2ESCEpiLCd2 PGCLCd6 PGCLC3

210

-1-2-3-4

ChI

P-s

eq IP

leve

l (lo

g 2)

norm

aliz

ed b

y C

hIP

-QP

CR

(L

INE

1 O

RF2

)

LINE 1ORF2

SINEB1

IAP

ESCEpiLCd2 LIF Agd2 PGCLCd6 PGCLC

H3K27me3

10

-1-2-3-4-5-6

ChI

P-s

eq IP

leve

l (lo

g 2)

norm

aliz

ed b

y C

hIP

-QP

CR

(L

INE

1 O

RF2

)

ChIP Q-PCR ∆Ct (ChIP-Input)

H3K9me2SINE B1IAP

ESC

ESC

d6

EpiLCd2

d6

d2

EpiLC

2

1

0

-1

-2

-3

-44 3 2 1 0 -1 -2

y = -1.26x + 0.85R2 = 0.67

ChI

P-s

eq IP

leve

l (lo

g 2)

norm

aliz

ed b

y C

hIP

-QP

CR

(L

INE

1 O

RF2

)

ChIP Q-PCR ∆Ct (ChIP-Input)

SINE B1IAP

ESC

ESCd6EpiLC

d2d2 LIF Ag

d6

d2d2 LIF Ag

EpiLC

H3K27me31

0

0

-1

1 1

-2

2

-3

3

-4

4

-5

5-6

6

y = -1.04x + 0.60R2 = 0.94

Figure S2. Distribution of Histone H3, and Normalization and Comparison of ChIP-seq Data for H3K3me2 and H3K27me3, Related to Figure 3 and Figure 5(A) ChIP-Q-PCR analysis of histone H3 on the LINE L1 ORF2, SINEB1, and IAP for ESCs, EpiLCs, d2 and d6 PGCLCs (color codes as indicated). (B) The log2 IP/input-frequency plots of histone H3 for the genome (single-copy regions, 2 Kb sliding windows with 1 Kb overlaps, red) and around TSSs (within 1Kb) of the HCP, ICP, and LCP genes (color codes as indicated) during in vitro PGC specification. (C) ChIP-seq values of H3K9me2 on the repetitive elements normalized by the ChIP-Q-PCR values of H3K9me2 on the LINE L1 ORF2 for ESCs, EpiLCs, and d2 and d6 PGCLCs (color codes as indicated). (D) ChIP-seq values of H3K9me2 on SINE B1 and IAP normalized by the ChIP-Q-PCR values of H3K9me2 on the LINE L1 ORF2 plotted against ChIP-Q-PCR values for SINE B1 and IAP for ESCs, EpiLCs, and d2 and d6 PGCLCs (SDs, two biological replicates). (E) ChIP-seq values of H3K27me3 on the repetitive elements normalized by the ChIP-Q-PCR values of H3K27me3 on the LINE L1 ORF2 for ESCs, EpiLCs, d2 LIF Ag, and d2 and d6 PGCLCs (color codes as indicated). (F) ChIP-seq values of H3K27me3 on SINE B1 and IAP normalized by the ChIP-Q-PCR values of H3K27me3 on the LINE L1 ORF2 plotted against ChIP-Q-PCR values for SINE B1 and IAP for ESCs, EpiLCs, and d2 and d6 PGCLCs (SDs, two biological replicates). (G) Contour graphs for scatter plot comparisons of H3K9me2 around the TSSs (within 1 Kb) in the two biological replicates (Rep.1 and Rep. 2) of ESCs, EpiLCs, and d2 and d6 PGCLCs (left), and between averaged values in ESCs and those in EpiLCs, or in d2 or d6 PGCLCs (right). (H) Venn diagrams of genes with H3K9me2 detected within 1 Kb from TSSs in the two biological replicates of ESCs, EpiLCs, and d2 and d6 PGCLCs. (legend continued on next page)

15546 964297

EpiLC

16926 4134

d2 PGCLC

16581 116275

d6 PGCLCESC

135121175 128

K15840 105483

EpiLC

16611 24345

d2PGCLC

16622 212151

d6PGCLCd2 LIF Ag

16954 4216

ESC

16036427 454

A B0

1

2

3

4

5

ESCEpiLC

d2 PGCLCd6 PGCLC

LINE1ORF2

SINE B1 IAP

∆Ct (

ChI

P-In

put)

H3 IP/Input (log2)-3 -2 -1 0 1 2

Freq

uenc

y

ESC

EpiLC

d2 PGCLC

d6 PGCLC

Genome

TSS

HCPICPLCP

Page 4: Document S1. Figures S1–S7 and Supplemental Experimental

D

Kurimoto et al., Supplemental Figure S3

OCT4, ESC OnlyTop50%

OC

T4

ES

C

Epi

LC

d2 d6

H3K27acOCT4, ESC&EpiLCTop50%

OC

T4

ES

C

Epi

LC

d2 d6

H3K27acOTX2, EpiLC OnlyTop50%

OTX

2

ES

C

Epi

LC

d2 d6

H3K27ac

±10 Kb0 0 55

Figure S3. Normalization and Comparison of ChIP-seq Data for H3K4me3 and H3K27ac, and Relationships between H3K27ac Peaks and OCT4-binding Sites During In Vitro PGC Specification, Related to Figure 2 (A) Contour graphs for scatter plot comparisons of H3K4me3 in the two biological replicates (Rep.1 and Rep. 2) of ESCs, EpiLCs, d2 LIF Ag, and d2 and d6 PGCLCs (left), and between averaged values in ESCs and those in EpiLCs, d2 LIF Ag, d2 or d6 PGCLCs (right). (B) Venn diagrams of genes with H3K4me3 peaks around TSSs (< 2Kb) in the two biological replicates of ESCs, EpiLCs, d2 LIF Ag, and d2 and d6 PGCLCs. (C) Contour graphs for scatter plot comparisons of H3K27ac in the two biological replicates (Rep.1 and Rep. 2) of ESCs, EpiLCs, d2 LIF Ag, and d2 and d6 PGCLCs (left), and between averaged values in ESCs and those in EpiLCs, d2 LIF Ag, d2 or d6 PGCLCs (right). (D) Heat map representation of the relationships of H3K27ac peaks during in vitro PGC specification with (left) OCT4-binding sites specific to ESCs (top 50%), (middle) OCT4-binding sites common to ESCs and EpiLCs (top 50%), and (right) OTX2-binding sites specific to EpiLCs (top 50%) (Buecker et al., 2014).

A

B

C

ESC EpiLC d2 LIF Ag d2 PGCLC d6 PGCLC

Rep

2 (l

og2)

H3K4me3 Rep1 (log2) H3K4me3 ESC avg. (log2)8 106420-2 8 106420-2 8 106420-2 8 106420-2 8 106420-2 8 106420-2 8 106420-2 8 106420-2 8 106420-2

810

6420

-2-4

Epi

LC a

vg.

d2 L

IF A

g av

g.

d2 P

GC

LC a

vg.

d6 P

GC

LC a

vg.

ESC EpiLC d2LIF Ag d2PGCLC d6PGCLC

Rep

2 (l

og2)

H3K27ac Rep1 (log2) H3K27ac ESC avg. (log2)86420-2-4 86420-2-4 86420-2-4 86420-2-4 86420-2-4 86420-2-4 86420-2-4 86420-2-4 86420-2-4

86420

-2-4

Epi

LC a

vg.

d2 L

IF A

g av

g.

d2 P

GC

LC a

vg.

d6 P

GC

LC a

vg.

Log2 number

of peaks

0-11-22-33-44-55-66-77-88-99-10

Log2 number

of peaks

0-11-22-33-44-55-66-77-88-99-1010-11

9579598 679

ESC

10554 356502

EpiLC

9706315 587

d2 LIF Ag

8578 1533280

d2 PGCLC

9101 715587

d6 PGCLC

(Legend for Figure S2, continued)(I) Contour graphs for scatter plot comparisons of genome-wide H3K9me2 (2 Kb sliding windows with 1 Kb overlaps) in the two biological replicates (Rep.1 and Rep. 2) of ESCs, EpiLCs, and d2 and d6 PGCLCs (left), and between averaged values in ESCs and those in EpiLCs, d2 or d6 PGCLCs (right). (J) Contour graphs for scatter plot comparisons of H3K27me3 around the TSSs (within 1 Kb) in the two biological replicates (Rep.1 and Rep. 2) of ESCs, EpiLCs, d2 LIF Ag, and d2 and d6 PGCLCs (left), and between averaged values in ESCs and those in EpiLCs, d2 LIF Ag, d2 or d6 PGCLCs (right). (K) Venn diagrams of genes with H3K27me3 detected within 1 Kb from TSSs in the two biological replicates of ESCs, EpiLCs, d2 LIF Ag, and d2 and d6 PGCLCs. (L) Contour graphs for scatter plot comparisons of genome-wide H3K27me3 (2 Kb sliding windows with 1 Kb overlaps) in the two biological replicates (Rep.1 and Rep. 2) of ESCs, EpiLCs, and d2 and d6 PGCLCs (left), and between averaged values in ESCs and those in EpiLCs, d2 LIF Ag, d2 or d6 PGCLCs (right).

PGCLC PGCLC PGCLC

Page 5: Document S1. Figures S1–S7 and Supplemental Experimental

C

BA

ED

Kurimoto et al., Supplemental Figure S4

F G

Log 2

exp

ress

ion

leve

l

Log 2

exp

ress

ion

leve

l

pattern specification process

regulation of transcription,DNA-dependent

embryonic morphogenesis

chordate embryonic development

anterior/posterior pattern formation

regulation of cell proliferation

skeletal system development

heart development

1E-3 1E-9 1E-15

d2 LIF Ag/d2 PGCLC>2 (IP level)p value

Fgf8, Tbx3, Cdx2, Tdgf1, Nodal, Hoxb1, Hoxa4, Hoxc4

Myc, Tbx3, Vcam1, Cer1, Wnt3, Wnt4, Gata4, Hand1

Prdm1, Cdh1, Cebpa, Cited1, Gata2, Gjab5, Krt8, Pdgfrb

Cer1, Cdx1, Hoxa7, Hoxb7, Hoxc4, Hoxd3, Msgn1, Wnt3

Id1, Id2, Id3, Ascl1, Etv2, Isl1, Meis2, Msx1, Klf4, Nkx1-2

Ccnd2, Cdkn2a, Cdkn2b, Egf , Fgf16, Fosl2, Myc, Tgfa, Tgfb1

Trp63, Aes, Bmi1, Hexa, Hoxb7, Hoxd1, Hoxd3, Igfbp3, Myf

Fgf8, Hand1, Id1, Id2, Id3, Tdgf1, Tgfb2, Erbb3, Erbb4, Casp8, Cfc1

H3K27me3 IP level (log2)

H3K

27m

e3 IP

leve

l (lo

g 2)

d2 PGCLC

d6 P

GC

LCd2

LIF

Ag

Epi

LC

420-2-4-8-10

4

2

0

-2

-4

-8

-10

4

2

0

-2

-4

-8

-10

4

2

0

-2

-4

-8

-10

ESC EpiLC

d2 PGCLC d6 PGCLC

p value p value

p valuep value

Dnd1, Dazl, Dnmt3a, Spo11, Sycp3, Tdrd7, Mael, Piwil2, Piwil4Spag16, Spata4, Spata19, Spata24,Tsnax, Tssk5, Fabp9

Tlr4, Il15, Cd40, Fas, Foxp1, Tgfbr2, Satb1, ItgalCald1, Myh7, Actn2, Actn3, Ryr1, Pde4b, Camk2d

Plcz1, Pla2g3, Cyp39a1, Lipa, Plcb2, Pld4

Plcz1, Gabrb3, Gabra4, Gabre, Kcne1, Cacna1bion transportgamete generation

spermatogenesislipid catabolic process

positive regulation ofimmune system processmuscle system process

1E-2 1E-2

1E-5 1E-17 1E-29

1E-81E-51E-3 1E-4

Tlr6, Tlr12, Cxcl15, Defa5, Defb3, Defb4

Adh1, Akr1c13, Aldh1a1, Cyb5r2, Kdm5d

Cfd, Clec7a, F2r, Il17a, Il1f6, Il23a, Olr1

Slc11a2, Ftmt, Steap4, Slc11a1, Fthl17, Slc25a37

Daf2, C2, C9, IC4b, l18r1, Il18rap, Il1rl1defense response

proteolysis

inflammatory responseoxidation reduction

Adora2a, Nrtn, Neurod2, Neurog1, Hes5

Fgf2, Fgf9, Shh, Ihh, Wnt1, Vgf, Fgfr4

Hoxa7, Hoxb13, Gata2, Foxa1, Glis3, Dmrt1

Hoxc5, Dll1, Cxcr4, Gsx2, Lefty1, Nkx2-1, Pax1

Abcc8, Cacna1d, Grin2a, Kcna7, Slc10a4, Camk2b

Bmp6, Bmp8a, Bmp8b, Foxc1, Igf1, Igf2, Hoxd3, Tgfb2

Sox2, Tgfb1, Dscam, Hmx2, Itga8, Nkx2-5, Bdnf, Six3

Adora2b, Adcy1, Adcy3, Adcy5, Adrb3, Galr1

Aldh1a2, Grem1, Hoxa10, Hoxb4, Tbx1, Twsg1, Esrrb

1E-3 1E-12 1E-24neuron differentiation

cell-cell signalingregulation of transcription,

DNA-dependentpattern specification process

ion transport

skeletal system developmentregulation of cAMP biosynthetic

processsensory organ development

embryonic morphogenesis

embryonic morphogenesispattern specification process

gland developmentear development

heart developmentcell adhesion

neuron differentiation

regulation of transcription,DNA-dependent

skeletal system development

Hoxa1, Hoxb1, Bmpr1b, Nodal, Tdgf1, Otx2

Eomes, Bmi1, Fgf4, Fgf8, Gsc, Tbx3, Wnt3, Hes1

Myc, Cdx2, Cebpa, Gata4, Klf4, Msx1, Tgif1, Nkx1-2

Bmp5, Hoxa4, Hoxb7, Hoxc4, Nf1, Pax7, Pdgfra, Sox9

Cdh1, Ccnd1, Cd44, Igf1r, Lama1, Pitx2, Wnt4, Ar

Hand1, Gja1, Pax3, Tbx2, Tbx20, Mixl1, Notch1, Tnnl1

Itga3, Itgb6, Tmem8, Pcdh7, Pcdh8, Jam3, Kitl, Cd44, Thy1

Eyz1, Zic1, Hesx1, Kcnq4, Nkx3-2, Nr4a3, Six2, Celsr1, Jag1

Neurod4, Erbb2, Hoxd9, Hoxd10, Gli2, Hes1, Olig1, Nrp1

innate immune response

ESC EpiLC d2PGCLC

d6PGCLC

d2PGCLC

d6PGCLC

ES

Cge

nes

d6 P

GC

LCge

nes

6

7

8

9

10

11

12

13

14

Ezh2Rbbp4Rbbp7Aebp2Phf1EedSuz12Mtf2Jarid2Phf19

ESC EpiLC6

7

8

9

10

11

12

13

14

Mll1Mll2Mll3Setd1a

Frac

tion

of g

enom

ic re

gion

(fol

d ch

ange

of I

P le

vel

> 2)

(d2

PGCL

C/Ep

iLC<

0.5)

(d2

PGCL

C/Ep

iLC>

2)

0

0.9

0

0.06

d6/EpiLC>2d6/EpiLC<0.5

Gene

body TSS

1Kb

5Kb

10Kb

20Kb

50Kb

100K

b

>100

Kb 0

0.1

0

0.3

(d6

PGCL

C/Ep

iLC>

2)

(d6

PGCL

C/Ep

iLC<

0.5)

Distance from TSS

d2/EpiLC>2d2/EpiLC<0.5

540-560520-540500-520480-500460-480440-460420-440400-420380-400360-380340-360320-340300-320280-300260-280240-260220-240

Numberof genes

Figure S4. Analysis of the H3K27me3 Targets During In Vitro PGC Specification, Related to Figure 3(A) Relationships between the log2 gene expression levels and the log2 H3K27me3 (top), H3K9me2 (middle), and H3K4me3 (bottom) levels during in vitro PGC specification. (B) Fraction of the genomic region with fold H3K27me3 IP level changes > 2 plotted against the genomic loci in EpiLC to d2 PGCLC (top) and EpiLC to d6 PGCLC (bottom) comparisons. (C) GO analysis for the ESC, EpiLC, d2 and d6 PGCLC PRC2 targets. (D) Contour graphs for scatter plot comparisons of the log2 H3K27me3 levels between d2 PGCLCs and EpiLCs (top), d2 LIF Ag (middle), or d6 PGCLCs (bottom). (legend continued on next page)

Expression level (log2)

H3K

27m

e3Lo

g 2 le

vels

H3K

9me2

2

0

-2

-4

-6

-82

0

-2

-4

-6

-8

0 2 4 6 8 10 12 14 0 2 4 6 8 10 12 14 0 2 4 6 8 10 12 14 0 2 4 6 8 10 12 14

ESC EpiLC d2 PGCLC d6 PGCLCCorresponding to log2 Expr. 8

H3K

4me3

2

0

4

6

8

10

Corresponding to Log2 Expr. 8

Page 6: Document S1. Figures S1–S7 and Supplemental Experimental

Kurimoto et al., Supplemental Figure S5

D

F

B

A C

ESC EpiLC d2PGCLC

d6 ESC EpiLC d2 d6 ESC EpiLC d2 d6

HCP ICP LCPH3K4me3 level (log2)

H3K

27m

e3 le

vel (

log 2

)

3210

-1-2-3-4-5-6-7-8

987654321 987654321 987654321 987654321

ESCCorresponding to log2 Expr. 8

Corresponding to log2 Expr. 8

EpiLC d2 PGCLC d6 PGCLC

7180 7173 7366 7270

2031 2955 1769 25131560 644 1451 102884 83 269 44

0%10%20%30%40%50%60%70%80%90%

100%

662 581 698 619

309 588 260 489

1375 11641159

1228

94 107 323104

175 121 123 13672 123 39 145

3215 32442585

3165

315 2891030

331

EpiLC

d2 PGCLC

d2 PGCLC

d6 PGCLC

ESC

EpiLC

Neuron differentiationPattern specification process

Embryonic morphogenesis

Neuron differentiationPattern specification process

Embryonic morphogenesis

Neuron differentiationPattern specification process

Embryonic morphogenesisNeuron differentiation

Pattern specification processEmbryonic morphogenesis

Neuron differentiationPattern specification process

Embryonic morphogenesis

Neuron differentiationPattern specification process

Embryonic morphogenesisNeuron differentiation

Pattern specification processEmbryonic morphogenesis

Neuron differentiationPattern specification process

Embryonic morphogenesisNeuron differentiation

Pattern specification processEmbryonic morphogenesis

Fgf4, Nr4a3, Prox1, Zic5, Odz4

1297

501

1566

1252

307

1611

436

1123

1312

Gata2, Pbx3, Etv4, Otx2, ShhHoxc5, Notch2, Satb2, NogNr4a3, Prox1, Chrna9, Prkra

Nrtn, Nrp1, Jag2, Dscam, NrcamHoxa1, Hoxa5, Hoxc4, , Bmpr1b, Otx1, Hes7Eomes, Ovol2, Tgfb3, Twist1, Mixl1

Pax6, Olig2, Ascl1, Nrn1, Aldh1a2Hoxd13, Acvr1c, Pax7, Foxa2, Emx1Fgf8, Fgfr2, Wnt3, Esrrb, Lef1, Cyp26b1

Jag1, Dll1, Isl2, Olig1, Farp2

Nrtn, Nrn1, Aldh1a2, Hes5, Ascl1Fgf8, Hoxa1, Hoxd13, Pax6, Wnt3

Hoxb7, Hoxd3, Gata4, Msgn1, Satb2

Fgfr2, Fgf9, Itga4, Lef1, Mixl1

Pax3, Nrcam, Dscam, Cxcr4, Olig2Hoxa5, Hoxc4, Bmpr1b, Otx1, Acvr1cEomes, Tgfb3, Twist, Tgfbr2, Fgf10

Hes1, Hes5, Nrtn, Evx1, Pax6

Nrcam, Jag1 Olig1, Runx1, Dscam, Dll1

Pax3, Olig2, Cxcr4, Isl2, Nkx2-9Hoxa5, Hoxc4, Msgn1, Acvr1cFgf4, Eomes, Myo6, Rgma, Zbtb16

Otx1, Hoxb7, Hoxb8, Hoxd3, Bmpr1b, Gata4Satb2, Tgfbr2, Tgfb3, Twist1, Foxc1

Hoxa1, Hoxb6, Hoxc8, Cdx2, Tbx3Tgif1, Msx1, Mixl1, Hand1, Dkk1, Wnt3

1E-1 1E-221E-10P value

1E-1 1E-221E-10P value

1E-1 1E-221E-10P value

0-2020-4040-6060-8080-100100-120120-140140-160160-180180-200200-220220-240240-260260-280280-300300-20

Numberof genes

Arg

2M

thfd

2lC

dk5r

1G

naz

Nrp

1A

dora

2aR

tn4r

l1R

eln

Igfb

p5

∆Ct (

ChI

P-In

put)

∆Ct (α

H3K

27m

e3

- nor

mal

IgG

)

1st IP (αH3K4me3)

2nd IP (αH3K27me3)

2nd IP (normal IgG)

2nd IP (αH3K27me3 - normal IgG)

-101234

-10123

-2-1

2

01

43

-2

-2-3-4-5-6

-1

Hoxa

1 3’

UTR

Gapd

hHo

xa1

Hoxa

5Pc

sk9

Bivalent promotersK4

K27

Ed6 PGCLC

E11.5 PGCs(Sachs et al,

2013)

1012

1423

1462

1E-3 P-value1E-25Neuron differentiation

Pattern specification process

Embryonic morphogenesis

Neuron differentiation

Pattern specification process

Embryonic morphogenesis

Neuron differentiation

Pattern specification process

Embryonic morphogenesis

ESC EpiLC d2PGCLC

d6 ESC EpiLC d2 d6 ESC EpiLC d2 d6

HCP ICP LCP

5 6 7 66 5 4 5

21 1720 16

1 0 3 1

0%10%20%30%40%50%60%70%80%90%

100%

1 1 0 01 1 1 2

30 3025

27

0 08

0

0 0 0 00 0 0 0

28 2821

27

0 07

1

BivK4K27Neither

Figure S5. Analysis of Bivalent Genes During In Vitro PGC Specification, Related to Figure 4(A) Relationships between the log2 H3K4me3 levels and the log2 H3K27me3 IP levels during in vitro PGC specification. (B) Transitions of the numbers of HCP, ICP, and LCP genes (all genes including those with log2 expression levels > 8) that bear H3K4me3 (blue), H3K27me3 (green), bivalent modifications (orange), and no modifications (pale blue) during in vitro PGC specification. (C) Sequential ChIP-Q-PCR analysis of bivalent genes in EpiLCs. The IP efficiency of the 1st ChIP using the anti-H3K4me3 antibody and of the second ChIPs using the anti-H3K27me3 antibody or normal IgG were indicated as ∆Ct values from the input (upper three panels). The fold enrichment in the second ChIP was indicated as the ∆Ct value between the two 2nd IPs (anti-H3K27me3 antibody - normal IgG) (bottom panel). ChIP-Q-PCR was performed for the H3K27me3-only region (Hoxa1 3’ UTR, blue), H3K4me3-only promoter (H3K4me3 peak around the TSS of Gapdh, green), and 12 bivalent promoters (H3K4me3 peaks around the TSSs of Hoxa1, Hoxa5, Pcsk9, Arg2, Mthfd2l, Cdk5r1, Gnaz, Nrp1, Adora2a, Rtn4rl1, Reln, Igfbp5, red). Note that the bivalent promoters were enriched in both the 1st and 2nd IPs (upper two panels). Bars indicate SDs of the two biological replicates. The primers used are given in Supplemental Table S1. (D) Venn diagram representation of the transitions of bivalent genes (log2 expression levels < 8, numbers indicated) and their GO enrichment during in vitro PGC specification. (E) Venn diagram representation of the bivalent genes (log2 expression levels < 8, numbers indicated) and their GO enrichment in comparison between d6 PGCLCs and PGCs at E11.5 (Sachs et al., 2013). (F) Transitions of the histone modifications of the germline genes (log2 expression levels < 8).

Germline genes (Log2 Expr. < 8) Biv K4 K27 Neither

(Legend for Figure S4, continued)(E) GO analysis of genes bearing higher levels of H3K27me3 (> 2 fold) in d2 LIF Ag compared to d2 PGCLCs. (F) Expression of PRC2 components during in vitro PGC specification (SDs, two biological replicates). (G) Expression of H3K4 methyltransferases during in vitro PGC specification (SDs, two biological replicates).

PGCLC PGCLC

PGCLC PGCLC

Page 7: Document S1. Figures S1–S7 and Supplemental Experimental

B

A

C D

E

Kurimoto et al., Supplemental Figure S6

Rara,Smad3,Sirt4,Hes1,Id3,Per1,Foxo3, Fos,Jun, Klf5, ...Xrcc4,Dedd,Rbm5,Gli3,Src,Bcl2,Bcl3,Bcl6,Trp53,Gsk3b, …

Mettl4,Mbd1,Su39h2,Suv420h1,Prmt1,Prdm9,Mll3,Mll5, …

EpiSC > EpiLC

EpiLC > EpiSC

H3K27ac peaks (IP/Input>16) around non-housekeeping genes (within gene body or <15Kbp from TSS)

Bivalent in EpiLC

Bivalent in EpiSC

1713

1150

937

Neuron differentiation

Pattern specification process

Embryonic morphogenesis

Neuron differentiation

Pattern specification process

Embryonic morphogenesis

Neuron differentiation

Pattern specification process

Embryonic morphogenesis

1E-33

1E-2 1E-6 1E-11

GO Neurondifferentiation

GO Embryonicdevelopment

GO Patternspecification

EpiLC genes

0 10 20 30 40 50 60 70

64

41

50

20

37

18

18

62

Number of H3K27ac peaksaround non-house keeping genes

Freq

uenc

y

H3K27me3 enrichment level (log2)-5 543210-1-2-3-4

EpiLCEpiSC

All genes

Germlinegenes

EpiLCEpiSC

1E-3 1E-7 1E-11P value

1E-3 1E-7 1E-11P value

P value

Regulation of transcription, DNA-dependentRegulation of apoptosis

Methylation

Positive regulation of transcription, DNA-dependent

Cell adhesionMyc, Ar, Rarg, Foxd3, Nfib, Lef1, Klf12, ...

Cdh1, Cdh4, Cdh10, Fgf4, Itga8, Arhgap6, ...

EpiLC

EpiSC

EpiLC

EpiSC

EpiLC

EpiSC

H3K

27ac

H3K

4me3

H3K

27m

e3

Pou5f1 Fgf8 Id3 Rara Smad3 Mettl4Hoxa1Hoxb cluster

b1b2b3

EpiSC > EpiLCEpiLC > EpiSC

H3K27ac peaksdifferential betweenEpiLC and EpiSC

Figure S6. Comparison of the Chromatin States between EpiLCs and EpiSCs, Related to Figure 2, Figure 3, Figure 4, and Figure 6(A) Selected GO terms enriched in non-house keeping genes (log2 expression levels < 8 either in EpiLCs or EpiSCs, or >2 fold difference between EpiLCs and EpiSCs) associated with strong H3K27ac peaks differential between EpiLCs and EpiSCs (IP/input > 16, and > 2 fold difference) (within gene bodies or < 15 Kb from TSSs). (B) ChIP-seq tracks of H3K27ac, H3K4me3, and H3K27me3 for selected genes in EpiLCs and EpiSCs. (C) Number of strong H3K27ac peaks differential between EpiLCs and EpiSCs (color codes as indicated) around the non-house keeping genes classified in the indicated GO terms (neuron differentiation, embryonic development, pattern specification) and the EpiLC genes. The classification of GO terms was defined using AmiGO 2 (Ashburner et al., 2000; Carbon et al., 2009). (D) Venn diagram representation of the bivalent genes (log2 expression levels < 8, numbers indicated) and their GO enrichment in comparison between EpiLCs and EpiSCs. (E) The log2 H3K27me3 IP level-frequency plots for the germline genes in EpiLCs (magenta) and EpiSCs (blue). Pale purple and cyan lines represent the plots for all genes as references.

Page 8: Document S1. Figures S1–S7 and Supplemental Experimental

Kurimoto et al., Supplemental Figure S7

B

A

H

D E F G

KJ1E-2 1E-9

chordate embryonicdevelopment

embryonic morphogenesis

epithelium development

pattern specification process

cell-cell adhesionregulation of transcription,

DNA-dependent

P value

Hoxa1, Hoxb5, Tbx3,Otx2, Wnt3, Bmp4, Bmi1

Fgf4, Bmp4, Gli3, Lama5, Lefty1, Hoxc10

Acvr1c, Tgfbr3, Chd7, Krt8, Msx1

Gja1, Src, Id3, Grlf1, Cd44, Cyp7b1, Bcl2

Dnmt3a, Dnmt3b, Ehmt1, Myc, Stat3, Klf4, Meis2

Dlg1, Frem2, Itga4, Itga6, Nptn, Pcdh19

0.00000.00100.00200.00300.00400.00500.0060

Pro

babi

lity

-250b 0 250bPosition of Best Site in Sequence

Motif 4, P=1.8e-371Motif 19, P=9.9e-339

EGFP-BLIMP1, d2PGCLC Motif 2

T (TF0000006 )

Motif 6

Motif 4 Motif 19

0.00000.00100.00200.00300.00400.00500.00600.00700.0080

Pro

babi

lity

-250b 0 250bPosition of Best Site in Sequence

Motif 6, P=1.1e-274Motif 2, P=1.8e-249T, P=4.0e-40

TEGFP-BLIMP1, d2PGCLC T

BLIMP motif WWM-9(Doody et al)

BLIMP consensus(Kuo et al)

Exp

ress

ion

leve

l (lo

g 2)

C

I TFs bound genes(<15Kb from TSS)

EGFP-BLIMP1, d2PGCLCT

1742,297 566

Hoxa1,Tbx3,Ehmt1,Dlg5,Lrp1,Krt8

TFs bound genes

EGFP-BLIMP1

10855 1769 2737 462

2440352 520

130

3782 350 558 148

0%10%20%30%40%50%60%70%80%90%

100%

ALLgenes

d2 d6 T

LCPICPHCP

BLIMP1 T

BLIMP1 (d2)BLIMP1 (d6)T

Enr

ichm

ent

(pea

k nu

mbe

r/bp)

(BLI

MP

1)

Frac

tionT

0

0.1

0.2

0.3

0

0.2

0.4

0.6

0.8

1

64 256

-256

1024

-102

4

4096

-409

6

1638

4

-163

84

6553

6

-655

36

2621

44

-262

144

1048

576

-104

8576 -6

4

Distance from TSS (bp)

Fgf4SrcGja1Tdgf1Uhrf1Klf9

Mesp1Msgn1Hoxa1BmperKrt8Anxa3

ESC EpiLC d2PGCLC

d6PGCLC

4

5

6

7

8

9

10

11

12

13

14

ESC EpiLC d2PGCLC

d6PGCLC

4

5

6

7

8

9

10

11

12

13

14

embryonic morphogenesis

pattern specification process

anterior/posterior pattern formation

positive regulation of transcription,DNA-dependent

mesoderm development

neuron differentiation

Fgf8, Msgn1, Tbx3, Hoxa1

Tbx6, Mesp1, Efna1, Vegfc

Hoxa1, Hoxa2, Hoxa4, Hoxa5

Gata6, Irf1, Atxn7, Ablm1, Foxo3

Mesp1, Tbx6, Tbx3, Ext1, Nanog

Ascl1, Numb, Neurog1, Epha1

1E-2 1E-9P value

ESC

EpiLCd2 PGCLC

d6 PGCLC

d2 PGCLC

d6 PGCLC

H3K

27m

e3E

GFP

-BLI

MP

1

Fgf4Fgf3Hoxb4Hoxb5 Hoxb2

100Kb 100Kb

BLIMP1 bound genes T bound genes

Figure S7. Analysis of the BLIMP1 and T Target Genes, Related to Figure 7(A) ChIP-seq track transitions for H3K27me3- (top, blue) and BLIMP1-binding (bottom,red) in the 100 Kb regions around Fgf3 and Fgf4 (left) and Hoxb cluster (right). (B) Distribution of BLIMP1 (in d2 and d6 PGCLCs) and T [in EpiLC aggregates stimulated by BMP4 for 36 hrs (Aramaki et al., 2013)] peaks (color codes as indicated) in the genome represented by peak numbers per base pair plotted against the distances from the TSSs (red dotted line). (C) The promoter classes for genes bound by BLIMP1 and T. (D) The sequence motifs for BLIMP1 binding in comparison to those identified previously (Doody et al., 2010; Kuo and Calame, 2004). (E) Motif probability graph showing the position of the consensus motifs in the BLIMP1-binding sites. (F) The sequence motifs for T binding. A motif sequence from the JASPAR CORZ database (TF00006 as PazarID) (Mathelier et al., 2014) is shown as a reference. (G) Motif probability graph showing the position of the consensus motifs in the T-binding sites. (H) Expression of representative targets of BLIMP1 (left) and T (right) during in vitro PGC specification. (I) Venn diagram showing the overlap between genes (< 15 Kb from the TSSs) bound by BLIMP1 in d2 PGCLCs and those bound by T in EpiLC aggregates stimulated by BMP4 for 36 hrs. (J) GO analysis of BLIMP1 targets (core enrichment genes identified in Figure 7E). (K) GO analysis of T targets (core enrichment genes identified in Figure 7F).

Page 9: Document S1. Figures S1–S7 and Supplemental Experimental

8

SUPPLEMENTAL TABLES (Table S1-Table S6), see separate Excel documents

Table S1. Primers Used in This Study, Related to All Figures

Table S2. Antibodies Used in This Study, Related to All Figures

Table S3. Outlines and Mapping Statistics of ChIP-seq experiments, Related to All

Figures

Table S4. Gene Expression and Chromatin States Analyzed in This Study, Related

to All Figures

Table S5. Cell-type specific H3K27ac peaks and TF binding motifs in Such Peaks,

Related to Figure 2

Table S6. BLIMP1 and T Binding Sites During In Vitro PGC Specification, Related

to Figure 7

Page 10: Document S1. Figures S1–S7 and Supplemental Experimental

9

SUPPLEMENTAL EXPERIMENTAL PROCEDURES

Analysis of the Gene Expression for In Vitro PGC Specification

The gene expression data on an Affymetrix Gene Chip microarray reported in (Hayashi

et al., 2011; Nakaki et al., 2013) were used for the transcriptome analysis of in vitro

PGC specification (GEO numbers: GSM744093 and GSM744094 for ESCs,

GSM744095 and GSM744096 for EpiLCs, GSM1070847 and GSM1070848 for d2

PGCLCs, and GSM744101 and GSM744102 for d6 PGCLCs).

A gene list was created using Ensemble Genes with genes that are included in the

microarray probes for Affymetrix Mouse430_2. In the case that multiple probes were

assigned to a single gene, the probe that gave the highest average expression values was

selected. The promoter classes (HCPs, ICPs, and LCPs) were defined as reported

previously (Borgel et al., 2010).

Significantly expressed genes were defined as those showing an averaged log2

expression level > 8 in two biological replicates. Differentially expressed genes

(Figure 1C) were defined by significant expression levels and fold changes > 2 in at

least one pair-wise comparison among ESCs, EpiLCs, and d2 and d6 PGCLCs, and such

genes were further classified by the highest expression levels among the four key cell

types. Gene ontology analysis was performed using the DAVID gene ontology

functional annotation tool (http://david.abcc.ncifcrf.gov/) (Huang da et al., 2009a, b).

Generation of EGFP-Blimp1 Knock-in Mice and ESCs

All the animal experiments were performed under the ethical guidelines of Kyoto

University and RIKEN CDB. Noon of the day when the copulation plugs of mated

females were identified was designated as embryonic day (E) 0.5.

The targeting vector for the EGFP-Blimp1 knock-in allele, in which EGFP cDNA and a

linker sequence (the BspEI-HindIII sequence in the pEGFP-C1 plasmid; Addgene) were

inserted into the first ATG of the Prdm1 genes, was constructed using SalI-NotI and

SwaI-XhoI sites of the DT-A-pA/loxP/PGK-Neo-pA/loxP vector

(http://www.cdb.riken.jp/arg/cassette.html). The targeting vector was linearized using

the SalI site, and electroporated into the TT2 ESC line (Yagi et al., 1993).

Homologous recombination was screened by PCR using the primer set NeoGt-1/F and

Prdm1_N3659/R (Supplemental Table S1), and confirmed by Southern blot analysis

with the 5’- and 3’- probes (PCR amplicons by primers described in Supplemental Table

S1), which detected AvrII and HindIII sites of the Prdm1 locus, respectively. Random

integration was ruled out by Southern blot analysis using a probe targeting the Neo

locus. The homologous recombinant ESCs were injected into eight-cell-stage embryos

of ICR mice to generate chimeric mice. Chimeras with a high ESC contribution were

judged by coat color and mated with C57BL/6 females to generate Prdm1+/EGFP-Blmp1-Neo

Page 11: Document S1. Figures S1–S7 and Supplemental Experimental

10

mice, which were genotyped by PCR using Prdm1 Ex3/F and Ex3R primers

(Supplemental Table S1) and Southern blot analysis using tail tip DNAs. The

heterozygous mice were mated with the EIIa-Cre transgenic mice (Jackson Laboratory)

to excise the pGK-Neo-pA and generated Prdm1 heterozygous +/EGFP-Blmp1

mice, which

were genotyped by PCR using the Prdm1 Int3/F and Int3/R primers (Supplemental

Table S1). The heterozygous mice were intercrossed to generate homozygous Prdm1

EGFP-Blmp1/ EGFP-Blmp1 (EGFP-Blimp1 knock-in) mice (Accession No.

CDB0923K: http://www.cdb.riken.jp/arg/mutant%20mice%20list.html), from which

EGFP-Blimp1 knock-in ESCs were generated under the conditions described previously

(Hayashi et al., 2011). The EGFP-Blimp1 homozygous knock-in mice are healthy and

fertile.

Induction from ESCs of EpiLCs and PGCLCs

BVSC (Hayashi et al., 2011) and EGFP-Blimp1 knock-in ESC lines were induced into

EpiLCs as described previously (Hayashi et al., 2011) for 48 hrs (BVSC) or 36 hrs

(EGFP-Blimp1 knock-in). The EpiLCs were then cultured under a floating condition

by plating 2-3×103 cells per well of a lipidure-coated U-bottom 96-well plate (Thermo

Scientific) in GK15 medium (Hayashi et al., 2011) containing LIF (1000 U/mL) for

induction of d2 LIF aggregate (d2 LIF Ag), LIF and BMP4 (500 ng/mL) for d2

PGCLCs, and LIF, BMP4, BMP8a (500ng /mL), SCF (100 ng/mL), and EGF (50

ng/mL) for d6 PGCLCs, respectively. Since BMP8a was proved to be omittable for

PGCLC induction (data not shown), d6 PGCLC induction from the EGFP-Blimp1

knock-in ESCs was performed without BMP8a. d2 and d6 PGCLCs of the BVSC line

were purified with a fluorescence-activated cell sorter (FACS) (ARIA III; BD

Biosciences) by fluorescence of BV. For the EGFP-BLIMP1 knock-in line, since the

EGFP signal by the EGFP-BLIMP1 fusion protein was too low to be distinctively

detected by the FACS analysis, d2 PGCLCs were purified with FACS using an anti-Kit

antibody (Supplemental Table S2) and d6 PGCLCs using anti-SSEA1 and anti-Integrin

3 antibodies (Supplemental Table S2).

ChIP-seq Data Analysis and Normalization

Read data were mapped on the mouse mm9 genome by Bowtie v0.12.9 (Langmead et

al., 2009) and manipulated by Picard-tools v1.85 (http://piard.sourceforge.net), IGV

tools v2.3.5 (Robinson et al., 2011) and Samtools v0.1.18 (Li et al., 2009) for fitting to

the subsequent analyses. Read patterns were visualized by IGV (Robinson et al.,

2011).

The distribution of H3 was constant throughout the genome, around the TSSs, and on

the major repetitive elements (LINE1 ORF2, SINE B1, and IAP) during in vitro PGC

specification (Figure S2A and S2B).

Page 12: Document S1. Figures S1–S7 and Supplemental Experimental

11

Peak callings for EGFP-BLIMP1, T, H3K4me3, and H3K27ac were performed using

MACS v1.4.2 (Zhang et al., 2008) with default settings (P-value cutoff 10−5

). To

enable the comparison of peaks in different samples, we scanned all peaks throughout

the genome, and considered peaks detected in proximity (within 1 Kb) as a single peak.

Read densities (FPKM) of H3K4me3 within 500 bp from peak centers were normalized

as follows: We calculated the invariant set (Li and Hung Wong, 2001) of read densities

against the biological replication 1 of ESCs, then drew a linear regression line in log

scale with a slope predefined as 1.0, and used the Y-intercept value of this line for each

sample as a coefficient for normalization. The peak intensity was defined by the

IP/input ratio [normalized read density of the peak (within 500 bp from peak centers)

over Input (within 5 Kb)] averaged in the two biological replicates. H3K4me3 peaks

closest to TSSs (<2 Kb) were identified, and the level of TSS-associated H3K4me3 was

considered significant when the peak intensities were higher than the average of the 95th

percentile for significantly expressed genes in all four key cell types. The expression

and chromatin states are summarized in Supplemental Table S4.

H3K27ac exhibited few peaks that were consistently high in all samples, and thus could

not be normalized by the same method as used for H3K4me3. Therefore, read

densities of H3K27ac within 500 bp from the peak center were normalized using an MA

plot against the replication 1 of ESCs; the average log-transformed read densities were

divided into 30 fractions, and the modes of fold differences in the top 10% of fractions

were used as normalization coefficients. Peak intensities were defined as for

H3K4me3 (see above), and only high intensity peaks (>16) were used to identify

H3K27ac peaks associated with genes, which were defined by peaks in the gene body or

within 15 Kb from TSSs. The enrichments of transcription factor motifs in H3K27ac

peaks were analyzed using MEME-ChIP (MEME v4.9.1) (Machanick and Bailey, 2011).

Cell-type specific H3K27ac peaks are listed in Supplemental Table S5.

Normalization and determination of peak intensities for EGFP-BLIMP1 and T were

performed as for H3K4me3, except that reads within 250 bp from peak centers were

used. Peaks satisfying the following criteria were used in the subsequent analyses:

intensities of >16 and significantly higher than the background levels within 5 Kb from

the peaks (average and 3× standard deviation of ChIP intensities). The peaks of

EGFP-BLIMP1 and T associated with genes were defined by those within 15 Kb from

the TSSs. The EGFP-BLIMP1 and T peaks are listed in Supplemental Table S6.

The genome-wide absolute levels of H3K27me3 and H3K9me2 appear to change in a

dynamic fashion during both in vivo and in vitro PGC specification and development

(Hayashi et al., 2011; Nakaki et al., 2013; Seki et al., 2005; Seki et al., 2007).

Therefore, by using absolute IP efficiency measured by ChIP-Q-PCR, we normalized

Page 13: Document S1. Figures S1–S7 and Supplemental Experimental

12

the genome-wide levels of H3K27me3 and H3K9me2 in each cell type in order to

quantify their transitions during in vitro PGC specification.

ChIPs for H3K27me3 and H3K9me2 were performed with 2-3×104 cells (two biological

replicates for each cell type) using 2 g antibodies (Supplemental Table S2)

independent from the ChIP-seq analysis. The IP efficiency of repetitive elements in

the ChIP-ed DNAs was measured by Q-PCR as ∆Ct values from the input (LINE 1

ORF2, SINE B1, and IAP; Supplemental Table S1).

For ChIP-seq data, reads mapped at more than one location of the genome (mm9) were

incorporated for repetitive sequence analysis with “-M 1” and "―best” options at

bowtie mapping (these reads were excluded from all other analyses using the “-m”

option), and the IP/input ratio of the genomic loci corresponding to the ChIP-Q-PCR

analysis was calculated. Then, the read counts of the ChIP-seq data were normalized

so that the ratios of LINE 1 ORF2 levels among the cell types in the ChIP-seq data were

the same as those in the ChIP-Q-PCR data. This normalization gave good correlations

between the ChIP-seq and ChIP-Q-PCR values of other repetitive elements (SINE B1

and IAP), and of single-copy genes (Figure S2C-S2F and data not shown).

Although the H3K27me3 signals were enriched around the TSSs, it was difficult to

define the peaks due to the lawn-type distribution of H3K27me3 (Marks et al., 2012)

and the genome-wide changes of the H3K27me3 levels. Similarly, we did not detect

H3K9me2 peaks around many TSSs (Figure 5). Therefore, for H3K27me3 and

H3K9me2, we simply counted the ChIP-seq reads within 1 Kb from the TSSs to define

the modification levels associated with the genes. The normalized IP/input ratio

[normalized read density (within 1 Kb from the TSSs, or any genomic sites) over input

(within 5 Kb)] averaged in the two biological replicates was defined as the IP level.

The H3K27me3 IP level corresponding to log2 expression level 8 was determined by

averaging for TSSs of genes with log2 expression levels of 8 ± 0.5. Enrichment levels

of H3K27me3 were defined by fold differences from the IP level corresponding to the

log2 expression level 8. TSSs with significant changes of H3K27me3 were defined by

enrichment levels > 1 and fold changes > 2 in at least one pair-wise comparison among

ESCs, EpiLCs, and d2 and d6 PGCLCs, and such TSSs were further classified by the

highest enrichment levels among the four key cell types (PRC2 targets). Bivalent

genes were defined as genes with significant H3K4me3 levels associated with the TSSs

and H3K27me3 enrichment level around the TSSs > 1.

Because d2 PGCLCs and d2 LIF aggregates showed similar global distribution of

H3K27me3 (Figure S4D), IP levels were used to identify differential H3K27me3 levels

at the TSSs between the two cell types (Figure 3H).

Page 14: Document S1. Figures S1–S7 and Supplemental Experimental

13

The two biological replicates of the ChIP-seq data for each histone modification and

BLIMP1 binding exhibited good reproducibility (Figure S1F, S2G-S2L, and S3A-S3C).

Consistent with the general properties of each modification, in all the cell types, we

found a positive correlation between the levels of gene expression and the levels of

H3K4me3 peaks closest to the TSSs, and negative correlations between the levels of

gene expression and the levels of H3K27me3 and H3K9me2 around (±1 Kb) the TSSs

and between the levels of H3K4me3 and the levels of H3K27me3 (Figure S4A and

S5A).

Sequential ChIP-Q-PCR

A sequential ChIP-Q-PCR was performed using 1.8×107 EpiLCs, based on a protocol

reported previously (Truax and Greer, 2012). Prior to immunoprecipitation, 18 g of a

rabbit polyclonal antibody for H3K4me3 (Abcam, ab8580) was incubated with 90 l of

M280 Dynabeads Protein G for 1 hr at room temperature. As described above,

EpiLCs were fixed and lysed, and the chromatins were solubilized by sonication (with

1×106 cells/400 L of SDS-lysis buffer per tube, and 10 cycles of sonication) and

diluted in ChIP dilution buffer. The sonication products (in total 25.2 mL) were

pooled and divided into 100 L of the Input and 25.1 mL of the first ChIP sample. The

first ChIP samples were then subjected to immunoprecipitation as described above.

Purified ChIPed DNA was eluted in 100 l of elution buffer, and diluted in 2 mL of the

ChIP dilution buffer. Prior to the second immunoprecipitation, 2 g of a rabbit

polyclonal antibody for H3K27me3 (Abcam, ab108425) and a normal rabbit IgG (Santa

Cruz, sc-2027) were incubated with 5 L of M280 Dynabeads Protein G. The diluted

and purified ChIPed DNA was then divided into 100 L of the first IP product and 2 mL

of the second ChIP sample. 900 L of the second ChIP sample was subjected to

immunoprecipitation with Dynabeads-anti-H3K27me3 antibody complex or

Dynabeads-normal rabbit IgG complex. ChIPed DNA of the second

immunoprecipitation was purified and eluted in 30 L of the elution buffer (the second

IP product). Reverse crosslink and Proteinase K treatment followed by purification of

DNA were performed as described above. QPCR was performed with the primer pairs

described in Supplemental Table S1, using 1:10 diluted DNA of the Input, the first IP

product, and the second IP product.

Comparison of the ChIP seq data for EpiLCs with those for EpiSCs

ChIP-seq and Input data for H3K4me3, H3K27ac, and H3K27me3, and transcriptome

data by microarray analysis for EpiSCs (Factor et al., 2014; Hayashi et al., 2011) were

obtained from the NCBI database (GEO accession numbers: GSE57409 for ChIP-seq,

GSM744097 and GSM744098 for microarray), and processed as described above.

Criteria regarding the levels of H3K27me3 and H3K4me3 were also defined as

described using the gene expression levels obtained from the microarray data. The

peak signals of H3K27ac (MACS P-value < 10−5

) around genes (gene bodies and < 15

Page 15: Document S1. Figures S1–S7 and Supplemental Experimental

14

Kb from the TSSs) were calculated and normalized to median.

Gene Set Enrichment Analysis (GSEA)

To identify expression level changes caused by loss-of-function of Blimp1, single-cell

microarray data for PGCs and Blimp1-null cells with deficient Blimp1 transcripts

(Blimp1) at the E/MB stage were used (Kurimoto et al., 2008) (GEO numbers:

GSM280697 - GSM280704 for wild-type PGCs and GSM280749 - GSM280760 for

Blimp1-null, Blimp1 transcript-positive cells), and differences in the averaged log2

expression levels were calculated between wild-type PGCs and Blimp1-null Blimp1

transcript positive cells.

To identify expression level changes caused by gain-of-function of Blimp1, microarray

data on overexpression of the Blimp1 transgene in EpiLC aggregations (TF-PGCLCs)

were used (Nakaki et al., 2013) [GEO numbers: GSM1139220 and GSM1139230 for

day 1 Blimp1-induced TF-PGCLCs (d1B-PGCLCs), GSM1139215 and GSM1139225

for EpiLCs with a Blimp1 transgene (B-EpiLCs), GSM1139218 and GSM1139228 for a

parental clone treated with doxycycline for 1 day in aggregation (d1 Parental

aggregation), and GSM1139213 and GSM1139223 for EpiLCs of a parental clone

(Parental EpiLCs)], and the difference between the consequences of Blimp1

overexpression (d1B-PGCLCs - B-EpiLCs) and of aggregation formation (d1 Parental

aggregation - Parental EpiLCs) were calculated.

To identify expression level changes correlated with T, the differences in the log2

expression levels between d2 PGCLCs and EpiLCs were calculated. GSEA

(Subramanian et al., 2005) was performed using the above-described expression level

differences and lists of genes with EGFP-BLIMP or T peaks.

Immunofluorescence Analysis of Spread Cells

d6 PGCLCs and EpiLCs induced from BVSC ESCs were purified with FACS, mixed at

a 1:1 ratio, and spread on MAS-coated glass slides (Matsunami Glass) using Cytospin4

(Thermo Scientific) as reported previously (Nakaki et al., 2013). The spread cells

were then fixed by 4% paraformaldehyde, and incubated overnight in blocking buffer

(PBS containing 0.2% Tween 20, 0.1% BSA, and 1% normal goat serum) with rabbit

anti-Lamin B1 antibody (ab16048; Abcam). The cells were washed and incubated for

2 hrs in the blocking buffer with DAPI and anti-rabbit IgG conjugated with Alexa Fluor

633. Image data were obtained using a confocal laser scanning microscope (Olympus

FV1000).

Immunofluorescence Analysis of Cryosections

Gonads at E12.5 were fixed in 4% PFA for 2 hrs at 4ºC, then sequentially soaked in PBS

containing 10% and 30% sucrose, and embedded in OCT compound (Sakura Finetek).

Page 16: Document S1. Figures S1–S7 and Supplemental Experimental

15

The embedded samples were frozen at −80ºC and sectioned at a 10 m thickness using a

cryostat (Leica). The cryosections were incubated overnight in the blocking buffer

containing rat anti-GFP monoclonal (Nacalai Tesque) and rabbit anti-MVH polyclonal

antibodies (Abcam), washed three times in PBS containing 0.2% Tween 20, and then

incubated in the blocking buffer containing DAPI, goat anti-rat antibody conjugated to

Alexa Fluor 488, and anti-rabbit antibody conjugated to Alexa Fluor 568. Image data

were obtained using a confocal laser scanning microscope (Olympus FV1000).

Histology

Mouse testes and ovaries were fixed in Bouin’s fixative overnight and embedded in

paraffin wax. Sectioning was performed at a 6 μm thickness using a microtome

(Leica). The sectioned samples were mounted on MAS-coated glass slides

(Matsunami Glass) and stained with hematoxylin and eosin.

Western Blot Analysis

PGCs at E12.5 of EGFP-BLIMP1 knock-in mice and Stella-EGFP transgenic mice

(Seki et al., 2007), which is the wild-type for the Prdm1 locus, were purified with

Magnetic Activated Cell Sorting (MACS; Miltenyi Biotec) using anti-SSEA-1

microbeads. The proteins were separated by 6% SDS-Polyacrylamide gel

electrophoresis, blotted to an Immobilon-P Transfer Membrane (Millipore, Bedford,

MA), and incubated with rabbit anti-BLIMP1 polyclonal antibody that we generated or

mouse anti--Tubulin monoclonal antibody (Sigma-Aldrich). Immunodetection was

performed using goat anti-rabbit or sheep anti-mouse IgGs conjugated to HRP, and an

ECL-plus western blotting detection system (GE Healthcare). Luminescent signals

were detected using an LAS4000IR multicolor luminescent image analyzer (Fuji Film).

Accession Numbers

The ChIP-seq and normalized microarray data used in this study were deposited in the

NCBI database (GEO accession numbers: GSE60204 and GSE60018).

Page 17: Document S1. Figures S1–S7 and Supplemental Experimental

16

SUPPLEMENTAL REFERENCES

Borgel, J., Guibert, S., Li, Y., Chiba, H., Schubeler, D., Sasaki, H., Forne, T., and Weber,

M. (2010). Targets and dynamics of promoter DNA methylation during early mouse

development. Nat Genet 42, 1093-1100.

Factor, D.C., Corradin, O., Zentner, G.E., Saiakhova, A., Song, L., Chenoweth, J.G.,

McKay, R.D., Crawford, G.E., Scacheri, P.C., and Tesar, P.J. (2014). Epigenomic

comparison reveals activation of "seed" enhancers during transition from naive to

primed pluripotency. Cell Stem Cell 14, 854-863.

Hayashi, K., Ohta, H., Kurimoto, K., Aramaki, S., and Saitou, M. (2011). Reconstitution

of the mouse germ cell specification pathway in culture by pluripotent stem cells. Cell

146, 519-532.

Huang da, W., Sherman, B.T., and Lempicki, R.A. (2009a). Bioinformatics enrichment

tools: paths toward the comprehensive functional analysis of large gene lists. Nucleic

Acids Res 37, 1-13.

Huang da, W., Sherman, B.T., and Lempicki, R.A. (2009b). Systematic and integrative

analysis of large gene lists using DAVID bioinformatics resources. Nature protocols 4,

44-57.

Kurimoto, K., Yabuta, Y., Ohinata, Y., Shigeta, M., Yamanaka, K., and Saitou, M.

(2008). Complex genome-wide transcription dynamics orchestrated by Blimp1 for the

specification of the germ cell lineage in mice. Genes Dev 22, 1617-1635.

Langmead, B., Trapnell, C., Pop, M., and Salzberg, S.L. (2009). Ultrafast and

memory-efficient alignment of short DNA sequences to the human genome. Genome

Biol 10, R25.

Li, C., and Hung Wong, W. (2001). Model-based analysis of oligonucleotide arrays:

model validation, design issues and standard error application. Genome Biol 2,

RESEARCH0032.

Li, H., Handsaker, B., Wysoker, A., Fennell, T., Ruan, J., Homer, N., Marth, G.,

Abecasis, G., Durbin, R., and Genome Project Data Processing, S. (2009). The

Sequence Alignment/Map format and SAMtools. Bioinformatics 25, 2078-2079.

Machanick, P., and Bailey, T.L. (2011). MEME-ChIP: motif analysis of large DNA

datasets. Bioinformatics 27, 1696-1697.

Marks, H., Kalkan, T., Menafra, R., Denissov, S., Jones, K., Hofemeister, H., Nichols, J.,

Kranz, A., Francis Stewart, A., Smith, A., et al. (2012). The transcriptional and

epigenomic foundations of ground state pluripotency. Cell 149, 590-604.

Nakaki, F., Hayashi, K., Ohta, H., Kurimoto, K., Yabuta, Y., and Saitou, M. (2013).

Induction of mouse germ-cell fate by transcription factors in vitro. Nature 501, 222-226.

Page 18: Document S1. Figures S1–S7 and Supplemental Experimental

17

Robinson, J.T., Thorvaldsdottir, H., Winckler, W., Guttman, M., Lander, E.S., Getz, G.,

and Mesirov, J.P. (2011). Integrative genomics viewer. Nat Biotechnol 29, 24-26.

Seki, Y., Hayashi, K., Itoh, K., Mizugaki, M., Saitou, M., and Matsui, Y. (2005).

Extensive and orderly reprogramming of genome-wide chromatin modifications

associated with specification and early development of germ cells in mice. Dev Biol

278, 440-458.

Seki, Y., Yamaji, M., Yabuta, Y., Sano, M., Shigeta, M., Matsui, Y., Saga, Y., Tachibana,

M., Shinkai, Y., and Saitou, M. (2007). Cellular dynamics associated with the

genome-wide epigenetic reprogramming in migrating primordial germ cells in mice.

Development 134, 2627-2638.

Subramanian, A., Tamayo, P., Mootha, V.K., Mukherjee, S., Ebert, B.L., Gillette, M.A.,

Paulovich, A., Pomeroy, S.L., Golub, T.R., Lander, E.S., et al. (2005). Gene set

enrichment analysis: a knowledge-based approach for interpreting genome-wide

expression profiles. Proc Natl Acad Sci U S A 102, 15545-15550.

Truax, A.D., and Greer, S.F. (2012). ChIP and Re-ChIP assays: investigating

interactions between regulatory proteins, histone modifications, and the DNA sequences

to which they bind. Methods Mol Biol 809, 175-188.

Yagi, T., Tokunaga, T., Furuta, Y., Nada, S., Yoshida, M., Tsukada, T., Saga, Y., Takeda,

N., Ikawa, Y., and Aizawa, S. (1993). A novel ES cell line, TT2, with high

germline-differentiating potency. Anal Biochem 214, 70-76.

Zhang, Y., Liu, T., Meyer, C.A., Eeckhoute, J., Johnson, D.S., Bernstein, B.E., Nusbaum,

C., Myers, R.M., Brown, M., Li, W., et al. (2008). Model-based analysis of ChIP-Seq

(MACS). Genome Biol 9, R137.