different mutational rates and mechanisms in human cells ... · bridget riley-gillis,5 nenad...
TRANSCRIPT
NEURODEVELOPMENT
Different mutational rates andmechanisms in human cells atpregastrulation and neurogenesisTaejeong Bae,1 Livia Tomasini,2 Jessica Mariani,2 Bo Zhou,3 Tanmoy Roychowdhury,1
Daniel Franjic,4 Mihovil Pletikos,4 Reenal Pattni,3 Bo-Juen Chen,5 Elisa Venturini,5
Bridget Riley-Gillis,5 Nenad Sestan,4,6 Alexander E. Urban,3
Alexej Abyzov,1* Flora M. Vaccarino2,4,6*
Somatic mosaicism in the human brain may alter function of individual neurons. Weanalyzed genomes of single cells from the forebrains of three human fetuses (15 to 21 weekspostconception) using clonal cell populations. We detected 200 to 400 single-nucleotidevariations (SNVs) per cell. SNV patterns resembled those found in cancer cell genomes,indicating a role of background mutagenesis in cancer. SNVs with a frequency of >2%in brain were also present in the spleen, revealing a pregastrulation origin. We reconstructedcell lineages for the first five postzygotic cleavages and calculated a mutation rate of~1.3 mutations per division per cell. Later in development, during neurogenesis, the mutationspectrum shifted toward oxidative damage, and the mutation rate increased. Bothneurogenesis and early embryogenesis exhibit substantially more mutagenesis than adulthood.
Somatic mutagenesis is one of the emergingareas of vertebrate genome biology. Sev-eral studies revealed extensive genomicmosaicism marked by hundreds of single-nucleotide variants (SNVs) per cell in so-
matic tissues of the human body, such as skinfibroblasts, intestine, liver, and colon (1–3).Mosaiccopy-number alterations are also common, andinsertions of retrotransposable elements havebeen detected (4–10). Mosaicism is prominent inthe central nervous system, with implications forbrain evolution and the genomic underpinningsof human neuropsychiatric disorders (11, 12).Roughly 1500 SNVs might be present in matureneurons from the adult human cortex, whichare only detectable in the analyzed cell and arethought to be related to transcriptional activity(13). However, the temporal origin of these SNVsduring development is unknown. Furthermore, theuse of in vitro whole-genome amplification (WGA)fromDNA of single nuclei is prone to experimen-tal artifacts mimicking SNVs (14, 15). Here wedescribe the discovery and analysis of mosaicSNVs in neuronal progenitor cells in three fetalhuman brains. Individual progenitor cells wereallowed to proliferate into clonal cell populations,which yielded insights into the genomes of thefounder cells (fig. S1) and provided an estimationof the frequency andmutation spectrum ofmosaic
mutations in human development while avoidingWGA-associated artifacts.
Discovery and validation of mosaic SNVs
Brains were collected fromphenotypically normalpostmortem human fetuses ranging in age from15 to 21 weeks postconception. Based on a com-parison of counts of germline SNVs (3,809,591for subject 316; 4,316,547 for subject 320; and3,746,847 for subject 275) to those derived bythe 1000 Genomes Project across different humansubpopulations, we concluded that subjects 316and 275 were of non-African origin, whereas sub-ject 320 (male, 17 weeks postconception) was ofAfrican descent.From a bulk culture of dissociated cells of the
ventricular and subventricular zones (VZ-SVZ)of the frontal region of the cerebral cortex, parietalcortex, or basal ganglia (BG), we grew 31 single-cell–derived clonal cell populations, each con-taining a few thousand cells, using the limitingdilution approach (fig. S1). A few possible di-visions before dilution are not likely to notablycontribute to the mutation landscape in eachcell. DNA extracted from the individual clones,the source tissue of germinal zones, and the spleenwas sequenced to a minimum of 30x genomecoverage (fig. S1). For three clones, we could notderive enough cells; hence, DNA was amplifiedby mutliple displacement amplification beforesequencing.Mosaic SNVs present in the founder cell of the
clones were discovered by comparing genomesof clones both to each other and to genomes ofthe germinal zone tissue and spleen (table S1).We selected those calls with greater than 35%variant-allele frequency (VAF) in clones as can-didate mosaic SNVs. This limit was chosen toexclude mutations arising during culture, whichshould have a VAF of 25% or less. The distribu-
tion for the SNV discovery data set is centeredaround a VAF of 50%, as expected for true mosaicvariants (figs. S1 and S2).When comparing clones to each other, we fil-
tered the resulting calls on the basis of the con-formity of their recurrence to clones that areexpected and are not expected to carry the samemosaic SNVs (fig. S3). Calls from such clone-to-clone comparisons were 98.9% concordant withcalls from comparing clones to VZ-SVZ braintissue or spleen (Fig. 1A). However, among 68calls made exclusively from clone-to-clone compar-isons, 31 (46%) were missing from clone-to–braintissue or clone-to-spleen comparisons becausethey corresponded to SNVs present in tissues athigh frequency (Fig. 1C), demonstrating the ad-vantage of the clone-to-clone comparison approach.Therefore, the clone-to-clone comparison repre-sents an alternative design to the use of familialtrios (1) for the study of mosaicism.Eight randomly selected SNVs were all con-
firmed in the clones using polymerase chain re-action (PCR) and Sanger sequencing (table S2).As an additional validation strategy, we designedan oligonucleotide library complementary to theloci of all 6288 SNVs comprising the discoverydata set and performed capture and deep re-sequencing (to ~1000x coverage) in the DNAfrom 10 clones. This confirmed the 50%-centeredVAF distribution for a majority of SNVs, with aminority (5.1%) having a VAF lower than 35%,perhaps indicating that these variations couldhave arisen during cell culturing (figs. S1 and S4).Accordingly, we estimated our false-positive rateat around 5%. From an in silico comparison ofour clones with the unrelated and well-charac-terized cell line NA12878, we estimated that thesensitivity for discovering mosaic SNVs in theclones was ~83% (fig. S5).
Mosaic SNV counts, mutation spectra,and distributions across brain regions
SNVs were found at rates of 108 to 572 per clone,with clones from older brains containing morevariants (Fig. 1D), which averages to 200 to 400SNVs after adjustment for discovery sensitivityand false positives. No differences in SNV countsfor clones from frontal and parietal cortex andfrom frontal cortex and basal ganglia of the samebrain were noticeable (Fig. 1D). Similarly, therelative contributions of substitution types to themutation spectrumwere the same for clones fromdifferent brains and from different brain regions(Fig. 1E). Overall, the transition-to-transversionratio (Ti/Tv) was 0.6, with the most frequent sub-stitution type being a C:G→A:T transversion. Thisperhaps reflectsDNAdamage by oxidation, result-ing in 8-oxoguanine that is later fixated to thre-onine through incorrect base pairingwith adenine(16). The secondmost common substitutionwas aC:G→T:A transition, which is thought to be causedby deamination of cytosine and 5-methylcytosine(16). Linear approximation of the increase in SNVcounts over time allowed estimation of a muta-tion rate of 5.1 (95%CI, 1.5 to 9) SNVs per day perprogenitor during neurogenesis. This projects toa rate of roughly 8.6 (95% CI, 1.6 to 20) SNVs per
RESEARCH
Bae et al., Science 359, 550–555 (2018) 2 February 2018 1 of 6
1Department of Health Sciences Research, Mayo Clinic,Rochester, MN 55905, USA. 2Child Study Center, YaleUniversity, New Haven, CT 06520, USA. 3Departments ofPsychiatry and Genetics, Stanford University, Palo Alto, CA94305, USA. 4Department of Neuroscience, Yale University,New Haven, CT 06520, USA. 5New York Genome Center,New York, NY 10013, USA. 6Yale Kavli Institute forNeuroscience, New Haven, CT 06520, USA.*Corresponding author. Email: [email protected] (A.A.);[email protected] (F.M.V.)
on February 28, 2020
http://science.sciencem
ag.org/D
ownloaded from
division per progenitor, assuming that the lengthof the cell cycle of cortical progenitors is between27 and 54 hours, which is based upon studies inprimates (17). The large interval of this estima-tion is due to uncertainties in both the per-dayincrease in SNVs and the length of the cell cycle.Interindividual variabilitymay further widen theconfidence interval of this estimate.To genotype the presence of variants across
brains, we used the above-referenced capture lib-rary to conduct deep sequencing (~1000x cover-age) in the source tissue (dorsal and basal germinallayers), in the corresponding outer layers contain-ing mature neurons (frontal, parietal cortex andbasal ganglia), in other brain regions (occipitalcortex, cerebellum), and in a peripheral tissue,the spleen (table S3). A total of 144 SNVs werereliably genotyped, from 11 to 68 in each tissue,including spleen,withVAFs between 0.3 and 30%(Fig. 2A and fig. S6). High VAFs for seven suchSNVs were further cross-confirmed with an or-thogonal technique, droplet digital PCR (tableS2 and figs. S7 to S17). However, for hundredsof SNVs atmuch lower VAFs (typically below 1%),the evidence for their presence in each tissue wasnot significant, likely because of their low VAFsin tissues.Almost 60% of the genotyped SNVs (1.4% of
total SNVs) and 92%of the SNVswith VAF above2% in at least one brain region had a nonzeroVAF in the spleen (Fig. 2, B and C, and figs. S6and S18). Because the brain is of neuroectoder-mal origin and the spleen is ofmesodermal origin,
these shared SNVs likely occurred before gastru-lation, when themesoderm, ectoderm, and endo-derm differentiate from a single-layer epithelium.This suggestion is consistent with the range ofVAFs of these shared SNVs, as there are about12 cell divisions before gastrulation (18), whichcorresponds to expected VAFs from 0.03 to 25%in somatic tissues, depending on how early inembryogenesis variants have occurred. Some SNVswere shared between the spleen and only someof the brain regions (Fig. 2, B and C, and figs. S6and S18). This could indicate regional sublineages,i.e., that nonmixing sets of progenitors generateneurons in different brain regions and that thesedistinct populations of progenitorsmay not sharea common ancestor with cells in the spleen. How-ever, conducting more sensitive assays is nec-essary to exclude the possibility of incompletegenotyping. Another observation pointing toan early origin of the mutations is that the over-lap between SNVs in the basal ganglia VZ-SVZand the cortical VZ-SVZ with their respectivedifferentiated regions was less than the overlapbetween the four regions together (fig. S18), sug-gesting that most of the genotyped mutationsfound in the brain arise at least before the split-ting between basal and dorsal regions of thetelencephalon, i.e., at around the neural plate stageor earlier.Genotypedmosaic SNVs clearly cluster by sim-
ilar VAFs in each brain (Fig. 3 and fig. S19). Onthe basis of the average values of the frequenciesfor each cluster and sharing of such SNVs between
clones, we concluded that these clusters likelyrepresent variants created during sequential post-zygotic cleavages. Assuming equal contributionof dividing cells to tissues, we reconstructed thecell-progeny tree and determined the precise orig-in of 84mutations during the first five postzygoticcleavages (Fig. 3B and fig. S19). These mutationstypically had VAFs above 1%, whereas the re-maining ones, typically with VAFs below 1%, wereassigned to later divisions. Only two SNVs hadconflicting assignment between clusters and clones,whichperhaps could be explainedbymisclusteringor incorrect discovery and/or genotyping in clones.Alternatively, these conflicts, alongwith the highVAFs for the very first SNV in the tree (Fig. 3A),may indicate an unequal lineage contribution totissues due to asymmetric division, unequal pro-liferation, or positive or negative selection (19).Using the trees, we estimated the average mu-
tation rates per division per daughter cell in theearly human embryo as 1.66 ± 0.24, 1.18 ± 0.33,and 1.05 ± 0.22 for brains of subjects 316, 320,and 275, respectively. The weighted average andvariance of the three measurements is 1.3 ± 0.15,consistent with the rate of 1.2 estimated fromanalysis of de novo SNVs in familial trios (20),and lower than the lower-bound estimate for mu-tability of neuronal progenitors, thereby suggest-ing that the mutation rate during neurogenesis ishigher than that in the early embryo.We then split the set of mosaic SNVs into early
origin—those genotyped in tissues by captureexperiments—and likely late origin—those not
Bae et al., Science 359, 550–555 (2018) 2 February 2018 2 of 6
clone-to-spleen clone-to-tissue
clone-to-clone
5,841
156
167 66
77
68
145 106
31
4
Other
Highly confident basesP
ropo
rtio
n of
SN
Vs
Genome average
1,769 33 88 69 136
O
HigH
Variant allele frequency in cortex VZ-SVZ0.01% 0.1% 1% 10% 100%C
ount
of g
enot
yped
SN
Vs
4,305 35 68 8 9
S316(15w)
S320(17w4d)
S275(21w)
24 days18 days100
200
0
300
400
500
600
Cou
nt o
f SN
Vs
S316 BG-VZ-SVZS316 FR-VZ-SVZS320 FR-VZ-SVZS275 FR-VZ-SVZS275 PA-VZ-SVZ
non-CpG
CpG
C>A C>G C>T T>A T>C T>G
0.1
0.2
0.0
0.3
0.4
0.5
Pro
port
ion
of S
NV
s
Substitution type
Fig. 1. SNV discovery in brains. (A) Three approaches of discoveringmosaic SNVs were contrasted: comparing clones to the VZ-SVZ tissueof origin, comparing clones to the spleen, and comparing clones witheach other (see fig. S3). The three approaches give largely concordantcalls. The comparison is for calls from the brains of all three subjects.(B) Calls specific to clone-to-original tissue (in blue) and clone-to-spleen(in red) discovery approaches are notably enriched for bases with lessconfident calling (as defined by the mask of the 1000 Genomes Project).These residual calls were not included in the final call set. Colors as in (A).
(C) VAF of genotyped SNVs from deep resequencing in all three brains.The clone-to-clone discovery approach allows for finding high-frequencymosaic SNVs in brain tissue (green line) that are missed from clone-to-tissue or clone-to-spleen comparisons. Colors as in (A). (D) Counts ofmosaic SNVs (for subjects 316, 320, and 275) per clone increase linearlywith fetal age (w, weeks; d, days). BG, basal ganglia; FR, frontal region;PA, parietal region. (E) Contribution of each substitution type to themutation spectrum is not different between different fetuses and brainregions. Colors as in (D). Bars in (D) and (E) indicate mean ± SEM.
RESEARCH | RESEARCH ARTICLEon F
ebruary 28, 2020
http://science.sciencemag.org/
Dow
nloaded from
genotyped. The spectrum of early mosaic muta-tions (i.e., the frequency of nucleotide substitu-tion in the context of trinucleotide motifs) bearslittle resemblance to the spectrum of SNVs oc-curring later in neuronal progenitors, revealing ashift inmutagenesis duringdevelopment (Fig. 3C).Early mutations had the same 2.2 Ti/Tv ratio asgermline variants and had a larger fraction of C:G→T:A transitions overall (P value of 2.2 × 10−16,by Fischer’s exact test), particularly in CpGmotifs (P value of 4.3 × 10−5, by Fisher’s exacttest), implicating the spontaneous deamination of5-methylcytosine as a contributor to themutagenicprocess (16). The signature of earlier mutationswas also similar to the signature of de novomuta-tions in the human population (20). As some ofthe early mutations can be passed to the nextgeneration through the germline lineage, theconvergence of their spectrum with de novo andgermline variants is expected. Later mutations,on the other hand, had a larger contribution ofC:G→A:T transversions (P value of 8.0 × 10−12,by Fisher’s exact test), implicating oxidative dam-
age as a significant contributor to the mutagenicprocess (16). Furthermore, themutation spectrumfor these transversionswasmost similar (Pearson’scorrelation coefficient r = 0.90) to the spectrumobserved in colorectal cancer that results from adeficiency in the DNA glycosylase MUTYH, whererepair of 8-oxoguanine, the product of oxidativedamage, is compromised (21).
Properties of mosaic SNVs
For the following analyses, we used all mosaicSNVs in the discovery data set from all clones,totaling 6288, as mutation spectra for the threebrains were extremely similar (fig. S20). The dis-tribution between neighboringmosaic SNVs wasconsistentwith uniform randomplacement acrossthe genome (fig. S21). In line with this, no en-richment of exonic and intronic SNVs in any geneontology category was observed either when as-suming uniform background mutation rate orwhen using, as a backgroundmosaic, SNVs fromliver, intestine, colon, or fibroblasts (1, 2). Rough-ly 3% of SNVsmay have a functional consequence
by affecting either protein-coding sequence orgene regulation (fig. S21 and table S4). This proj-ects to about 12 nonbenign SNVs per progenitorcell at 20 postconception weeks. A significant de-pletion of mosaic SNVs was observed in deoxy-ribonuclease (DNase)–hypersensitive sites relativeto flanking regions (Fig. 4A). The depletion wasmore pronounced (10 versus 5%) when utilizingDNase-hypersensitive sites for fetal brain ratherthan for a lymphoblastoid cell line, suggestinga relation between a cell’s epigenome and thegenesis of mutations. Because no such depletionwas observed in coding relative to intronic generegions, the depletion is not the result of negativeselection and, rather, likely reflects better repairefficiency in open DNA regions, as was observedfor somatic mutations in cancers (22, 23).Similar to somatic SNVs in cancers (24) and
mosaic SNVs in skin fibroblasts (1), we found thatthe density ofmosaic SNVs in the brain correlatesnegatively withmost histonemarks in fetal brainand embryonic stem cells (Fig. 4B). Comparisonof our mosaic SNVs with mutational signatures
Bae et al., Science 359, 550–555 (2018) 2 February 2018 3 of 6
BG-VZ-SVZ
PA-CX Occipital
BG FR-CX
Spleen
FR-VZ-SVZ
Cerebellum
n=68 n=50
n=48 n=42
n=52
n=48 n=47
n=55
0
50
1000
50
100
Cou
nt o
f SN
Vs
0.01
%0.
1% 1% 10%
100%
0.1% 1% 10
%10
0%0.
1% 1% 10%
100%
0.1% 1% 10
%10
0%
Variant allele frequency
Ungenotyped Mosaic Germline
Cerebellum
Occipital
FR-VZ-SVZ
BG-VZ-SVZ
Spleen
FR-CXBG
Spl
een
FR-CXOccipital
Cer
ebel
lum
Fig. 2. Genotyping of SNV in original tissues. (A) Several dozens ofmosaic SNVs with VAFs of 0.3 to 30% in tissues from various brain regionsand the spleen are genotyped by the capture-resequencing approach(green line). For hundreds more SNVs, the evidence for presence in tissueis indistinguishable from background noise (blue line). BG, basal ganglia;FR and PA, frontal and parietal region of the cerebral cortex, respectively;VZ-SVZ, ventricular germinal layers; CX, outer cortical layer (see fig. S1).(B) Venn diagram of genotypedmosaic SNVs across brain regions and spleen
for subject 316. Almost 60% of mosaic SNVs could be genotyped in one ormore brain regions and spleen, and 44% could be genotyped in all brainregions and spleen. (C and D) Comparative VAFs for mosaic SNVs acrossdifferent brain regions and spleen for the same subject. Many SNVs areshared by multiple brain regions and by brain and spleen with similar VAFs(SNVs shared across two tissues are indicated by green, red, and bluecircles, whereas SNVs shared across three tissues are indicated by magentacircles). Black and gray circles indicate SNVs genotyped in only one region.
RESEARCH | RESEARCH ARTICLEon F
ebruary 28, 2020
http://science.sciencemag.org/
Dow
nloaded from
found in cancer (25, 26) revealed that signatures18 and 8—found in neuroblastoma andmedullo-blastoma, respectively—as well as their combina-tion, are the best descriptors for the spectrum ofmosaicmutations in the developing brain (Fig. 4,C andD).Mosaic SNVswere equallywell describedby the combination of signatures 5with 18 and 1Bwith 18. Therefore, signature 18, with suspectedetiology of oxidative damage (27), consistently con-tributed to themutation spectrumofmosaic SNVsin fetal brain progenitors. This signature wasmostly similar to late SNVs, whereas signature1Bwasmostly similar to the early ones (fig. S22).
Implications for development and disease
Our study uncovered extensive mosaicism inhuman fetal brain, with 200 to 400 SNVs present
per brain progenitor cell at 15 to 21 weeks ofgestation. This amount of mosaicism is likely in-herited by cortical postmitotic neurons, as neuro-genesis ends at around 20 weeks in humans (28).Indeed, our estimate is in good agreement withthe estimate that postmitotic neurons have ~300to 900mosaic SNVswithin one year of birth (29).There is an order-of-magnitude difference betweennumbers of mosaic SNVs and de novo single-nucleotide polymorphisms (SNPs) (20), implyinga higher effect of mosaic SNVs on normal braindevelopment and disease. Indeed, we estimatethat up to 12 nonbenignmutations can be presentin neuronal progenitors and consequently trans-mitted to a sizable fraction of daughter neurons.It is conceivable that, in rare cases, some of thesemutations may have a strong deleterious effect,
for example, initiating overgrowth (30, 31) or neo-plastic transformation by knocking out key genes.Indeed, the resemblance of mosaic SNVs in fetalbrain to somatic mutations in brain cancers and,particularly, to medulloblastoma supports thetheory that cancer-drivingmutations can happenby chance during background mutagenesis (32).As dozens of discovered mutations happen
before gastrulation, our study demonstrates thatearly postzygotic mutations can be reconstructedfrom the analysis of a handful of clones and tis-sues, opening an avenue for charting individu-alized mosaicism maps. As mosaic variants cancontribute to interindividual phenotypic differ-ences and have been implicated in an individ-ual’s disease risk, we suggest that knowing theindividual “mosaicome” could be as important
Bae et al., Science 359, 550–555 (2018) 2 February 2018 4 of 6
25% 12.5% 6.25% 3.125% 1.56%Expected VAF
D2 (11.34%) D3 (6.85%) D4 (3.44%) D5 (1.48%) Later divisions
35%
25%
10%
0%
VA
F
15:8
0509
588:
A:C
*9:
1198
1289
7:C
:G*
10:8
2892
782:
G:A
17:1
1299
753:
C:T
5:24
6764
84:C
:G3:
6531
5491
:T:G
X:3
2634
564:
A:T
12:1
2995
8073
:G:A
2:19
5286
731:
G:A
6:12
3506
649:
G:A
11:4
8722
461:
C:G
*5:
1474
7848
:C:T
*4:
2302
291:
G:A
1:69
9378
67:T
:A10
:130
3526
45:C
:G12
:589
4770
2:G
:A1:
1445
6553
9:C
:G12
:124
9162
43:G
:T6:
5024
6761
:G:T
3:19
3553
373:
C:T
2:11
5612
987:
C:G
10:6
0934
523:
C:T
9:54
2724
3:C
:T9:
2459
2787
:A:G
18:7
5829
051:
C:A
14:9
8610
053:
G:T
*7:
3450
6076
:G:C
4:12
8560
345:
T:G
X:1
0006
8496
:A:C
17:2
5728
433:
C:T
4:98
1971
30:C
:AX
:478
4004
:C:T
10:5
7119
608:
G:T
1:97
7204
61:A
:T2:
1904
0302
0:T:
C2:
1904
0032
7:T:
A18
:427
8369
1:C
:T3:
4657
0239
:C:T
13:7
8250
380:
G:T
10:6
9425
741:
G:A
12:1
1005
1561
:C:T
18:1
3291
163:
C:T
12:3
4279
291:
C:T
12:1
2116
4908
:C:G
9:86
1500
85:G
:A10
:102
3714
98:T
:C8:
1260
4522
2:G
:T4:
4147
1182
:C:T
10:8
8493
465:
C:T
X:7
9532
040:
G:A
14:8
7748
661:
C:T
X:3
8830
155:
C:T
22:3
9629
836:
G:A
2:15
8657
15:G
:A9:
1135
0347
:C:T
8:25
4943
0:G
:A8:
7625
6101
:C:T
14:9
4785
774:
A:G
6:10
9998
647:
A:T
9:28
2705
18:G
:A2:
1493
9539
0:C
:T15
:931
5858
3:G
:A4:
1197
9017
6:G
:A5:
1065
6866
9:C
:A17
:373
1925
7:G
:A11
:119
7546
7:G
:A17
:803
3754
9:G
:A8:
5055
7028
:G:A
15:4
1192
288:
G:A
17:1
4636
79:G
:A1:
5216
0694
:T:C
6:46
7179
19:G
:A5:
1605
7395
9:C
:T1:
2334
0265
1:C
:T5:
1302
1644
3:G
:A10
:878
1647
4:C
:T22
:316
0309
4:C
:T4:
2873
2455
:G:A
9:13
5564
031:
C:T
5:71
0182
8:G
:A8:
8628
2399
:G:A
8:10
3821
858:
T:C
SpleenCerebellum
OccipitalPA-CX
BGBG-VZ-SVZFR-VZ-SVZ
FR-CX#11#9b
#1#3
#18b#22
#20b#19b
#25#24
#6
a b c d e f g h i j k l m n o p q r s t u v w x y z
#19b
#20b
#3
#25
#18b
#11
#9b
#1
#6
#24
#22
traceable
untraceable
b, d, g, h
c, e, f
a
l
n, q
i(?), k, o
, , , , , , , ,
r, s, u, v, w, x, y, z, , (?)
, , ,
j, m, pt
,
,
Random0.5 1.0
Spectrum of late(n=6,144)
Spectrum of early(n=144)
Spectrum of de novo(n=747, ref. 20)
r=0.36
r=0.29r=0.77
Pro
port
ion
of S
NV
s
0.12
0.06
0.00
0.0
0.5
C>A C>G C>T T>A T>C T>G
Early
Late4.3 10-5
CpG non-CpG
0% 50% 100%
8.0 10-12
2.2 10-16
Fig. 3. Reconstruction of mosaic SNV mutations during earlydevelopment of subject 316. (A) Hierarchical clustering of SNVs genotypedin the different brain regions and spleen by their VAFs revealed groupingconsistent with SNV sharing between clones (white squares represent zeroVAF). Black and gray squares denote, respectively, SNVs discovered inclones and SNVs missed during discovery but genotyped afterwards. Forcompleteness, five SNVs (marked with *) were included in the analysis ifpresent in multiple clones but the corresponding VAFestimation from captureresequencing was not available. Their VAFs were estimated from whole-genome tissue sequencing. On the basis of the corresponding average VAF(shown underneath each cluster), each cluster was assigned to consecutivepostzygotic divisions: D1 (no SNVs observed), D2, D3, D4, and D5. The #
notation indicates the ID number of each clone. (B) The reconstructed cell-progeny tree during those divisions had only two conflicts of SNVassignment,denoted by “?”, between clusters and clones. “Expected VAF” denotes VAFof mutations arising at each stage, assuming equal contribution of allprogenies to tissues. (C) Mutational spectra of likely early mosaic SNVs(darker color shades) and presumably later arising SNVs (lighter colorshades) are different. The difference in the spectra is due to the shift infrequency of C:G→T:A transitions, particularly in CpG motifs, and C:G→A:Ttransversions.The spectrum of early SNVs is much closer to the spectrum forde novo SNVs in the human population (triangle with Pearson’s correlationcoefficient r values). Random distribution represents correlation coefficientswhen randomly, but proportionally, subsampling early and late mutations.
RESEARCH | RESEARCH ARTICLEon F
ebruary 28, 2020
http://science.sciencemag.org/
Dow
nloaded from
as knowing the individual germline genome,particularly given the much stronger selectionacting on germline variants and the lower pene-trance of mosaic variants that is likely to be trans-lated in milder phenotypes.Wealso discovered a shift inmutagenesis during
development that is characterized by an increasedmutation rate and a change in frequency of sub-stitution types. We cannot rule out that the in-creased mutation rate can be partially explainedby interindividual variation, althoughwehave noevidence for such variability. The shift occurssometime between early cleavages and neuro-genesis and may be the consequence of physio-logical, biochemical, and gene-expression changesrelated to the generation of neurons fromneuralstem cells. Alternatively, the shiftmay reflectmoregeneral developmental processes common to alltissues during organogenesis, and, on the basis ofincreased counts ofmutations related to oxidative
damage, could be coupled to a higher availabilityof radical oxygen species after development ofthe cardiovascular system of the embryo. If thisis the case, we predict thatmutation spectra andrates per division undergo a similar shift duringdevelopment across all somatic lineages.Our estimated average mutation rate of 5.1
SNVs per day per neuronal progenitor duringneurogenesis implies that neurons generated atearly and later stages of neurogenesis will carrydifferent burdens of mosaic variants. This rate isthree orders of magnitude higher than 0.4 to 2mutations per year accumulated in the germlinelineage of adults (20, 33, 34). It is also 50 timeshigher than the rate in postnatal stem cells of thesmall intestine, colon, and liver, estimated to be36 mutations per year (2). Therefore, our resultsshow that the prenatal period is intrinsically high-ly mutagenic, likely the consequence of oxidativedamage coupled with more frequent cell divisions.
We found no difference in SNV counts betweenprogenitors from cortex and from basal ganglia,implying that mosaicism accumulates at similarrates across the brain during neurogenesis. Withthe observed rate of 5.1 SNVs per day per neuronalprogenitor, one can project that cells in the fore-brain subventricular zone and hippocampal sub-granular zone,whereneurogenesis andgliogenesiscontinues formore extended time periods (35–37),would accumulate about 1000 mosaic mutationsby the time of birth. This estimate is consistentwith the estimates of about 1000 mosaic SNVspresent in skin fibroblast cells and in stem cells ofthe colon and intestine in children (1, 2); indeed,mutation rates in all somatic proliferative cell lin-eages duringprenatal developmentmaybe similar.
REFERENCES AND NOTES
1. A. Abyzov et al., Genome Res. 27, 512–523 (2017).2. F. Blokzijl et al., Nature 538, 260–264 (2016).
Bae et al., Science 359, 550–555 (2018) 2 February 2018 5 of 6
Genomic distance from DNase region (Kb)
SN
V fr
eque
ncy
0 10 20-10-20
0.0
0.5
1.0
1.5 Fetal Brain (P < 0.0001) NA12878 (P = 0.0183)
Mut
atio
nal s
igna
ture
s
Correlation of signatures to the spectrum
1884
10141B
53
1A16
96
2019
21513
711122117
-0.1 0.1 0.3 0.5 0.7 0.9
Cor
rela
tion
with
SN
V d
ensi
ty
H2A
.ZH
2BK
120a
cH
2BK
20ac
H3K
79m
e1H
3K79
me2
H4K
20m
e1H
4K8a
cH
3K23
me2
H4K
91ac
H4K
5ac
H3K
4ac
H2A
K5a
cH
3K23
acH
3K56
acH
3K14
acH
2BK
12ac
H3K
27ac
H2B
K5a
cH
2BK
15ac
H3K
18ac
H3K
9ac
H3K
4me2
H3K
9me3
H3K
36m
e3D
Nas
e.m
acs2
H3K
27m
e3H
3K4m
e3H
3K4m
e1
Epigenetic mark density
-0.6
-0.4
-0.2
0.0
0.2
0.4ESCFetal Brain
Num
ber
of p
airs
Correlation of pairs of signatures to the spectrum
5 and 18 (59.9%, 40.1%)8 and 18 (73.9%, 26.1%)1B and 18 (56.1%, 43.9%)
0.0 0.2 0.4 0.6 0.8 1.0
4
8
12
16
0
Fig. 4. Properties of mosaic SNVs in brain. (A) Depletion of mosaicSNVs in DNase-hypersensitive sites, possibly indicating a betterefficiency of DNA-repair pathways in those regions (22, 23). Kb, kilobase.(B) Density of mosaic SNVs correlates negatively with histone marks inembryonic stem cells (ESCs) and fetal brain, revealing similarity to somaticSNVs in cancers. (C) Mutational signatures 8 and 18 (orange) found in
brain cancers have the highest two correlations with the mutationspectrum of mosaic SNVs. (D) Exhaustive combinations of pairs ofsignatures consistently show that signatures 1B and 5 also contribute tothe description of the mutation spectrum in combination with signature 18.Thus, signature 18 is the best descriptor of mosaic SNVs indeveloping brain.
RESEARCH | RESEARCH ARTICLEon F
ebruary 28, 2020
http://science.sciencemag.org/
Dow
nloaded from
3. N. Saini et al., PLOS Genet. 12, e1006385 (2016).4. A. Abyzov et al., Nature 492, 438–442 (2012).5. M. O’Huallachain, K. J. Karczewski, S. M. Weissman,
A. E. Urban, M. P. Snyder, Proc. Natl. Acad. Sci. U.S.A. 109,18018–18023 (2012).
6. M. J. McConnell et al., Science 342, 632–637 (2013).7. C. C. Laurie et al., Nat. Genet. 44, 642–650 (2012).8. K. B. Jacobs et al., Nat. Genet. 44, 651–658 (2012).9. G. D. Evrony et al., Cell 151, 483–496 (2012).10. J. A. Erwin et al., Nat. Neurosci. 19, 1583–1591 (2016).11. T. R. Insel, Mol. Psychiatry 19, 156–158 (2014).12. M. J. McConnell et al., Science 356, eaal1641 (2017).13. M. A. Lodato et al., Science 350, 94–98 (2015).14. C. Chen et al., Science 356, 189–194 (2017).15. X. Dong et al., Nat. Methods 14, 491–493 (2017).16. A. Bacolla, D. N. Cooper, K. M. Vasquez, Genes (Basel) 5,
108–146 (2014).17. D. R. Kornack, P. Rakic, Proc. Natl. Acad. Sci. U.S.A. 95,
1242–1246 (1998).18. K. L. Moore, T. V. N. Persaud, M. G. Torchia, Before We Are
Born (Elsevier Health Sciences, 2015).19. Y. S. Ju et al., Nature 543, 714–718 (2017).20. R. Rahbari et al., Nat. Genet. 48, 126–133 (2016).21. A. Viel et al., EBioMedicine 20, 39–49 (2017).22. D. Perera et al., Nature 532, 259–263 (2016).
23. R. Sabarinathan, L. Mularoni, J. Deu-Pons, A. Gonzalez-Perez,N. López-Bigas, Nature 532, 264–267 (2016).
24. P. Polak et al., Nature 518, 360–364 (2015).25. L. B. Alexandrov et al., Nature 500, 415–421 (2013).26. M. S. Lawrence et al., Nature 499, 214–218 (2013).27. T. Helleday, S. Eshtad, S. Nik-Zainal, Nat. Rev. Genet. 15,
585–598 (2014).28. A. A. Pollen et al., Cell 163, 55–67 (2015).29. M. A. Lodato et al., Science, 359 555–559 (2018).30. A. Poduri et al., Neuron 74, 41–48 (2012).31. J. H. Lee et al., Nat. Genet. 44, 941–945 (2012).32. C. Tomasetti, B. Vogelstein, Science 347, 78–81 (2015).33. J. J. Michaelson et al., Cell 151, 1431–1442 (2012).34. H. Jónsson et al., Nature 549, 519–522 (2017).35. F. H. Gage, J. Neurosci. 22, 612–613 (2002).36. A. Ernst et al., Cell 156, 1072–1083 (2014).37. J. T. Gonçalves, S. T. Schafer, F. H. Gage, Cell 167, 897–914
(2016).
ACKNOWLEDGMENTS
This work was supported by the high-performance computing(HPC) facilities operated by the Yale Center for ResearchComputing and Yale’s W. M. Keck Biotechnology Laboratory, as
well as their respective staff. This work is also supported by NIHgrants RR19895 and RR029676-01, which helped fund the cluster.The sequencing data from this study have been deposited to theNIH National Institute of Mental Health (NIMH) Data Archive(https://data-archive.nimh.nih.gov) under collection ID #2330 andDOI: 10.15154/1410419. This work was funded by the Mayo ClinicCenter For Individualized Medicine and by NIH grants R01MH100914 (F.M.V.), U01 MH106876 (F.M.V., A.A., A.E.U.), U01MH106874 (N.S.), P50 MH106934 (N.S.), and R03 CA191421(A.A.). A.A. is also a Visiting Professor at Yale Child Study Center.The supplementary materials contain additional data.
SUPPLEMENTARY MATERIALS
www.sciencemag.org/content/359/6375/550/suppl/DC1Materials and MethodsFigs. S1 to S24Tables S1 to S4References (38–41)
23 June 2017; accepted 28 November 2017Published online 7 December 201710.1126/science.aan8690
Bae et al., Science 359, 550–555 (2018) 2 February 2018 6 of 6
RESEARCH | RESEARCH ARTICLEon F
ebruary 28, 2020
http://science.sciencemag.org/
Dow
nloaded from
neurogenesisDifferent mutational rates and mechanisms in human cells at pregastrulation and
VaccarinoPattni, Bo-Juen Chen, Elisa Venturini, Bridget Riley-Gillis, Nenad Sestan, Alexander E. Urban, Alexej Abyzov and Flora M. Taejeong Bae, Livia Tomasini, Jessica Mariani, Bo Zhou, Tanmoy Roychowdhury, Daniel Franjic, Mihovil Pletikos, Reenal
originally published online December 7, 2017DOI: 10.1126/science.aan8690 (6375), 550-555.359Science
, this issue p. 550, p. 555; see also p. 521Sciencedivergence of genomes across the brain could affect function.affected by inborn errors in DNA repair. Postmitotic mutations might only affect one neuron, but the accumulatedaged 4 months to 82 years. Somatic mutations accumulated with increasing age and accumulated faster in individuals
also found that neurons take on somatic mutations as they age by sequencing single neurons from subjectset al.Lodato These early mutations could be generating useful neuronal diversity or could predispose individuals to later dysfunction.human brain. Both the type of mutation and the rates of accumulation changed between gastrulation and neurogenesis.
examined the genomes of single neurons from the prenatal developinget al.renewal (see the Perspective by Lee). Bae Most neurons that make up the human brain are postmitotic, living and functioning for a very long time without
Brain mutations, young and old
ARTICLE TOOLS http://science.sciencemag.org/content/359/6375/550
MATERIALSSUPPLEMENTARY http://science.sciencemag.org/content/suppl/2017/12/06/science.aan8690.DC1
CONTENTRELATED
http://science.sciencemag.org/content/sci/359/6375/521.fullhttp://science.sciencemag.org/content/sci/359/6375/555.full
REFERENCES
http://science.sciencemag.org/content/359/6375/550#BIBLThis article cites 40 articles, 11 of which you can access for free
PERMISSIONS http://www.sciencemag.org/help/reprints-and-permissions
Terms of ServiceUse of this article is subject to the
is a registered trademark of AAAS.ScienceScience, 1200 New York Avenue NW, Washington, DC 20005. The title (print ISSN 0036-8075; online ISSN 1095-9203) is published by the American Association for the Advancement ofScience
Science. No claim to original U.S. Government WorksCopyright © 2018 The Authors, some rights reserved; exclusive licensee American Association for the Advancement of
on February 28, 2020
http://science.sciencem
ag.org/D
ownloaded from