different mutational rates and mechanisms in human cells ... · bridget riley-gillis,5 nenad...

7
NEURODEVELOPMENT Different mutational rates and mechanisms in human cells at pregastrulation and neurogenesis Taejeong Bae, 1 Livia Tomasini, 2 Jessica Mariani, 2 Bo Zhou, 3 Tanmoy Roychowdhury, 1 Daniel Franjic, 4 Mihovil Pletikos, 4 Reenal Pattni, 3 Bo-Juen Chen, 5 Elisa Venturini, 5 Bridget Riley-Gillis, 5 Nenad Sestan, 4,6 Alexander E. Urban, 3 Alexej Abyzov, 1 * Flora M. Vaccarino 2,4,6 * Somatic mosaicism in the human brain may alter function of individual neurons. We analyzed genomes of single cells from the forebrains of three human fetuses (15 to 21 weeks postconception) using clonal cell populations. We detected 200 to 400 single-nucleotide variations (SNVs) per cell. SNV patterns resembled those found in cancer cell genomes, indicating a role of background mutagenesis in cancer. SNVs with a frequency of >2% in brain were also present in the spleen, revealing a pregastrulation origin. We reconstructed cell lineages for the first five postzygotic cleavages and calculated a mutation rate of ~1.3 mutations per division per cell. Later in development, during neurogenesis, the mutation spectrum shifted toward oxidative damage, and the mutation rate increased. Both neurogenesis and early embryogenesis exhibit substantially more mutagenesis than adulthood. S omatic mutagenesis is one of the emerging areas of vertebrate genome biology. Sev- eral studies revealed extensive genomic mosaicism marked by hundreds of single- nucleotide variants (SNVs) per cell in so- matic tissues of the human body, such as skin fibroblasts, intestine, liver, and colon (13). Mosaic copy-number alterations are also common, and insertions of retrotransposable elements have been detected (410). Mosaicism is prominent in the central nervous system, with implications for brain evolution and the genomic underpinnings of human neuropsychiatric disorders (11, 12). Roughly 1500 SNVs might be present in mature neurons from the adult human cortex, which are only detectable in the analyzed cell and are thought to be related to transcriptional activity (13). However, the temporal origin of these SNVs during development is unknown. Furthermore, the use of in vitro whole-genome amplification (WGA) from DNA of single nuclei is prone to experimen- tal artifacts mimicking SNVs (14, 15). Here we describe the discovery and analysis of mosaic SNVs in neuronal progenitor cells in three fetal human brains. Individual progenitor cells were allowed to proliferate into clonal cell populations, which yielded insights into the genomes of the founder cells (fig. S1) and provided an estimation of the frequency and mutation spectrum of mosaic mutations in human development while avoiding WGA-associated artifacts. Discovery and validation of mosaic SNVs Brains were collected from phenotypically normal postmortem human fetuses ranging in age from 15 to 21 weeks postconception. Based on a com- parison of counts of germline SNVs (3,809,591 for subject 316; 4,316,547 for subject 320; and 3,746,847 for subject 275) to those derived by the 1000 Genomes Project across different human subpopulations, we concluded that subjects 316 and 275 were of non-African origin, whereas sub- ject 320 (male, 17 weeks postconception) was of African descent. From a bulk culture of dissociated cells of the ventricular and subventricular zones (VZ-SVZ) of the frontal region of the cerebral cortex, parietal cortex, or basal ganglia (BG), we grew 31 single- cellderived clonal cell populations, each con- taining a few thousand cells, using the limiting dilution approach (fig. S1). A few possible di- visions before dilution are not likely to notably contribute to the mutation landscape in each cell. DNA extracted from the individual clones, the source tissue of germinal zones, and the spleen was sequenced to a minimum of 30x genome coverage (fig. S1). For three clones, we could not derive enough cells; hence, DNA was amplified by mutliple displacement amplification before sequencing. Mosaic SNVs present in the founder cell of the clones were discovered by comparing genomes of clones both to each other and to genomes of the germinal zone tissue and spleen (table S1). We selected those calls with greater than 35% variant-allele frequency (VAF) in clones as can- didate mosaic SNVs. This limit was chosen to exclude mutations arising during culture, which should have a VAF of 25% or less. The distribu- tion for the SNV discovery data set is centered around a VAF of 50%, as expected for true mosaic variants (figs. S1 and S2). When comparing clones to each other, we fil- tered the resulting calls on the basis of the con- formity of their recurrence to clones that are expected and are not expected to carry the same mosaic SNVs (fig. S3). Calls from such clone-to- clone comparisons were 98.9% concordant with calls from comparing clones to VZ-SVZ brain tissue or spleen (Fig. 1A). However, among 68 calls made exclusively from clone-to-clone compar- isons, 31 (46%) were missing from clone-tobrain tissue or clone-to-spleen comparisons because they corresponded to SNVs present in tissues at high frequency (Fig. 1C), demonstrating the ad- vantage of the clone-to-clone comparison approach. Therefore, the clone-to-clone comparison repre- sents an alternative design to the use of familial trios (1) for the study of mosaicism. Eight randomly selected SNVs were all con- firmed in the clones using polymerase chain re- action (PCR) and Sanger sequencing (table S2). As an additional validation strategy, we designed an oligonucleotide library complementary to the loci of all 6288 SNVs comprising the discovery data set and performed capture and deep re- sequencing (to ~1000x coverage) in the DNA from 10 clones. This confirmed the 50%-centered VAF distribution for a majority of SNVs, with a minority (5.1%) having a VAF lower than 35%, perhaps indicating that these variations could have arisen during cell culturing (figs. S1 and S4). Accordingly, we estimated our false-positive rate at around 5%. From an in silico comparison of our clones with the unrelated and well-charac- terized cell line NA12878, we estimated that the sensitivity for discovering mosaic SNVs in the clones was ~83% (fig. S5). Mosaic SNV counts, mutation spectra, and distributions across brain regions SNVs were found at rates of 108 to 572 per clone, with clones from older brains containing more variants (Fig. 1D), which averages to 200 to 400 SNVs after adjustment for discovery sensitivity and false positives. No differences in SNV counts for clones from frontal and parietal cortex and from frontal cortex and basal ganglia of the same brain were noticeable (Fig. 1D). Similarly, the relative contributions of substitution types to the mutation spectrum were the same for clones from different brains and from different brain regions (Fig. 1E). Overall, the transition-to-transversion ratio (Ti/Tv) was 0.6, with the most frequent sub- stitution type being a C:GA:T transversion. This perhaps reflects DNA damage by oxidation, result- ing in 8-oxoguanine that is later fixated to thre- onine through incorrect base pairing with adenine (16). The second most common substitution was a C:GT:A transition, which is thought to be caused by deamination of cytosine and 5-methylcytosine (16). Linear approximation of the increase in SNV counts over time allowed estimation of a muta- tion rate of 5.1 (95% CI, 1.5 to 9) SNVs per day per progenitor during neurogenesis. This projects to a rate of roughly 8.6 (95% CI, 1.6 to 20) SNVs per RESEARCH Bae et al., Science 359, 550555 (2018) 2 February 2018 1 of 6 1 Department of Health Sciences Research, Mayo Clinic, Rochester, MN 55905, USA. 2 Child Study Center, Yale University, New Haven, CT 06520, USA. 3 Departments of Psychiatry and Genetics, Stanford University, Palo Alto, CA 94305, USA. 4 Department of Neuroscience, Yale University, New Haven, CT 06520, USA. 5 New York Genome Center, New York, NY 10013, USA. 6 Yale Kavli Institute for Neuroscience, New Haven, CT 06520, USA. *Corresponding author. Email: [email protected] (A.A.); [email protected] (F.M.V.) on February 28, 2020 http://science.sciencemag.org/ Downloaded from

Upload: others

Post on 22-Feb-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Different mutational rates and mechanisms in human cells ... · Bridget Riley-Gillis,5 Nenad Sestan,4,6 Alexander E. Urban,3 Alexej Abyzov,1* Flora M. Vaccarino2,4,6* Somatic mosaicism

NEURODEVELOPMENT

Different mutational rates andmechanisms in human cells atpregastrulation and neurogenesisTaejeong Bae,1 Livia Tomasini,2 Jessica Mariani,2 Bo Zhou,3 Tanmoy Roychowdhury,1

Daniel Franjic,4 Mihovil Pletikos,4 Reenal Pattni,3 Bo-Juen Chen,5 Elisa Venturini,5

Bridget Riley-Gillis,5 Nenad Sestan,4,6 Alexander E. Urban,3

Alexej Abyzov,1* Flora M. Vaccarino2,4,6*

Somatic mosaicism in the human brain may alter function of individual neurons. Weanalyzed genomes of single cells from the forebrains of three human fetuses (15 to 21 weekspostconception) using clonal cell populations. We detected 200 to 400 single-nucleotidevariations (SNVs) per cell. SNV patterns resembled those found in cancer cell genomes,indicating a role of background mutagenesis in cancer. SNVs with a frequency of >2%in brain were also present in the spleen, revealing a pregastrulation origin. We reconstructedcell lineages for the first five postzygotic cleavages and calculated a mutation rate of~1.3 mutations per division per cell. Later in development, during neurogenesis, the mutationspectrum shifted toward oxidative damage, and the mutation rate increased. Bothneurogenesis and early embryogenesis exhibit substantially more mutagenesis than adulthood.

Somatic mutagenesis is one of the emergingareas of vertebrate genome biology. Sev-eral studies revealed extensive genomicmosaicism marked by hundreds of single-nucleotide variants (SNVs) per cell in so-

matic tissues of the human body, such as skinfibroblasts, intestine, liver, and colon (1–3).Mosaiccopy-number alterations are also common, andinsertions of retrotransposable elements havebeen detected (4–10). Mosaicism is prominent inthe central nervous system, with implications forbrain evolution and the genomic underpinningsof human neuropsychiatric disorders (11, 12).Roughly 1500 SNVs might be present in matureneurons from the adult human cortex, whichare only detectable in the analyzed cell and arethought to be related to transcriptional activity(13). However, the temporal origin of these SNVsduring development is unknown. Furthermore, theuse of in vitro whole-genome amplification (WGA)fromDNA of single nuclei is prone to experimen-tal artifacts mimicking SNVs (14, 15). Here wedescribe the discovery and analysis of mosaicSNVs in neuronal progenitor cells in three fetalhuman brains. Individual progenitor cells wereallowed to proliferate into clonal cell populations,which yielded insights into the genomes of thefounder cells (fig. S1) and provided an estimationof the frequency andmutation spectrum ofmosaic

mutations in human development while avoidingWGA-associated artifacts.

Discovery and validation of mosaic SNVs

Brains were collected fromphenotypically normalpostmortem human fetuses ranging in age from15 to 21 weeks postconception. Based on a com-parison of counts of germline SNVs (3,809,591for subject 316; 4,316,547 for subject 320; and3,746,847 for subject 275) to those derived bythe 1000 Genomes Project across different humansubpopulations, we concluded that subjects 316and 275 were of non-African origin, whereas sub-ject 320 (male, 17 weeks postconception) was ofAfrican descent.From a bulk culture of dissociated cells of the

ventricular and subventricular zones (VZ-SVZ)of the frontal region of the cerebral cortex, parietalcortex, or basal ganglia (BG), we grew 31 single-cell–derived clonal cell populations, each con-taining a few thousand cells, using the limitingdilution approach (fig. S1). A few possible di-visions before dilution are not likely to notablycontribute to the mutation landscape in eachcell. DNA extracted from the individual clones,the source tissue of germinal zones, and the spleenwas sequenced to a minimum of 30x genomecoverage (fig. S1). For three clones, we could notderive enough cells; hence, DNA was amplifiedby mutliple displacement amplification beforesequencing.Mosaic SNVs present in the founder cell of the

clones were discovered by comparing genomesof clones both to each other and to genomes ofthe germinal zone tissue and spleen (table S1).We selected those calls with greater than 35%variant-allele frequency (VAF) in clones as can-didate mosaic SNVs. This limit was chosen toexclude mutations arising during culture, whichshould have a VAF of 25% or less. The distribu-

tion for the SNV discovery data set is centeredaround a VAF of 50%, as expected for true mosaicvariants (figs. S1 and S2).When comparing clones to each other, we fil-

tered the resulting calls on the basis of the con-formity of their recurrence to clones that areexpected and are not expected to carry the samemosaic SNVs (fig. S3). Calls from such clone-to-clone comparisons were 98.9% concordant withcalls from comparing clones to VZ-SVZ braintissue or spleen (Fig. 1A). However, among 68calls made exclusively from clone-to-clone compar-isons, 31 (46%) were missing from clone-to–braintissue or clone-to-spleen comparisons becausethey corresponded to SNVs present in tissues athigh frequency (Fig. 1C), demonstrating the ad-vantage of the clone-to-clone comparison approach.Therefore, the clone-to-clone comparison repre-sents an alternative design to the use of familialtrios (1) for the study of mosaicism.Eight randomly selected SNVs were all con-

firmed in the clones using polymerase chain re-action (PCR) and Sanger sequencing (table S2).As an additional validation strategy, we designedan oligonucleotide library complementary to theloci of all 6288 SNVs comprising the discoverydata set and performed capture and deep re-sequencing (to ~1000x coverage) in the DNAfrom 10 clones. This confirmed the 50%-centeredVAF distribution for a majority of SNVs, with aminority (5.1%) having a VAF lower than 35%,perhaps indicating that these variations couldhave arisen during cell culturing (figs. S1 and S4).Accordingly, we estimated our false-positive rateat around 5%. From an in silico comparison ofour clones with the unrelated and well-charac-terized cell line NA12878, we estimated that thesensitivity for discovering mosaic SNVs in theclones was ~83% (fig. S5).

Mosaic SNV counts, mutation spectra,and distributions across brain regions

SNVs were found at rates of 108 to 572 per clone,with clones from older brains containing morevariants (Fig. 1D), which averages to 200 to 400SNVs after adjustment for discovery sensitivityand false positives. No differences in SNV countsfor clones from frontal and parietal cortex andfrom frontal cortex and basal ganglia of the samebrain were noticeable (Fig. 1D). Similarly, therelative contributions of substitution types to themutation spectrumwere the same for clones fromdifferent brains and from different brain regions(Fig. 1E). Overall, the transition-to-transversionratio (Ti/Tv) was 0.6, with the most frequent sub-stitution type being a C:G→A:T transversion. Thisperhaps reflectsDNAdamage by oxidation, result-ing in 8-oxoguanine that is later fixated to thre-onine through incorrect base pairingwith adenine(16). The secondmost common substitutionwas aC:G→T:A transition, which is thought to be causedby deamination of cytosine and 5-methylcytosine(16). Linear approximation of the increase in SNVcounts over time allowed estimation of a muta-tion rate of 5.1 (95%CI, 1.5 to 9) SNVs per day perprogenitor during neurogenesis. This projects toa rate of roughly 8.6 (95% CI, 1.6 to 20) SNVs per

RESEARCH

Bae et al., Science 359, 550–555 (2018) 2 February 2018 1 of 6

1Department of Health Sciences Research, Mayo Clinic,Rochester, MN 55905, USA. 2Child Study Center, YaleUniversity, New Haven, CT 06520, USA. 3Departments ofPsychiatry and Genetics, Stanford University, Palo Alto, CA94305, USA. 4Department of Neuroscience, Yale University,New Haven, CT 06520, USA. 5New York Genome Center,New York, NY 10013, USA. 6Yale Kavli Institute forNeuroscience, New Haven, CT 06520, USA.*Corresponding author. Email: [email protected] (A.A.);[email protected] (F.M.V.)

on February 28, 2020

http://science.sciencem

ag.org/D

ownloaded from

Page 2: Different mutational rates and mechanisms in human cells ... · Bridget Riley-Gillis,5 Nenad Sestan,4,6 Alexander E. Urban,3 Alexej Abyzov,1* Flora M. Vaccarino2,4,6* Somatic mosaicism

division per progenitor, assuming that the lengthof the cell cycle of cortical progenitors is between27 and 54 hours, which is based upon studies inprimates (17). The large interval of this estima-tion is due to uncertainties in both the per-dayincrease in SNVs and the length of the cell cycle.Interindividual variabilitymay further widen theconfidence interval of this estimate.To genotype the presence of variants across

brains, we used the above-referenced capture lib-rary to conduct deep sequencing (~1000x cover-age) in the source tissue (dorsal and basal germinallayers), in the corresponding outer layers contain-ing mature neurons (frontal, parietal cortex andbasal ganglia), in other brain regions (occipitalcortex, cerebellum), and in a peripheral tissue,the spleen (table S3). A total of 144 SNVs werereliably genotyped, from 11 to 68 in each tissue,including spleen,withVAFs between 0.3 and 30%(Fig. 2A and fig. S6). High VAFs for seven suchSNVs were further cross-confirmed with an or-thogonal technique, droplet digital PCR (tableS2 and figs. S7 to S17). However, for hundredsof SNVs atmuch lower VAFs (typically below 1%),the evidence for their presence in each tissue wasnot significant, likely because of their low VAFsin tissues.Almost 60% of the genotyped SNVs (1.4% of

total SNVs) and 92%of the SNVswith VAF above2% in at least one brain region had a nonzeroVAF in the spleen (Fig. 2, B and C, and figs. S6and S18). Because the brain is of neuroectoder-mal origin and the spleen is ofmesodermal origin,

these shared SNVs likely occurred before gastru-lation, when themesoderm, ectoderm, and endo-derm differentiate from a single-layer epithelium.This suggestion is consistent with the range ofVAFs of these shared SNVs, as there are about12 cell divisions before gastrulation (18), whichcorresponds to expected VAFs from 0.03 to 25%in somatic tissues, depending on how early inembryogenesis variants have occurred. Some SNVswere shared between the spleen and only someof the brain regions (Fig. 2, B and C, and figs. S6and S18). This could indicate regional sublineages,i.e., that nonmixing sets of progenitors generateneurons in different brain regions and that thesedistinct populations of progenitorsmay not sharea common ancestor with cells in the spleen. How-ever, conducting more sensitive assays is nec-essary to exclude the possibility of incompletegenotyping. Another observation pointing toan early origin of the mutations is that the over-lap between SNVs in the basal ganglia VZ-SVZand the cortical VZ-SVZ with their respectivedifferentiated regions was less than the overlapbetween the four regions together (fig. S18), sug-gesting that most of the genotyped mutationsfound in the brain arise at least before the split-ting between basal and dorsal regions of thetelencephalon, i.e., at around the neural plate stageor earlier.Genotypedmosaic SNVs clearly cluster by sim-

ilar VAFs in each brain (Fig. 3 and fig. S19). Onthe basis of the average values of the frequenciesfor each cluster and sharing of such SNVs between

clones, we concluded that these clusters likelyrepresent variants created during sequential post-zygotic cleavages. Assuming equal contributionof dividing cells to tissues, we reconstructed thecell-progeny tree and determined the precise orig-in of 84mutations during the first five postzygoticcleavages (Fig. 3B and fig. S19). These mutationstypically had VAFs above 1%, whereas the re-maining ones, typically with VAFs below 1%, wereassigned to later divisions. Only two SNVs hadconflicting assignment between clusters and clones,whichperhaps could be explainedbymisclusteringor incorrect discovery and/or genotyping in clones.Alternatively, these conflicts, alongwith the highVAFs for the very first SNV in the tree (Fig. 3A),may indicate an unequal lineage contribution totissues due to asymmetric division, unequal pro-liferation, or positive or negative selection (19).Using the trees, we estimated the average mu-

tation rates per division per daughter cell in theearly human embryo as 1.66 ± 0.24, 1.18 ± 0.33,and 1.05 ± 0.22 for brains of subjects 316, 320,and 275, respectively. The weighted average andvariance of the three measurements is 1.3 ± 0.15,consistent with the rate of 1.2 estimated fromanalysis of de novo SNVs in familial trios (20),and lower than the lower-bound estimate for mu-tability of neuronal progenitors, thereby suggest-ing that the mutation rate during neurogenesis ishigher than that in the early embryo.We then split the set of mosaic SNVs into early

origin—those genotyped in tissues by captureexperiments—and likely late origin—those not

Bae et al., Science 359, 550–555 (2018) 2 February 2018 2 of 6

clone-to-spleen clone-to-tissue

clone-to-clone

5,841

156

167 66

77

68

145 106

31

4

Other

Highly confident basesP

ropo

rtio

n of

SN

Vs

Genome average

1,769 33 88 69 136

O

HigH

Variant allele frequency in cortex VZ-SVZ0.01% 0.1% 1% 10% 100%C

ount

of g

enot

yped

SN

Vs

4,305 35 68 8 9

S316(15w)

S320(17w4d)

S275(21w)

24 days18 days100

200

0

300

400

500

600

Cou

nt o

f SN

Vs

S316 BG-VZ-SVZS316 FR-VZ-SVZS320 FR-VZ-SVZS275 FR-VZ-SVZS275 PA-VZ-SVZ

non-CpG

CpG

C>A C>G C>T T>A T>C T>G

0.1

0.2

0.0

0.3

0.4

0.5

Pro

port

ion

of S

NV

s

Substitution type

Fig. 1. SNV discovery in brains. (A) Three approaches of discoveringmosaic SNVs were contrasted: comparing clones to the VZ-SVZ tissueof origin, comparing clones to the spleen, and comparing clones witheach other (see fig. S3). The three approaches give largely concordantcalls. The comparison is for calls from the brains of all three subjects.(B) Calls specific to clone-to-original tissue (in blue) and clone-to-spleen(in red) discovery approaches are notably enriched for bases with lessconfident calling (as defined by the mask of the 1000 Genomes Project).These residual calls were not included in the final call set. Colors as in (A).

(C) VAF of genotyped SNVs from deep resequencing in all three brains.The clone-to-clone discovery approach allows for finding high-frequencymosaic SNVs in brain tissue (green line) that are missed from clone-to-tissue or clone-to-spleen comparisons. Colors as in (A). (D) Counts ofmosaic SNVs (for subjects 316, 320, and 275) per clone increase linearlywith fetal age (w, weeks; d, days). BG, basal ganglia; FR, frontal region;PA, parietal region. (E) Contribution of each substitution type to themutation spectrum is not different between different fetuses and brainregions. Colors as in (D). Bars in (D) and (E) indicate mean ± SEM.

RESEARCH | RESEARCH ARTICLEon F

ebruary 28, 2020

http://science.sciencemag.org/

Dow

nloaded from

Page 3: Different mutational rates and mechanisms in human cells ... · Bridget Riley-Gillis,5 Nenad Sestan,4,6 Alexander E. Urban,3 Alexej Abyzov,1* Flora M. Vaccarino2,4,6* Somatic mosaicism

genotyped. The spectrum of early mosaic muta-tions (i.e., the frequency of nucleotide substitu-tion in the context of trinucleotide motifs) bearslittle resemblance to the spectrum of SNVs oc-curring later in neuronal progenitors, revealing ashift inmutagenesis duringdevelopment (Fig. 3C).Early mutations had the same 2.2 Ti/Tv ratio asgermline variants and had a larger fraction of C:G→T:A transitions overall (P value of 2.2 × 10−16,by Fischer’s exact test), particularly in CpGmotifs (P value of 4.3 × 10−5, by Fisher’s exacttest), implicating the spontaneous deamination of5-methylcytosine as a contributor to themutagenicprocess (16). The signature of earlier mutationswas also similar to the signature of de novomuta-tions in the human population (20). As some ofthe early mutations can be passed to the nextgeneration through the germline lineage, theconvergence of their spectrum with de novo andgermline variants is expected. Later mutations,on the other hand, had a larger contribution ofC:G→A:T transversions (P value of 8.0 × 10−12,by Fisher’s exact test), implicating oxidative dam-

age as a significant contributor to the mutagenicprocess (16). Furthermore, themutation spectrumfor these transversionswasmost similar (Pearson’scorrelation coefficient r = 0.90) to the spectrumobserved in colorectal cancer that results from adeficiency in the DNA glycosylase MUTYH, whererepair of 8-oxoguanine, the product of oxidativedamage, is compromised (21).

Properties of mosaic SNVs

For the following analyses, we used all mosaicSNVs in the discovery data set from all clones,totaling 6288, as mutation spectra for the threebrains were extremely similar (fig. S20). The dis-tribution between neighboringmosaic SNVs wasconsistentwith uniform randomplacement acrossthe genome (fig. S21). In line with this, no en-richment of exonic and intronic SNVs in any geneontology category was observed either when as-suming uniform background mutation rate orwhen using, as a backgroundmosaic, SNVs fromliver, intestine, colon, or fibroblasts (1, 2). Rough-ly 3% of SNVsmay have a functional consequence

by affecting either protein-coding sequence orgene regulation (fig. S21 and table S4). This proj-ects to about 12 nonbenign SNVs per progenitorcell at 20 postconception weeks. A significant de-pletion of mosaic SNVs was observed in deoxy-ribonuclease (DNase)–hypersensitive sites relativeto flanking regions (Fig. 4A). The depletion wasmore pronounced (10 versus 5%) when utilizingDNase-hypersensitive sites for fetal brain ratherthan for a lymphoblastoid cell line, suggestinga relation between a cell’s epigenome and thegenesis of mutations. Because no such depletionwas observed in coding relative to intronic generegions, the depletion is not the result of negativeselection and, rather, likely reflects better repairefficiency in open DNA regions, as was observedfor somatic mutations in cancers (22, 23).Similar to somatic SNVs in cancers (24) and

mosaic SNVs in skin fibroblasts (1), we found thatthe density ofmosaic SNVs in the brain correlatesnegatively withmost histonemarks in fetal brainand embryonic stem cells (Fig. 4B). Comparisonof our mosaic SNVs with mutational signatures

Bae et al., Science 359, 550–555 (2018) 2 February 2018 3 of 6

BG-VZ-SVZ

PA-CX Occipital

BG FR-CX

Spleen

FR-VZ-SVZ

Cerebellum

n=68 n=50

n=48 n=42

n=52

n=48 n=47

n=55

0

50

1000

50

100

Cou

nt o

f SN

Vs

0.01

%0.

1% 1% 10%

100%

0.1% 1% 10

%10

0%0.

1% 1% 10%

100%

0.1% 1% 10

%10

0%

Variant allele frequency

Ungenotyped Mosaic Germline

Cerebellum

Occipital

FR-VZ-SVZ

BG-VZ-SVZ

Spleen

FR-CXBG

Spl

een

FR-CXOccipital

Cer

ebel

lum

Fig. 2. Genotyping of SNV in original tissues. (A) Several dozens ofmosaic SNVs with VAFs of 0.3 to 30% in tissues from various brain regionsand the spleen are genotyped by the capture-resequencing approach(green line). For hundreds more SNVs, the evidence for presence in tissueis indistinguishable from background noise (blue line). BG, basal ganglia;FR and PA, frontal and parietal region of the cerebral cortex, respectively;VZ-SVZ, ventricular germinal layers; CX, outer cortical layer (see fig. S1).(B) Venn diagram of genotypedmosaic SNVs across brain regions and spleen

for subject 316. Almost 60% of mosaic SNVs could be genotyped in one ormore brain regions and spleen, and 44% could be genotyped in all brainregions and spleen. (C and D) Comparative VAFs for mosaic SNVs acrossdifferent brain regions and spleen for the same subject. Many SNVs areshared by multiple brain regions and by brain and spleen with similar VAFs(SNVs shared across two tissues are indicated by green, red, and bluecircles, whereas SNVs shared across three tissues are indicated by magentacircles). Black and gray circles indicate SNVs genotyped in only one region.

RESEARCH | RESEARCH ARTICLEon F

ebruary 28, 2020

http://science.sciencemag.org/

Dow

nloaded from

Page 4: Different mutational rates and mechanisms in human cells ... · Bridget Riley-Gillis,5 Nenad Sestan,4,6 Alexander E. Urban,3 Alexej Abyzov,1* Flora M. Vaccarino2,4,6* Somatic mosaicism

found in cancer (25, 26) revealed that signatures18 and 8—found in neuroblastoma andmedullo-blastoma, respectively—as well as their combina-tion, are the best descriptors for the spectrum ofmosaicmutations in the developing brain (Fig. 4,C andD).Mosaic SNVswere equallywell describedby the combination of signatures 5with 18 and 1Bwith 18. Therefore, signature 18, with suspectedetiology of oxidative damage (27), consistently con-tributed to themutation spectrumofmosaic SNVsin fetal brain progenitors. This signature wasmostly similar to late SNVs, whereas signature1Bwasmostly similar to the early ones (fig. S22).

Implications for development and disease

Our study uncovered extensive mosaicism inhuman fetal brain, with 200 to 400 SNVs present

per brain progenitor cell at 15 to 21 weeks ofgestation. This amount of mosaicism is likely in-herited by cortical postmitotic neurons, as neuro-genesis ends at around 20 weeks in humans (28).Indeed, our estimate is in good agreement withthe estimate that postmitotic neurons have ~300to 900mosaic SNVswithin one year of birth (29).There is an order-of-magnitude difference betweennumbers of mosaic SNVs and de novo single-nucleotide polymorphisms (SNPs) (20), implyinga higher effect of mosaic SNVs on normal braindevelopment and disease. Indeed, we estimatethat up to 12 nonbenignmutations can be presentin neuronal progenitors and consequently trans-mitted to a sizable fraction of daughter neurons.It is conceivable that, in rare cases, some of thesemutations may have a strong deleterious effect,

for example, initiating overgrowth (30, 31) or neo-plastic transformation by knocking out key genes.Indeed, the resemblance of mosaic SNVs in fetalbrain to somatic mutations in brain cancers and,particularly, to medulloblastoma supports thetheory that cancer-drivingmutations can happenby chance during background mutagenesis (32).As dozens of discovered mutations happen

before gastrulation, our study demonstrates thatearly postzygotic mutations can be reconstructedfrom the analysis of a handful of clones and tis-sues, opening an avenue for charting individu-alized mosaicism maps. As mosaic variants cancontribute to interindividual phenotypic differ-ences and have been implicated in an individ-ual’s disease risk, we suggest that knowing theindividual “mosaicome” could be as important

Bae et al., Science 359, 550–555 (2018) 2 February 2018 4 of 6

25% 12.5% 6.25% 3.125% 1.56%Expected VAF

D2 (11.34%) D3 (6.85%) D4 (3.44%) D5 (1.48%) Later divisions

35%

25%

10%

0%

VA

F

15:8

0509

588:

A:C

*9:

1198

1289

7:C

:G*

10:8

2892

782:

G:A

17:1

1299

753:

C:T

5:24

6764

84:C

:G3:

6531

5491

:T:G

X:3

2634

564:

A:T

12:1

2995

8073

:G:A

2:19

5286

731:

G:A

6:12

3506

649:

G:A

11:4

8722

461:

C:G

*5:

1474

7848

:C:T

*4:

2302

291:

G:A

1:69

9378

67:T

:A10

:130

3526

45:C

:G12

:589

4770

2:G

:A1:

1445

6553

9:C

:G12

:124

9162

43:G

:T6:

5024

6761

:G:T

3:19

3553

373:

C:T

2:11

5612

987:

C:G

10:6

0934

523:

C:T

9:54

2724

3:C

:T9:

2459

2787

:A:G

18:7

5829

051:

C:A

14:9

8610

053:

G:T

*7:

3450

6076

:G:C

4:12

8560

345:

T:G

X:1

0006

8496

:A:C

17:2

5728

433:

C:T

4:98

1971

30:C

:AX

:478

4004

:C:T

10:5

7119

608:

G:T

1:97

7204

61:A

:T2:

1904

0302

0:T:

C2:

1904

0032

7:T:

A18

:427

8369

1:C

:T3:

4657

0239

:C:T

13:7

8250

380:

G:T

10:6

9425

741:

G:A

12:1

1005

1561

:C:T

18:1

3291

163:

C:T

12:3

4279

291:

C:T

12:1

2116

4908

:C:G

9:86

1500

85:G

:A10

:102

3714

98:T

:C8:

1260

4522

2:G

:T4:

4147

1182

:C:T

10:8

8493

465:

C:T

X:7

9532

040:

G:A

14:8

7748

661:

C:T

X:3

8830

155:

C:T

22:3

9629

836:

G:A

2:15

8657

15:G

:A9:

1135

0347

:C:T

8:25

4943

0:G

:A8:

7625

6101

:C:T

14:9

4785

774:

A:G

6:10

9998

647:

A:T

9:28

2705

18:G

:A2:

1493

9539

0:C

:T15

:931

5858

3:G

:A4:

1197

9017

6:G

:A5:

1065

6866

9:C

:A17

:373

1925

7:G

:A11

:119

7546

7:G

:A17

:803

3754

9:G

:A8:

5055

7028

:G:A

15:4

1192

288:

G:A

17:1

4636

79:G

:A1:

5216

0694

:T:C

6:46

7179

19:G

:A5:

1605

7395

9:C

:T1:

2334

0265

1:C

:T5:

1302

1644

3:G

:A10

:878

1647

4:C

:T22

:316

0309

4:C

:T4:

2873

2455

:G:A

9:13

5564

031:

C:T

5:71

0182

8:G

:A8:

8628

2399

:G:A

8:10

3821

858:

T:C

SpleenCerebellum

OccipitalPA-CX

BGBG-VZ-SVZFR-VZ-SVZ

FR-CX#11#9b

#1#3

#18b#22

#20b#19b

#25#24

#6

a b c d e f g h i j k l m n o p q r s t u v w x y z

#19b

#20b

#3

#25

#18b

#11

#9b

#1

#6

#24

#22

traceable

untraceable

b, d, g, h

c, e, f

a

l

n, q

i(?), k, o

, , , , , , , ,

r, s, u, v, w, x, y, z, , (?)

, , ,

j, m, pt

,

,

Random0.5 1.0

Spectrum of late(n=6,144)

Spectrum of early(n=144)

Spectrum of de novo(n=747, ref. 20)

r=0.36

r=0.29r=0.77

Pro

port

ion

of S

NV

s

0.12

0.06

0.00

0.0

0.5

C>A C>G C>T T>A T>C T>G

Early

Late4.3 10-5

CpG non-CpG

0% 50% 100%

8.0 10-12

2.2 10-16

Fig. 3. Reconstruction of mosaic SNV mutations during earlydevelopment of subject 316. (A) Hierarchical clustering of SNVs genotypedin the different brain regions and spleen by their VAFs revealed groupingconsistent with SNV sharing between clones (white squares represent zeroVAF). Black and gray squares denote, respectively, SNVs discovered inclones and SNVs missed during discovery but genotyped afterwards. Forcompleteness, five SNVs (marked with *) were included in the analysis ifpresent in multiple clones but the corresponding VAFestimation from captureresequencing was not available. Their VAFs were estimated from whole-genome tissue sequencing. On the basis of the corresponding average VAF(shown underneath each cluster), each cluster was assigned to consecutivepostzygotic divisions: D1 (no SNVs observed), D2, D3, D4, and D5. The #

notation indicates the ID number of each clone. (B) The reconstructed cell-progeny tree during those divisions had only two conflicts of SNVassignment,denoted by “?”, between clusters and clones. “Expected VAF” denotes VAFof mutations arising at each stage, assuming equal contribution of allprogenies to tissues. (C) Mutational spectra of likely early mosaic SNVs(darker color shades) and presumably later arising SNVs (lighter colorshades) are different. The difference in the spectra is due to the shift infrequency of C:G→T:A transitions, particularly in CpG motifs, and C:G→A:Ttransversions.The spectrum of early SNVs is much closer to the spectrum forde novo SNVs in the human population (triangle with Pearson’s correlationcoefficient r values). Random distribution represents correlation coefficientswhen randomly, but proportionally, subsampling early and late mutations.

RESEARCH | RESEARCH ARTICLEon F

ebruary 28, 2020

http://science.sciencemag.org/

Dow

nloaded from

Page 5: Different mutational rates and mechanisms in human cells ... · Bridget Riley-Gillis,5 Nenad Sestan,4,6 Alexander E. Urban,3 Alexej Abyzov,1* Flora M. Vaccarino2,4,6* Somatic mosaicism

as knowing the individual germline genome,particularly given the much stronger selectionacting on germline variants and the lower pene-trance of mosaic variants that is likely to be trans-lated in milder phenotypes.Wealso discovered a shift inmutagenesis during

development that is characterized by an increasedmutation rate and a change in frequency of sub-stitution types. We cannot rule out that the in-creased mutation rate can be partially explainedby interindividual variation, althoughwehave noevidence for such variability. The shift occurssometime between early cleavages and neuro-genesis and may be the consequence of physio-logical, biochemical, and gene-expression changesrelated to the generation of neurons fromneuralstem cells. Alternatively, the shiftmay reflectmoregeneral developmental processes common to alltissues during organogenesis, and, on the basis ofincreased counts ofmutations related to oxidative

damage, could be coupled to a higher availabilityof radical oxygen species after development ofthe cardiovascular system of the embryo. If thisis the case, we predict thatmutation spectra andrates per division undergo a similar shift duringdevelopment across all somatic lineages.Our estimated average mutation rate of 5.1

SNVs per day per neuronal progenitor duringneurogenesis implies that neurons generated atearly and later stages of neurogenesis will carrydifferent burdens of mosaic variants. This rate isthree orders of magnitude higher than 0.4 to 2mutations per year accumulated in the germlinelineage of adults (20, 33, 34). It is also 50 timeshigher than the rate in postnatal stem cells of thesmall intestine, colon, and liver, estimated to be36 mutations per year (2). Therefore, our resultsshow that the prenatal period is intrinsically high-ly mutagenic, likely the consequence of oxidativedamage coupled with more frequent cell divisions.

We found no difference in SNV counts betweenprogenitors from cortex and from basal ganglia,implying that mosaicism accumulates at similarrates across the brain during neurogenesis. Withthe observed rate of 5.1 SNVs per day per neuronalprogenitor, one can project that cells in the fore-brain subventricular zone and hippocampal sub-granular zone,whereneurogenesis andgliogenesiscontinues formore extended time periods (35–37),would accumulate about 1000 mosaic mutationsby the time of birth. This estimate is consistentwith the estimates of about 1000 mosaic SNVspresent in skin fibroblast cells and in stem cells ofthe colon and intestine in children (1, 2); indeed,mutation rates in all somatic proliferative cell lin-eages duringprenatal developmentmaybe similar.

REFERENCES AND NOTES

1. A. Abyzov et al., Genome Res. 27, 512–523 (2017).2. F. Blokzijl et al., Nature 538, 260–264 (2016).

Bae et al., Science 359, 550–555 (2018) 2 February 2018 5 of 6

Genomic distance from DNase region (Kb)

SN

V fr

eque

ncy

0 10 20-10-20

0.0

0.5

1.0

1.5 Fetal Brain (P < 0.0001) NA12878 (P = 0.0183)

Mut

atio

nal s

igna

ture

s

Correlation of signatures to the spectrum

1884

10141B

53

1A16

96

2019

21513

711122117

-0.1 0.1 0.3 0.5 0.7 0.9

Cor

rela

tion

with

SN

V d

ensi

ty

H2A

.ZH

2BK

120a

cH

2BK

20ac

H3K

79m

e1H

3K79

me2

H4K

20m

e1H

4K8a

cH

3K23

me2

H4K

91ac

H4K

5ac

H3K

4ac

H2A

K5a

cH

3K23

acH

3K56

acH

3K14

acH

2BK

12ac

H3K

27ac

H2B

K5a

cH

2BK

15ac

H3K

18ac

H3K

9ac

H3K

4me2

H3K

9me3

H3K

36m

e3D

Nas

e.m

acs2

H3K

27m

e3H

3K4m

e3H

3K4m

e1

Epigenetic mark density

-0.6

-0.4

-0.2

0.0

0.2

0.4ESCFetal Brain

Num

ber

of p

airs

Correlation of pairs of signatures to the spectrum

5 and 18 (59.9%, 40.1%)8 and 18 (73.9%, 26.1%)1B and 18 (56.1%, 43.9%)

0.0 0.2 0.4 0.6 0.8 1.0

4

8

12

16

0

Fig. 4. Properties of mosaic SNVs in brain. (A) Depletion of mosaicSNVs in DNase-hypersensitive sites, possibly indicating a betterefficiency of DNA-repair pathways in those regions (22, 23). Kb, kilobase.(B) Density of mosaic SNVs correlates negatively with histone marks inembryonic stem cells (ESCs) and fetal brain, revealing similarity to somaticSNVs in cancers. (C) Mutational signatures 8 and 18 (orange) found in

brain cancers have the highest two correlations with the mutationspectrum of mosaic SNVs. (D) Exhaustive combinations of pairs ofsignatures consistently show that signatures 1B and 5 also contribute tothe description of the mutation spectrum in combination with signature 18.Thus, signature 18 is the best descriptor of mosaic SNVs indeveloping brain.

RESEARCH | RESEARCH ARTICLEon F

ebruary 28, 2020

http://science.sciencemag.org/

Dow

nloaded from

Page 6: Different mutational rates and mechanisms in human cells ... · Bridget Riley-Gillis,5 Nenad Sestan,4,6 Alexander E. Urban,3 Alexej Abyzov,1* Flora M. Vaccarino2,4,6* Somatic mosaicism

3. N. Saini et al., PLOS Genet. 12, e1006385 (2016).4. A. Abyzov et al., Nature 492, 438–442 (2012).5. M. O’Huallachain, K. J. Karczewski, S. M. Weissman,

A. E. Urban, M. P. Snyder, Proc. Natl. Acad. Sci. U.S.A. 109,18018–18023 (2012).

6. M. J. McConnell et al., Science 342, 632–637 (2013).7. C. C. Laurie et al., Nat. Genet. 44, 642–650 (2012).8. K. B. Jacobs et al., Nat. Genet. 44, 651–658 (2012).9. G. D. Evrony et al., Cell 151, 483–496 (2012).10. J. A. Erwin et al., Nat. Neurosci. 19, 1583–1591 (2016).11. T. R. Insel, Mol. Psychiatry 19, 156–158 (2014).12. M. J. McConnell et al., Science 356, eaal1641 (2017).13. M. A. Lodato et al., Science 350, 94–98 (2015).14. C. Chen et al., Science 356, 189–194 (2017).15. X. Dong et al., Nat. Methods 14, 491–493 (2017).16. A. Bacolla, D. N. Cooper, K. M. Vasquez, Genes (Basel) 5,

108–146 (2014).17. D. R. Kornack, P. Rakic, Proc. Natl. Acad. Sci. U.S.A. 95,

1242–1246 (1998).18. K. L. Moore, T. V. N. Persaud, M. G. Torchia, Before We Are

Born (Elsevier Health Sciences, 2015).19. Y. S. Ju et al., Nature 543, 714–718 (2017).20. R. Rahbari et al., Nat. Genet. 48, 126–133 (2016).21. A. Viel et al., EBioMedicine 20, 39–49 (2017).22. D. Perera et al., Nature 532, 259–263 (2016).

23. R. Sabarinathan, L. Mularoni, J. Deu-Pons, A. Gonzalez-Perez,N. López-Bigas, Nature 532, 264–267 (2016).

24. P. Polak et al., Nature 518, 360–364 (2015).25. L. B. Alexandrov et al., Nature 500, 415–421 (2013).26. M. S. Lawrence et al., Nature 499, 214–218 (2013).27. T. Helleday, S. Eshtad, S. Nik-Zainal, Nat. Rev. Genet. 15,

585–598 (2014).28. A. A. Pollen et al., Cell 163, 55–67 (2015).29. M. A. Lodato et al., Science, 359 555–559 (2018).30. A. Poduri et al., Neuron 74, 41–48 (2012).31. J. H. Lee et al., Nat. Genet. 44, 941–945 (2012).32. C. Tomasetti, B. Vogelstein, Science 347, 78–81 (2015).33. J. J. Michaelson et al., Cell 151, 1431–1442 (2012).34. H. Jónsson et al., Nature 549, 519–522 (2017).35. F. H. Gage, J. Neurosci. 22, 612–613 (2002).36. A. Ernst et al., Cell 156, 1072–1083 (2014).37. J. T. Gonçalves, S. T. Schafer, F. H. Gage, Cell 167, 897–914

(2016).

ACKNOWLEDGMENTS

This work was supported by the high-performance computing(HPC) facilities operated by the Yale Center for ResearchComputing and Yale’s W. M. Keck Biotechnology Laboratory, as

well as their respective staff. This work is also supported by NIHgrants RR19895 and RR029676-01, which helped fund the cluster.The sequencing data from this study have been deposited to theNIH National Institute of Mental Health (NIMH) Data Archive(https://data-archive.nimh.nih.gov) under collection ID #2330 andDOI: 10.15154/1410419. This work was funded by the Mayo ClinicCenter For Individualized Medicine and by NIH grants R01MH100914 (F.M.V.), U01 MH106876 (F.M.V., A.A., A.E.U.), U01MH106874 (N.S.), P50 MH106934 (N.S.), and R03 CA191421(A.A.). A.A. is also a Visiting Professor at Yale Child Study Center.The supplementary materials contain additional data.

SUPPLEMENTARY MATERIALS

www.sciencemag.org/content/359/6375/550/suppl/DC1Materials and MethodsFigs. S1 to S24Tables S1 to S4References (38–41)

23 June 2017; accepted 28 November 2017Published online 7 December 201710.1126/science.aan8690

Bae et al., Science 359, 550–555 (2018) 2 February 2018 6 of 6

RESEARCH | RESEARCH ARTICLEon F

ebruary 28, 2020

http://science.sciencemag.org/

Dow

nloaded from

Page 7: Different mutational rates and mechanisms in human cells ... · Bridget Riley-Gillis,5 Nenad Sestan,4,6 Alexander E. Urban,3 Alexej Abyzov,1* Flora M. Vaccarino2,4,6* Somatic mosaicism

neurogenesisDifferent mutational rates and mechanisms in human cells at pregastrulation and

VaccarinoPattni, Bo-Juen Chen, Elisa Venturini, Bridget Riley-Gillis, Nenad Sestan, Alexander E. Urban, Alexej Abyzov and Flora M. Taejeong Bae, Livia Tomasini, Jessica Mariani, Bo Zhou, Tanmoy Roychowdhury, Daniel Franjic, Mihovil Pletikos, Reenal

originally published online December 7, 2017DOI: 10.1126/science.aan8690 (6375), 550-555.359Science 

, this issue p. 550, p. 555; see also p. 521Sciencedivergence of genomes across the brain could affect function.affected by inborn errors in DNA repair. Postmitotic mutations might only affect one neuron, but the accumulatedaged 4 months to 82 years. Somatic mutations accumulated with increasing age and accumulated faster in individuals

also found that neurons take on somatic mutations as they age by sequencing single neurons from subjectset al.Lodato These early mutations could be generating useful neuronal diversity or could predispose individuals to later dysfunction.human brain. Both the type of mutation and the rates of accumulation changed between gastrulation and neurogenesis.

examined the genomes of single neurons from the prenatal developinget al.renewal (see the Perspective by Lee). Bae Most neurons that make up the human brain are postmitotic, living and functioning for a very long time without

Brain mutations, young and old

ARTICLE TOOLS http://science.sciencemag.org/content/359/6375/550

MATERIALSSUPPLEMENTARY http://science.sciencemag.org/content/suppl/2017/12/06/science.aan8690.DC1

CONTENTRELATED

http://science.sciencemag.org/content/sci/359/6375/521.fullhttp://science.sciencemag.org/content/sci/359/6375/555.full

REFERENCES

http://science.sciencemag.org/content/359/6375/550#BIBLThis article cites 40 articles, 11 of which you can access for free

PERMISSIONS http://www.sciencemag.org/help/reprints-and-permissions

Terms of ServiceUse of this article is subject to the

is a registered trademark of AAAS.ScienceScience, 1200 New York Avenue NW, Washington, DC 20005. The title (print ISSN 0036-8075; online ISSN 1095-9203) is published by the American Association for the Advancement ofScience

Science. No claim to original U.S. Government WorksCopyright © 2018 The Authors, some rights reserved; exclusive licensee American Association for the Advancement of

on February 28, 2020

http://science.sciencem

ag.org/D

ownloaded from