university of groningen genetics of hirschsprung …...genetics of hirschsprung disease rare...
TRANSCRIPT
University of Groningen
Genetics of Hirschsprung diseaseSchriemer, Duco
IMPORTANT NOTE: You are advised to consult the publisher's version (publisher's PDF) if you wish to cite fromit. Please check the document version below.
Document VersionPublisher's PDF, also known as Version of record
Publication date:2016
Link to publication in University of Groningen/UMCG research database
Citation for published version (APA):Schriemer, D. (2016). Genetics of Hirschsprung disease: Rare variants, in vivo analysis and expressionprofiling. Rijksuniversiteit Groningen.
CopyrightOther than for strictly personal use, it is not permitted to download or to forward/distribute the text or part of it without the consent of theauthor(s) and/or copyright holder(s), unless the work is under an open content license (like Creative Commons).
Take-down policyIf you believe that this document breaches copyright please contact us providing details, and we will remove access to the work immediatelyand investigate your claim.
Downloaded from the University of Groningen/UMCG research database (Pure): http://www.rug.nl/research/portal. For technical reasons thenumber of authors shown on this cover page is limited to 10 maximum.
Download date: 16-12-2020
GENETICS OF HIRSCHSPRUNG DISEASE
Rare variants, in vivo analysis and expression profiling
Duco Schriemer
Genetics of Hirschsprung disease – Rare variants, in vivo analysis and expression profiling
Duco Schriemer
The work presented in this PhD thesis was mostly conducted at the Department of
Neuroscience, section Medical Physiology and the Department of Genetics of the University
of Groningen, University Medical Center Groningen. Part of this work was conducted on
behalf of The International Hirschsprung Disease Consortium.
The research in this thesis was financially supported by grants from ZonMW (TOP-subsidie
40-00812-98-10042 and the Maag Lever Darm stichting (WO09-62).
Printing Ipskamp Drukkers
Cover design Duco Schriemer, image of zebrafish gut by William Cheng
Financial support Printing of this thesis was financially supported by the
University of Groningen, University Medical Center Groningen
(UMCG) and the Groningen University Institute for Drug
Exploration (GUIDE).
ISBN (printed version) 978-90-367-8900-4
ISBN (electronic version) 978-90-367-8899-1
Copyright © 2016 by Duco Schriemer. All rights reserved. No part of this thesis may be
reproduced, stored in a retrieval system, or transmitted in any form or by any means
without prior permission of the author.
Genetics of Hirschsprung disease
Rare variants, in vivo analysis and expression profiling
Proefschrift
ter verkrijging van de graad van doctor aan de Rijksuniversiteit Groningen
op gezag van de rector magnificus prof. dr. E. Sterken
en volgens besluit van het College voor Promoties.
De openbare verdediging zal plaatsvinden op
maandag 20 juni 2016 om 14.30 uur
door
Duco Schriemer
geboren op 27 maart 1987 te Leeuwarden
Promotor
Prof. dr. R.M.W. Hofstra
Copromotor
Dr. B.J.L. Eggen
Beoordelingscommissie
Prof. dr. J.D. Laman
Prof. dr. P.K.H. Tam
Prof. dr. ir. J.A. Veltman
Paranimfen
Koen van Zomeren
Hiske van Duinen
TABLE OF CONTENTS
Chapter 1 General introduction 9
Chapter 2 De novo mutations in Hirschsprung patients link Central
Nervous System genes to the development of the Enteric
Nervous System
39
Chapter 3 Regulators of gene expression in Enteric Neural Crest Cells
are putative Hirschsprung disease genes
87
Chapter 4 Removing the maximum assigned value of the Genotype
Quality score allows specific detection of de novo
mutations in Next-Generation Sequencing data
111
Chapter 5 Combined strategies to identify diseases associated genes
for rare complex diseases: Hirschsprung disease as a
model
129
Chapter 6 Overexpression of the chromosome 21 gene ATP5O results
in fewer enteric neurons: the missing link between Down
syndrome and Hirschsprung disease?
157
Chapter 7 General discussion 181
Chapter 8 Summary
Nederlandse samenvatting
List of abbreviations
Dankwoord / Acknowledgements
196
200
204
206
GENERAL INTRODUCTION
1
CHAPTER 1
10
DEVELOPMENT OF THE ENTERIC NERVOUS SYSTEM
The enteric nervous system (ENS) is the complex network of neurons and glial
cells that occurs along the entire length of the gastrointestinal tract and is
responsible for the peristaltic movements of the bowel, as well as regulation of gut
metabolism. Although the ENS is partially controlled by the central nervous system
(CNS), it largely functions autonomously. The neurons and glial cells of the ENS are
organized in interconnected ganglia that are localized in two plexuses in the gut
wall. The myenteric (Auerbach’s) plexus is located between the longitudinal and
circular muscles of the gut wall and innervates these muscular layers. The
submucosal (Meissner) plexus is localized within the submucosa and regulates gut
homeostasis1,2. The ENS is entirely derived from the neural crest, a cell population
that originates at the border of the non-neural ectoderm and the neural plate.
Upon closure of the neural tube, neural crest cells delaminate and migrate
throughout the body and differentiate into various cell types including
melanocytes, craniofacial cartilage, adrenal medulla, dorsal root ganglia and the
neurons and glia of the ENS3.
The majority of the ENS is derived from vagal neural crest cells that
originate from the neural tube at the level of somites 1 to 74. Vagal neural crest
cells invade the foregut at 4 weeks of gestation in human embryonic development5.
These enteric neural crest cells (ENCCs) proliferate extensively. Vagal neural crest
cells migrate in a rostral to caudal direction to colonize the foregut, midgut and
hindgut, whereas sacral neural crest cells move in the opposite direction.
Colonization of the entire length of the human gastrointestinal tract is completed
by the 7th week of gestation5.
The development of ENS has been studied in detail in mice, where neural
crest cells colonize the gut between embryonic day 9.5 (E9.5) and E13.56 (Figure
1). Time-lapse microscopy of fluorescently labeled ENCCs in mice has shown that
ENCCs migrate along the gut in interconnected strands7. Individual cells break
from these strands at the wavefront and assemble into new chains of migrating
cells. As the gastrointestinal tract develops, the midgut and hindgut become
juxtaposed in a hairpin loop at mouse embryonic day 11.5. A subpopulation of the
migrating ENCCs use this spatial organization as a shortcut from the midgut to the
hindgut via the mesentry (Figure 1). These trans-mesenteric ENCCs extensively
contribute to the ENS in the murine hindgut8. Although the hairpin loop is also
observed in human embryonic development, it is yet unclear whether ENCCs cross
GENERAL INTRODUCTION
11
1 the mesentery in humans8. Interestingly, advancing murine ENCCs pause their
migration for approximately 12 hours when reaching the caecum, before
continuing colonization of the hindgut and caecum7. Whether this pause in
migration occurs in humans is also as yet unknown.
The hindgut is partially colonized by sacral neural crest cells that originate
at the level of somite 24 in mice. Sacral neural crest cells enter the hindgut late
during ENS development and migrate in the opposite direction to the vagal neural
crest cells (Figure 1)7,9–11. Studies using quail-chick chimeras showed that the
contribution of the sacral neural crest to the ENS is limited, but contributes up to
17% of neurons in the chick hindgut9. Although vagal and sacral neural crest cells
have different migratory properties, they are morphologically indistinguishable
after the gut is completely colonized11.
Figure 1. Colonization of the developing gastrointestinal tract by ENCCs in the mouse. Vagal
neural crest cells delaminate from the neural tube (E9.5) and migrate in a rostro-caudal direction to
colonize the gut (E10.5-E13.5). Sacral neural crest cells contribute to the colonization of the hindgut in
opposite direction later during development. Adapted from Heanue & Pachnis (2007) and Sasselli et al.
(2012)2,16.
CHAPTER 1
12
In order to form a functional ENS, ENCCs have to differentiate into many
neuronal subtypes and glia. Markers of early neuronal and glial differentiation are
found in ENCCs that are actively migrating and proliferating12. Early differentiating
enteric neurons are found just behind the migratory wavefront, while glial
precursors are found further away from the wavefront, indicating that enteric
neuronal differentiation precedes glial differentiation13. ENCCs are a
heterogeneous cell population. Terminally differentiated neurons are observed
when ENCCs are still mitotic and migratory. However, these represent different
cell populations, as markers of terminally differentiated neurons are found
exclusively in post-mitotic, non-migratory cells12,14. Different neuronal subtypes
arise at different developmental stages. Serotoninergic (5-HT) and cholinergic
(ChAT) neurons arise early, whereas neurons expressing ENK, NPY, VIP and CGRP
appear later and neurons and glial cells continue to be appear postnatally15.
HIRSCHSPRUNG DISEASE
Clinical presentation
Hirschsprung disease (HSCR) is a congenital disorder that is characterized by the
absence of enteric ganglia in the distal region of the intestine and it is thought to be
the result from incomplete colonization of the gut by ENCCs. The absence of enteric
ganglia (aganglionosis) results in tonic contraction of the muscles in the affected
segment of the bowel, leading to abdominal distension (megacolon) proximal to
the affected segment (Figure 2A-C). HSCR patients commonly present with
vomiting, chronic constipation, failure to pass meconium, and less frequently with
enterocolitis. Aganglionosis generally affects the most distal region of the bowel
and the length of the affected segment varies between patients. Aganglionosis from
the anal sphincter up to the rectosigmoid or sigmoid colon is defined as short-
segment HSCR and comprises 80% of the HSCR cases (Figure 2A). In long-segment
HSCR (15% of the cases) the aganglionosis extends beyond the sigmoid colon and
in total colonic aganglionosis (5% of the cases) there is a lack of enteric ganglia
along the entire length of the colon17. In rare cases, termed total intestinal
aganglionosis, even the small intestine, stomach and esophagus can be affected18.
GENERAL INTRODUCTION
13
1 Diagnosis
The majority of HSCR patients are diagnosed within the first 48 hours after birth.
HSCR is sometimes diagnosed during childhood and in rare occasions during adult
life. The absence of ganglia in a rectal biopsy confirms a HSCR diagnosis19.
Aganglionosis can also be detected by increased immunoreactivity for
acetylcholinesterase in hypertrophied extrinsic nerve fibers20. Additional
diagnostic proof can be obtained using anorectal manometry21; measurement of
the relaxation of the rectum in response to inflation of a rectal balloon. Barium
enema radiography may be used to assess the length of obstructed colon and may
show a dilated proximal segment.
Figure 2. Intestinal aganglionosis and clinical presentation of HSCR. A) Schematic representation
of a normal colon, displaying the different colonic segments, and a HSCR-affected colon. B) X-ray of a
HSCR patient showing tonic contraction of the affected segment and dilatation of the proximal bowel. C)
Abdominal distension in a newborn suffering from HSCR. Adapted from http://www.ShowMe.com and
http://www.artikelkeperawatan.info.
CHAPTER 1
14
Treatment
Left untreated, the mortality rate of HSCR is ~75%22. Treatment consists of
surgical resection of the aganglionic segment and rejoining of the ganglionic gut
with the anal sphincter. This so-called pull-through surgery was first performed by
Dr. Orvar Swenson in 194823 and several modifications to the original procedure
have been described. These are known as the Soave, Duhamel and Boley
procedures. From the 1990’s open surgical methods have started to be replaced by
laparoscopic and transanal endorectal resection and pull-through surgeries24,25.
These procedures require reduced operating and hospitalization times, and show
improved clinical outcome26–28. Depending on the physical condition of the patient,
a colostomy might be created proximally of the aganglionic segment to bypass the
abdominal obstruction. In these cases a second surgical procedure is required to
perform the pull-through surgery. Even though surgical treatment reduces the
mortality rate of HSCR to as little as a few percent29, long-term follow-up studies
showed that many patients suffer from impaired bowel function, constipation,
fecal soiling, perianal dermatitis, enterocolitis and associated social problems30,31.
It has been hypothesized that replenishment of the aganglionic gut with
neuronal progenitor cells (NPCs) may provide a future therapy for HSCR. NPCs
have successfully been isolated from human embryonic and postnatal mucosal
gut32–35. Moreover, these cells can be expanded in vitro when cultured as
neurosphere-like bodies (NLBs) and can be differentiated into neurons and glia.
Metzger and co-workers have transplanted human NLBs into aganglionic gut
segments derived from HSCR patients. Transplanted NPCs migrated away from the
NLBs to colonize the aganglionic gut and differentiated into functional enteric
neurons35. Importantly, these authors and others33 showed that NLBs can be
generated from HSCR patient biopsies with similar efficiency as from non-HSCR
gut. Although this is an important step towards the clinical use of autologous stem
cells in transplantations, cell numbers are currently too low to replenish
aganglionic gut segments of HSCR patients.
An alternative and unlimited source of neural crest cells for
transplantation to HSCR patients could be provided by human induced pluripotent
stem (iPS) cells. Human iPS cells can be differentiated in vitro into neural crest
cells, can be purified by flow cytometry, and can subsequently be differentiated
into peripheral neurons and glial cells36. When transplanted into chick embryos or
mice, human embryonic stem cell-derived neural crest cells are migratory and
differentiate in vivo37. In vitro neural crest differentiation has been further
GENERAL INTRODUCTION
15
1 optimized to obtain neural crest cells of vagal origin38. When transplanted into a
transgenic mouse model for HSCR, these cells extensively migrate, up to the distal
colon, and completely rescue disease-related mortality38. Moreover, pepstatin A
was found to rescue the impaired migratory capacity of neural crest cells in a
genetically engineered HSCR model system, suggesting that pepstatin A could
improve the outcome of therapeutic neural crest cell transplantations in HSCR
patients38. Despite their clinical promise, a current limitation of hiPS cell-derived
neural crest cells for transplantations is their safety, as remaining undifferentiated
hiPS cells in the graft can lead to tumor formation in the donor39.
Epidemiology
The prevalence of HSCR is around 1:5,000 live births, but varies between ethnic
groups. The disease is 2-fold and 3-fold more common among Asians than among
Caucasians and Hispanics, respectively40. In all ethnicities, the prevalence of HSCR
is higher in males than in females, but it is unclear what causes this bias. This
gender bias is most prominent in short-segment HSCR, with a 4:1 male:female
ratio, compared to a 2:1 ratio in long-segment HSCR41. Although HSCR mostly
presents sporadically (non-familial), HSCR can present in a familial form (5-20% of
cases). In general, the risk for siblings of HSCR patients to develop the disease as
well (recurrence risk) is 4%, but ranges from 1 to 33% depending on the extent of
aganglionosis and the gender of both the patient and sibling (Figure 3). Long-
segment HSCR gives a high recurrence risk and has a dominant mode of
inheritance with incomplete penetrance. Short-segment HSCR has a much lower
recurrence risk and follows a multifactorial or recessive mode of inheritance41.
Most HSCR cases present as an isolated trait. However, 30% of the HSCR cases are
associated with congenital syndromes, including Down syndrome (Trisomy 21)
that accounts for 7% of all HSCR cases42.
GENES ASSOCIATED WITH HIRSCHSPRUNG DISEASE
RET signaling pathway
Linkage analysis on familial HSCR patients found that the major gene for HSCR
maps to chromosome band 10q11.244,45 and RET was identified as the gene
carrying mutations in this locus46,47 (see Table 1 for a list of all HSCR-associated
genes). Mutations in the coding regions of RET have been found in ~50% of
CHAPTER 1
16
familial HSCR cases and 15-35% of sporadic cases48,49. Even though coding RET
mutations have a dominant mode of inheritance, coding RET mutations have an
incomplete penetrance in HSCR (72% in males and 51% in females)48, suggesting
that other genetic aberrations are needed for disease development. In addition to
HSCR, RET mutations are involved in Multiple Endocrine Neoplasia type 2 (MEN2)
and Familial Medullary Thyroid Carcinoma (FMTC)50. RET mutations in MEN2 and
FMTC are restricted to specific domains of RET and are activating, gain of function
mutations. RET mutations in HSCR, on the other hand, are scattered throughout the
coding sequence of RET48. A wide variety of coding RET mutations has been
identified in HSCR patients, including large and small deletions, missense,
nonsense, and splice-site mutations. In HSCR, RET mutations generally result in a
loss of RET function and lead to haploinsufficiency51. However, activating
mutations that are associated with MEN2 have been reported in HSCR patients as
well52–54.
Besides rare, coding mutations, also common, non-coding variants in RET
have been associated with HSCR. Two independent haplotypes in the RET locus
have been identified55. The first haplotype spans 27 kb between 4 kb upstream of
Figure 3. Recurrence risk and incidence of HSCR by phenotype and contributing genetic variant.
HSCR mostly affects a short segment of the bowel, is most common in males and predominantly occurs
sporadically. The recurrence risk is higher for long-segment HSCR, females and familial cases. The more
common forms of HSCR have a large contribution of common genetic variants with low penetrance,
whereas the rare and more heritable forms of HSCR are caused by more penetrant, rare variants.
Adapted from Emison et al. (2010)43.
GENERAL INTRODUCTION
17
1 the RET promoter, 2 SNPs in an evolutionary conserved enhancer in intron 1 and a
synonymous SNP in exon 2. The two SNPs in the RET promoter sequence reduce
the binding affinity of NKX2-1 (TTF1) to the RET promoter, but the contribution of
these SNPs to RET promoter activity is limited56. Similarly, the synonymous SNP in
exon 2 does not influence RET expression levels57. The two SNPs in the conserved
enhancer in intron 1 are in strong linkage disequilibrium. Nevertheless, both SNPs
influence the activity of the enhancer independently58,59. Mice expressing the LacZ
reporter gene under control of the enhancer element containing both SNPs show
that the temporospatial expression of LacZ is similar to that of RET, suggesting that
the enhancer element is indeed regulating RET expression60. Interestingly, the
penetrance of the intron 1 enhancer SNPs is higher in males than in females.
Overall, these SNPs have a 10 to 20-fold higher contribution to disease risk than
coding mutations58. The predisposing haplotype is present in ~60% of Caucasian
patients, compared to 20% of controls. In Asians, the frequency of the predisposing
haplotype is higher; 85% in patients and 40% in controls61. The high frequency of
the predisposing haplotype might contribute to the higher incidence of HSCR
among Asians. The second HSCR-associated RET haplotype contains a synonymous
SNP in exon 14 and a SNP in the 3’UTR. The 3’UTR SNPs is protective against HSCR
by increasing the stability of RET mRNA62. Again, the frequency of the haplotype in
different ethnicities is concordant with the prevalence of HSCR, as the protective
haplotype is present in 4-8% of Caucasians and is virtually absent in the Asian
population63.
Both rare, coding mutations and common, non-coding variants in RET
contribute to the risk of developing HSCR. However, their relative contributions
depend on gender, length of the aganglionic segment and familiality43. The
enhancer SNPs are more common in males, short-segment HSCR and sporadic
cases. Rare, coding mutations on the other hand, are more frequent in females,
long-segment HSCR and familial cases. The presence of rare and common genetic
variants in RET, and possibly also in other genes, therefore contributes to the
gender-, segment length- and familiality-dependent bias in recurrence risk and
incidence of HSCR.
RET signaling in ENS development
RET is a receptor tyrosine kinase that is expressed by ENCCs throughout ENS
development. RET can be activated by four different ligands, glial cell line-
CHAPTER 1
18
Table 1. HSCR-associated genes and loci.
HSCR: Hirschsprung disease, MEN2A: Multiple endocrine neoplasia type 2, WS: Waardenburg-Shah
syndrome, CCHS: Congenital central hypoventilation syndrome, DS: Down syndrome
derived neurotrophic factor (GDNF), neurturin (NRTN), artemin (ARTN) and
persephin (PSPN). These ligands bind to GDNF family receptor alpha (GFRα) 1-4,
respectively, that act as co-receptors in RET signaling. Ligand binding to RET, and
GFRα co-receptors result in RET dimerization and trans-phosphorylation of
tyrosine residues in the intracellular domain of RET. The phosphorylated tyrosine
Gene Locus Phenotype Frequency of coding mutations
Inheritance
RET 10q11.2 Non-syndromic HSCR / MEN2A
50% familial, 15–35% sporadic
Dominant, incomplete penetrance
GDNF 5p13.1 Non-syndromic HSCR Rare Non-Mendelian
GFRA1 10q25.3 Non-syndromic HSCR 1 case reported Dominant
NRTN 19p13.3 Non-syndromic HSCR Very rare Non-Mendelian
PSPN 19p13.3 Non-syndromic HSCR Very rare Non-Mendelian
EDNRB 13q22.3 Non-syndromic HSCR WS
3-7% Dominant (de novo in 80%) Recessive
EDN3 20q13.32 Non-syndromic HSCR WS
<5% Dominant, incomplete penetrance Recessive
ECE1 1p36.12 HSCR, craniofacial and cardiac defects
1 case reported Dominant
NRG1 8p12 Non-syndromic HSCR 6% Dominant, incomplete penetrance
NRG3 10q23.1 Non-syndromic HSCR Rare Dominant, incomplete penetrance
SEMA3C 7q21.11 Non-syndromic HSCR Rare Non-Mendelian
SEMA3D 7q21.11 Non-syndromic HSCR Rare Non-Mendelian
SOX10 22q13.1 Non-syndromic HSCR, WS
>5% Dominant (de novo in 75%)
PHOX2B 4p13 CCHS <5% Dominant (de novo in 90%)
ZFHX1B /
ZEB2 2q22.3 Mowat-Wilson syndrome <5% Dominant (de novo in 100%)
TCF4 18q21.2 Pitt-Hopkins syndrome 1 case reported Dominant
NKX2-1 / TTF1
14q13.3 Non-syndromic HSCR 1 case reported Dominant
L1CAM Xq28 X-linked hydrocephalus and HSCR
Rare X-linked dominant
DSCAM 21q22.2 Non-syndromic HSCR, DS
Association of common variants
Non-Mendelian
KBP / KIAA1279
10q22.1 Goldberg-Shprintzen syndrome
Rare Recessive
DNMT3B 20q11.21 Non-syndromic HSCR Rare Dominant, incomplete penetrance
PTCH1 9q22.32 Non-syndromic HSCR Association of common variants
Non-Mendelian
DLL3 6q27 Non-syndromic HSCR Association of common variants
Non-Mendelian
IKBKAP 9q31 Non-syndromic HSCR, dysautonomia
Association of common variants
Dominant, incomplete penetrance
unknown 3p21 Non-syndromic HSCR
Non-Mendelian
unknown 16q23 WS
Non-Mendelian
unknown 4q31-q32 Non-syndromic HSCR
Dominant, incomplete penetrance
unknown 19q12 Non-syndromic HSCR
Non-Mendelian
unknown 21q22 WS
Recessive
GENERAL INTRODUCTION
19
1 residues function as docking sites for a wide range of downstream signaling
cascades, including the ERK, PI3K/AKT, p38-MAPK and JNK pathways, that are
critical for proliferation, migration, differentiation and survival of ENCCs and their
derivatives64. Evidence that RET is required for the development of the ENS has
been obtained in Ret knockout mice, as these mice fail to colonize the bowel
beyond the stomach65. However, the required RET gene dosage differs between
humans and mice. Only homozygous Ret knockout mice are aganglionic, but
heterozygous animals develop normally. In humans, on the other hand, mutations
in RET are heterozygous and lead to a disease phenotype with reduced penetrance.
Like Ret-deficient mice, mice lacking the RET ligand GDNF, or its co-
receptor GFRα1, also display total intestinal aganglionosis66–69. GDNF acts as a
chemoattractant to vagal ENCCs as they migrate along the developing gut70. In
addition, GDNF is required for the proliferation and survival of ENCCs, induces
neuronal differentiation, and is a chemoattractant for outgrowing neurites71–75. In
humans, heterozygous mutations in GDNF have been found sporadically in isolated
HSCR cases49,76,77.
Of the remaining RET ligands, mutations in NRTN and PSPN have been
identified in HSCR patients, but RET signaling by NRTN and PSPN is less critical for
ENS development than signaling by GDNF78,79. NRTN is involved in neuronal
differentiation of ENCCs, as mice lacking the NRTN co-receptor GFRα2 exhibit a
reduced density of cholinergic projections and decreased intestinal motility80.
Despite the indispensable role for the GFRα co-receptors in RET signaling,
mutations in these genes have not yet been reported in HSCR patients, except for a
GFRα1 deletion in a single patient81.
Endothelin signaling pathway
In addition to RET, signaling through the endothelin receptor type B (EDNRB) is
also critical for the development of the ENS. EDNRB is a G protein-coupled
receptor that is expressed by ENCCs and activated by its ligands endothelin (EDN)
1-3 that are secreted by the gut mesenchyme. Mutations in the endothelin pathway
account for ~5% of all HSCR cases55. Most of these mutations were found in
EDNRB, but mutations in EDN3 have been reported as well82–87. In addition, a
single mutation has been found in endothelin converting enzyme 1 (ECE1), an
enzyme that processes endothelins in the gut mesenchyme before secretion into
the lumen of the gut88. Mutations in EDNRB were first identified in a Mennonite
population with a high prevalence of HSCR82. Both heterozygous and homozygous
CHAPTER 1
20
ENDRB mutations occurred in this population. Several homozygous patients
presented with additional symptoms, including deafness and hypopigmentation, a
condition known as Shah-Waardenburg syndrome. The presence of heterozygous
EDNRB mutations in isolated HSCR has later been confirmed83. Likewise,
homozygous mutations in the ligand EDN3 cause Shah-Waardenburg syndrome,
whereas heterozygous mutations in EDN3 lead to isolated HSCR86,87.
Disruption of Ednrb, Edn3 or Ece1 in mice leads to aganglionosis and
abnormal pigmentation, resembling the Shah-Waardenburg phenotype89–91. In the
mouse gut, EDNRB signaling is required for the migration of ENCCs towards EDN3
between developmental stage E10 and E12.592,93. Additionally, activation of
EDNRB by EDN3 induces ENCCs to proliferate, maintains their precursor state and
prevents premature differentiation74,94,95.
Neuregulin signaling
A genome-wide association study (GWAS) performed on HSCR patients revealed a
strong association of SNPs in intron 1 of neuregulin 1 (NRG1) in a Chinese
population96. Subsequent fine-mapping of the NRG1 locus showed that the four
most highly associated SNPs map to the NRG1 promoter region and these variants
reduce NRG1 expression levels in gut tissue97. The association between common
variants in NRG1 and HSCR has been confirmed in independent Thai and
Indonesian populations, but no significant association between NRG1 and HSCR
was found in patients of European ancestry98–101. Sequencing analysis of the NRG1
coding region revealed that, in addition to common variants, also rare, coding
variants are found in HSCR patients. These were found in both Chinese and
Spanish patient cohorts, implicating NRG1 in the development of HSCR not only in
Asians, but also in Caucasians102,103.
Following up on the initial GWAS performed on Chinese HSCR patients,
these data were re-purposed to assess the role of copy number variants (CNVs) in
HSCR. In this study, CNVs in NRG3, a paralog of NRG1, were found to be associated
to HSCR104. Additional evidence implicating NRG3 in HSCR was obtained in an
exome sequencing study on a Chinese HSCR family that identified NRG3 as the top
candidate gene in this family105.
Neuregulins are a family of proteins that are involved in synaptic plasticity
in neurons and differentiation into glial cells106. The different neuregulins (NRG1-
4) have many splice variants that are either membrane bound or secreted.
Secreted neuregulins bind to various different heterodimers of ErbB receptors 1-4.
GENERAL INTRODUCTION
21
1 The importance of Nrg-ErbB signaling in the murine ENS has been shown by
conditional knockout of ErbB2 in Nestin-expressing cells. These mice displayed a
loss of enteric neurons and glia and distension of the colon, reminiscent of HSCR107.
Semaphorin class 3 signaling
A GWAS study on HSCR patients of European ancestry detected an association to
the semaphorin class 3 locus, that contains seven SEMA3 genes (SEMA3A through
SEMA3G)101. These authors found that SEMA3A, SEMA3C and SEMA3D were
expressed in the mouse ENS, and therefore performed targeted sequencing of
these genes in HSCR cases. Damaging variants were found in SEMA3A, SEMA3C and
SEMA3D and knockdown of sema3c and sema3d in zebrafish embryos resulted in
HSCR-like phenotype. The strongest association within the semaphorin class 3
locus maps in between SEMA3A and SEMA3D. This inspired a mutational screening
of SEMA3A and SEMA3D that identified several coding variants in these two
genes108.
HSCR-associated transcription factors
Mutations in transcription factors have mostly been implicated in the genetic
etiology of syndromic forms of HSCR, such as Shah-Waardenburg syndrome,
congenital central hypoventilation syndrome (CCHS), Mowat-Wilson syndrome
and Pitt-Hopkins syndrome.
Heterozygous mutations in SOX10 cause Shah-Waardenburg
syndrome109,110. This is opposed to the Endothelin pathway in Shah-Waardenburg,
where homozygous mutations are involved. SOX10 mutations were originally
restricted to syndromic HSCR, however a single case was reported of a truncating
mutation in a non-syndromic HSCR patient111. SOX10 is expressed by ENCCs as
they invade the foregut and is critical for maintenance of their progenitor state112.
SOX10 mutations are also found in another neurocristopathy, namely Kallmann
syndrome113.
CCHS is a rare respiratory disorder that is associated with HSCR and is
primarily caused by mutations in PHOX2B114. CCHS results from loss of autonomic
neurons that fail to develop in the absence of PHOX2B115.
Mowat-Wilson syndrome, which is characterized by mental retardation,
facial abnormalities, epilepsy and HSCR, is caused by mutations in ZFHX1B (official
name ZEB2, sometimes referred to as SIP1)116,117. ZFHX1B is required for the
formation of vagal neural crest cells118.
CHAPTER 1
22
Pitt-Hopkins syndrome is a rare genetic disorder that is characterized by
mental retardation, wide mouth and distinctive facial features, and intermittent
hyperventilation followed by apnea. Pitt-Hopkins syndrome is caused by
mutations in TCF4119,120. A single patient has been reported that, in addition to the
characteristic features of Pitt-Hopkins syndrome, presented with HSCR121.
As mentioned earlier, two SNPs in the promoter region of RET overlap
with a binding site of the transcription factor NKX2-1. Sequence analysis of the
coding regions of NKX2-1 identified one heterozygous mutation in the DNA binding
domain of the gene56. Functional analysis showed that the mutant NKX2-1 caused a
reduction in RET promoter activity, but only in combination with the predisposing
SNPs in the RET promoter. In contrast to the other transcription factors involved in
HSCR (SOX10, PHOX2B, ZFHX1B and TCF4), which are associated with syndromic
forms of the disease, the NKX2-1mutation was found in a patient with non-
syndromic HSCR. However, NKX2-1 mutations are also associated with congenital
hypothyroidism, a condition that can present in syndromic HSCR122,123.
Other genes in HSCR
Mutations in cell adhesion molecule L1CAM have been reported in patients
suffering from the rare comorbidity of X-linked hydrocephalus and HSCR124–127.
The phenotypic spectrum of L1CAM mutations is extensive however, and one case
was reported of two siblings who presented with HSCR and a hypoplastic corpus
callosum, but without hydrocephalus128. Another cell adhesion molecule, DSCAM,
was found to be associated with HSCR through a chromosome 21 scan in Down
syndrome patients with HSCR129. Although the gene was identified in patients with
Down syndrome and HSCR, a replication study on non-syndromic HSCR patients
confirmed the association between DSCAM and HSCR129.
Homozygosity mapping in a family presenting with Goldberg-Shprintzen
syndrome identified homozygous nonsense mutations in KBP (KIAA1279, official
name KIF1BP) as a cause for this syndromic form of HSCR130. Recently, the role of
KBP in Goldberg-Shprintzen syndrome was confirmed by two reports who found
homozygous nonsense mutations and deletions in KBP, respectively131,132. KBP
interacts with microtubule-associated proteins and is required for neuronal
differentiation and neurite outgrowth133.
Due to the multipotent character of ENCCs, a study was performed to
determine the expression levels of stem cell-expressed genes in ENCCs from gut
biopsies. The authors found a low expression level of DNMT3B in ENCCs isolated
GENERAL INTRODUCTION
23
1 from HSCR patients compared to control individuals. Subsequent genetic analysis
of a large patient cohort found three potential damaging variants in DNMT3B134.
DNMT3B regulates the expression of neural crest specifier genes and loss of
DNMT3B prolongs the duration of neural crest production and emigration from
the neural tube135,136.
Through pathway-based epistasis analysis of GWAS data on Chinese
patients, PTCH1 and DLL3 were identified as susceptibility genes for HSCR137. The
association between PTCH1 and HSCR has later been confirmed in an independent
patient cohort138. PTCH1 acts as a receptor in hedgehog signaling and DLL3
encodes a ligand for Notch. Both hedgehog and Notch signaling are involved in
gliogenesis, underlining the critical balance between neuronal and glial
differentiation in ENS development.
HSCR susceptibility regions
Linkage analysis on HSCR families has been a widely adopted methodology to
identify HSCR-associated loci and led to the identification of 10q11, the locus that
contains RET44,45. A study by Bolk et al. showed that in fact the majority of HSCR
families have linkage to the RET locus, although only half of these families carry
coding RET variants139. Interestingly, this study identified 9q31 as a second
susceptibility region exclusively in patients without coding RET variants. Fine
mapping of the 9q31 locus in Dutch families detected association to SVEP1 in the
absence of RET variants, but could not be replicated in Chinese samples. The
Chinese population did show association to IKBKAP. However, in contrast to the
linkage analysis by Bolk et al., this association was most significant in patients
carrying coding RET variants140. Ikbkap is important for ENS development in
zebrafish, as knockdown of the gene causes aganglionosis141. Several other loci
have been found by linkage analysis and for some of these loci, no causal gene has
yet been identified. In the Mennonite population where EDNRB was discovered,
there was not only linkage to the EDNRB locus, but this study also found that the
21q22 locus is a risk modifier of HSCR142. Other studies reported linkage to 3p21
and 19q12 in addition to the RET locus143 and linkage to 16q23 in addition to the
RET and EDNRB loci144. Finally, the 4q31-q32 locus was identified in a HSCR family
that showed no linkage to any of the known HSCR genes, yet the 4q31-q32 locus
was not sufficient to cause the disease145.
CHAPTER 1
24
Interactions between HSCR pathways
The genetics of HSCR are complex and genetic variants in the above-mentioned
genes contribute to HSCR, but are not sufficient to cause the disease. It is a
combination of predisposing factors that leads to the development of HSCR. For
example, a HSCR patient was reported to have mutations in both RET and EDNRB,
each inherited from a healthy parent146. Furthermore, the joint transmission of
RET and EDNRB haplotypes was demonstrated in a Mennonite population82,144.
These genetic interactions were corroborated by the functional interaction
between Ret and Ednrb in mouse models of aganglionosis144,147,148. Association
studies of RET and NRG1 in HSCR have shown that the combination of SNPs in
these genes gives a higher disease risk than SNPs in RET or NRG1 alone98,99,149. The
balance between RET and NRG signaling is important for ENS development, as it
was shown that NRG1 inhibits GDNF-induced neuronal differentiation of ENCCs,
while GDNF downregulates the expression of the NRG1 receptor ErbB2149.
Epistasis between RET and SEMA3 was demonstrated and the risk of developing
HSCR correlates with the number of risk alleles in these genes100. Indeed,
knockdown of sema3c in zebrafish increases the extent of aganglionosis in
conjunction with knockdown of ret101. In several congenital syndromes that are
associated with HSCR, the predisposing RET allele in intron 1 is overrepresented in
patients that present with HSCR, suggesting that RET is a modifier in syndromic
HSCR150.
ENS DEVELOPMENT GENES NOT ASSOCIATED WITH HSCR
As described here, a large number of genes have been associated with HSCR and
these genes have functional roles in the development of the ENS. In addition to
these HSCR-associated genes, many other genes play a role during ENS
development and are therefore candidate HSCR genes. For example, loss of Indian
hedgehog (Ihh) causes segmental aganglionosis in mice and loss of Sonic hedgehog
(Shh) results in abnormal innervations in the gut151. Total colonic aganglionosis is
observed in homozygous Pax3 mutant mice, or in Pax3/Tcof1 double
heterozygotes152,153. Other mouse models of aganglionosis include Sall4 and β1-
integrin knockout mice that fail to colonize the distal bowel and Mash1 knockout
mice display aganglionosis of the esophagus154–156. Furthermore, mice lacking Dcc
GENERAL INTRODUCTION
25
1 fail to develop submucosal ganglia and loss of TrkC, its ligand Nt3 or Hlx1 results in
reduced numbers of enteric ganglia157,158.
A forward genetic screen for loci affecting ENS development found
partially penetrant aganglionosis in the ´TashT´ mouse line159. Interestingly, the
penetrance of aganglionosis was higher in males than in females, strongly
resembling the gender-bias that is observed in HSCR. The inserted transgene was
mapped to a gene desert on chromosome 10 that physically interacts with the
nearby gene Fam162b. Many X-linked genes were found to be downregulated in
ENCCs from ‘TashT’ mouse embryos, and known HSCR genes were differentially
expressed.
It has been suggested that the sexual dimorphism in HSCR may result from
expression of SRY, which is encoded on the male Y-chromosome. The transcription
factor SOX10 positively regulates RET expression by binding to its promoter
region. In contrast, SRY represses RET transcription and can competitively replace
SOX10 on the RET promoter160. Since SRY is expressed exclusively in males, the
gene could contribute to HSCR in males by reducing RET expression.
The Notch, WNT, TGF-β and FGF signaling pathways are involved in the
specification of numerous lineages during embryonic development, including the
development of the ENS. Notch is required for maintaining the progenitor state of
ENCC by regulating SOX10 expression and conditional inactivation of Notch
signaling in neural crest cells results in premature neurogenesis161. Wnt signaling
is involved in multiple aspects of ENS development. Non-canonical signaling by
Wnt11 regulates the specification of the neural crest, proliferation of ENCC
depends on Wnt1 and Wnt3a, and Wnt5 is required for axon guidance162–164. BMP
proteins are part of the TGF-β superfamily and have been implicated in the
development of the ENS. BMP2 and 4 are required for the migration of ENCCs
along the gut and loss of BMP signaling causes hypoganglionosis165,166. Moreover,
ENCCs fail to organize into ganglia in the absence of BMPs. FGF signaling is
required for specification of the neural crest and FGF2 regulates proliferation of
neural crest cells167,168.
COMMON, RARE AND DE NOVO VARIANTS IN HSCR
As summarized in Table 1, a large number of genes have been associated to HSCR.
Most of these genes have been identified through linkage analysis in HSCR families,
CHAPTER 1
26
the analysis of syndromic HSCR cases and by genetic studies in genetically isolated
populations. These cases show a Mendelian mode of inheritance, either recessive
or dominant with reduced penetrance. However, the majority of HSCR cases are
sporadic and non-syndromic, and show a non-Mendelian mode of inheritance41.
Consequently, only ~25% of the heritability in HSCR can currently be explained.
Part of the missing heritability is likely due to genetic variation in yet unidentified
HSCR genes. The introduction of GWAS has allowed the identification of
predisposing variants in sporadic HSCR cases on a genome-wide scale and has
implicated the neuregulin and semaphorin class 3 pathways in the etiology of
HSCR96,101. In addition to the common variants associated with HSCR, targeted
sequencing has revealed that rare variants contribute to sporadic HSCR as well.
Next-generation sequencing (NGS) has revolutionized the genome-wide
analysis of rare variants. However, it remains challenging to assess the
contribution of rare variants to sporadic, non-Mendelian disease. Although rare
variants with a relatively large effect size may be identified through exome
sequencing, their allele frequency is usually too low for statistical significant
associations. Moreover, the large number of genotyped base pairs in exome
sequencing requires a large multiple-testing correction, further reducing statistical
power. Thus far, exome sequencing studies in HSCR have focused on known and
candidate disease genes or have been performed on familial HSCR cases105,169,170.
When it comes to discovering pathogenic rare variants in sporadic disease,
exome sequencing has been most successful for de novo mutations. De novo
mutations are newly occurring genetic variants that are caused by errors in DNA
replication or environmental factors. When de novo mutations occur in the
germline they can be passed on to the next generation. Consequently, germline de
novo mutations can be found in the somatic cells of a child, but not in somatic cells
of its parents. De novo mutations are a major cause of syndromic HSCR,
particularly in the HSCR-associated syndromes Waardenburg-Shah (de novo
mutations in SOX10), congenital central hypoventilation syndrome (PHOX2B) and
Mowat-Wilson syndrome (ZFHX1B)111,114,117. In sporadic HSCR, mutations in RET
and EDNRB have been reported to occur de novo, mainly in patients with long-
segment HSCR48,49,171–173. Given its high heritability, it is conceivable that long-
segment HSCR can be caused by de novo mutations in other, not previously
associated genes as well. Genome-wide screens for de novo mutations in
neurodevelopmental disorders found that de novo mutations contribute to the
genetic etiology of these disorders and have led to the identification of new disease
GENERAL INTRODUCTION
27
1 genes174. A genome-wide screen for de novo mutations in sporadic HSCR cases may
therefore identify new genes for HSCR and help our understanding of HSCR
genetics, which could have important implications for genetic counseling.
DE NOVO MUTATIONS
Patterns and rates of de novo mutations in the human genome
The rate at which de novo mutations occur in the human genome is 1.20x10-8 per
nucleotide per generation for single nucleotide variants (SNVs)175,176. This comes
down to an average of 63.2 de novo SNVs in the human genome175. Measurements
of de novo SNV rates in the exome (the coding regions of the genome) are slightly
higher, with observed rates of 1.31-2.17x10-8 (176). This difference may be partly
explained by the high GC content of the exome and by transcription-induced
mutagenesis177. The de novo mutation rates of small insertions and deletions and
large copy number variations (CNVs) are orders of magnitude lower178,179.
However, due to their large size, the total number of base pairs affected by de novo
CNVs exceeds the number of de novo SNVs. De novo mutations are not randomly
distributed over paternally and maternally inherited chromosomes. In fact, 75% of
de novo SNVs are located on a chromosome that is inherited from the
father175,176,180. Moreover, the number of germline de novo mutations increases
with the father’s age. For every added year to the father’s age at time of conception,
the child will have 2 additional de novo mutations175,180. The age of the mother at
time of conception does not affect the number of de novo mutations in the
offspring. The high number of cell divisions during male spermatogenesis can
explain the gender and age bias, which continues to increase with age.
The Genome of the Netherlands (GoNL) consortium has analyzed the
patterns of de novo mutations in 250 Dutch parent-offspring trios in great detail181.
The GoNL study found that the genomic position of de novo mutations is not
random and depends on the father’s age. De novo mutations in offspring from
young fathers reside in late-replicating genomic regions. With increasing paternal
age, de novo mutations are more likely to be found in early-replicating loci181. In
addition, de novo mutations in offspring from older fathers are more often located
in the proximity of genes, suggesting that older fathers not only transmit a higher
number of de novo mutations, but their de novo mutations are also more likely to
have a functional impact181. In line with other reports, the GoNL study found
CHAPTER 1
28
clusters of 2-3 de novo mutations within 20 kb182,183. Interestingly, the clustered de
novo mutations contained few C>T substitutions in CpG islands and many C>G
substitutions compared to non-clustered de novo mutations181. This finding
suggests that the clustered de novo mutations arise by a different mechanism than
non-clustered de novo mutations.
De novo mutations in human disease
The introduction of new genetic variation by de novo mutations is a requirement
for evolution. However, only a fraction of all de novo mutations will be
advantageous for an individual. In fact, de novo mutations are an important cause
of human diseases174. Somatic de novo mutations are involved in cancer and many
genetic diseases are caused by germline de novo mutations. Rare developmental
disorders are generally caused by de novo mutations in a single gene. These include
CHARGE syndrome (mutations in CHD7), Schinzel-Giedion syndrome (SETBP1),
Kabuki syndrome (MLL2), Bohring-Opitz syndrome (ASXL1) and KBG syndrome
(ANKRD11)184–188. Besides rare developmental disorders, de novo mutations can
also contribute to more common genetic diseases. De novo CNVs have been
identified in a wide range of neuropsychiatric disorders and the frequency of CNVs
in these patients is far higher than in the general population189. Targeted
sequencing of known disease genes for autism spectrum disorders (ASD),
schizophrenia and intellectual disability (ID) showed that de novo SNVs in multiple,
different genes play a role in these diseases190,191.
The introduction of NGS has allowed the genome-wide study of de novo
SNVs. In reality, researchers often limit their study to the exome, because exome
sequencing is less expensive than whole genome sequencing and non-coding
variants remain difficult to annotate and interpret. Four large exome sequencing
studies on ASD found de novo SNVs in many interconnected genes that act together
in neuronal development192–195. The fraction of de novo mutations that was
deleterious was higher in ASD patients than in their unaffected siblings192,195. An
excess of nonsynonymous de novo SNVs was also reported in schizophrenia196.
Another exome sequencing study on schizophrenia reported an increased de novo
mutation rate in patients197. However, it must be noted here that the de novo
mutation rate in schizophrenic patients was not compared to unaffected controls,
but to publically available whole-genome sequencing data. Since mutation rates
from exome sequencing data are higher than those from whole-genome
sequencing data176, it is unclear whether the increased de novo mutation rate is
GENERAL INTRODUCTION
29
1 attributable to the phenotype or the sequencing technology. A whole genome
sequencing study on patients with ID found de novo SNVs and CNVs in known ID
genes in 20 out of 50 patients198. Combined with other technologies, these authors
reported de novo events in ID genes in 60% of their patient cohort. The largest de
novo mutation screen by exome sequencing was performed on 1,213 patients with
congenital heart disease (CHD)199. The de novo SNV rate in these patients was not
different from controls, but patients carried significantly more damaging de novo
mutations. The enrichment for damaging mutations was even more pronounced in
patients that were diagnosed with neurodevelopmental disability or other
congenital anomalies in addition to CHD. In line with the phenotypes, heart- and
brain-expressed genes carried an excess of damaging de novo mutations.
In general, patients with ASD, schizophrenia or CHD with neuro-
developmental involvement are enriched for deleterious de novo mutations, but do
not show an increase in the total number of de novo mutations192,195,196,199.
Nevertheless, offspring from older fathers are at increased risk of developing
psychiatric disorders, including autism, attention-deficit/hyperactivity disorder
(ADHD) and bipolar disorder200. This finding suggests that psychiatric patients
carry increased numbers of de novo mutations, a hypothesis that thus far has not
been confirmed by exome sequencing studies.
Owing to its time- and cost-efficiency, NGS has made its way into clinical
genetic diagnostics. A report has been published on 250 patients with suspected
genetic disorders with Mendelian inheritance, who had been subjected to exome
sequencing. A molecular diagnosis could be made for 25% of the cases. This study
found that 83% of autosomal dominant mutations and 40% of X-linked mutations
were de novo, emphasizing the contribution of de novo mutations to genetic
disorders201.
AIMS AND OUTLINE OF THIS THESIS
The aim of the work described in this thesis is to assess the contribution of rare
variants to the development of HSCR. The advent of NGS has allowed the genome-
wide identification of rare variants, including de novo mutations. The role of de
novo mutations in long-segment HSCR is described in chapter 2. Since genes
carrying de novo mutations were not linked to ENS development based on
CHAPTER 1
30
bioinformatics prediction, we tested the functional contribution of these genes to
ENS development in a zebrafish model.
The interpretation and selection of candidate genes in HSCR relies on our
understanding of the biological processes underlying ENS development. To better
understand these processes, we analyzed the gene expression profiles of enteric
neural crest cells and the long-term effect of RET signaling in these cells. As
described in chapter 3, predicted regulatory genes in ENS development included
known HSCR genes and provided novel candidate genes for HSCR.
Although NGS offers new possibilities in the genetic dissection of HSCR
and other genetic disorders and traits, the application of the technology remains
challenging. The identification of de novo mutations in exome sequencing data is
best described as looking for a needle in a haystack, since the number of technical
artefacts in NGS data is orders of magnitude larger than the true number of de novo
mutations. Chapter 4 describes how sequencing quality parameters can be applied
to discriminate de novo mutations from false positives and allow their efficient
detection.
In addition to de novo mutations, the exome contains many rare, inherited
variants. Association studies for rare variants generally suffer from a lack of
statistical power. In chapter 5 we describe how a combination of strategies can be
applied to approach this problem and to extract useful data from such studies.
The incidence of HSCR is over a hundred times higher in Down syndrome
patients than in the general population, suggesting that the trisomy of one or more
genes on chromosome 21 predisposes to HSCR. We overexpressed genes from
chromosome 21 in a transgenic zebrafish model to identify chromosome 21 genes
linking HSCR to Down syndrome, as described in chapter 6.
Finally, chapter 7 summarizes and discusses the work presented in this
thesis. Moreover, future perspectives for the genetic dissection of HSCR are given.
GENERAL INTRODUCTION
31
1 REFERENCES
1. Gershon, M. D. The enteric nervous system: a second brain. Hosp. Pract. (1995) 34, 31–32, 35–38, 41–42 passim (1999).
2. Heanue, T. & Pachnis, V. Enteric nervous system development and Hirschsprung’s disease: advances in genetic and stem cell studies. Nat. Rev. Neurosci. 8, 466–79 (2007).
3. Le Douarin, N. M. & Kalcheim, C. The Neural Crest 2nd edition. (1999). 4. Yntema, C. L. & Hammond, W. S. The origin of intrinsic ganglia of trunk viscera from vagal
neural crest in the chick embryo. J. Comp. Neurol. 101, 515–41 (1954). 5. Wallace, A. S. & Burns, A. J. Development of the enteric nervous system, smooth muscle and
interstitial cells of Cajal in the human gastrointestinal tract. Cell Tissue Res. 319, 367–82 (2005).
6. Kapur, R. P., Yost, C. & Palmiter, R. D. A transgenic model for studying development of the enteric nervous system in normal and aganglionic mice. Development 116, 167–75 (1992).
7. Druckenbrod, N. R. & Epstein, M. L. The pattern of neural crest advance in the cecum and colon. Dev. Biol. 287, 125–33 (2005).
8. Nishiyama, C. et al. Trans-mesenteric neural crest cells are the principal source of the colonic enteric nervous system. Nat. Neurosci. 15, 1211–8 (2012).
9. Burns, A. J. & Douarin, N. M. The sacral neural crest contributes neurons and glia to the post-umbilical gut: spatiotemporal analysis of the development of the enteric nervous system. Development 125, 4335–47 (1998).
10. Kapur, R. P. Colonization of the murine hindgut by sacral crest-derived neural precursors: experimental support for an evolutionarily conserved model. Dev. Biol. 227, 146–55 (2000).
11. Wang, X., Chan, A. K. K., Sham, M. H., Burns, A. J. & Chan, W. Y. Analysis of the sacral neural crest cell contribution to the hindgut enteric nervous system in the mouse embryo. Gastroenterology 141, 992–1002.e6 (2011).
12. Young, H. M., Turner, K. N. & Bergner, A. J. The location and phenotype of proliferating neural-crest-derived cells in the developing mouse gut. Cell Tissue Res. 320, 1–9 (2005).
13. Young, H. M., Bergner, A. J. & Muller, T. Acquisition of neuronal and glial markers by neural crest-derived cells in the mouse intestine. J. Comp. Neurol. 456, 1–11 (2003).
14. Young, H. M. et al. Dynamics of neural crest-derived cell migration in the embryonic mouse gut. Dev. Biol. 270, 455–473 (2004).
15. Pham, T. D., Gershon, M. D. & Rothman, T. P. Time of origin of neurons in the murine enteric nervous system: sequence in relation to phenotype. J. Comp. Neurol. 314, 789–798 (1991).
16. Sasselli, V., Pachnis, V. & Burns, A. J. The enteric nervous system. Dev. Biol. 366, 64–73 (2012). 17. Garver, K. L., Law, J. C. & Garver, B. Hirschsprung disease: a genetic study. Clin. Genet. 28, 503–
508 (1985). 18. Ruttenstock, E. & Puri, P. A meta-analysis of clinical outcome in patients with total intestinal
aganglionosis. Pediatr. Surg. Int. 25, 833–9 (2009). 19. Whitehouse, F. R. & Kernohan, J. W. Myenteric Plexus in Congenital Megacolon. Arch. Intern.
Med. 82, 75–111 (1948). 20. Herzog, B., Moser, F., Schsrli, A. & Morger, R. Acetylcholinesterase Activity in Suction Biopsies
of the Rectum in the Diagnosis of Hirschsprung’s Disease. J Pediatr Surg 7, 11–7 (1972). 21. Emir, H. et al. Anorectal manometry during the neonatal period: its specificity in the diagnosis
of Hirschsprung’s disease. Eur. J. Pediatr. Surg. 9, 101–3 (1999). 22. Fullerton, A. Congenital Idiopathic Dilatation of the Colon Treated by Stretching of the Pelvi-
rectal sphincter. Br. Med. J. 1, 753–4 (1927). 23. Swenson, O. My early experience with Hirschsprung’s disease. J. Pediatr. Surg. 24, 839–44;
discussion 844–5 (1989). 24. Georgeson, K. E. et al. Primary laparoscopic-assisted endorectal colon pull-through for
Hirschsprung’s disease: a new gold standard. Ann. Surg. 229, 678–82; discussion 682–3 (1999).
25. De la Torre-Mondragón, L. & Ortega-Salgado, J. A. Transanal endorectal pull-through for Hirschsprung’s disease. J. Pediatr. Surg. 33, 1283–6 (1998).
26. Fujiwara, N. et al. A comparative study of laparoscopy-assisted pull-through and open pull-through for Hirschsprung’s disease with special reference to postoperative fecal continence. J. Pediatr. Surg. 42, 2071–4 (2007).
CHAPTER 1
32
27. De la Torre, L. & Ortega, A. Transanal versus open endorectal pull-through for Hirschsprung’s disease. J. Pediatr. Surg. 35, 1630–2 (2000).
28. Langer, J. C., Seifert, M. & Minkes, R. K. One-stage Soave pull-through for Hirschsprung’s disease: a comparison of the transanal and open approaches. J. Pediatr. Surg. 35, 820–2 (2000).
29. Pini Prato, A. et al. Hirschsprung’s disease: what about mortality? Pediatr. Surg. Int. 27, 473–8 (2011).
30. Rintala, R. J. & Pakarinen, M. P. Long-term outcomes of Hirschsprung’s disease. Semin. Pediatr. Surg. 21, 336–43 (2012).
31. Ieiri, S. et al. Long-term outcomes and the quality of life of Hirschsprung disease in adolescents who have reached 18 years or older--a 47-year single-institute experience. J. Pediatr. Surg. 45, 2398–402 (2010).
32. Rauch, U., Hänsgen, A., Hagl, C., Holland-Cunz, S. & Schäfer, K.-H. Isolation and cultivation of neuronal precursor cells from the developing human enteric nervous system as a tool for cell therapy in dysganglionosis. Int. J. Colorectal Dis. 21, 554–9 (2006).
33. Almond, S., Lindley, R. M., Kenny, S. E., Connell, M. G. & Edgar, D. H. Characterisation and transplantation of enteric nervous system progenitor cells. Gut 56, 489–96 (2007).
34. Lindley, R. M. et al. Human and Mouse Enteric Nervous System Neurosphere Transplants Regulate the Function of Aganglionic Embryonic Distal Colon. Gastroenterology 135, (2008).
35. Metzger, M., Caldwell, C., Barlow, A. J., Burns, A. J. & Thapar, N. Enteric Nervous System Stem Cells Derived From Human Gut Mucosa for the Treatment of Aganglionic Gut Disorders. Gastroenterology 136, (2009).
36. Lee, G., Chambers, S. M., Tomishima, M. J. & Studer, L. Derivation of neural crest cells from human pluripotent stem cells. Nat Protoc 5, 688–701 (2010).
37. Lee, G. et al. Isolation and directed differentiation of neural crest stem cells derived from human embryonic stem cells. Nat. Biotechnol. 25, 1468–1475 (2007).
38. Fattahi, F. et al. Deriving human ENS lineages for cell therapy and drug discovery in Hirschsprung disease. Nature (2016). doi:10.1038/nature16951
39. Ben-David, U. & Benvenisty, N. The tumorigenicity of human embryonic and induced pluripotent stem cells. Nat. Rev. Cancer 11, 268–77 (2011).
40. Torfs, C. P. An epidemiological study of Hirschsprung disease in a multiracial California population. Third Int. Meet. Hirschsprung Dis. Relat. neurocristopathies (1998).
41. Badner, J. A., Sieber, W. K., Garver, K. L. & Chakravarti, A. A genetic study of Hirschsprung disease. Am. J. Hum. Genet. 46, 568–80 (1990).
42. Friedmacher, F. & Puri, P. Hirschsprung’s disease associated with Down syndrome: A meta-analysis of incidence, functional outcomes and mortality. Pediatr. Surg. Int. 29, 937–946 (2013).
43. Emison, E. S. et al. Differential contributions of rare and common, coding and noncoding Ret mutations to multifactorial Hirschsprung disease liability. Am. J. Hum. Genet. 87, 60–74 (2010).
44. Angrist, M. et al. A gene for Hirschsprung disease (megacolon) in the pericentromeric region of human chromosome 10. Nat. Genet. 4, 351–6 (1993).
45. Lyonnet, S. et al. A gene for Hirschsprung disease maps to the proximal long arm of chromosome 10. Nat. Genet. 4, 346–50 (1993).
46. Edery, P. et al. Mutations of the RET proto-oncogene in Hirschsprung’s disease. Nature 367, 378–380 (1994).
47. Romeo, G. et al. Point mutations affecting the tyrosine kinase domain of the RET proto-oncogene in Hirschsprung’s disease. Nature 367, 377–378 (1994).
48. Attié, T. et al. Diversity of RET proto-oncogene mutations in familial and sporadic Hirschsprung disease. Hum. Mol. Genet. 4, 1381–6 (1995).
49. Hofstra, R. M. et al. RET and GDNF gene scanning in Hirschsprung patients using two dual denaturing gel systems. Hum. Mutat. 15, 418–29 (2000).
50. Pasquali, D. et al. Multiple endocrine neoplasia, the old and the new: a mini review. G. Chir. 33, 370–3 (2012).
51. Pasini, B. et al. Loss of function effect of RET mutations causing Hirschsprung disease. Nat. Genet. 10, 35–40 (1995).
52. Mulligan, L. M. et al. Diverse phenotypes associated with exon 10 mutations of the RET proto-oncogene. Hum. Mol. Genet. 3, 2163–2168 (1994).
53. Angrist, M. et al. Mutation analysis of the RET receptor tyrosine kinase in Hirschsprung
GENERAL INTRODUCTION
33
1 disease. Hum. Mol. Genet. 4, 821–30 (1995). 54. Sijmons, R. H. et al. Oncological implications of RET gene mutations in Hirschsprung’s disease.
Gut 43, 542–547 (1998). 55. Amiel, J. et al. Hirschsprung disease, associated syndromes and genetics: a review. J. Med.
Genet. 45, 1–14 (2008). 56. Garcia-Barcelo, M. et al. TTF-1 and RET promoter SNPs: regulation of RET transcription in
Hirschsprung’s disease. Hum. Mol. Genet. 14, 191–204 (2005). 57. Griseri, P. et al. A common haplotype at the 5’ end of the RET proto-oncogene,
overrepresented in Hirschsprung patients, is associated with reduced gene expression. Hum. Mutat. 25, 189–95 (2005).
58. Emison, E. S. et al. A common sex-dependent mutation in a RET enhancer underlies Hirschsprung disease risk. Nature 434, 857–63 (2005).
59. Sribudiani, Y. et al. Variants in RET associated with hirschsprung’s disease affect binding of transcription factors and gene expression. Gastroenterology 140, 572–582 (2011).
60. Grice, E. A., Rochelle, E. S., Green, E. D., Chakravarti, A. & McCallion, A. S. Evaluation of the RET regulatory landscape reveals the biological relevance of a HSCR-implicated enhancer. Hum. Mol. Genet. 14, 3837–3845 (2005).
61. Chattopadhyay, P. et al. Global survey of haplotype frequencies and linkage disequilibrium at the RET locus. Eur. J. Hum. Genet. 11, 760–9 (2003).
62. Griseri, P. et al. A common variant located in the 3’UTR of the RET gene is associated with protection from Hirschsprung disease. Hum. Mutat. 28, 168–76 (2007).
63. Griseri, P. et al. A single-nucleotide polymorphic variant of the RET proto-oncogene is underrepresented in sporadic Hirschsprung disease. Eur. J. Hum. Genet. 8, 721–4 (2000).
64. Plaza-Menacho, I., Burzynski, G. M., de Groot, J. W., Eggen, B. J. L. & Hofstra, R. M. W. Current concepts in RET-related genetics, signaling and therapeutics. Trends Genet. 22, 627–36 (2006).
65. Schuchardt, A., D’Agati, V., Larsson-Blomberg, L., Costantini, F. & Pachnis, V. Defects in the kidney and enteric nervous system of mice lacking the tyrosine kinase receptor Ret. Nature 367, 380–383 (1994).
66. Pichel, J. G. et al. Defects in enteric innervation and kidney development in mice lacking GDNF. Nature 382, 73–76 (1996).
67. Sánchez, M. P. et al. Renal agenesis and the absence of enteric neurons in mice lacking GDNF. Nature 382, 70–73 (1996).
68. Moore, M. W. et al. Renal and neuronal abnormalities in mice lacking GDNF. Nature 382, 76–79 (1996).
69. Enomoto, H. et al. GFR alpha1-deficient mice have deficits in the enteric nervous system and kidneys. Neuron 21, 317–324 (1998).
70. Young, H. M. et al. GDNF is a chemoattractant for enteric neural cells. Dev. Biol. 229, 503–16 (2001).
71. Chalazonitis, A., Rothman, T. P., Chen, J. & Gershon, M. D. Age-dependent differences in the effects of GDNF and NT-3 on the development of neurons and glia from neural crest-derived precursors immunoselected from the fetal rat gut: expression of GFRalpha-1 in vitro and in vivo. Dev. Biol. 204, 385–406 (1998).
72. Taraviras, S. et al. Signalling by the RET receptor tyrosine kinase and its role in the development of the mammalian enteric nervous system. 2797, 2785–2797 (1999).
73. Gianino, S., Grider, J. R., Cresswell, J., Enomoto, H. & Heuckeroth, R. O. GDNF availability determines enteric neuron number by controlling precursor proliferation. Development 130, 2187–2198 (2003).
74. Hearn, C. J., Murphy, M. & Newgreen, D. GDNF and ET-3 differentially modulate the numbers of avian enteric neural crest cells and enteric neurons in vitro. Dev. Biol. 197, 93–105 (1998).
75. Natarajan, D., Marcos-Gutierrez, C., Pachnis, V. & de Graaff, E. Requirement of signalling by receptor tyrosine kinase RET for the directed migration of enteric nervous system progenitor cells during mammalian embryogenesis. Development 129, 5151–60 (2002).
76. Angrist, M., Bolk, S., Halushka, M., Lapchak, P. A. & Chakravarti, A. Germline mutations in glial cell line-derived neurotrophic factor (GDNF) and RET in a Hirschsprung disease patient. Nat. Genet. 14, 341–344 (1996).
77. Ivanchuk, S. M., Myers, S. M., Eng, C. & Mulligan, L. M. De novo mutation of GDNF, ligand for the RET/GDNFR-alpha receptor complex, in Hirschsprung disease. Hum. Mol. Genet. 5, 2023–2026 (1996).
CHAPTER 1
34
78. Doray, B. et al. Mutation of the RET ligand, neurturin, supports multigenic inheritance in Hirschsprung disease. Hum. Mol. Genet. 7, 1449–1452 (1998).
79. Ruiz-Ferrer, M. et al. Novel mutations at RET ligand genes preventing receptor activation are associated to Hirschsprung’s disease. J. Mol. Med. (Berl). 89, 471–80 (2011).
80. Rossi, J. et al. Alimentary tract innervation deficits and dysfunction in mice lacking GDNF family receptor α2. J. Clin. Invest. 112, 707–716 (2003).
81. Borrego, S. et al. Investigation of germline GFRA4 mutations and evaluation of the involvement of GFRA1, GFRA2, GFRA3, and GFRA4 sequence variants in Hirschsprung disease. J. Med. Genet. 40, e18 (2003).
82. Puffenberger, E. G. et al. A missense mutation of the endothelin-B receptor gene in multigenic hirschsprung’s disease. Cell 79, 1257–1266 (1994).
83. Amiel, J. Heterozygous endothelin receptor B (EDNRB) mutations in isolated Hirschsprung disease. Hum. Mol. Genet. 5, 355–357 (1996).
84. Auricchio, A. Endothelin-B receptor mutations in patients with isolated Hirschsprung disease from a non-inbred population. Hum. Mol. Genet. 5, 351–354 (1996).
85. Kusafuka, T. Novel mutations of the endothelin-B receptor gene in isolated patients with Hirschsprung’s disease. Hum. Mol. Genet. 5, 347–349 (1996).
86. Edery, P. et al. Mutation of the endothelin-3 gene in the Waardenburg-Hirschsprung disease (Shah-Waardenburg syndrome). Nat. Genet. 12, 442–4 (1996).
87. Hofstra, R. M. et al. A homozygous mutation in the endothelin-3 gene associated with a combined Waardenburg type 2 and Hirschsprung phenotype (Shah-Waardenburg syndrome). Nat. Genet. 12, 445–7 (1996).
88. Hofstra, R. M. et al. A loss-of-function mutation in the endothelin-converting enzyme 1 (ECE-1) associated with Hirschsprung disease, cardiac defects, and autonomic dysfunction. Am. J. Hum. Genet. 64, 304–8 (1999).
89. Hosoda, K. et al. Targeted and natural (piebald-lethal) mutations of endothelin-B receptor gene produce megacolon associated with spotted coat color in mice. Cell 79, 1267–1276 (1994).
90. Baynash, A. G. et al. Interaction of endothelin-3 with endothelin-B receptor is essential for development of epidermal melanocytes and enteric neurons. Cell 79, 1277–1285 (1994).
91. Yanagisawa, H. et al. Dual genetic pathways of endothelin-mediated intercellular signaling revealed by targeted disruption of endothelin converting enzyme-1 gene. Development 125, 825–836 (1998).
92. Lee, H.-O., Levorse, J. M. & Shin, M. K. The endothelin receptor-B is required for the migration of neural crest-derived melanocyte and enteric neuron precursors. Dev. Biol. 259, 162–75 (2003).
93. Shin, M. K., Levorse, J. M., Ingram, R. S. & Tilghman, S. M. The temporal requirement for endothelin receptor-B signalling during neural crest development. Nature 402, 496–501 (1999).
94. Wu, J. J., Chen, J. X., Rothman, T. P. & Gershon, M. D. Inhibition of in vitro enteric neuronal development by endothelin-3: mediation by endothelin B receptors. Development 126, 1161–73 (1999).
95. Nagy, N. & Goldstein, A. M. Endothelin-3 regulates neural crest cell proliferation and differentiation in the hindgut enteric nervous system. Dev. Biol. 293, 203–17 (2006).
96. Garcia-Barcelo, M.-M. et al. Genome-wide association study identifies NRG1 as a susceptibility locus for Hirschsprung’s disease. Proc. Natl. Acad. Sci. U. S. A. 106, 2694–9 (2009).
97. Tang, C. S.-M. et al. Fine mapping of the NRG1 Hirschsprung’s disease locus. PLoS One 6, e16181 (2011).
98. Phusantisampan, T. et al. Association of genetic polymorphisms in the RET-protooncogene and NRG1 with Hirschsprung disease in Thai patients. J. Hum. Genet. 57, 286–93 (2012).
99. Gunadi et al. Effects of RET and NRG1 polymorphisms in Indonesian patients with Hirschsprung disease. J. Pediatr. Surg. 49, 1614–8 (2014).
100. Kapoor, A. et al. Population variation in total genetic risk of Hirschsprung disease from common RET, SEMA3 and NRG1 susceptibility polymorphisms. Hum. Mol. Genet. 24, 2997–3003 (2015).
101. Jiang, Q. et al. Functional loss of semaphorin 3C and/or semaphorin 3D and their epistatic interaction with ret are critical to Hirschsprung disease liability. Am. J. Hum. Genet. 96, 581–96 (2015).
GENERAL INTRODUCTION
35
1 102. Tang, C. S.-M. et al. Mutations in the NRG1 gene are associated with Hirschsprung disease. Hum. Genet. 131, 67–76 (2011).
103. Luzón-Toro, B. et al. Comprehensive Analysis of NRG1 Common and Rare Variants in Hirschsprung Patients. PLoS One 7, e36524 (2012).
104. Tang, C. S.-M. et al. Genome-wide copy number analysis uncovers a new HSCR gene: NRG3. PLoS Genet. 8, e1002687 (2012).
105. Yang, J. et al. Exome sequencing identified NRG3 as a novel susceptible gene of Hirschsprung’s disease in a Chinese population. Mol. Neurobiol. 47, 957–66 (2013).
106. Buonanno, A. & Fischbach, G. D. Neuregulin and ErbB receptor signaling pathways in the nervous system. Curr. Opin. Neurobiol. 11, 287–96 (2001).
107. Crone, S. A., Negro, A., Trumpp, A., Giovannini, M. & Lee, K.-F. Colonic Epithelial Expression of ErbB2 Is Required for Postnatal Maintenance of the Enteric Nervous System. Neuron 37, 29–40 (2003).
108. Luzón-Toro, B. et al. Mutational spectrum of semaphorin 3A and semaphorin 3D genes in Spanish Hirschsprung patients. PLoS One 8, e54800 (2013).
109. Pingault, V. et al. SOX10 mutations in patients with Waardenburg-Hirschsprung disease. Nat. Genet. 18, 171–3 (1998).
110. Touraine, R. L. et al. Neurological phenotype in Waardenburg syndrome type 4 correlates with novel SOX10 truncating mutations and expression in developing brain. Am. J. Hum. Genet. 66, 1496–503 (2000).
111. Sánchez-Mejías, A. et al. Involvement of SOX10 in the pathogenesis of Hirschsprung disease: Report of a truncating mutation in an isolated patient. J. Mol. Med. 88, 507–514 (2010).
112. Paratore, C., Eichenberger, C., Suter, U. & Sommer, L. Sox10 haploinsufficiency affects maintenance of progenitor cells in a mouse model of Hirschsprung disease. Hum. Mol. Genet. 11, 3075–3085 (2002).
113. Pingault, V. et al. Loss-of-function mutations in SOX10 cause Kallmann syndrome with deafness. Am. J. Hum. Genet. 92, 707–724 (2013).
114. Amiel, J. et al. Polyalanine expansion and frameshift mutations of the paired-like homeobox gene PHOX2B in congenital central hypoventilation syndrome. Nat. Genet. 33, 459–61 (2003).
115. Pattyn, A., Morin, X., Cremer, H., Goridis, C. & Brunet, J. F. The homeobox gene Phox2b is essential for the development of autonomic neural crest derivatives. Nature 399, 366–70 (1999).
116. Mowat, D. R. et al. Hirschsprung disease, microcephaly, mental retardation, and characteristic facial features: delineation of a new syndrome and identification of a locus at chromosome 2q22-q23. J. Med. Genet. 35, 617–623 (1998).
117. Wakamatsu, N. et al. Mutations in SIP1, encoding Smad interacting protein-1, cause a form of Hirschsprung disease. Nat. Genet. 27, 369–70 (2001).
118. Van de Putte, T. et al. Mice lacking ZFHX1B, the gene that codes for Smad-interacting protein-1, reveal a role for multiple neural crest cell defects in the etiology of Hirschsprung disease-mental retardation syndrome. Am. J. Hum. Genet. 72, 465–70 (2003).
119. Amiel, J. et al. Mutations in TCF4, encoding a class I basic helix-loop-helix transcription factor, are responsible for Pitt-Hopkins syndrome, a severe epileptic encephalopathy associated with autonomic dysfunction. Am. J. Hum. Genet. 80, 988–993 (2007).
120. Zweier, C. et al. Haploinsufficiency of TCF4 causes syndromal mental retardation with intermittent hyperventilation (Pitt-Hopkins syndrome). Am. J. Hum. Genet. 80, 994–1001 (2007).
121. Peippo, M. M. et al. Pitt-Hopkins syndrome in two patients and further definition of the phenotype. Clin. Dysmorphol. 15, 47–54 (2006).
122. Salerno, T. et al. Respiratory insufficiency in a newborn with congenital hypothyroidism due to a new mutation of TTF-1/NKX2.1 gene. Pediatr. Pulmonol. 49, E42–4 (2014).
123. Mustafin, A. A. & Sultanova, L. M. [Differential diagnosis of Hirschsprung’s disease and congenital hypothyroidism in children]. Vestn. Khir. Im. I. I. Grek. 134, 89–91 (1985).
124. Okamoto, N., Wada, Y. & Goto, M. Hydrocephalus and Hirschsprung’s disease in a patient with a mutation of L1CAM. J. Med. Genet. 34, 670–671 (1997).
125. Parisi, M. A. et al. Hydrocephalus and intestinal aganglionosis: is L1CAM a modifier gene in Hirschsprung disease? Am. J. Med. Genet. 108, 51–6 (2002).
126. Okamoto, N. et al. Hydrocephalus and Hirschsprung’s disease with a mutation of L1CAM. J. Hum. Genet. 49, 334–7 (2004).
CHAPTER 1
36
127. Jackson, S.-R. et al. L1CAM mutation in association with X-linked hydrocephalus and Hirschsprung’s disease. Pediatr. Surg. Int. 25, 823–5 (2009).
128. Basel-Vanagaite, L. et al. Expanding the phenotypic spectrum of L1CAM-associated disease. Clin. Genet. 69, 414–9 (2006).
129. Jannot, A.-S. et al. Chromosome 21 scan in Down syndrome reveals DSCAM as a predisposing locus in Hirschsprung disease. PLoS One 8, e62519 (2013).
130. Brooks, A. S. et al. Homozygous nonsense mutations in KIAA1279 are associated with malformations of the central and enteric nervous systems. Am. J. Hum. Genet. 77, 120–6 (2005).
131. Drévillon, L. et al. KBP-cytoskeleton interactions underlie developmental anomalies in Goldberg-Shprintzen syndrome. Hum. Mol. Genet. 22, 2387–99 (2013).
132. Dafsari, H. S. et al. Goldberg-Shprintzen megacolon syndrome with associated sensory motor axonal neuropathy. Am. J. Med. Genet. A 167, 1300–4 (2015).
133. Alves, M. M. et al. KBP interacts with SCG10, linking Goldberg-Shprintzen syndrome to microtubule dynamics and neuronal differentiation. Hum. Mol. Genet. 19, 3642–3651 (2010).
134. Torroglosa, A. et al. Involvement of DNMT3B in the pathogenesis of Hirschsprung disease and its possible role as a regulator of neurogenesis in the human enteric nervous system. Genet. Med. 16, 703–10 (2014).
135. Martins-Taylor, K., Schroeder, D. I., LaSalle, J. M., Lalande, M. & Xu, R.-H. Role of DNMT3B in the regulation of early neural and neural crest specifiers. Epigenetics 7, 71–82 (2014).
136. Hu, N., Strobl-Mazzulla, P. H., Simoes-Costa, M., Sánchez-Vásquez, E. & Bronner, M. E. DNA methyltransferase 3B regulates duration of neural crest production via repression of Sox10. Proc. Natl. Acad. Sci. U. S. A. 111, 17911–6 (2014).
137. Ngan, E. S.-W. et al. Hedgehog/Notch-induced premature gliogenesis represents a new disease mechanism for Hirschsprung disease in mice and humans. J. Clin. Invest. 121, 3467–78 (2011).
138. Wang, Y. et al. Common Genetic Variations in Patched1 (PTCH1) Gene and Risk of Hirschsprung Disease in the Han Chinese Population. PLoS One 8, 1–8 (2013).
139. Bolk, S. et al. A human model for multigenic inheritance: phenotypic expression in Hirschsprung disease requires both the RET gene and a new 9q31 locus. Proc. Natl. Acad. Sci. U. S. A. 97, 268–73 (2000).
140. Tang, C. S. et al. Fine mapping of the 9q31 Hirschsprung’s disease locus. Hum. Genet. 127, 675–683 (2010).
141. Cheng, W. W.-C. et al. Depletion of the IKBKAP ortholog in zebrafish leads to hirschsprung disease-like phenotype. World J. Gastroenterol. 21, 2040–6 (2015).
142. Puffenberger, E. et al. Identity-by-descent and association mapping of a recessive gene for Hirschsprung disease on human chromosome 13q22. Hum. Mol. Genet. 3, 1217–1225 (1994).
143. Gabriel, S. B. et al. Segregation at three loci explains familial and population risk in Hirschsprung disease. Nat. Genet. 31, 89–93 (2002).
144. Carrasquillo, M. M. et al. Genome-wide association study and mouse model identify interaction between RET and EDNRB pathways in Hirschsprung disease. Nat. Genet. 32, 237–44 (2002).
145. Brooks, A. S. et al. A novel susceptibility locus for Hirschsprung’s disease maps to 4q31.3-q32.3. J. Med. Genet. 43, e35 (2006).
146. Auricchio, A. et al. Double heterozygosity for a RET substitution interfering with splicing and an EDNRB missense mutation in Hirschsprung disease. Am. J. Hum. Genet. 64, 1216–21 (1999).
147. McCallion, A. S., Stames, E., Conlon, R. A. & Chakravarti, A. Phenotype variation in two-locus mouse models of Hirschsprung disease: tissue-specific interaction between Ret and Ednrb. Proc. Natl. Acad. Sci. U. S. A. 100, 1826–31 (2003).
148. Barlow, A., de Graaff, E. & Pachnis, V. Enteric Nervous System Progenitors Are Coordinately Controlled by the G Protein-Coupled Receptor EDNRB and the Receptor Tyrosine Kinase RET. Neuron 40, 905–916 (2003).
149. Gui, H. et al. RET and NRG1 interplay in Hirschsprung disease. Hum. Genet. 132, 591–600 (2013).
150. De Pontual, L. et al. Epistatic interactions with a common hypomorphic Ret allele in syndromic Hirschsprung disease. Hum. Mutat. 28, 790–796 (2007).
151. Ramalho-Santos, M., Melton, D. A. & McMahon, A. P. Hedgehog signals regulate multiple aspects of gastrointestinal development. Development 127, 2763–2772 (2000).
152. Lang, D. et al. Pax3 is required for enteric ganglia formation and functions with Sox10 to modulate expression of c-ret. J. Clin. Invest. 106, 963–71 (2000).
GENERAL INTRODUCTION
37
1 153. Barlow, A. J., Dixon, J., Dixon, M. & Trainor, P. a. Tcof1 acts as a modifier of Pax3 during enteric nervous system development and in the pathogenesis of colonic aganglionosis. Hum. Mol. Genet. 22, 1206–17 (2013).
154. Warren, M. et al. A Sall4 mutant mouse model useful for studying the role of Sall4 in early embryonic development and organogenesis. Genesis 45, 51–58 (2007).
155. Breau, M. A. et al. Lack of beta1 integrins in enteric neural crest cells leads to a Hirschsprung-like phenotype. Development 133, 1725–1734 (2006).
156. Blaugrund, E. et al. Distinct subpopulations of enteric neuronal progenitors defined by time of development, sympathoadrenal lineage markers and Mash-1-dependence. Development 122, 309–20 (1996).
157. Chalazonitis, A. et al. Neurotrophin-3 is required for the survival-differentiation of subsets of developing enteric neurons. J. Neurosci. 21, 5620–5636 (2001).
158. Bates, M. D., Dunagan, D. T., Welch, L. C., Kaul, A. & Harvey, R. P. The Hlx homeobox transcription factor is required early in enteric nervous system development. BMC Dev. Biol. 6, 33 (2006).
159. Bergeron, K.-F. et al. Male-biased aganglionic megacolon in the TashT mouse line due to perturbation of silencer elements in a large gene desert of chromosome 10. PLoS Genet. 11, e1005093 (2015).
160. Li, Y. et al. SRY interference of normal regulation of the RET gene suggests a potential role of the Y-chromosome gene in sexual dimorphism in Hirschsprung disease. Hum. Mol. Genet. 24, 685–697 (2014).
161. Okamura, Y. & Saga, Y. Notch signaling is required for the maintenance of enteric neural crest progenitors. Development 135, 3555–3565 (2008).
162. Ossipova, O. & Sokol, S. Y. Neural crest specification by noncanonical Wnt signaling and PAR-1. Development 138, 5441–50 (2011).
163. Ikeya, M., Lee, S. M., Johnson, J. E., McMahon, A. P. & Takada, S. Wnt signalling required for expansion of neural crest and CNS progenitors. Nature 389, 966–70 (1997).
164. Yoshikawa, S., McKinnon, R. D., Kokel, M. & Thomas, J. B. Wnt-mediated axon guidance via the Drosophila Derailed receptor. Nature 422, 583–8 (2003).
165. Chalazonitis, A. et al. Bone morphogenetic protein-2 and -4 limit the number of enteric neurons but promote development of a TrkC-expressing neurotrophin-3-dependent subset. J. Neurosci. 24, 4266–82 (2004).
166. Goldstein, A. M., Brewer, K. C., Doyle, A. M., Nagy, N. & Roberts, D. J. BMP signaling is necessary for neural crest cell migration and ganglion formation in the enteric nervous system. Mech. Dev. 122, 821–33 (2005).
167. Mayor, R., Guerrero, N. & Martínez, C. Role of FGF and noggin in neural crest induction. Dev. Biol. 189, 1–12 (1997).
168. Murphy, M., Reid, K., Ford, M., Furness, J. B. & Bartlett, P. F. FGF2 regulates proliferation of neural crest cells, with subsequent neuronal differentiation regulated by LIF or related factors. Development 120, 3519–28 (1994).
169. Luzón-Toro, B. et al. Next-generation-based targeted sequencing as an efficient tool for the study of the genetic background in Hirschsprung patients. BMC Med. Genet. 16, 89 (2015).
170. Luzón-Toro, B. et al. Exome sequencing reveals a high genetic heterogeneity on familial Hirschsprung disease. Sci. Rep. 5, 16473 (2015).
171. Pelet, A. et al. De-novo mutations of the RET proto-oncogene in Hirschsprung’s disease. Lancet 344, 1769–70 (1994).
172. Yin, L. et al. Prevalence and parental origin of de novo RET mutations in Hirschsprung’s disease. Eur. J. Hum. Genet. 4, 356–8 (1996).
173. Chen, W.-C., Chang, S.-S., Sy, E. D. & Tsai, M.-C. A De Novo novel mutation of the EDNRB gene in a Taiwanese boy with Hirschsprung disease. J. Formos. Med. Assoc. 105, 349–354 (2006).
174. Veltman, J. a. & Brunner, H. G. De novo mutations in human genetic disease. Nat. Rev. Genet. 13, 565–575 (2012).
175. Kong, A. et al. Rate of de novo mutations and the importance of father’s age to disease risk. Nature 488, 471–475 (2012).
176. Campbell, C. D. & Eichler, E. E. Properties and rates of germline mutations in humans. Trends Genet. 29, 575–584 (2013).
177. Park, C., Qian, W. & Zhang, J. Genomic evidence for elevated mutation rates in highly expressed genes. EMBO Rep. 13, 1123–9 (2012).
CHAPTER 1
38
178. Lynch, M. Rate, molecular spectrum, and consequences of human mutation. Proc. Natl. Acad. Sci. U. S. A. 107, 961–8 (2010).
179. Itsara, A. et al. De novo rates and selection of large copy number variation. Genome Res. 20, 1469–81 (2010).
180. Francioli, L. C. et al. Whole-genome sequence variation, population structure and demographic history of the Dutch population. Nat. Genet. 46, 1–95 (2014).
181. Francioli, L. C. et al. Genome-wide patterns and properties of de novo mutations in humans. Nat. Genet. 47, 822–826 (2015).
182. Michaelson, J. J. et al. Whole-genome sequencing in autism identifies hot spots for de novo germline mutation. Cell 151, 1431–42 (2012).
183. Campbell, C. D. et al. Estimating the human mutation rate using autozygosity in a founder population. Nat. Genet. 44, 1277–81 (2012).
184. Vissers, L. E. L. M. et al. Mutations in a new member of the chromodomain gene family cause CHARGE syndrome. Nat. Genet. 36, 955–7 (2004).
185. Hoischen, A. et al. De novo mutations of SETBP1 cause Schinzel-Giedion syndrome. Nat. Genet. 42, 483–485 (2010).
186. Ng, S. B. et al. Exome sequencing identifies MLL2 mutations as a cause of Kabuki syndrome. Nat. Genet. 42, 790–3 (2010).
187. Hoischen, A. et al. De novo nonsense mutations in ASXL1 cause Bohring-Opitz syndrome. Nat. Genet. 43, 729–731 (2011).
188. Sirmaci, A. et al. Mutations in ANKRD11 cause KBG syndrome, characterized by intellectual disability, skeletal malformations, and macrodontia. Am. J. Hum. Genet. 89, 289–94 (2011).
189. Cook, E. H. & Scherer, S. W. Copy-number variations associated with neuropsychiatric conditions. Nature 455, 919–23 (2008).
190. Awadalla, P. et al. Direct Measure of the De Novo Mutation Rate in Autism and Schizophrenia Cohorts. Am. J. Hum. Genet. 87, 316–324 (2010).
191. Hamdan, F. F. et al. Excess of De Novo Deleterious Mutations in Genes Associated with Glutamatergic Systems in Nonsyndromic Intellectual Disability. Am. J. Hum. Genet. 88, 306–316 (2011).
192. Sanders, S. J. et al. De novo mutations revealed by whole-exome sequencing are strongly associated with autism. Nature 485, 237–241 (2012).
193. O’Roak, B. J. et al. Sporadic autism exomes reveal a highly interconnected protein network of de novo mutations. Nature 485, 246–250 (2012).
194. Neale, B. M. et al. Patterns and rates of exonic de novo mutations in autism spectrum disorders. Nature 485, 242–245 (2012).
195. Iossifov, I. et al. De Novo Gene Disruptions in Children on the Autistic Spectrum. Neuron 74, 285–299 (2012).
196. Xu, B. et al. Exome sequencing supports a de novo mutational paradigm for schizophrenia. Nat. Genet. 43, 864–8 (2011).
197. Girard, S. L. et al. Increased exonic de novo mutation rate in individuals with schizophrenia. Nat. Genet. 43, 860–863 (2011).
198. Gilissen, C. et al. Genome sequencing identifies major causes of severe intellectual disability. Nature 511, 344–347 (2014).
199. Homsy, J. et al. De novo mutations in congenital heart disease with neurodevelopmental and other congenital anomalies. Science (80-. ). 350, 1262–1266 (2015).
200. D’Onofrio, B. M. et al. Paternal age at childbearing and offspring psychiatric and academic morbidity. JAMA psychiatry 71, 432–8 (2014).
201. Yang, Y. et al. Clinical whole-exome sequencing for the diagnosis of mendelian disorders. N. Engl. J. Med. 369, 1502–11 (2013).
DE NOVO MUTATIONS IN HIRSCHSPRUNG PATIENTS
LINK CENTRAL NERVOUS SYSTEM GENES TO THE
DEVELOPMENT OF THE ENTERIC NERVOUS SYSTEM
Hongsheng Gui1,2*, Duco Schriemer3*, William W.C. Cheng1,4*, Rajendra K. Chauhan4,
Guillermo Antiňolo5,6, Courtney Berrios7, Marta Bleda6,8, Alice S. Brooks4, Rutger
W.W. Brouwer9, Alan J. Burns4,10, Stacey S. Cherny2, Joaquin Dopazo5,6, Bart J.L.
Eggen3, Paola Griseri11, Binta Jalloh12, Thuy-Linh Le13,14, Vincent C.H. Lui1, Berta
Luzón-Toro5,6, Ivana Matera11, Elly S.W. Ngan1, Anna Pelet13,14, Macarena Ruiz-
Ferrer5,6, Pak C. Sham2, Iain T. Shepherd12, Man-Ting So1, Yunia Sribudiani4,15, Clara
S.M. Tang1, Mirjam C.G.N. van den Hout9, Wilfred F.J. van IJcken9, Joke B.G.M.
Verheij16, Jeanne Amiel13,14, Salud Borrego5,6, Isabella Ceccherini11, Aravinda
Chakravarti7, Stanislas Lyonnet13,14, Paul K.H. Tam1, Maria-Mercè Garcia-Barceló1#
& Robert M.W. Hofstra4,10#
1 Department of Surgery, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong, SAR, China 2 Centre for Genomic Sciences, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong, SAR,
China 3 Department of Neuroscience, section Medical Physiology, University of Groningen, University Medical Center
Groningen, Groningen, The Netherlands 4 Department of Clinical Genetics, Erasmus Medical Center, Rotterdam, The Netherlands 5 Department of Genetics, Reproduction and Fetal Medicine, Institute of Biomedicine of Seville (IBIS), University
Hospital Virgen del Rocío/CSIC/University of Seville, Seville, Spain 6 Centre for Biomedical Network Research on Rare Diseases (CIBERER), Seville, Spain 7 McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, USA 8 Department of Medicine, School of Clinical Medicine, University of Cambridge, Addenbrooke's Hospital,
Cambridge, United Kingdom 9 Erasmus Center for Biomics, Erasmus Medical Center, Rotterdam, The Netherlands. 10 Stem Cells and Regenerative Medicine, Birth Defects Research Centre, UCL Institute of Child Health, London, UK 11 UOC Genetica Medica, Istituto Gaslini, Genova, Italy 12 Department of Biology, Emory University, Atlanta, USA 13 Département de Génétique, Faculté de Médecine, Université Paris Descartes, Paris, France 14 INSERM U-781, AP-HP Hôpital Necker-Enfants Malades, Paris, France 15 Department of Biochemistry and Molecular Biology, Faculty of Medicine, Universitas Padjadjaran, Bandung,
Indonesia 16 Department of Genetics, University of Groningen, University Medical Center Groningen, Groningen, The
Netherlands
*Equally contributing authors; #Equally contributing corresponding authors.
Submitted
2
CHAPTER 2
40
ABSTRACT
Hirschsprung disease (HSCR), the most common form of congenital bowel
obstruction, results from a failure of enteric nervous system (ENS) progenitors to
migrate, proliferate, differentiate or survive to and within the gastrointestinal
tract, resulting in aganglionosis in the distal colon. The HSCR genes identified to
date are known to be involved in ENS development. Therefore, the search for genes
solving the missing heritability in HSCR has focused on ENS-related pathways. A de
novo mutation (DNM) screening in 24 HSCR patients revealed 20 DNMs in 20 genes
besides 8 DNMs in the known HSCR gene RET. Knockdown of genes carrying
missense and loss of function DNMs identified 4 genes indispensable for ENS
development in zebrafish. Moreover, these 4 genes, which are expressed in the gut
or ENS progenitors, are also involved in central nervous system (CNS)
development. These newly identified HSCR genes indicate that CNS-associated
genes also play a major role in ENS development.
Keywords: De novo mutations, Hirschsprung disease, neural crest, ENS, CNS
DE NOVO MUTATIONS IN HSCR PATIENTS LINK CNS GENES TO THE DEVELOPMENT OF THE ENS
41
2
INTRODUCTION
Hirschsprung disease (HSCR) is the most common form of congenital obstruction
of the bowel, with an incidence of ~1 per 5,000 live births. However, the incidence
varies significantly between ethnic groups with the highest incidence reported in
the Asian population, with 2.8 per 10,000 live births1,2. HSCR results from a failure
of the neural crest cells, that give rise to the enteric nervous system (ENS), to
migrate, proliferate, differentiate or survive in the bowel wall, resulting in
aganglionosis of the distal part of the gastrointestinal tract. This results in clinically
severe and sometimes life-threatening bowel obstruction. As HSCR is a highly
heritable disorder, genetic variation (mutations) in the genomes of these patients
must largely explain disease development. The mode of inheritance of HSCR can be
recessive mostly in syndromic cases, or dominant with incomplete penetrance in
non-syndromic HSCR families, to oligogenic/polygenic in sporadic cases3. So far
>15 HSCR susceptibility genes have been found as are 6 linkage regions1 and three
associated loci2,4. The genes identified belong to a limited number of pathways,
which have been shown to be relevant to the development of the ENS, of which the
RET pathway and the endothelin pathway are the most important ones. However,
the identified genes and variants in these genes explain no more than 25% of the
overall genetic risk2,4. Thus, the vast majority of cases cannot yet be explained by
the identified HSCR-associated variants. These findings indicate that the majority
of the disease risk must be due to as yet unidentified rare or common variants in
the known HSCR genes or, more likely, variants in yet unknown genes, acting alone
or in combination.
Exome sequencing followed by selection of genes that can be functionally
linked to the pathways already known to be involved in the disease is the current
approach in the field of human genetics. Variants in genes totally unlinked to the
known genes or pathways are largely neglected. This study aimed to determine the
contribution of rare exonic, non-synonymous de novo mutations (DNMs) to HSCR
without any a priori selection. Therefore, not only did we perform ‘standard’
exome sequencing analyses, followed by burden tests and in silico prediction, but
we also carried out an unbiased in vivo analysis of the mutated genes in a zebrafish
model.
CHAPTER 2
42
METHODS
Study samples
Trios
A total of 24 trios (affected child and unaffected parents) without family history of
HSCR recruited in 5 different centers were included for Whole Exome Sequencing
(WES). The patients were all non-syndromic. Five trios were of Chinese origin
whereas 19 were of Caucasian ancestry. We prioritized the most/more severe and
rarer HSCR cases for this study, namely female patients with long segment or total
colonic aganglionosis. Sixteen out of the 24 patients had previously tested negative
for RET damaging variants by traditional technologies. Characteristics of the
patients are presented in Supplementary Table 1. Informed consent was obtained
from all participants.
Case-control
WES data from 28 additional sporadic HSCR patients without sub-phenotype
limitation (singletons) and 212 controls were used to check gene recurrence and
assess the gene burden for rare variants (Supplementary Table 1).
Data generation
Whole exome sequencing
DNA samples were sequenced in four centers. The exome-capture kit and sequence
platforms used per center are detailed in Supplementary Table 2. Appropriate
mapping tools (Burrows-Wheeler aligner–BWA- for Illumina data and Bfast for
Solid data) were used to align sequence reads to the human reference genome
(build 19)5. Sequence quality was re-evaluated using the FastQC toolbox, Picard’s
metric summary and the GATK Depth-of-Coverage module. After initial quality
control (QC) all eligible sequences were pre-processed for local indel realignment,
PCR duplicate removal and base quality recalibration6.
Genome-wide SNP array
To determine copy number variants (CNVs) and regions of homozygosity, DNA was
hybridized to the HumanCyto SNP12 BeadChip (Illumina, San Diego, CA, USA)
according to standard protocols.
DE NOVO MUTATIONS IN HSCR PATIENTS LINK CNS GENES TO THE DEVELOPMENT OF THE ENS
43
2
Variant calling and prioritization
Aligned reads from all sequenced samples were pre-processed according to
standard guidelines6. Variant calling was done independently for Illumina reads or
Solid reads using the Genome Analysis Toolkit (GATK) unified Genotyper 2.07. To
avoid mismatched regions across different capture kits, calling was performed on
whole genome wide without limiting on any capture array. Special setting (allow
potentially miscoded quality scores) was used to make color-spaced solid reads
compatible to the program (Broad institute). Raw variants (including single
nucleotide variants and short insertions/deletions) with individual genotypes and
their affiliated quality scores were stored in a standard VCF format after calling.
Quality assessment (QA) and QC were then adopted on a few set of variants (raw
variants, exonic variants, rare variants) to generate a confident variant set for
downstream prioritization (Supplementary Note).
Clean variant set at exonic regions was produced after variant-level and
genotype-level quality control. Rare coding sequence variants were then
prioritized by filtering out those variants with minor allele frequency >0.01 in any
of these public databases (dbSNP137, 1000 Human Genome project and NHLBI
Exome Sequencing project). An automatic pipeline integrating GATK, KGGSeq,
Annovar and Plink was used to generate final set of qualified variants
(Supplementary Figure 1).
Identification of DNM
WES DNM detection
Rare, exonic variants present in the probands but absent in both parents were
considered DNM. To select putative DNM (or de novo variations) the following
criteria were used: 1) minimal coverage of 5 in patients and parents; 2) a minimal
genotype quality score of 10 for both patients and parents; 3) at least 10% of the
reads showed the alternative allele in patients; and 4) not more than 10% of the
reads showed the alternative allele in parents. Subsequently all remaining DNM
variants were manually inspected using the Integrated Genome Viewer (IGV) and
classified into 5 different confidence ranks according to their base-calling quality
and strand bias. The first two ranks of DNM candidates were selected for
validation by Sanger sequencing; while the other three classes of candidates were
re-evaluated by a model trained from variants submitted for Sanger sequencing
(Supplementary Note).
CHAPTER 2
44
RET gene inspection
To guarantee that no de novo mutations had been missed in the major HSCR gene,
the depth of coverage of each of the 21 exons of RET was manually inspected for
each patient. All exons with a coverage <10 were Sanger sequenced. Mutation
Detector software (Thermo Fisher Scientific) was used to identify rare coding
sequencing mutations from raw Sanger sequences; any mutation found in trio
proband was further checked in his/her parents. Besides rare mutations, bi-allelic
genotypes for the common risk single nucleotide polymorphisms (IVS1+9494,
rs2435357T) were extracted from local databases or newly genotyped.
Copy number variation detection
The Nexus®software program (Biodiscovery, El Segundo, CA, USA) was used to
normalize and analyse the SNP array data as mentioned above. Loss is defined as
the loss of a minimum of 5 probes in a 150kb region, with a minimum Log R ratio –
0.2. Gain is defined as the gain of a minimum of 7 probes in 200kb region, with
minimum Log R ratio 0.15. The minimum length of regions of homozygosity
analysed was 2Mb. The identified CNVs were reviewed for pathogenicity using the
genome browser UCSC (http://genome.ucsc.edu), the DGV database (http://
dgv.tcag.ca/dgv/app/home), the Decipher database (https://decipher.sanger.
ac.uk/) and our in-house local reference data base that consists of 250 healthy
controls and 250 individuals of the general population.
Statistical tests
De novo mutation rate
All proven DNMs were classified into loss-of-function (nonsense Single Nucleotide
Variants (SNVs), frame-shift indels and splicing sites), missense SNVs, in-frame
indels and synonymous SNVs. The counts of DNM per trio were fitted to Poisson
distribution with lamda as observed mean. De novo mutation rates were calculated
for these DNM subtypes and compared to 677 published healthy trios and
neurodevelopmental disease trios using a binomial test8–13. Given per-gene
mutation rate in Samocha et al. paper14, statistical over-representation of
mutations in all 24 genes were calculated using Fisher’s exact test.
Gene-wide burden analysis
Genes with DNM were further scrutinized for the presence of inherited rare
damaging variants in the trios as well as in HSCR singletons for whom WES data
DE NOVO MUTATIONS IN HSCR PATIENTS LINK CNS GENES TO THE DEVELOPMENT OF THE ENS
45
2
were available. A detailed analytical protocol was shared before running
association in each centre. Briefly, genotypes of rare damaging variants (as
previously defined) in genes carrying ≥1 de novo mutation were extracted from
raw sequencing reads. CMC test in Rvtest package was used to collapse multiple
variants into the same gene (boundary defined using hg19 refgene) and compare
overall burden between cases and local matched controls15. P-values were
estimated by asymptotic chi-square distribution. Gene-wise p-value, burden
direction and variant count per gene were exported. Ultimately sample-size
weighted Z-score method was used to conduct meta-analysis on gene-wise
summary statistics from three centres using the same protocol.
Bioinformatics analysis
Variant-level implication
The impact of each DNM to its carrying gene was predicted using several of
bioinformatics tools or databases. The conservation of missense SNVs was
predicted using GERP and PhyloP across 29 different species. The deleteriousness
of missense or nonsense SNVs were determined by a logit model incorporating 5
prediction programs (Polyphen2, Sift, MutationTaster, PhyloP and Likelihood
ratio)16. Human Splicing finder was used to predict whether DNMs causing
synonymous change or locating at splicing sites (exon +/- 2bp) created or
disrupted splice sites17. To further implicate the possible role of synonymous
DNMs on transcription, RNAmute was used to predicted the RNA substructure
change due to corresponding site mutation18. Finally, ClinVar and PubMed were
searched for the same or similar mutations in the same gene that present in
healthy controls or other disease patients.
Gene-level implication
The evidence of gene-level implication was collected from two aspects. On one
side, those 24 genes carrying DNMs were searched against databases (ATGU’s
Server) for other disease patients or healthy samples14. On the other side, ENS
candidate genes/gene-sets (Supplementary Table 8; Supplementary Note) were
linked to newly identified genes using pathway or PPI network information.
Disease Association Protein-Protein Link Evaluator (DAPPLE) was used to test
whether the genes carrying DNM in our study are functionally connected to each
other. The significance of observed pathway enrichment and network connectivity
was evaluated empirically using randomly selected genes, genes having the same
CHAPTER 2
46
genomic size as the identified DNM genes. InWeb and Ingenuity Pathway Analysis
were used to detect direct and indirect protein interactions between ENS-related
genes and genes with DNMs.
Gene expression in ENS
In order to test the involvement of the newly identified genes in enteric nervous
system development, in house expression data was shared from other in-parallel
projects in Hong Kong, Rotterdam centre. The first expression dataset was from
RNA sequencing on an iPSC-induced enteric neural crest cell (ENCC) for a HSCR
patient; the second and third expression dataset was from microarray chips on
embryonic mouse gut and ENCC.
Zebrafish
Tg(-8.3bphox2b:Kaede) transgenic zebrafish (Danio rerio) embryos were obtained
from natural spawning. Maintenance of zebrafish and culture of embryos were
carried out as described previously. Embryos were staged by days post-
fertilization (dpf) at 28.5°C.
Gene knockdown by antisense morpholino
Antisense morpholinos (MO) (Gene Tools LLC) targeting the zebrafish orthologues
of the candidate genes, by blocking either translation or splicing, were
microinjected to 1 to 4-cell stage Tg(-8.3bphox2b:Kaede) transgenic zebrafish
embryos as previously described19. For candidate genes that are duplicated in the
zebrafish genome, morpholinos targeting all paralogs were co-injected. Standard
control morpholino and 5-nucleotide mismatch control morpholino for ckap2l,
dennd3a, dennd3b, ncl1, nup98 and tbata were used as negative control. Embryos
were raised to 5 dpf, analysed and imaged under a stereo fluorescence microscope
(Leica MZ16FA and DFC300FX). An HSCR-like phenotype was defined as the
absence of enteric neurons in the distal intestine in 5 dpf embryos. Sequences and
dosages of all morpholinos used are listed in Supplementary table 9.
Expression analysis
To confirm the target gene were successfully knockdown, total RNA were
extracted from 1 dpf embryos (n=50) injected with the splice blocking morpholino
using RNA Bee (Amsbio) and cDNA were reverse transcribed using iScript cDNA
Synthesis Kit (Bio-rad). qPCR was performed using KAPA Sybr® Fast qPCR Kit
DE NOVO MUTATIONS IN HSCR PATIENTS LINK CNS GENES TO THE DEVELOPMENT OF THE ENS
47
2
(KAPA Biosystems; see Supplemantary Table 10 for primer details) and the
expression of the target gene was normalized by the mean expression of two
housekeeping genes (elfa and actb). Relative expression of the target gene in the
splice blocking morpholino-injected embryos to the control morpholino-injected
embryos was determined by Livak method20.
To determine the temporal expression of the zebrafish orthologues, RT-
PCR was performed at various time points with primers used to amplify up a
segment of the open reading frame of each gene. To determine the spatial
expression patterns of dennd3a, dennd3b, ncl1, nup98 and tbata, antisense
Digoxigenin-labeled probes for both genes were generated and whole-mount in
situ hybridization was performed as described by Thisse et al.21.
RESULTS
Identification of de novo mutations
We performed whole-exome sequencing (WES) on 24 trios composed of a sporadic
non-syndromic HSCR patient and the unaffected parents (72 individuals;
Supplementary Table 1) and focused on de novo variants. Sporadic female cases
with a long segment (LS) HSCR were overrepresented as the load of de novo rare
coding variant is presumed to be the highest in this group. The depth coverage of
the targeted sequences ranged from 18X to 74X (average 46X), and the targeted
exome covered by at least 10 sequence reads ranged from 65% to 98% (average
88%). Sequencing metrics after standard analytical pipeline (Supplementary
Figure 1) were in normal ranges (Supplementary Note; see Supplementary Table 2
and Supplementary Figure 2 for detail).
All de novo variations were carefully selected, validated and/or
statistically predicted (Methods and Supplementary Note; see prediction result in
Supplementary Table 3). After Sanger sequencing validation, a total of 28 DNMs in
14 patients were identified (Table 1). The overall DNM rate per individual was 1.2
per exome per generation (Poisson distribution with λ=1.2; Kolmogorov-Smirnov
test, p=0.893; Supplementary Figure 3) which is in accordance with the expected
mutation rate in the general population. Several studies have shown that the DNM
rates are similar between patients and healthy controls, but found that patients
have a significantly higher fraction of loss of function (LOF) DNMs8,9. Indeed, in our
CHAPTER 2
48
Table 1. De novo mutations in Hirschsprung disease probands
Trio Pheno-
type Gene De novo mutation Type
Prediction delete-
riousness*
MAF (dbSNP137/ ESP6500)%
1 L, F RET 3splicing9+1 splicing - N / N
RBM25 c.474C>T: p.L158L synonymous - N / N
2 L, F RET c.2511_2519delCCCTGG
ACC:p.S837fs frameshift - N / N
COL6A3 c.3327C>T: p.H1109H synonymous - 0.00042
(rs114845780) / N
3 L, F RET c.1818_1819insGGCAC:
p.Y606fs frameshift - N / N
4 L, F DAB2IP c.2339C>T:p.T780M# missense No N / N
ISG20L2 c.961G>A:p.G321R missense Yes N / N
MED26 c.675C>T:p.A225A synonymous - N / N
NCLN c.496C>T:p.Q166X
# nonsense - N / N
NUP98 c.5207A>G:p.N1736S missense Yes N / N
VEZF1 c.584C>T:p.S195F missense Yes N / N
ZNF57 c.570C>T:p.D190D synonymous - N / N
5 L, F RET c.1761delG :p.G588fs frameshift - N / N
SCUBE3 c.1493A>T:p.N498I missense No N / N
6 L, M AFF3 c.1975G>C:p.V659L missense No N / N
PLEKHG5 c.2628G>T:p.T876T synonymous - N / N
7 L, M KDM4A c.26A>G:p.N9S missense No N / N
8 L, M MAP4 c.3351C>T:p.G1117G synonymous - N / N
9 L, F RET c.1858T>C:p.C620R missense Yes 0 (rs77316810)
/ N
10 TCA, M CKAP2L c.555_556delAA:
p.E186fs frameshift - N / 0.00002
11 L, F RET c.409T>G:p.C137G missense Yes N / N
HMCN1 c.10366G>A:p.A3456T missense No N / N
TUBG1 c.699T>C:p.S233S synonymous - N / N
12 L, F CCR2 c.848T>A:p.L283Q missense Yes N / N
DENND3 c.1921delT:p.K640fs frameshift - N / N
13 L, F RET c.1710C>A:p.C570X nonsense - N / N
14 L, F RET c.526_528delGCA:
p.R175del non-frameshift - N / N
TBATA c.157C>T:p.R53C missense No N / N
F: Female; M: Male; L: Long-segment HSCR; TCA: Total Colonic Aganglionosis; *: Disease-causal
prediction by KGGSeq57, a software that uses a weighted logistic regression to combine multiple
prediction scores; #mosaic mutation; Dark grey: de novo RET mutations; Light grey: genes giving a
HSCR-like phenotype in zebrafish; %: minor allele frequency in dbSNP137 or ESP database, with ‘N’
standing for no data available.
DE NOVO MUTATIONS IN HSCR PATIENTS LINK CNS GENES TO THE DEVELOPMENT OF THE ENS
49
2
HSCR patient cohort, the rate of loss of function DNMs (LOF; N=8, including
nonsense, frameshift and splice site changes) is significantly higher than that of
healthy trios (p=0.011) or unaffected siblings of neuropsychiatric patients
(p=0.001) from multiple published studies8,10–12,22 (Supplementary Table 4). The
28 DNMs were localised in 21 genes. 8 DNMs were found in RET, the major HSCR
gene23. Among the DNMs in RET was the Cys620Arg variant, known to cause both
HSCR and Multiple Endocrine Neoplasia type 2A24. In this study, the observed rate
for RET DNMs (0.33 per trio) was significantly higher (binomial test, p<2*10-16)
than that modelled for RET DNMs in the general population (0.000133 per trio)
according to Samocha et al.14.
One of the patients analysed carried a total of 7 DNMs, two of which (in
NCLN and DAB2IP) were mosaic mutations (Supplementary Figure 4). This finding
is in line with a recent report stating that 6.5% of all DNMs are in fact mosaic and
occur post-zygotic25. Within the 24 patients we looked for inherited rare damaging
variants in the 21 genes that carried DNMs (Supplementary Note, Methods).
Inherited damaging mutations were found in RET, HMCN1, PLEKHG5, MAP4,
SCUBE3, and KDM4A (Supplementary Table 5). Neither de novo nor inherited copy
number variants (CNVs) were detected in any of the trios.
Mutation profile of HSCR patients
In general, disease-associated common variants confer a liability to disease to the
individuals of the general population. These common variants, in combination with
environmental and/or rare variants finally result in manifestation of the disease.
Thus, since both rare and common variants jointly contribute to HSCR we carefully
examined the genetic profile of our patients to assess the genetic background on
which the DNMs reside. Each patient was investigated for the presence or absence
of the common HSCR-associated RET allele (IVS1+9494, rs2435357T)26–29 as well
as for the presence of rare variants (inherited from unaffected parents) in a set of
116 pre-selected genes known to be involved in ENS development (Supplementary
Notes; Supplementary Tables 3 and 6).
The mutation profile for all patients is shown in Supplementary Table 5.
We observe that 29% of the patients with ≥ 1 DNM and 60% of the patients
without any DNM carry the common RET risk genotype TT (rs2435357T).
Moreover, patients with DNM carry on average 1.4 inherited rare damaging
variants in ENS genes, compared to an average of 2.4 in patients without any
DNM. Notably, six out of the 14 patients carried DNMs without co-occurrence of
CHAPTER 2
50
a RET coding sequence mutation. Although the differences are not statistically
significant, these observations suggest that the new genes identified may,
independently of the genetic background, play a role in the pathology of the
disorder, and prompted us to further investigate those genes using in silico and in
vivo approaches.
Determining pathogenicity of the DNMs in silico
The recurrence of a mutation or the identification of a recurrently mutated gene in
an independent group of patients or unrelated controls can provide corroborating
evidence of pathogenicity or neutrality30. Therefore, all the genes in which we
identified DNMs were checked against public databases (ATGU’s Gene-Mutation-
Constraint Server) for DNM recurrence. Only one missense DNM (different from
that identified in this study) in MAP4 was found in a patient with autism spectrum
disorder (ASD). A few genes (SCUBE3, RBM25 and TUBG1; Table 2) were identified
evolutionary constrained genes in which functional variants are more likely to be
deleterious14.
To establish whether genes with DNMs carry significantly more rare
variants in HSCR patients than in controls, we used the WES data from the 20
eligible HSCR trio-probands, 28 additional HSCR patients and 212 control
individuals to calculate the variation burden per gene (Methods). Nine of the
twenty-one genes (RET, KDM4A, HMCN1, MAP4, NUP98, AFF3, COL6A3, CCR2, and
CKAP2L) were found recurrently mutated in multiple HSCR patients with different
rare damaging mutation sites (Supplementary Table 7). Meta-analysis of our gene
burden tests showed that RET and CKAP2L were enriched for rare damaging
variants in the HSCR patients (nominal p<0.05; Table 2 and Supplementary Table
7). However, cross-checking of these 21 genes in another in-parallel HSCR exome
study (190 cases and 740 controls) revealed only RET was significantly
overrepresented with deleterious variants (p < 0.001; manuscript in preparation,
A. Chakravarti).
The possible impact of DNMs on gene function was explored using
bioinformatic prediction tools (Methods). Besides the 8 LOF mutations, 6 out of
twelve missense mutations were consistently predicted deleterious (Table 1). As
for the seven synonymous DNMs, we found no in silico evidence indicating that
those changes interfered with splicing and/or significantly changed the RNA
structure (Supplementary Table 8).
DE NOVO MUTATIONS IN HSCR PATIENTS LINK CNS GENES TO THE DEVELOPMENT OF THE ENS
51
2
We next checked whether the genes with DNMs are functionally related to
each other and/or to the signalling networks known to govern ENS development.
ISG20L2 and MAP4 showed more indirect interactions with other genes carrying
DNMs than expected by chance (p=0.0063 and p=0.0167 respectively) as predicted
by DAPPLE, though no direct in silico interactions were found among those 21
genes. A list of 116 known ENS related genes (Supplementary Table 6) was used to
study the functional link between genes with DNMs (other than RET) and the ENS.
Only a single interaction was identified in the InWeb protein interaction catalogue
(COL6A3 interacts with ITGB1). Using Ingenuity Pathway Analysis, we identified
additional direct and indirect relationships with ENS-related genes for MAP4,
COL6A3, RBM25 and TUBG1 (Supplementary Figure 5). All genes carrying DNMs
were either expressed in human iPSC-derived enteric neuron precursors or in
primary murine enteric neuron precursors (Table 2).
Table 2. Genes carrying de novo mutations
Gene Number of amino
acids
Co-occurrence with RET DNM
Burden test meta-analyses
(p-value)
Zebrafish ENS
phenotype
Gut expression (human; mouse;
zebrafish)#
PLEKHG5 1062 No 0.3997 NT Yes; Yes; -
KDM4A 1064 No 0.1190 No Yes; Yes;-
ISG20L2 353 No 0.4949 No Yes; Yes; -
HMCN1 5635 Yes 0.9789 No Yes; Yes; -
AFF3 1226 No 0.4745 No Yes; Yes; -
CKAP2L 745 No 0.0178 No Yes; Yes: -
COL6A3 3177 Yes 0.6398 NT Yes; Yes; -
CCR2 374 No 0.4745 No Yes; Yes; -
MAP4 1152 No 0.4851 NT Yes; No; -
SCUBE3* 993 Yes 0.7133 No Yes; Yes;-
DENND3 1198 No 0.5977 Yes Yes; Yes; Yes
DAB2IP 1189 No 0.9819 No Yes; Yes; -
RET 1114 - 0.0078 Yes Yes; Yes; -
TBATA 351 Yes 0.8028 Yes No; Yes; Yes
NUP98 1817 No 0.7243 Yes Yes; Yes; Yes
RBM25* 843 Yes 0.0846 NT Yes; Yes; -
TUBG1* 451 Yes 1.0000 NT Yes; Yes; -
VEZF1 521 No 0.6717 No Yes; Yes; -
ZNF57 555 No 0.3808 NT Yes; No: -
NCLN 563 No 1.0000 Yes Yes; Yes; Yes
MED26 600 No 1.0000 NT Yes; Yes; -
*genes evolutionary constrained as per Samocha et al. 2014; NT: not tested (gene carries synonymous
mutation and/or has no orthologue in zebrafish); #data from in-house hIPSC-derived neural crest,
mouse expression data, and RT-PCR in zebrafish (test only for 4 novel genes).
CHAPTER 2
52
Determining pathogenicity of the DNMs in vivo
As no proof of functional effects for any of the synonymous DNMs was found, we
further focused on the 13 genes (other than RET) that have a LOF or missense
mutation. Because none of these 13 genes were obvious candidates for HSCR we
used the zebrafish model system to further investigate the function of these genes
in ENS development. Previous studies have shown that morpholino-mediated
knockdown of orthologues of known HSCR genes results in an HSCR-like
phenotype in zebrafish4,31–35. Except CCR2, all 13 genes with nonsynonymous
DNMs have zebrafish orthologues. Splice-blocking morpholinos (SBMOs) were
designed to knock down the orthologues for these 12 genes (Methods). The SBMOs
were injected into Tg(-8.3bphox2b:Kaede) transgenic zebrafish19 embryos that
express the fluorescent protein Kaede in enteric neuron precursors and
differentiated enteric neurons. Initially, knockdown of 5 orthologues (ckap2l,
dennd3a and dennd3b, ncl1, nup98 and tbata) resulted in a HSCR-like phenotype as
enteric neuron were absent in the distal intestine of 5 dpf embryos, while embryos
injected with 5-nucleotide mismatch control morpholinos had normal ENS
development with enteric neurons present along the entire length of intestine. We
then co-injected the SBMOs with p53 morpholinos to verify the phenotype did not
result from non-specific, p53-induced apoptosis. Co-injection of p53 morpholino
with dennd3a and dennd3b, ncl1, nup98 or tbata SBMOs resulted in the same
phenotype (Figure 1), indicating the phenotype was not caused by non-specific
apoptosis. On the contrary, the phenotype could not be reproduced in ckap2l SBMO
and p53 morpholino co-injection (Figure 1). To further demonstrate the absence of
enteric neuron was specific to the knockdown of the orthologues, we repeated the
experiment by injecting translation-blocking morpholinos (TBMOs) against
dennd3a, dennd3b, ncl1, nup98 and tbata and the phenotype was reproduced (data
not shown). Therefore we concluded that knockdown of the DENND3, NCLN,
NUP98 and TBATA orthologues disrupted ENS development and caused a HSCR-
like phenotype in vivo.
To confirm the SBMOs knockdown effect, qPCR was performed to compare
the expression of the target genes between SBMO-injected and control
morpholino-injected embryos. Expression of dennd3a, dennd3b, nup98 and tbata
was markedly reduced in the SBMO-injected embryos (Supplementary Figure 6).
Intriguingly, there was no significant reduction in ncl1 expression in the ncl1 SBMO
injected embryos.
DE NOVO MUTATIONS IN HSCR PATIENTS LINK CNS GENES TO THE DEVELOPMENT OF THE ENS
53
2
Therefore we further investigated it by performing RT-PCR on individual
embryos and found that there was a large variation in ncl1 expression between
embryos injected with the SBMO, with some of them showing a clear reduction in
ncl1 transcript level (Supplementary Figure 7). Of the zebrafish orthologues that
did not show a specific HSCR-like phenotype after SBMOs injection, all
demonstrated significant reductions in expressions except for aff3, scube3 and
vezf1a (Supplementary Figure 6).
In addition we performed RT-PCR and whole mount in situ hybridization
(WISH) experiments to determine if the gene expression patterns of the zebrafish
orthologues were consistent with a predicted role in ENS development. Temporal
analysis using RT-PCR revealed that zebrafish orthologues of DENND3, NCLN and
NUP98 were maternally and zygotically expressed from 0-120hpf while the TBATA
orthologue is only zygotically expressed from 24-120hpf (Supplementary Fig 8).
Figure 1. Pathogenicity analysis in vivo by morpholino gene knockdown in zebrafish. Knockdown
of ncl1, dennd3, nup98 and tbata resulted in HSCR-like phenotype that kaede-expressing enteric
neurons were absent in the distal intestine at 5 dpf and the results were reproduced in the presence of
p53 morpholino. Aganglionosis observed in ckap2l knockdown was caused by non-specific apoptosis as
the result was not reproducible in p53 morpholino co-injection. Number of embryos with phenotype
out of total number of embryos observed is shown. Dotted lines outline the intestines. Asterisks
indicate the positions of anus. Arrows indicate the position where the aganglionic region begins.
CHAPTER 2
54
WISH analysis showed that the orthologues for all 4 genes were expressed in
distinct spatial locations specifically in the intestine and the anterior CNS from 24-
96hpf (Figure 2).
DISCUSSION
Over the last years a large number of papers have been published on de novo
mutation screening in human diseases. This has resulted in the identification of
many new disease-associated genes. Genes are considered as true disease causing
when at least two unlinked patients are found with a mutation in the same gene.
This works well for diseases that are relatively homogeneous or for which many
patients can be investigated. For the more heterogeneous rare diseases for which
only small cohorts are available this poses a problem. Often possible disease
causing genes are found in a single patient. How to decide whether this finding is of
importance? Expression of the gene in the relevant tissues can be considered as
additional evidence, as is networks analysis. However, making strong statements
for private disease genes is, and will be, extremely difficult. It also results in a bias
towards genes in the known disease causing gene networks. Genes not fitting the
current knowledge are often discarded as uninteresting. In the current study we
wanted to take this all one step further.
Therefore, we decided that the best way to obtain sound evidence for
involvement of new candidate genes in HSCR should come from functional
analysis. We opted for an in vivo approach using the zebrafish model system. We
knocked down the expression of zebrafish orthologues of 12 of the 13 genes in
which loss of function or missense DNMs were identified in a transgenic reporter
zebrafish line (Tg(-8.3bphox2b:Kaede)). The orthologues of 9 of the 12 genes were
successfully knockdown by morpholinos, and from which we discovered that 4
genes when functionally perturbed resulted in loss of neurons in the distal gut, as
in the HSCR patients. It is noteworthy that the SBMOs targeting 3 of the
orthologues (aff3, scube3 and vezf1a) did not knock down the target transcripts as
expected, which highlighted the limitation of morpholinos and might lead to false-
negative results36. To bypass this limitation, other loss-of-function approaches
should be considered to further study these genes, such as CRISPR/Cas9
knockout37. Finding 4 genes that when knocked down in zebrafish give a hindgut
phenotype resembling the human patients in which the DNMs were found, clearly
DE NOVO MUTATIONS IN HSCR PATIENTS LINK CNS GENES TO THE DEVELOPMENT OF THE ENS
55
2
demonstrates that genes that never would have been followed up, based on the
usual gene selection criteria, should not be ignored.
Using the bioinformatics prediction and statistics, we would have focused
on RET and CKAP2L only as they were significantly enriched for rare variants in the
HSCR patients (nominal p<0.05; Table 2).
We wondered whether any or all of these 4 genes can be linked to the ENS
or whether they play relevant roles in neuronal development or neural crest
derived cell types in general. In fact by studying these genes in more depth we
noticed that all 4, despite lack of obvious connection to the known ENS pathways,
are involved in the development of the CNS or the neural crest, making these not as
random as they might first appear.
DENN/MADD Domain Containing 3 (DENND3) is a guanine nucleotide
exchange factor (GEF) that is involved in intracellular trafficking by activation of
the small GTPase RAB1238. In zebrafish, Rab12 and other Rab GTPases are highly
expressed by pre-migratory neural crest cells and their expression is dysregulated
in Ovo1 morphant zebrafish that display altered migration of neural crest cells39.
Independently of RAB12, DENND3 also regulates Akt activity, which is involved in
the proliferation and survival of enteric neural crest cells38,40.
Figure 2. Temporal and spatial expression patterns of zebrafish orthologues. Whole mount in situ
hybridized embryos hybridized with antisense riboprobes for dennd3a, dennd3b, ncl1, nup98 and tbata
at the indicated developmental stages. All columns show lateral views. Anterior CNS expression is
apparent at all stages for all probes while intestinal expression for all probes is apparent from 48hpf
onwards.
CHAPTER 2
56
Nicalin (NCLN) is a key component of a protein complex that antagonizes
Nodal signalling41. In vertebrates, Nodal signalling is involved in induction of the
mesoderm and endoderm42. In contrast, inhibition of Nodal signalling is required
for the specification of human embryonic stem cells into neuroectoderm, including
the neural crest43,44. The antagonizing function of Nicalin on Nodal signalling is
therefore consistent with the neural crest specification that is required for ENS
development.
The NUP98 gene encodes a precursor protein that is autoproteolytically
cleaved to produce two proteins: NUP98 from the N-terminus and NUP96 from the
C-terminus45,46. A missense DNM was identified in the last exon of the NUP98 gene
and therefore affects the NUP96 protein. As in humans, zebrafish Nup96 is
produced by cleavage of the Nup98 precursor protein. Since morpholino’s act on
mRNA level, both nup98 and nup96 were targeted in our zebrafish experiments. It
is therefore unclear whether the observed aganglionosis is caused by loss of
Nup98 or Nup96. NUP96 is one of approximately 30 proteins in the nuclear pore
complex (NPC)47 and its expression level regulates the rate of proliferation48. Two
other members of the NPC (Nup133 and Nup210) are involved in neural
differentiation in mice49,50. Moreover, NUP96 interacts with NUP98 and NUP98 is
involved in the transcriptional regulation of the HSCR genes SEMA3A, DSCAM,
NRG1 and the NRG1 receptor ERBB4 in human neural progenitor cells51. Therefore,
it is likely that loss of either NUP protein (NUP96 or NUP98) could contribute to
HSCR development.
The mouse orthologue of Thymus, Brain And Testes Associated (TBATA) is
called Spatial and is highly expressed during differentiation of several tissues52.
These include the cerebellum, hippocampus and Purkinje cells in the brain, where
TBATA/Spatial is expressed in early differentiating neurons53. In mouse
hippocampal neurons, TBATA/Spatial is required for neurite outgrowth and
dendrite patterning54.
The 4 newly identified candidate genes for HSCR all seem to play a role in
neuronal development and could potentially be involved in HSCR (Figure 3). This
also suggests a clear link between CNS and ENS development. This is not
surprising as a number of studies have described the strong correlation between
Down syndrome and syndromic HSCR and several known HSCR genes (e.g. KBP,
SOX10, NRG1, IKBKAP, ZEB2, PHOX2B) have been reported to be involved in both
CNS and ENS pathologies2,55–57. In humans, SOX10 mutations cause myelin
deficiencies and sensory neuropathies as well as the neurological variant of
DE NOVO MUTATIONS IN HSCR PATIENTS LINK CNS GENES TO THE DEVELOPMENT OF THE ENS
57
2
Waardenburg-Shah syndrome which includes HSCR in the phenotypic spectrum.
Likewise, NRG1 is associated with schizophrenia and Nrg1 mutations in mice cause
peripheral sensory neuropathies46. IKBKAP mutations are associated with the
Riley-Day syndrome or familial dysautonomia (FD)58,59. Notably, some patients
with FD also suffer from gastrointestinal dysfunction shortly after birth and
interestingly, the co-occurrence of both FD and HSCR has been reported60. In
addition, knockdown of ikbkap in zebrafish also generates a HSCR-like
phenotype35. Further, KBP mutations are associated with Goldberg-Shprintzen
syndrome61 (MIM 609460), a rare autosomal recessive inherited syndrome, where
patients present with HSCR, microcephaly polymicrogyria and moderate mental
retardation.
Besides the fact that several HSCR/neuromuscular genes are known to be
associated with CNS defects, the opposite is also described. Many neurological and
Figure 3. Newly identified genes in ENS development. All symbols represent proteins coded by
Hirschsprung known genes or novel genes identified in this study. The effect of gene NUP98 is shown by
protein NUP96. The interaction effects between different proteins are illustrated by four different lines
representing binding, secreted/express, phosphorylation and activation. ENCC, enteric neural crest cell.
CHAPTER 2
58
psychiatric disorders are associated with constipation, and sometimes defects in
the ENS are reported62. For instance, it has recently been described that mutations
in CDH8 result in a specific subtype of autism in combination with gastrointestinal
problems. A cdh8-/- zebrafish recapitulates the human phenotype, including
increased head size (expansion of the forebrain/midbrain), an impairment of
gastrointestinal motility and a reduction in post-mitotic enteric neurons63. Besides,
a search of CNS and autism in Phenolyzer64 returned two genes (APP and MECP2)
that have been implicated in ENS development65,66.
Thus, given all of the above, and the fact that HSCR occurs with
neurological disorders more often than would be expected by chance, it is not
surprising that dysfunction of these newly identified neurological related genes
results in dysregulation of the neural crest-derived cells that form the ENS, and
hence in HSCR. These data are further corroborated by the expression patterns we
observed for the orthologues of these 4 genes in zebrafish embryos (Figure 2),
with all 4 having clear expression in both the brain and the gut.
Finding a niche for these genes in ENS development will help to open new
avenues of research which, eventually, will enhance our knowledge about ENS
development and HSCR disease mechanisms. Until now, we believed that the
number of cellular processes involved in the development of HSCR was limited.
Clearly this idea needs to be revisited as the novel genes we identified are not
directly linked to any of the currently known HSCR gene networks. In spite of the
plethora of databases and prediction tools available, very little is known about the
intricate ways in which genes interact in the development of the ENS, or the
function of many genes.
URLS
Genome analysis toolkit (GATK) (https://www.broadinstitute.org/gatk/);
ANNOVAR (http://annovar.openbioinformatics.org/en/latest/);
PLINK (http://pngu.mgh.harvard.edu/~purcell/plink/);
KGGSeq (http://statgenpro.psychiatry.hku.hk/limx/kggseq/);
ATGU’s Server (http://atgu.mgh.harvard.edu/webtools/gene-lookup/);
DAPPLE (http://www.broadinstitute.org/mpg/dapple/dappleTMP.php);
ClinVar (http://www.ncbi.nlm.nih.gov/clinvar/)
DE NOVO MUTATIONS IN HSCR PATIENTS LINK CNS GENES TO THE DEVELOPMENT OF THE ENS
59
2
ACKNOWLEDGMENTS
The authors would like to thank the patients and families involved in this study. DS,
AB, RC, BE, YS, RH are supported by research grants from ZonMW (TOP-subsidie
40-00812-98-10042 to BJLE/RMWH) and the Maag Lever Darm stichting (WO09-
62 to RMWH); AC is supported by NIH grant R37 HD28088; HG, WC, VL, EN, PS, MS,
CT, PT, MMG-B are supported by the Health and Medical Research Fund (HMRF
01121326 to VCHL; HMRF 02131866 to MMGB; and 01121476 to ESWN); General
Research Fund (HKU 777612M to PKT); HKU seed funding for basic research
(201110159001 to PKT) and small project funding (201309176158 to CSMT). The
work described was also partially supported by a grant from the Research Grant
Council of the Hong Kong Special Administrative Region, China, Project No. [T12C-
714/14R] to PKT; CH, PG are supported by grants from the Italian Ministry of
Health through “Cinque per mille” and Ricerca Corrente to the Gaslini Institute; GA,
MB, BL-T, MR-F, SB are supported by the Spanish Ministry of Economy and
Competitiveness (Institute of Health Carlos III (ISCIII), PI13/01560 and CDTI,
FEDER-Innterconecta EXP00052887/ITC-20111037), and the Regional Ministry of
Innovation, Science and Enterprise of the Autonomous Government
of Andalusia (CTS-7447); NINDS (5R21NS082546) awarded to ITS.
AUTHOR CONTRIBUTIONS
H.G. and D.S. performed the exome sequencing analyses and wrote the manuscript.
W.C. together with A.J.B., R.C., V.L., B.J. and I.T.S. performed the zebrafish
experiments and prepared the figures for the manuscript. Y.S. and C.S.T. conducted
the CNV analyses. Sanger sequencing validation was performed by P.G., I.M., A.P.,
M.T.S., M.R.F., B.L-T. and D.S. Statistical support was provided by H.G., M.B.,
R.W.W.B-, T.L., S.C., P.S. and A.C. Expression data was obtained and analyzed by Y.S.
and E.S.W.N. Bioinformatics support was provided by S.C., P.S., M.v.d.H., W.v.IJ. and
J.B.G.M.V. A.S.B., C.B., P.T., J.A., S.L., R.H., B.E., M.M.G.B., G.A., S.B. and I.C were
involved in patient recruitment and clinical aspect of the study. P.T., J.A., S.L., R.H.,
B.E., M.M.G.B., S.B., I.C. and A.C. conceived and design the project. All authors
contributed to writing and editing.
COMPETING FINANCIAL INTERESTS
The authors declare no competing financial interests.
CHAPTER 2
60
REFERENCES
1. Amiel, J. et al. Hirschsprung disease, associated syndromes and genetics: a review. J. Med.
Genet. 45, 1–14 (2008). 2. Garcia-Barcelo, M.-M. et al. Genome-wide association study identifies NRG1 as a susceptibility
locus for Hirschsprung’s disease. Proc. Natl. Acad. Sci. U. S. A. 106, 2694–9 (2009). 3. Alves, M. M. et al. Contribution of rare and common variants determine complex diseases-
Hirschsprung disease as a model. Dev. Biol. 382, 320–9 (2013). 4. Jiang, Q. et al. Functional loss of semaphorin 3C and/or semaphorin 3D and their epistatic
interaction with ret are critical to Hirschsprung disease liability. Am. J. Hum. Genet. 96, 581–96 (2015).
5. Li, H. et al. The Sequence Alignment/Map format and SAMtools. Bioinformatics 25, 2078–2079 (2009).
6. DePristo, M. A. et al. A framework for variation discovery and genotyping using next- generation DNA sequencing data. Nat Genet 43, 491–498 (2011).
7. McKenna, A. et al. The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 20, 1297–303 (2010).
8. Sanders, S. J. et al. De novo mutations revealed by whole-exome sequencing are strongly associated with autism. Nature 485, 237–241 (2012).
9. Iossifov, I. et al. De Novo Gene Disruptions in Children on the Autistic Spectrum. Neuron 74, 285–299 (2012).
10. Xu, B. et al. Exome sequencing supports a de novo mutational paradigm for schizophrenia. Nat. Genet. 43, 864–8 (2011).
11. O’Roak, B. J. et al. Sporadic autism exomes reveal a highly interconnected protein network of de novo mutations. Nature 485, 246–250 (2012).
12. Rauch, A. et al. Range of genetic mutations associated with severe non-syndromic sporadic intellectual disability: An exome sequencing study. Lancet 380, 1674–1682 (2012).
13. Gulsuner, S. et al. Spatial and temporal mapping of de novo mutations in schizophrenia to a fetal prefrontal cortical network. Cell 154, 518–529 (2013).
14. Samocha, K. E. et al. A framework for the interpretation of de novo mutation in human disease. Nat. Genet. 46, 944–950 (2014).
15. Li, B. & Leal, S. Methods for detecting associations with rare variants for common diseases: application to analysis of sequence data. Am. J. Hum. Genet. 83, 311–321 (2008).
16. Li, M. X. et al. Predicting Mendelian Disease-Causing Non-Synonymous Single Nucleotide Variants in Exome Sequencing Studies. PLoS Genet. 9, (2013).
17. Desmet, F. O. et al. Human Splicing Finder: An online bioinformatics tool to predict splicing signals. Nucleic Acids Res. 37, (2009).
18. Churkin, A. & Barash, D. RNAmute: RNA secondary structure mutation analysis tool. BMC Bioinformatics 7, 221 (2006).
19. Harrison, C., Wabbersen, T. & Shepherd, I. T. In vivo visualization of the development of the enteric nervous system using a Tg(-8.3bphox2b:Kaede) transgenic zebrafish. Genesis 52, 985–90 (2014).
20. Livak, K. J. & Schmittgen, T. D. Analysis of relative gene expression data using real-time quantitative PCR and the 2(-Delta Delta C(T)) Method. Methods 25, 402–8 (2001).
21. Thisse, C., Thisse, B., Schilling, T. F. & Postlethwait, J. H. Structure of the zebrafish snail1 gene and its expression in wild-type, spadetail and no tail mutant embryos. Development 119, 1203–15 (1993).
22. Iossifov, I. et al. The contribution of de novo coding mutations to autism spectrum disorder. November 13, 216–221 (2014).
23. Heanue, T. a & Pachnis, V. Enteric nervous system development and Hirschsprung’s disease: advances in genetic and stem cell studies. Nat. Rev. Neurosci. 8, 466–79 (2007).
24. Romeo, G. et al. Association of multiple endocrine neoplasia type 2 and Hirschsprung disease. in Journal of Internal Medicine 243, 515–520 (1998).
25. Acuna-Hidalgo, R. et al. Post-zygotic Point Mutations Are an Underrecognized Source of De Novo Genomic Variation. Am. J. Hum. Genet. 97, 67–74 (2015).
26. Emison, E. S. et al. A common sex-dependent mutation in a RET enhancer underlies Hirschsprung disease risk. Nature 434, 857–63 (2005).
DE NOVO MUTATIONS IN HSCR PATIENTS LINK CNS GENES TO THE DEVELOPMENT OF THE ENS
61
2
27. Emison, E. S. et al. Differential contributions of rare and common, coding and noncoding Ret mutations to multifactorial Hirschsprung disease liability. Am. J. Hum. Genet. 87, 60–74 (2010).
28. Burzynski, G. M. et al. Identifying candidate Hirschsprung disease-associated RET variants. Am. J. Hum. Genet. 76, 850–858 (2005).
29. Sribudiani, Y. et al. Variants in RET associated with hirschsprung’s disease affect binding of transcription factors and gene expression. Gastroenterology 140, 572–582 (2011).
30. Do, R., Kathiresan, S. & Abecasis, G. R. Exome sequencing and complex disease: Practical aspects of rare variant association studies. Hum. Mol. Genet. 21, (2012).
31. Shepherd, I. T., Pietsch, J., Elworthy, S., Kelsh, R. N. & Raible, D. W. Roles for GFR alpha 1 receptors in zebrafish enteric nervous system development. Development 131, 241–249 (2004).
32. Shepherd, I. T., Beattie, C. E. & Raible, D. W. Functional analysis of zebrafish GDNF. Dev. Biol. 231, 420–435 (2001).
33. Elworthy, S., Pinto, J. P., Pettifer, A., Cancela, M. L. & Kelsh, R. N. Phox2b function in the enteric nervous system is conserved in zebrafish and is sox10-dependent. Mech. Dev. 122, 659–669 (2005).
34. Dutton, K., Dutton, J. R., Pauliny, A. & Kelsh, R. N. A morpholino phenocopy of the colourless mutant. Genesis 30, 188–9 (2001).
35. Cheng, W. W.-C. et al. Depletion of the IKBKAP ortholog in zebrafish leads to hirschsprung disease-like phenotype. World J. Gastroenterol. 21, 2040–6 (2015).
36. Bedell, V. M., Westcot, S. E. & Ekker, S. C. Lessons from morpholino-based screening in zebrafish. Brief. Funct. Genomics 10, 181–188 (2011).
37. Sander, J. D. & Joung, J. K. CRISPR-Cas systems for editing, regulating and targeting genomes. Nat. Biotechnol. 32, 347–55 (2014).
38. Matsui, T., Noguchi, K. & Fukuda, M. Dennd3 Functions as a Guanine Nucleotide Exchange Factor for Small GTPase Rab12 in Mouse Embryonic Fibroblasts. J. Biol. Chem. 289, 13986–95 (2014).
39. Piloto, S. & Schilling, T. F. Ovo1 links Wnt signaling with N-cadherin localization during neural crest migration. Development 137, 1981–1990 (2010).
40. Srinivasan, S., Anitha, M., Mwangi, S. & Heuckeroth, R. O. Enteric neuroblasts require the phosphatidylinositol 3-kinase/Akt/Forkhead pathway for GDNF-stimulated survival. Mol. Cell. Neurosci. 29, 107–19 (2005).
41. Haffner, C. et al. Nicalin and its binding partner Nomo are novel Nodal signaling antagonists. EMBO J. 23, 3041–3050 (2004).
42. Schier, A. F. Nodal signaling in vertebrate development. Annu. Rev. Cell Dev. Biol. 19, 589–621 (2003).
43. Smith, J. R. et al. Inhibition of Activin/Nodal signaling promotes specification of human embryonic stem cells into neuroectoderm. Dev. Biol. 313, 107–117 (2008).
44. Chambers, S. M. et al. Highly efficient neural conversion of human ES and iPS cells by dual inhibition of SMAD signaling. Nat. Biotechnol. 27, 275–280 (2009).
45. Fontoura, B. M. A., Blobel, G. & Matunis, M. J. A conserved biogenesis pathway for nucleoporins: Proteolytic processing of a 186-kilodalton precursor generates Nup98 and the novel nucleoporin, Nup96. J. Cell Biol. 144, 1097–1112 (1999).
46. Rosenblum, J. S. & Blobel, G. Autoproteolysis in nucleoporin biogenesis. Proc Natl Acad Sci U S A 96, 11370–11375 (1999).
47. Tran, E. J. & Wente, S. R. Dynamic Nuclear Pore Complexes: Life on the Edge. Cell 125, 1041–1053 (2006).
48. Chakraborty, P. et al. Nucleoporin Levels Regulate Cell Cycle Progression and Phase-Specific Gene Expression. Dev. Cell 15, 657–667 (2008).
49. Lupu, F., Alves, A., Anderson, K., Doye, V. & Lacy, E. Nuclear Pore Composition Regulates Neural Stem/Progenitor Cell Differentiation in the Mouse Embryo. Dev. Cell 14, 831–842 (2008).
50. D’Angelo, M. A., Gomez-Cavazos, J. S., Mei, A., Lackner, D. H. & Hetzer, M. W. A Change in Nuclear Pore Complex Composition Regulates Cell Differentiation. Dev. Cell 22, 446–458 (2012).
51. Liang, Y., Franks, T. M., Marchetto, M. C., Gage, F. H. & Hetzer, M. W. Dynamic Association of NUP98 with the Human Genome. PLoS Genet. 9, (2013).
52. Irla, M. et al. Genomic organization and the tissue distribution of alternatively spliced isoforms of the mouse Spatial gene. BMC Genomics 5, 41 (2004).
CHAPTER 2
62
53. Irla, M. et al. Neuronal distribution of Spatial in the developing cerebellum and hippocampus and its somatodendritic association with the kinesin motor KIF17. Exp. Cell Res. 313, 4107–4119 (2007).
54. Yammine, M., Saade, M., Chauvet, S. & Nguyen, C. Spatial gene’s (Tbata) implication in neurite outgrowth and dendrite patterning in hippocampal neurons. Mol. Cell. Neurosci. 59, 1–9 (2014).
55. Tang, C. S. et al. Fine mapping of the 9q31 Hirschsprung’s disease locus. Hum. Genet. 127, 675–683 (2010).
56. Pingault, V. et al. Peripheral neuropathy with chronic intestinal pseudo-obstruction and deafness: A developmental ‘neural crest syndrome’ related to a SOX10 mutation. Ann. Neurol. 48, 671–676 (2000).
57. Harrison, P. J. & Law, A. J. Neuregulin 1 and Schizophrenia: Genetics, Gene Expression, and Neurobiology. Biological Psychiatry 60, 132–140 (2006).
58. Anderson, S. L. et al. Familial dysautonomia is caused by mutations of the IKAP gene. Am. J. Hum. Genet. 68, 753–8 (2001).
59. Slaugenhaupt, S. A. et al. Tissue-specific expression of a splicing mutation in the IKBKAP gene causes familial dysautonomia. Am. J. Hum. Genet. 68, 598–605 (2001).
60. Azizi, E., Berlowitz, I., Vinograd, I., Reif, R. & Mundel, G. Congenital megacolon associated with familial dysautonomia. Eur. J. Pediatr. 142, 68–9 (1984).
61. Brooks, A. S. et al. A consanguineous family with Hirschsprung disease, microcephaly, and mental retardation (Goldberg-Shprintzen syndrome). J. Med. Genet. 36, 485–9 (1999).
62. Winge, K., Rasmussen, D. & Werdelin, L. Constipation in neurological diseases. J. Neurol. Neurosurg. Psychiatry 74, 13–19 (2003).
63. Bernier, R. et al. Disruptive CHD8 mutations define a subtype of autism early in development. Cell 158, 263–276 (2014).
64. Yang, H., Robinson, P. N. & Wang, K. Phenolyzer: phenotype-based prioritization of candidate genes for human diseases. Nat. Methods 1–6 (2015). doi:10.1038/nmeth.3484
65. Van Ginneken, C., Schäfer, K. H., Van Dam, D., Huygelen, V. & De Deyn, P. P. Morphological changes in the enteric nervous system of aging and APP23 transgenic mice. Brain Res. 1378, 43–53 (2011).
66. Wahba, G. et al. MeCP2 in the enteric nervous system. Neurogastroenterol. Motil. 27, 1156–61 (2015).
DE NOVO MUTATIONS IN HSCR PATIENTS LINK CNS GENES TO THE DEVELOPMENT OF THE ENS
63
2
SUPPLEMENTARY NOTES
Quality assessment and control for exome variants
Concrete criterions in quality assessment (QA) include: total number of variants;
dbSNP137 coverage; Transition/Transversion (Ti/Tv) ratio; genotype
concordance rate and cross-sample identical-by-decent (IBD) relatedness1. Two
complementary steps were applied in quality control (QC), including variant-level
filtering (hard filtration or variant quality recalibration (VQSR)) and genotype-
level filtering. In detail, we annotated GATK-called variants as low quality SNPs
(“QD <2.0” or "MQ <40.0" or "FS >60.0" or "HaplotypeScore >13.0" or
"MQRankSum <-12.5" or "ReadPosRankSum <-8.0" in their ‘info’ field) and low
quality Indels (“QD <2.0" or "ReadPosRankSum <-20.0" or "InbreedingCoeff <-0.8"
or "FS >200.0 in ‘info’ field); in addition, VQSR differentiated a few relatively low
quality SNVs (labeled as “TruthSensitivityTranche99.90to100.00” after Gaussian
mixture modeling at true sensitivity 99%) from other passed SNVs. On the other
hand, individual genotypes were evaluated by quality parameters in the field of
genotyping, mainly reflecting the likelihood of three possible genotypes (reference
homozygous, heterozygous and alternative homozygous). A heterozygous
genotype was kept only if it was supported by >4 total reads, and the ratio for
alternative allele is above 0.25. Comparatively, a reference or alternative
homozygous genotype was accepted if it was supported by > 4 total reads, and
ratio for reference or alternative allele is above 0.95.
Supplementary Table 2 shows the details of quality statistics for samples
from different sequencing centers at variant level. The total count of SNVs
(20~30K) or Indels (1~2K), Transition/Transversion (Ti/Tv) ratio (above 3.0),
dbSNP137 coverage (above 95%) and GWAS genotype concordance (>99%) are all
in normal range. No trio violated relatedness checking; meanwhile, no batch effects
or close relatedness (pi-hat coefficient > 0.125 as first cousin or above) were found
among the HSCR patients from different centers (Supplementary Figure 2). All
these quality metrics or statistics showed data quality at exonic regions that were
comparatively good for trios from different platforms or resources, and justified
our unbiased searching of de novo mutations in the following stages.
Mutation validation and prediction
Each DNM candidate was manually inspected using the Integrative Genomic
Viewer (IGV) and they were categorized into five different groups: probably true
CHAPTER 2 SUPPLEMENTARY INFORMATION
64
positive, possibly true positive, unclear, possibly false positive and probably false
positive. Two lists of putative DNM candidates were generated for confirmation by
Sanger sequencing. The first list contains 74 variants with high confidence ranking
(probably true positive and possibly true positive). Raw data were then re-
evaluated to generate 48 candidates with relatively low-confidence (unclear),
especially for those trios without any confirmed DNM in the first round. Rare
(minor allele frequency < 0.01 in public databases) predicted damaging variants in
genes carrying confirmed de novo mutations were extracted from exome calls and
submitted for Sanger validation. The allele origin was determined by checking the
mutation site in both parents. Phasing of DNM and inherited variants in the same
gene was also performed by Sanger sequencing. Rare damaging inherited variants
located in 116 ENS candidate genes were extracted from exome reads using the
same pipeline (Supplementary Figure 1); and the transmission patterns of these
variants were determined by referring to parental and maternal genotypes at the
same site.
Stepwise logistic regression was used to select effective predictors of the
de novo status in a trio and for the presence or absence of a mutation in a given
individual. The performance of these prediction models was evaluated using 10-
fold cross validation by the software WEKA. For model fitting to DNM status in the
trios, genotype quality (represented by normalized phred likelihood score for the
second most likely genotype) in the child and alternative allelic ratio in the parents
were prioritized. The Area Under the Receiver Operating Characteristic Curve
(AUC) was 0.959 (Supplementary Table 3) which suggests that the model predicts
the DNM status accurately. This model was then adopted to test all other
unvalidated de novo candidates (falling under the ”unclear”, “possibly false
positive” or “probably false positive” categories), which all turned out to be
negatives. For model fitting to the presence or absence of a variant in the patients,
genotype quality and alternative allelic ratio in each individual were retained. The
AUC was 0.824 (Supplementary Table 3). This second model was then used to help
predict the presence of rare variants in the DNM genes or ENS genes. Only those
variants predicted as positive candidates were shown (Supplementary Table 5).
Generation of ENS candidate genes
Candidate genes were selected by a literature review on Hirschsprung disease
research, which included both genetic and functional studies. Most of them were
also covered in Jiang et al.2 and Gui et al.3, which previously summarized possible
DE NOVO MUTATIONS IN HSCR PATIENTS LINK CNS GENES TO THE DEVELOPMENT OF THE ENS
65
2
genes related to HSCR or involved in ENS development. The genes were
categorized into 4 major types, genes selected based on: genetic linkage, genetic
association, microarray expression, and animal models. In total 116 genes were
selected that fit more than 1 category (Supplementary Table 6). A few of these
genes fall into the same pathways previously implicated in neural crest cell
migration, proliferation and differentiation. Three pathways (RET signaling
pathway, EDNRB signaling pathway and KBP signaling pathway) were key partners
involved in ENS development4.
SUPPLEMENTARY TABLES
Supplementary Table 1. Information of sample included in the study.
HSCR
patients
(N=52)
Trios (N=24) Singletons (N=28)
Controls
(N=212) Short Long/TCA Short Long/TCA
(N=1) (N=23) (N=15) (N=13)
Males 0 7 (4) 13 4 117
Females 1 (0) 16 (10) 2 9 95
Trios used to detect de novo mutations in coding sequences. Case/control samples were exome
sequenced by the same protocol in each cohort, and used to calculate gene-level burden for all genes
carrying a de novo mutation. ( ): number of patients with validated DNM. TCA: total colonic
aganglionosis.
Supplementary Table 2. Quality metrics for sequencing reads and variants from different
cohorts.
This shows comparable read depth and % of targeted exonic bases on the intersected exonic regions (~
30Mb) for different cohorts; in addition, variant-level metrics are also comparable at exonic regions
(Ti/Tv ratio, SNP/Indel counts, dbSNP137 coverage). *: SNVs passing variant quality recalibration
filtering were counted; #: only SNVs in exonic regions were used to estimate Ti/Tv ratio; %:
concordance between GWAS array and exome data, NA data not available; $: RV, rare variants with
minor allele frequency < 0.01 in dbsnp137, 1000 genome 2012 and ESP 6500 databases; SS: Sure Select.
Centre # of
Trios
Capture
array
Target
region
Sequencer Mean
covera
ge
>10X SNVs/indels
per patient*
Ti/
Tv#
Concor
dance
rate%
dbSNP
v137
coverage
RV per
patient
$
HK 5 Illumina
Truseq 62.3 M
Illumina
GAII 27.9 X 74% 10475 / 234 3.52 NA 99.17% 228
NL 10 Agilent SS
V4 51.4 M
Illumina
HiSeq2000 53.8 X 95%
13603 / 342 3.34 99.10% 99.29%
340
FR 5 Agilent SS
V4 51.4 M
Illumina
HiSeq2000 51.8 X 92% 12432 / 287 3.42 NA 99.51%
234
SP 4 NimbleGen
V2 36.5 M ABISolid4 47.4 X 82% 10502 / 530 3.59 NA 95.50%
713
CHAPTER 2 SUPPLEMENTARY INFORMATION
66
Supplementary Table 3. Statistical models for mutation prediction.
Model Classifier*
Confusion
matrix Sensitivity Specificity Precision
F-
Measure
AUC (10-
fold CV)#
DNM status
in trios
2ndPL patient
+ FA parents
93 3
0.692 0.969 0.857 0.766 0.959
8 18
Variant
presence/
absence in
patients
2ndPL patient
+ FA patient
68 10
0.703 0.872 0.839 0.765 0.824
22 52
Two models were trained by stepwise logistic regression on sequencing quality metrics and then used
to predict the de novo mutation status in a trio or the variant presence/absence status in exome
individuals. Training data was from true or false variants validated by Sanger sequencing, as shown in
confusion matrix. *: 2ndPL_patient means “second minimum phred-scaled likelihood (PL) score” in the
trio proband; FA_parents means maximum ”fractions of reads (FA) supporting each reported
alternative allele” from two parents. 2ndPL_patient, FA_patient means PL or FA value for given patient.
#: Area under curve (AUC) calculated from 10-fold cross-validation. Confusion matrix, F-measure and
AUC were acquired from WEKA output.
Supplementary Table 4. Comparison of de novo mutation rates
A B C
Mutation type HSCR-trios
(N=24)
Healthy-trios*
(N=54) p-value
Unaffected
siblings
(N=677)& p-value
Count (rate) Count (rate) A vs. B Count (rate) A vs. C
All DNMs 28 (1.17) 44 (0.81) 0.159 547 (0.81) 0.065
LOF DNMs 8 (0.33) 4 (0.07) 0.011# 54 (0.08) 0.001
##
Non-RET LOF
DNMs 3 (0.13) 4 (0.07) 0.447 54 (0.08) 0.447
Synonymous
DNMs 7 (0.29) 12 (0.22) 0.62 143 (0.21) 0.365
DNM mutation rate by different categories (All, LOF only, non-RET LOF, synonymous) were compared
between HSCR trios included in this study and those published healthy trios or unaffected siblings to
neurodevelopmental diseases. *: Data from Rauch (2012) and Xu (2012); &: data from Iossifov (2012),
O’Roak (2012), Sanders (2012) and Gulsuner (2013); #: nominally significant at 0.05; ##: significant
after Bonferroni correction.
DE NOVO MUTATIONS IN HSCR PATIENTS LINK CNS GENES TO THE DEVELOPMENT OF THE ENS
67
2
Supplementary Table 5. Joint distribution of common and rare variants for each trio proband. Pheno-type
*
RET rs2435357
: T/C#
De novo mutations%
Inherited mutations in genes in which de
novo mutations were found
$
Inherited mutations in 116 ENS/HSCR
candidate genes&
L, F CC RET: 3splicing9+1 (splicing site); RBM25: L158L
(synonymous)
RET: L56M (missense) (P)
SMO (P), KIAA1279 (M)
L, F CC COL6A3: H1109H (synonymous); RET:S837fs
(frameshift)
DCC (P)
L, F TC RET:Y606fs (frameshift); SON (M) L, F CT DAB2IP: H1132Y (missense);
NUP98:N1662S (missense);
VEZF1:S195F (missense); ZNF57:D190D (synonymous); ISG20L2:G321R (missense);
MED26:A225A (synonymous); NCLN:Q166* (stopgain)
NUP98:I1609T (missense) (M)
IKBKAP (P), SOX10 (M)
L, F CC SCUBE3: N498I (missense); RET: G588fs (frameshift)
PLEKHG5: E800fs (frameshift) (U)
NOTCH3 (M)
L, M TT PLEKHG5:T876T (synonymous); AFF3:V659L(missense)
L, M TT KDM4A: N9S (missense) MAP4: A882G (missense) (M)
ECE1 (P), JAG1 (P)
L, M CT MAP4:G1117G (synonymous) PCDHA1 (P), DCC (M), NOTCH3 (P)
L, F TT RET:C620R (missense)
TCA, M TT CKAP2L:E186fs (frameshift) CBR1 (M) L, F C/T HMCN1:A3456T (missense);
RET:C137G (missense); TUBG1:S233S (synonymous)
L, F C/T CCR2:L283Q (missense); DENND3:K640fs (frameshift)
IKBKAP (P), JAG1 (M)
L, F C/T RET:C570* (stopgain) ECE1 (P) L, F C/T RET:R175del (non-frameshift);
TBATA:R53C (missense) NOTCH1 (P), PFKL
(P) L, F CC HMCN1: P1269T
(missense) (P) IKBKAP (P), EDNRB
(P), JAG1 (M) L, F CC HMCN1: N2461S
(missense) (M) PHACTR4 (P), GLI3 (M), SHH (M), HMX3 (M), NAV2 (M), PRPH
(P), PSPN (P) L, M TT IHH (P), PFKL (P, M) S, F TT JAG1 (P)
TCA, M TT ELAVL4 (P), SERPINI1 (U),
PTCH1 (U), IKBKAP (M)
TCA, M TT JAG1 (M) L, F TT PLXNB1 (P)
L, F TT SCUBE3:R907C (missense) (P)
NRG1 (M), IFNGR2 (P)
L, F TC TAGLN3 (P) L, F TC DAB2IP:A338T
(missense) (P); KDM4A: V988M (M)
SON (P)
Common risk SNP (RET rs2435357), DNMs, inherited damaging variants in genes carrying DNMs, rare damaging variants in ENS candidate genes were tabulated for each HSCR patient. DNMs and inherited variants in DNM genes were confirmed by Sanger sequencing. Rare damaging variants in ENS candidate genes were all predicted as true according to training model 2 (see Supplementary Table 3). *: L: Long segment aganglionosis; S: Short segment aganglionosis; TCA: Total colonic aganglionosis; F: Female; M: Male. #: rs2435357, T is risk allele and minor allele. %: genes functionally validated in bold; $: parent of origin for mutation in candidate genes, P for paternal (P); M for Maternal, U for Unsure; &: 116 ENS-related HSCR candidate genes (as listed in Supplementary Table 6).
CHAPTER 2 SUPPLEMENTARY INFORMATION
68
Supplementary Table 6. Characteristics of 116 ENS-related HSCR candidate genes.
Gene Gene name Chromosome Evidence Ref
ALDH1A2 aldehyde dehydrogenase 1 family, member A2
15q22.1 Mouse (Absence EN) 5
ARHGEF3 Rho guanine nucleotide exchange factor (GEF) 3
3p14.3 Expression 6,7
ARTN artemin 1p34.1 Mouse (Abnormal ENS
morphology) 8
ASCL1 achaete-scute complex homolog 1 (Drosophila)
12q23.2 Mouse (Absence EN)/Expression
7,9–11
CADM1 cell adhesion molecule 1 11q23.2 Expression 7,12 CARTPT CART prepropeptide 5q13.2 Expression 7 CBR1 carbonyl reductase 1 21q22.13 Expression 13
CDH2 cadherin 2, type 1, N-cadherin (neuronal)
18q11.2 Expression 7,14
CRMP1 collapsin response mediator protein 1 4p16.1 Expression 7,15 CSTB cystatin B (stefin B) 21q22.3 Expression 16
CTNNAL1 catenin (cadherin-associated protein), alpha-like 1
9q31.3 Expression 7
DCC deleted in colorectal carcinoma 18q21.2 Mouse (Absence
submucosal ganglia) 17
DCX doublecortin Xq22.3-q23 Expression 7 DLL1 delta-like 1 (Drosophila) 6q27 Not described 10,11 DLL3 delta-like 3 (Drosophila) 19q13.2 Not described 10,11
DLX1 distal-less homeobox 1 2q32 Expression 7,18,1
9 DPYSL3 dihydropyrimidinase-like 3 5q32 Expression 7,20 EBF3 early B-cell factor 3 10q26.3 Expression 7,21
ECE1 endothelin converting enzyme 1 1p36 Human (Linkage)/Mouse
(Absence EN) 17,22,
23
EDN3 endothelin 3 20q13 Human (Linkage)/Mouse
(Absence EN) 17,22,24,25
EDNRB endothelin receptor type B 13q22 Human
(Linkage/CNV)/Mouse (Absence EN)
17,22,26,27
ELAVL2 ELAV (embryonic lethal, abnormal vision, Drosophila)-like 2 (Hu antigen B)
9p21 Expression 7,28
ELAVL4 ELAV (embryonic lethal, abnormal vision, Drosophila)-like 4 (Hu antigen D)
1p34 Expression 7,29
ERBB2
v-erb-b2 erythroblastic leukemia viral oncogene homolog 2, neuro/glioblastoma derived oncogene homolog (avian)
17q12 Mouse (Abnormal ENS
morphology) 30,31
ERBB3 v-erb-b2 erythroblastic leukemia viral oncogene homolog 3 (avian)
12q13.2 Mouse (Abnormal ENS
morphology) 30,31
ERBB4 v-erb-a erythroblastic leukemia viral oncogene homolog 4 (avian)
2q33.3-q34 Human (CNV) 32
ETV1 ets variant 1 7p21.3 Expression 7,33 FGF13 fibroblast growth factor 13 Xq26.3 Expression 7,34 GAP43 growth associated protein 43 3q13.1-q13.2 Expression 7,35
GDNF glial cell derived neurotrophic factor 5p13 Human (Linkage)/Mouse (Absence EN)/Expression
17,22,36–39
GFRA1 GDNF family receptor alpha 1 10q25 Human (1 patient)/Mouse (Absence EN)/Expression
7,8,38,40
GFRA2 similar to GDNF family receptor alpha 2; GDNF family receptor alpha 2
8p21.3 Mouse (Abnormal ENS
morphology) 8
GFRA3 GDNF family receptor alpha 3 5q11.2 Mouse (Abnormal
sympathetic system) 8
GFRA4 GDNF family receptor alpha 4 20p13 Not described 8
GLI1 GLI family zinc finger 1 12q13.3 Mouse (Abnormal
intestinal morphology) 41,42
GLI2 GLI family zinc finger 2 2q14 Mouse (Abnormal
intestinal morphology) 41,42
GLI3 GLI family zinc finger 3 7p14 Mouse (Abnormal
intestinal morphology) 41,42
GNG2 guanine nucleotide binding protein (G protein), gamma 2
14q21 Expression 7
DE NOVO MUTATIONS IN HSCR PATIENTS LINK CNS GENES TO THE DEVELOPMENT OF THE ENS
69
2
GNG3 guanine nucleotide binding protein (G protein), gamma 3
11p11 Expression 7,43
GRB10 growth factor receptor-bound protein 10 7p12.2 Human (Linkage) 38
HES1 hairy and enhancer of split 1, (Drosophila)
3q29 Mouse (Abnormal
intestinal morphology) 41,42
HLX H2.0-like homeobox 1q41 Mouse
(Hypoaganglionosis) 17
HMP19 HMP19 protein 5q35.2 Expression 7 HMX3 H6 family homeobox 3 10q26.13 Expression 7,44 HOXB5 homeobox B5 17q21.3 Expression 7,45 HOXD4 homeobox D4 2q31.1 Expression 7,46
IFNGR2 interferon gamma receptor 2 (interferon gamma transducer 1)
21q22.11 Expression 47
IHH Indian hedgehog homolog (Drosophila) 2q35 Mouse (Absence EN) 41,42
IKBKAP inhibitor of kappa light polypeptide gene enhancer in B-cells, kinase complex-associated protein
9q31.3 Human (Co-Expression) 48
IL10RB interleukin 10 receptor, beta 21q22.11 Expression 49
ITGB1 integrin, beta 1 (fibronectin receptor, beta polypeptide, antigen CD29 includes MDF2, MSK12)
10p11.22 Mouse (Absence EN) 17
JAG1 jagged 1 (Alagille syndrome) 20p12.1 Not described 10,11 JAG2 jagged 2 14q32.33 Not described 10,11
KIAA1279 KIAA1279 10q21 Human (Linkage/GSM
syndrome) 50
KLF4 Kruppel-like factor 4 (gut) 9q31 Mouse (Abnormal
intestinal morphology) 41,42
L1CAM L1 cell adhesion molecule Xq28
Human (Hydrocephalus)/Mouse
(Delayed NCC differentiation)/Expression
7,51–53
MAB21L1 mab-21-like 1 (C. elegans) 13q13 Expression 7,54 MAPK10 mitogen-activated protein kinase 10 4q22.1-q23 Expression 7,55 MAPT microtubule-associated protein tau 17q21.1 Expression 7
MLLT11 myeloid/lymphoid or mixed-lineage leukemia (trithorax homolog, Drosophila); translocated to, 11
1q21 Expression 7,56
NAV2 neuron navigator 2 11p15.1 Human (Exome) 57 NKX2-1 NK2 homeobox 1 14q13 Human (Case report) 17,22
NOTCH1 Notch homolog 1, translocation-associated (Drosophila)
9q34 Not described 10,11
NOTCH2 Notch homolog 2 (Drosophila) 6q27 Not described 10,11 NOTCH3 Notch homolog 3 (Drosophila) 19p13.12 Not described 10,11
NRG1 neuregulin 1 8p12 Human (GWAS)/Mouse
(Abnormal NCC migration)
58
NRG3 neuregulin 3 10q22-q23 Human (CNV/Exome) 32,59 NRP1 neuropilin 1 10p11.22 Not described 60,61
NRTN neurturin 19p13 Human (Linkage)/Mouse
(Abnormal ENS) 62
NTF3 3'-nucleotidase 1p36.11 Mouse (Reduced enteric
ganglia) 17
NTRK3 neurotrophic tyrosine kinase, receptor, type 3
15q25 Mouse (Reduced enteric
ganglia) 17
PAX3 paired box 3 2q36 Human (Exome)/Mouse
(Absence EN) 8,57
PCDHA1 protocadherin alpha 1; protocadherin alpha 4
5q31 Expression 7,63
PFKL phosphofructokinase, liver 21q22.3 Expression 64 PHACTR4 phosphatase and actin regulator 4 1p35.3 Mouse 65,66 PHOX2A paired-like homeobox 2a 11q13.2 Expression 7,67
PHOX2B paired-like homeobox 2b 4p13
Human (Haddad syndrome)/Mouse
(Abnormal ENS)/Expression
7,68, 69
PLXNA1 plexin A1 3q21.3 Not described 60,61 PLXNB1 plexin B1 3p21 Human (Linkage) 70,71
CHAPTER 2 SUPPLEMENTARY INFORMATION
70
POFUT1 protein O-fucosyltransferase 1 20q11.21 Mouse (Absence EN) 10,11 PROK1 prokineticin 1 1p13 Not described 72,73 PROK2 prokineticin 2 3p13 Not described 72,73 PROKR1 prokineticin receptor 1 2p14 Not described 72,73 PROKR2 prokineticin receptor 2 20p12 Not described 72,73 PRPH peripherin 12q12-q13 Expression 7,74 PSPN persephin 19p13.3 Not described 8
PTCH1 patched homolog 1 (Drosophila) 9q22.32 Mouse (Abnormal
intestinal morphology) 41,42
RET ret proto-oncogene 10q11 Human
(Linkage/CNV/Exome)/Mouse (Absence EN)
7,57, 75,76
SALL4 sal-like 4 (Drosophila) 20q13.2 Mouse (Absence EN) 17 SCG3 secretogranin III 15q21 Expression 7
SEMA3A sema domain, immunoglobulin domain (Ig), short basic domain, secreted, (semaphorin) 3A
7p12
Human (Association)/Mouse
(Abnormal ENS morphology)
57,77
SEMA3C/D sema domain, immunoglobulin domain (Ig), short basic domain, secreted, (semaphorin) 3D
7p12 Human (GWAS)/Others
(Not described) 60,61
SERPINI1 serpin peptidase inhibitor, clade I (neuroserpin), member 1
3q26.1 Expression 7,78, 79
SHH sonic hedgehog homolog (Drosophila) 7q36 Mouse (Ectopic enteric
ganglia formation) 41,42
SMO smoothened homolog (Drosophila) 7q32 Mouse (Abnormal neural
crest cell migration) 41,42
SOD1 superoxide dismutase 1, soluble 21q22.11 Expression 80,81 SON SON DNA binding protein 21q22.11 Expression 82
SOX10 SRY (sex determining region Y)-box 10 22q13 Human
(Linkage/WS4)/Mouse (Absence EN)
7,83
SOX2 SRY (sex determining region Y)-box 2 3q26.3-q27 Expression 7,84 SPRY2 sprouty homolog 2 (Drosophila) 13q31.1 Mouse (Increased EN) 17 STMN2 stathmin-like 2 8q21.13 Expression 7,85 STMN3 stathmin-like 3 20q13.3 Expression 7
SUFU suppressor of fused homolog (Drosophila)
10q24.32 Mouse (Abnormal neural
tube morphology) 41,42
SYT11 synaptotagmin XI 1q21.2 Expression 7,86 TAGLN3 transgelin 3 3q13.2 Expression 7 TBX3 T-box 3 12q24.1 Expression 87,88
TFF3 trefoil factor 3 (intestinal) 21q22.3 Expression 7,89, 90
TGFB2 transforming growth factor, beta 2 1q41 Expression 7,91
TMEFF2 transmembrane protein with EGF-like and two follistatin-like domains 2
2q32.3 Expression 7,92
TREX1 three prime repair exonuclease 1 3p21.31 Human (Linkage) 70,71
TTC3 tetratricopeptide repeat domain 3; tetratricopeptide repeat domain 3-like
21q22.2 Expression 93
TUBB3 tubulin, beta 3; melanocortin 1 receptor (alpha melanocyte stimulating hormone receptor)
16q24.3 Expression 7,94
UCHL1 ubiquitin carboxyl-terminal esterase L1 (ubiquitin thiolesterase)
4p14 Expression 7,95
VIP vasoactive intestinal peptide 6q25 Expression 7,96
ZEB2 zinc finger E-box binding homeobox 2 2q22
Human (Linkage/MW syndrome/CNV)/Mouse
(Abnormal NCC migration)
97–99
ZIC2 Zic family member 2 (odd-paired homolog, Drosophila)
13q32 Mouse 100, 101
Notes: updated gene symbols used for ZFHX1B (replaced by ZEB2), TRKC (replaced by NTRK3) and RALDH2 (replaced by ALDH1A2); EN: enteric neurons; NCC: neural crest cells.
DE NOVO MUTATIONS IN HSCR PATIENTS LINK CNS GENES TO THE DEVELOPMENT OF THE ENS
71
2
Supplementary Table 7. Gene recurrence and burden test.
Gene symbol
HK (14/73) Spain (15/100)# Rotterdam (19/39)
Meta-analysis (48/212)
p-value Direction* p-value Direction* p-value direction p-value Direction*
AFF3 1.0000 -1 0.6973 -1 0.0392 1 0.4745 1
CCR2 1.0000 -1 0.6973 -1 0.0392 1 0.4745 1
CKAP2L 0.0216 1 0.1175 1 1.0000 -1 0.0178 1
COL6A3 0.5883 -1 0.5430 -1 0.5970 1 0.6398 -1
DAB2IP 0.4402 -1 0.6973 -1 0.1484 1 0.9819 -1
DENND3 0.3699 -1 0.6973 -1 0.5970 1 0.5977 -1
HMCN1 0.4308 -1 0.3662 1 0.8013 -1 0.9789 1
ISG20L2 1.0000 -1 1.0000 -1 0.1484 1 0.4949 1
KDM4A 0.0216 1 0.4967 -1 0.1484 1 0.1190 1
MAP4 0.6596 -1 0.2903 1 0.5970 1 0.4851 1
MED26 1.0000 -1 1.0000 -1 1.0000 -1 1.0000 -1
NCLN 1.0000 -1 1.0000 -1 0.1484 1 0.4949 -1
NUP98 0.5309 -1 0.6973 -1 0.0392 1 0.7243 1
PLEKHG5 1.0000 -1 0.1999 -1 0.9826 1 0.3997 -1
RBM25 1.0000 -1 0.0095 1 1.0000 -1 0.0846 1
RET 0.1867 1 0.6367 1 0.0008 1 0.0078 1
SCUBE3 1.0000 -1 0.5806 -1 1.0000 -1 0.7133 -1
TBATA 1.0000 -1 1.0000 -1 0.5970 1 0.8028 1
TUBG1 1.0000 -1 1.0000 -1 1.0000 -1 1.0000 -1
VEZF1 1.0000 -1 0.6973 -1 0.1484 1 0.6717 1
ZNF57 0.0216 1 0.4967 -1 1.0000 -1 0.3808 1
Genes with DNMs were checked for the presence of rare damaging mutations in additional HSCR
patients. The burden of rare, damaging mutations in HSCR patients was compared to that of a local
population-matched controls; in addition, gene-wise burden test p-values from three cohorts (HK, Spain
and Rotterdam) were combined using meta-analysis. Number of cases and controls are given in
parentheses. *: Direction 1 means rare damaging variants enriched in cases, -1 means rare variants
enriched in controls; #: 4 HSCR patients in discovery trios were not included due to mismatched
platform with control data. Nominal P-values from meta-analyses < 0.05 are given in bold (CKAP2L and
RET).
CHAPTER 2 SUPPLEMENTARY INFORMATION
72
Supplementary Table 8. Bioinformatics prediction of the functional impact of DNMs.
Gene and
mutation
RNA
structure
change*
Human
splicing
finder#
Conservation
(PhyloP)%
Gene-level relevance$
RET:splicing9+1 0.7866 splice
donor 7.88 Major HSCR gene
RBM25:L158L 0.8224
Constrained gene102
, interacts with PAX3
RET:S837fs NA
Major HSCR gene
COL6A3:H1109H 0.4898
Interacts with ERBB2, ITGB1, shares pathway
with NRTN, GDNF
RET:Y606fs NA
Major HSCR gene
DAB2IP:H1132Y 0.8539
1.76
ISG20L2:G321R 0.495
7.59
MED26:A225A 0.9717
Pathway sharing with NOTCH genes
NCLN:Q166* 0.5467
7.38 Nodal signaling, involves in CNS development
NUP98:N1662S 0.5235
5.95 Regulation of known HSCR genes, involves in
CNS development
VEZF1:S195F 0.0565
9.86
ZNF57:D190D 0.5217
RET:G588fs NA splice-
acceptor Major HSCR gene
SCUBE3:N498I 0.5473
1.96 Hedgehog signaling
KDM4A:N9S 0.5813
3.54 Neural crest specification in chicken
PLEKHG5:T876T 0.4096
AFF3:V659L 0.0272
1.84
MAP4:G1117G 0.996
1.40 Interacts with CDH2, ERBB2, MAPT and DCX
RET:C620R 0.4286
4.80 Major HSCR gene
CKAP2L:E186fs NA
Involves in CNS development
RET:C137G 0.6841
3.73 Major HSCR gene
HMCN1:A3456T 0.6906
0.60
TUBG1:S233S 0.5802
Interacts with SOX2
CCR2:L283Q 0.0659
5.87 3p21; interacts with GLI2; shares pathway
with EDN3
DENND3:K640fs NA
Involves in CNS development
RET:C570* 0.1453
-0.02 Major HSCR gene
RET:R175del NA
Major HSCR gene
TBATA:R53C 0.5526
1.71 Involves in CNS development
Bioinformatics prediction tools, databases and literature were used to predict functional impact of
DNMs and the genes carrying DNMs. *: significant changes (< 0.2) are in bold; #: only potential splice
sites (donor or acceptor) are shown; %: PhyloP score > 2 means conservative; $: evidence collected
from PubMed literature and bioinformatics databases (STRING, MsigDB pathways).
DE NOVO MUTATIONS IN HSCR PATIENTS LINK CNS GENES TO THE DEVELOPMENT OF THE ENS
73
2
Supplementary Table 9. Sequence and dosage of antisense morpholino.
a. Splice-blocking morpholino
Target gene Human ortholog Sequence Dosage (ng)
aff3 AFF3 AAATGTCTTTCCCCCCTCACCTTTC 6
ckap2l CKAP2L TGAAGTAAACTCACAGTCTTTCCTC 6
dab2ipa DAB2IP* AGGTCAGCAGACTCACCTCGAAGCA 6
dab2ipb DAB2IP* GCTTTCCACTAACACCTTACCCAGC 6
dennd3a DENND3* CATCTTTACCCTGTGCGAAAAGTTA 6
dennd3b DENND3* CCATTCAATTTTGTTTCACCTGGAA 6
hmcn1 HMCN1 GCACAAAGATTTCCCCTTACCCTGA 6
isg20l2 ISG20L2 CTACTGATGCTTATTTCATACCTCT 6
kdm4aa KDM4A* GACACAAGCAATGACAGTACCAGGA 6
kdm4ab KDM4A* AGTTGAACAGAACATACTTGTCGCT 6
ncl1 NCLN GAACCTGCCAATGGATGTGGTTTAT 6
nup98 NUP98 GTATGGAGCAGCTAAACTTACGGTT 1
scube3 SCUBE3 ACTAGATGAAGGGACTCACTCTTGC 6
tbata TBATA GATAGAGCCCAATACTGTACCTCCC 4
vezf1a VEZF1* AGCCAATCGCACTAGCCTTACCTTT 6
vezf1b VEZF1* ATCCAAAATGCTAAACCCACCTAGA 6
b. Translation-blocking morpholino
Target gene Human ortholog Sequence Dosage (ng)
ckap2l CKAP2L GTCTTCATCAGTCATCGTTTCCATC 6
dennd3a DENND3* GACCACACGGCACATTATCAGCCAT 8
dennd3b DENND3* GACCGTCTGCCATTGAAAATCAACA 8
ncl1 NCLN ACCTCACCAGCCTCCTCGAACATGC 0.8
nup98 NUP98 GTTGAACATCTTGCACTGCTATAGA 12
tbata TBATA AGCACCTGCACAAACAAATCAGACT# 6
c. Control morpholino
Target gene Human ortholog Sequence%
Dosage (ng)
ckap2l CKAP2L TGtAcTAAAgTCACAcTgTTTCCTC 6
dennd3a DENND3* CAaCaTTACgCTGTGCcAAAAcTTA 6
dennd3b DENND3* CCAaTgAATTTTcTTTCACgTcGAA 6
ncl1 NCLN GAACaTcCCAATGaATcTGaTTTAT 6
nup98 NUP98 GTtTcGAGCAcCTAAAgTTACcGTT 1
tbata TBATA GAaAcAGCCgAATACTcTAgCTCCC 4
p53 P53 GCGCCATTGCTTTGCAAGAATTG 2
HBB$ CCTCTTACCTCAGTTACAATTTATA 12
*: DENND3, DAB2IP, KDM4A and VEZF1 are duplicated in zebrafish genome. #: There was no suitable
target site in tbata for translation-blocking morpholino. A second non-overlapping splice-blocking
morpholino was used instead. %: Small letters indicate the mismatch nucleotides to the corresponding
splice-blocking morpholino. $: morpholino against human beta-globin as an universal negative control.
CHAPTER 2 SUPPLEMENTARY INFORMATION
74
Supplementary Table 10. qPCR/RT-PCR primers.
Target transcript Forward / Reverse Sequence (5' to 3')
aff3 Forward AAAGCAGCAGTCAACGTTCC
Reverse CATCTGTCCAACTGCCAATG
ckap2l Forward TGAGATCCAACCACACCAAG
Reverse GTTCCACAGCGAAGACAATG
dab2ipa Forward TGGGACAGGATTTCTGCTTC
Reverse GCACAGCACGTCTCAAATTC
dab2ipb Forward GCACTAAAGCCATCGAGGAG
Reverse ACGGGTCCACTTCACAGTTC
dennd3a Forward TGCTTGGAGTGTCAAACGAG
Reverse ATAAACGGTGGAGCGTGAAC
dennd3b Forward GCAGCCTCTGATGATTGTCCT
Reverse GTTGGGACAGTATGGGCACA
hmcn1 Forward GAAGAAATTGCCTCGACCAG
Reverse AGCAGGTGAACCTTTGAGGA
isg20l2 Forward ACTCGCTGGAGTGGAATCAG
Reverse GGATAGCATGTCCCACAACC
kdm4aa Forward GGGATGTGGAAGAGCACATT
Reverse TGCTCTGGAGGCACAACATA
kdm4ab Forward TGAAAGAGTTCCGCAAAACC
Reverse CAGCTCCATAGATGGGAGGA
ncl1 Forward CTGTTTCTGTCGGTCGGAAT
Reverse ATCACACAGCGACGACTCAG
nup98 Forward GAACCTGGGGTTTGGATTCT
Reverse CCAGCATCACTTCCTCCAAT
scube3 Forward TCTCCTGTCCTGGAAACACC
Reverse ACTCCACATTGGCTGGGTAG
tbata Forward CTGAAAGCTGGCGTGAGGAA
Reverse GTGTGTGTGTTGTCGTACGC
vezf1a Forward GATGGAGGTGTCCACAAACC
Reverse GCAGGCCGTTACTTGACATT
vezf1b Forward GCACAAGCCCTACATCTGCT
Reverse TGGCATTTAAAGGGTCGTTC
elfa Forward CTTCTCAGGCTGACTGTGC
Reverse CCGCTAGCATTACCCTCC
actb Forward TACAATGAGCTCCGTGTTGC
Reverse GTTCCCATCTCCTGCTCAAA
DE NOVO MUTATIONS IN HSCR PATIENTS LINK CNS GENES TO THE DEVELOPMENT OF THE ENS
75
2
SUPPLEMENTARY FIGURES
Supplementary Figure 1. Analytical pipeline for exome sequence filtration and prioritization. 1:
GATK; 2: KGGSeq; 3: PLINK; 4: ANNOVAR. KGGSeq integrates different kinds of knowledge resources
from (epi)genetic databases, pathways databases and protein-protein interaction networks to annotate
the genes that harbor any post-QC variants as well as to predict the potential pathogenicity of their
variants. For deleteriousness prediction, KGGSeq integrates 5 prediction programs (Polyphen2, Sift,
MutationTaster, PhyloP and Likelihood ratio) which are weighted by logistic regression103. Annovar is
mainly used to double-check the final remaining variant for annotation, and provides supplementary
features from Database of genomic variation (DGV) and clinical variation database (ClinVAR).
CHAPTER 2 SUPPLEMENTARY INFORMATION
76
Supplementary Figure 2. Relatedness plotting of HSCR exome sequences. Around 17K common
SNPs (minor allele frequency > 0.01 in 1000Genomes European populations) were used to calculate
identical by descent (IBD) and identical by state (IBS) proportion. Each cell shows pi_hat statistics1 (IBD
proportion, calculated from P(IBD=2)+0.5*P(IBD=1); http://pngu.mgh.harvard.edu/~purcell/
plink/ibdibs.shtml) between two patients. No pairwise pi_hat coefficients are above 0.125 (the first
cousin relationship); the light blue cells represent 0.07~0.11 for samples mainly from HK population,
which is expected to be different from other European patients.
DE NOVO MUTATIONS IN HSCR PATIENTS LINK CNS GENES TO THE DEVELOPMENT OF THE ENS
77
2
A
B
Supplementary Figure 3. Distribution of de novo mutations per trio. A) Number of DNMs
(separated by mutation type) in each trio, categorized into three different types (Loss of function,
synonymous and others). B) Distribution of observed counts of DNMs per trio and expected counts per
trio calculated from Poisson distribution (lambda at 1.2).
CHAPTER 2 SUPPLEMENTARY INFORMATION
78
A
B
Supplementary Figure 4. Sanger confirmation of mosaic DNMs in DAB2IP and NCLN. Two out of 28
de novo mutations (in DAB2IP and NCLN) were confirmed as mosaic mutations by Sanger sequencing
(forward and reverse Sequencing direction). A) Peak for the DAB2IP heterozygous mosaic mutation. B)
Peak for the NCLN heterozygous mosaic mutation.
DE NOVO MUTATIONS IN HSCR PATIENTS LINK CNS GENES TO THE DEVELOPMENT OF THE ENS
79
2
Su
pp
lem
en
tary
Fig
ure
5.
Co
nn
ect
ion
of
DN
M g
en
es
an
d E
NS
ge
ne
s a
t p
ath
wa
y/
ne
two
rk l
ev
el.
In
gen
uit
y P
ath
way
An
alys
is (
IPA
) w
as u
sed
to
lin
k 1
16
EN
S ca
nd
idat
e ge
nes
(le
ft, S
up
ple
men
tary
Tab
le 8
) w
ith
th
e 2
0 n
ewly
fo
un
d g
enes
har
bo
rin
g d
e n
ovo
mu
tati
on
s (r
igh
t). S
oli
d a
nd
do
tted
lin
es
rep
rese
nt
dir
ect
and
ind
irec
t in
tera
ctio
ns,
res
pec
tive
ly.
CHAPTER 2 SUPPLEMENTARY INFORMATION
80
Supplementary Figure 6. qPCR confirmation of gene knockdown by SBMO. Relative expression of
the candidate genes between SBMO-injected (grey bar) and control morpholino-injected embryos
(black bar) by qPCR.
DE NOVO MUTATIONS IN HSCR PATIENTS LINK CNS GENES TO THE DEVELOPMENT OF THE ENS
81
2
Supplementary Figure 7. RT-PCR confirmation of ncl1 SBMO knockdown. ncl1 expressions in six
1dpf embryos injected with ncl1 SBMO were compared to control MO injected embryos. Arrow
indicated the expected amplicon. L: ladder; C1: control MO-injected embryo; C2: RT negative control.
Supplementary Figure 8. RT-PCR for expression of 4 candidate genes in zebrafish. Temporal
expression pattern of zebrafish orthologue genes. RT-PCR for dennd3a, dennd3b, ncl1, nup98 and tbata
was performed on RNA isolated from wild type embryos at 0, 24, 48, 72, 96 and 120 hpf.
CHAPTER 2 SUPPLEMENTARY INFORMATION
82
REFERENCES 1. Purcell, S. et al. PLINK: A tool set for whole-genome association and population-based linkage
analyses. Am. J. Hum. Genet. 81, 559–575 (2007). 2. Jiang, Q., Ho, Y.-Y., Hao, L., Nichols Berrios, C. & Chakravarti, A. Copy number variants in
candidate genes are genetic modifiers of Hirschsprung disease. PLoS One 6, e21219 (2011). 3. Gui, H. et al. Targeted next-generation sequencing on hirschsprung disease: A pilot study
exploits DNA pooling. Ann. Hum. Genet. 78, 381–387 (2014). 4. Alves, M. M. et al. Contribution of rare and common variants determine complex diseases-
Hirschsprung disease as a model. Dev. Biol. 382, 320–9 (2013). 5. Anderson, R. B., Newgreen, D. F. & Young, H. M. Neural crest and the development of the
enteric nervous system. Advances in Experimental Medicine and Biology 589, 181–196 (2006). 6. Thiesen, S., Kübart, S., Ropers, H. H. & Nothwang, H. G. Isolation of two novel human RhoGEFs,
ARHGEF3 and ARHGEF4, in 3p13-21 and 2q22. Biochem. Biophys. Res. Commun. 273, 364–9 (2000).
7. Heanue, T. A. & Pachnis, V. Expression profiling the developing mammalian enteric nervous system identifies marker and candidate Hirschsprung disease genes. Proc. Natl. Acad. Sci. U. S. A. 103, 6919–24 (2006).
8. Pachnis, V., Durbec, P., Taraviras, S., Grigoriou, M. & Natarajan, D. Role Of the RET signal transduction pathway in development of the mammalian enteric nervous system. Am. J. Physiol. 275, 321–327 (1998).
9. Huang, H. S. et al. Direct transcriptional induction of Gadd45gamma by Ascl1 during neuronal differentiation. Mol. Cell. Neurosci. 44, 282–96 (2010).
10. Okamura, Y. & Saga, Y. Notch signaling is required for the maintenance of enteric neural crest progenitors. Development 135, 3555–3565 (2008).
11. Morrison, S. J. et al. Transient Notch activation initiates an irreversible switch from neurogenesis to gliogenesis by neural crest stem cells. Cell 101, 499–510 (2000).
12. Hagiyama, M., Ichiyanagi, N., Kimura, K. B., Murakami, Y. & Ito, A. Expression of a soluble isoform of cell adhesion molecule 1 in the brain and its involvement in directional neurite outgrowth. Am. J. Pathol. 174, 2278–89 (2009).
13. Sultan, M. et al. Gene expression variation in Down’s syndrome mice allows prioritization of candidate genes. Genome Biol. 8, R91 (2007).
14. Hermiston, M. L. & Gordon, J. I. Inflammatory bowel disease and adenomas in mice expressing a dominant negative N-cadherin. Science 270, 1203–1207 (1995).
15. Schmidt, E. F., Shim, S.-O. & Strittmatter, S. M. Release of MICAL autoinhibition by semaphorin-plexin signaling promotes interaction with collapsin response mediator protein. J. Neurosci. 28, 2287–2297 (2008).
16. Brännvall, K. et al. Cystatin-B is expressed by neural stem cells and by differentiated neurons and astrocytes. Biochem. Biophys. Res. Commun. 308, 369–374 (2003).
17. Heanue, T. a & Pachnis, V. Enteric nervous system development and Hirschsprung’s disease: advances in genetic and stem cell studies. Nat. Rev. Neurosci. 8, 466–79 (2007).
18. Simeone, A. et al. Cloning and characterization of two members of the vertebrate Dlx gene family. Proc. Natl. Acad. Sci. U. S. A. 91, 2250–4 (1994).
19. Cobos, I., Borello, U. & Rubenstein, J. L. R. Dlx Transcription Factors Promote Migration through Repression of Axon and Dendrite Growth. Neuron 54, 873–888 (2007).
20. Hamajima, N. et al. A novel gene family defined by human dihydropyrimidinase and three related proteins with differential tissue distribution. Gene 180, 157–163 (1996).
21. Zardo, G. et al. Integrated genomic and epigenomic analyses pinpoint biallelic gene inactivation in tumors. Nat. Genet. 32, 453–458 (2002).
22. Amiel, J. & Lyonnet, S. Hirschsprung disease, associated syndromes, and genetics: a review. J. Med. Genet. 38, 729–739 (2001).
23. Hofstra, R. M. et al. A loss-of-function mutation in the endothelin-converting enzyme 1 (ECE-1) associated with Hirschsprung disease, cardiac defects, and autonomic dysfunction. Am. J. Hum. Genet. 64, 304–8 (1999).
24. Bidaud, C. et al. Mutations of the endothelin-3 gene in isolated and syndromic forms of Hirschsprung disease. Gastroenterol Clin Biol 21, 548–554 (1997).
25. Sánchez-Mejías, A., Fernández, R. M., López-Alonso, M., Antiñolo, G. & Borrego, S. Contribution of RET, NTRK3 and EDN3 to the expression of Hirschsprung disease in a multiplex family. J.
DE NOVO MUTATIONS IN HSCR PATIENTS LINK CNS GENES TO THE DEVELOPMENT OF THE ENS
83
2
Med. Genet. 46, 862–864 (2009). 26. Carrasquillo, M. M. et al. Genome-wide association study and mouse model identify interaction
between RET and EDNRB pathways in Hirschsprung disease. Nat. Genet. 32, 237–44 (2002). 27. Kenny, S. E. et al. Reduced endothelin-3 expression in sporadic Hirschsprung disease. Br. J.
Surg. 87, 580–585 (2000). 28. Han, J., Knops, J. F., Longshore, J. W. & King, P. H. Localization of human elav-like neuronal
protein 1 (Hel-N1) on chromosome 9p21 by chromosome microdissection polymerase chain reaction and fluorescence in situ hybridization. Genomics 36, 189–191 (1996).
29. Akamatsu, W. et al. The RNA-binding protein HuD regulates neuronal cell identity and maturation. Proc. Natl. Acad. Sci. U. S. A. 102, 4625–4630 (2005).
30. Paratore, C., Goerich, D. E., Suter, U., Wegner, M. & Sommer, L. Survival and glial fate acquisition of neural crest cells are regulated by an interplay between the transcription factor Sox10 and extrinsic combinatorial signaling. Development 128, 3949–61 (2001).
31. Britsch, S. The neuregulin-I/ErbB signaling system in development and disease. Adv. Anat. Embryol. Cell Biol. 190, 1–65 (2007).
32. Garcia-Barcelo, M.-M. et al. Genome-wide association study identifies NRG1 as a susceptibility locus for Hirschsprung’s disease. Proc. Natl. Acad. Sci. U. S. A. 106, 2694–9 (2009).
33. Flames, N. & Hobert, O. Gene regulatory logic of dopamine neuron differentiation. Nature 458, 885–889 (2009).
34. Smallwood, P. M. et al. Fibroblast growth factor (FGF) homologous factors: new members of the FGF family implicated in nervous system development. Proc. Natl. Acad. Sci. U. S. A. 93, 9850–7 (1996).
35. Strittmatter, S. M., Fankhauser, C., Huang, P. L., Mashimo, H. & Fishman, M. C. Neuronal pathfinding is abnormal in mice lacking the neuronal growth cone protein GAP-43. Cell 80, 445–452 (1995).
36. Salomon, R. et al. Germline mutations of the RET ligand GDNF are not sufficient to cause Hirschsprung disease. Nat. Genet. 14, 345–7 (1996).
37. Ivanchuk, S. M., Myers, S. M., Eng, C. & Mulligan, L. M. De novo mutation of GDNF, ligand for the RET/GDNFR-alpha receptor complex, in Hirschsprung disease. Hum. Mol. Genet. 5, 2023–2026 (1996).
38. Angrist, M. et al. Genomic structure of the gene for the SH2 and pleckstrin homology domain-containing protein GRB10 and evaluation of its role in Hirschsprung disease. Oncogene 17, 3065–3070 (1998).
39. Fernandez, R. M., Ruiz-Ferrer, M., Lopez-Alonso, M., Antiñolo, G. & Borrego, S. Polymorphisms in the genes encoding the 4 RET ligands, GDNF, NTN, ARTN, PSPN, and susceptibility to Hirschsprung disease. J. Pediatr. Surg. 43, 2042–2047 (2008).
40. Myers, S. M. et al. Investigation of germline GFR alpha-1 mutations in Hirschsprung disease. J.Med.Genet. 36, 217–220 (1999).
41. Fu, M. Sonic hedgehog regulates the proliferation, differentiation, and migration of enteric neural crest cells in gut. J. Cell Biol. 166, 673–684 (2004).
42. Reichenbach, B. et al. Endoderm-derived Sonic hedgehog and mesoderm Hand2 expression are required for enteric nervous system development in zebrafish. Dev. Biol. 318, 52–64 (2008).
43. Hurowitz, E. H. et al. Genomic characterization of the human heterotrimeric G protein alpha, beta, and gamma subunit genes. DNA Res. 7, 111–120 (2000).
44. Bober, E., Baum, C., Braun, T. & Arnold, H. H. A novel NK-related mouse homeobox gene: expression in central and peripheral nervous structures during embryonic development. Dev. Biol. 162, 288–303 (1994).
45. Fu, M., Lui, V. C. H., Sham, M. H., Cheung, A. N. Y. & Tam, P. K. H. HOXB5 expression is spatially and temporarily regulated in human embryonic gut during neural crest cell colonization and differentiation of enteric neuroblasts. Dev. Dyn. 228, 1–10 (2003).
46. Mavilio, F. et al. Differential and stage-related expression in embryonic tissues of a new human homeobox gene. Nature 324, 664–668 (1986).
47. Murai, M. et al. Interleukin 10 acts on regulatory T cells to maintain expression of the transcription factor Foxp3 and suppressive function in mice with colitis. Nat. Immunol. 10, 1178–84 (2009).
48. Bolk, S. et al. A human model for multigenic inheritance: phenotypic expression in Hirschsprung disease requires both the RET gene and a new 9q31 locus. Proc. Natl. Acad. Sci.
CHAPTER 2 SUPPLEMENTARY INFORMATION
84
U. S. A. 97, 268–73 (2000). 49. Glocker, E. et al. Inflammatory bowel disease and mutations affecting the interleukin-10
receptor. N. Engl. J. Med. 361, 2033–45 (2009). 50. Brooks, A. S. et al. Homozygous nonsense mutations in KIAA1279 are associated with
malformations of the central and enteric nervous systems. Am. J. Hum. Genet. 77, 120–6 (2005).
51. Griseri, P. et al. Complex pathogenesis of Hirschsprung’s disease in a patient with hydrocephalus, vesico-ureteral reflux and a balanced translocation t(3;17)(p12;q11). Eur. J. Hum. Genet. 17, 483–90 (2009).
52. Turner, K. N., Schachner, M. & Anderson, R. B. Cell adhesion molecule L1 affects the rate of differentiation of enteric neurons in the developing gut. Dev. Dyn. 238, 708–15 (2009).
53. Kenwrick, S. Neural cell recognition molecule L1: relating biological complexity to human disease mutations. Hum. Mol. Genet. 9, 879–886 (2000).
54. Mariani, M. et al. Two murine and human homologs of mab-21, a cell fate determination gene involved in Caenorhabditis elegans neural development. Hum. Mol. Genet. 8, 2397–2406 (1999).
55. Gupta, S. et al. Selective interaction of JNK protein kinase isoforms with transcription factors. EMBO J. 15, 2760–70 (1996).
56. Tse, W., Zhu, W., Chen, H. S. & Cohen, A. A novel gene, AF1q, fused to MLL in t(1;11) (q21;q23), is specifically expressed in leukemic and immature hematopoietic cells. Blood 85, 650–6 (1995).
57. K. D. Nguyen, T. T. The burden of coding, non-coding and chromosomal mutations in Hirschsprung disease. ASHG 2013 Abstr. (2013).
58. Garcia-Barcelo, M.-M. et al. Genome-wide association study identifies NRG1 as a susceptibility locus for Hirschsprung’s disease. Proc. Natl. Acad. Sci. U. S. A. 106, 2694–9 (2009).
59. Yang, J. et al. Exome sequencing identified NRG3 as a novel susceptible gene of Hirschsprung’s disease in a Chinese population. Mol. Neurobiol. 47, 957–966 (2013).
60. Jiang, Q. et al. Functional loss of semaphorin 3C and/or semaphorin 3D and their epistatic interaction with ret are critical to Hirschsprung disease liability. Am. J. Hum. Genet. 96, 581–96 (2015).
61. Kruger, R. P., Aurandt, J. & Guan, K.-L. Semaphorins command cells to move. Nat. Rev. Mol. Cell Biol. 6, 789–800 (2005).
62. Doray, B. et al. Mutation of the RET ligand, neurturin, supports multigenic inheritance in Hirschsprung disease. Hum. Mol. Genet. 7, 1449–1452 (1998).
63. Sugino, H. et al. Genomic organization of the family of CNR cadherin genes in mice and humans. Genomics 63, 75–87 (2000).
64. Elson, A., Levanon, D., Weiss, Y. & Groner, Y. Overexpression of liver-type phosphofructokinase (PFKL) in transgenic-PFKL mice: implication for gene dosage in trisomy 21. Biochem. J. 299 ( Pt 2, 409–15 (1994).
65. Allen, P. B., Greenfield, A. T., Svenningsson, P., Haspeslagh, D. C. & Greengard, P. Phactrs 1-4: A family of protein phosphatase 1 and actin regulatory proteins. Proc. Natl. Acad. Sci. U. S. A. 101, 7187–92 (2004).
66. Kim, T. H., Goodman, J., Anderson, K. V. & Niswander, L. Phactr4 Regulates Neural Tube and Optic Fissure Closure by Controlling PP1-, Rb-, and E2F1-Regulated Cell-Cycle Progression. Dev. Cell 13, 87–102 (2007).
67. Sasaki, A. et al. Molecular analysis of congenital central hypoventilation syndrome. Hum. Genet. 114, 22–26 (2003).
68. Garcia-Barceló, M. et al. Association study of PHOX2B as a candidate gene for Hirschsprung’s disease. Gut 52, 563–7 (2003).
69. Ou-Yang, M. C. et al. Concomitant existence of total bowel aganglionosis and congenital central hypoventilation syndrome in a neonate with PHOX2B gene mutation. J. Pediatr. Surg. 42, (2007).
70. Gabriel, S. B. et al. Segregation at three loci explains familial and population risk in Hirschsprung disease. Nat. Genet. 31, 89–93 (2002).
71. Garcia-Barceló, M.-M. et al. Mapping of a Hirschsprung’s disease locus in 3p21. Eur. J. Hum. Genet. 16, 833–40 (2008).
72. Ngan, E. S. W. et al. Prokineticin-1 (Prok-1) works coordinately with glial cell line-derived neurotrophic factor (GDNF) to mediate proliferation and differentiation of enteric neural crest
DE NOVO MUTATIONS IN HSCR PATIENTS LINK CNS GENES TO THE DEVELOPMENT OF THE ENS
85
2
cells. Biochim. Biophys. Acta - Mol. Cell Res. 1783, 467–478 (2008). 73. Ngan, E. S. W. et al. Prokineticin-1 modulates proliferation and differentiation of enteric neural
crest cells. Biochim. Biophys. Acta - Mol. Cell Res. 1773, 536–545 (2007). 74. Oblinger, M. M., Wong, J. & Parysek, L. M. Axotomy-induced changes in the expression of a type
III neuronal intermediate filament gene. J Neurosci 9, 3766–3775 (1989). 75. Edery, P. et al. Mutations of the RET proto-oncogene in Hirschsprung’s disease. Nature 367,
378–380 (1994). 76. Emison, E. S. et al. A common sex-dependent mutation in a RET enhancer underlies
Hirschsprung disease risk. Nature 434, 857–63 (2005). 77. Behar, O., Golden, J. A., Mashimo, H., Schoen, F. J. & Fishman, M. C. Semaphorin III is needed for
normal patterning and growth of nerves, bones and heart. Nature 383, 525–8 (1996). 78. Stoeckli, E. T. et al. Identification of proteins secreted from axons of embryonic dorsal-root-
ganglia neurons. Eur. J. Biochem. 180, 249–258 (1989). 79. Lee, M. S. et al. Selection of neural differentiation-specific genes by comparing profiles of
random differentiation. Stem Cells 24, 1946–1955 (2006). 80. Moldrich, R. X. et al. Proliferation deficits and gene expression dysregulation in Down’s
syndrome (Ts1Cje) neural progenitor cells cultured from neurospheres. J. Neurosci. Res. 87, 3143–3152 (2009).
81. Le Pecheur, M. et al. Oxidized SOD1 alters proteasome activities in vitro and in the cortex of SOD1 overexpressing mice. FEBS Lett. 579, 3613–8 (2005).
82. Wynn, S. L. et al. Organization and conservation of the GART/SON/DONSON locus in mouse and human genomes. Genomics 68, 57–62 (2000).
83. Southard-Smith, E. M., Kos, L. & Pavan, W. J. Sox10 mutation disrupts neural crest development in Dom Hirschsprung mouse model. Nat. Genet. 18, 60–64 (1998).
84. Bylund, M., Andersson, E., Novitch, B. G. & Muhr, J. Vertebrate neurogenesis is counteracted by Sox1-3 activity. Nat Neurosci 6, 1162–1168 (2003).
85. Bahn, S. et al. Neuronal target genes of the neuron-restrictive silencer factor in neurospheres derived from fetuses with Down’s syndrome: a gene expression study. Lancet (London, England) 359, 310–5 (2002).
86. Craxton, M. Genomic analysis of synaptotagmin genes. Genomics 77, 43–9 (2001). 87. Han, J. et al. Tbx3 improves the germ-line competency of induced pluripotent stem cells.
Nature 463, 1096–100 (2010). 88. Paulsen, F. P. et al. Intestinal trefoil factor/TFF3 promotes re-epithelialization of corneal
wounds. J. Biol. Chem. 283, 13418–13427 (2008). 89. Mashimo, H., Wu, D. C., Podolsky, D. K. & Fishman, M. C. Impaired defense of intestinal mucosa
in mice lacking intestinal trefoil factor. Science 274, 262–265 (1996). 90. Böttner, M., Krieglstein, K. & Unsicker, K. The transforming growth factor-betas: structure,
signaling, and roles in nervous system development and functions. J. Neurochem. 75, 2227–40 (2000).
91. Krieglstein, K., Strelau, J., Schober, A., Sullivan, A. & Unsicker, K. TGF-beta and the regulation of neuron survival and death. J. Physiol. Paris 96, 25–30 (2002).
92. Eib, D. W. et al. Expression of the follistatin/EGF-containing transmembrane protein M7365 (tomoregulin-1) during mouse development. Mech. Dev. 97, 167–71 (2000).
93. Tsukahara, F., Hattori, M., Muraki, T. & Sakaki, Y. Identification and cloning of a novel cDNA belonging to tetratricopeptide repeat gene family from Down syndrome-critical region 21q22.2. J Biochem 120, 820–827 (1996).
94. Mongroo, P. S. & Rustgi, A. K. The role of the miR-200 family in epithelial-mesenchymal transition. Cancer Biology and Therapy 10, 219–222 (2010).
95. Sakurai, M. et al. Ubiquitin C-terminal hydrolase L1 regulates the morphology of neural progenitor cells and modulates their differentiation. J. Cell Sci. 119, 162–171 (2006).
96. Larsson, L. T. Hirschsprung’s disease--immunohistochemical findings. Histol Histopathol 9, 615–629 (1994).
97. Wakamatsu, N. et al. Mutations in SIP1, encoding Smad interacting protein-1, cause a form of Hirschsprung disease. Nat. Genet. 27, 369–70 (2001).
98. Amiel, J. et al. Large-scale deletions and SMADIP1 truncating mutations in syndromic Hirschsprung disease with involvement of midline structures. Am. J. Hum. Genet. 69, 1370–7 (2001).
99. Van de Putte, T., Francis, A., Nelles, L., van Grunsven, L. A. & Huylebroeck, D. Neural crest-
CHAPTER 2 SUPPLEMENTARY INFORMATION
86
specific removal of Zfhx1b in mouse leads to a wide range of neurocristopathies reminiscent of Mowat-Wilson syndrome. Hum. Mol. Genet. 16, 1423–1436 (2007).
100. Nagai, T. et al. The expression of the mouse Zic1, Zic2, and Zic3 gene suggests an essential role for Zic genes in body pattern formation. Dev. Biol. 182, 299–313 (1997).
101. Sánchez-Camacho, C. & Bovolenta, P. Autonomous and non-autonomous Shh signalling mediate the in vivo growth and guidance of mouse retinal ganglion cell axons. Development 135, 3531–41 (2008).
102. Samocha, K. E. et al. A framework for the interpretation of de novo mutation in human disease. Nat. Genet. 46, 944–950 (2014).
103. Li, M. X. et al. Predicting Mendelian Disease-Causing Non-Synonymous Single Nucleotide Variants in Exome Sequencing Studies. PLoS Genet. 9, (2013).
REGULATORS OF GENE EXPRESSION IN ENTERIC
NEURAL CREST CELLS ARE PUTATIVE
HIRSCHSPRUNG DISEASE GENES
Duco Schriemer1, Yunia Sribudiani2,3, Arne IJpma4, Dipa Natarajan5, Katherine
MacKenzie2, Marco Metzger5,6, Ellen Binder5, Alan J. Burns2,5, Nikhil Thapar5,7,
Robert M.W. Hofstra2,5, Bart J.L. Eggen1
1 Department of Neuroscience, section Medical Physiology, University of Groningen, University Medical Center
Groningen, Groningen, The Netherlands 2 Department of Clinical Genetics, Erasmus University Medical Center, Rotterdam, The Netherlands 3 Department Biochemistry and Molecular Biology, Faculty of Medicine, Universitas Padjadjaran, Bandung,
Indonesia 4 Department of Bioinformatics, Erasmus University Medical Center, Rotterdam, The Netherlands 5 Stem Cells and Regenerative Medicine, Birth Defects Research Centre, UCL Institute of Child Health, London,
United Kingdom 6 Fraunhofer IGB, Wuerzburg branch and Chair Tissue Engineering and Regenerative Medicine, University Hospital
Wuerzburg, Wuerzburg, Germany 7 Department of Gastroenterology, Great Ormond Street Hospital for Children, London, United Kingdom
Provisionally accepted by Developmental Biology
3
CHAPTER 3
88
ABSTRACT
The enteric nervous system (ENS) is required for peristalsis of the gut and is
derived from enteric neural crest cells (ENCCs). During ENS development, the RET
receptor tyrosine kinase plays a critical role in the proliferation and survival of
ENCCs, their migration along the developing gut, and differentiation into enteric
neurons. Mutations in RET and its ligand GDNF cause Hirschsprung disease
(HSCR), a complex genetic disorder in which ENCCs fail to colonize variable lengths
of the distal bowel. To identify key regulators of ENCCs and the pathways
underlying RET signaling, gene expression profiles of untreated and GDNF-treated
ENCCs from E14.5 mouse embryos were generated. ENCCs express genes that are
involved in both early and late neuronal development, whereas GDNF treatment
induced neuronal maturation. Predicted regulators of gene expression in ENCCs
include the known HSCR genes Ret and Sox10, as well as Bdnf, App, Mapk10 and
Timp1. The regulatory overlap and functional interactions between these genes
were used to construct a regulatory network that is underlying ENS development
and connects to known HSCR genes. In addition, the adenosine receptor A2a
(Adora2a) and neuropeptide Y receptor Y2 (Npy2r) were identified as possible
regulators of terminal neuronal differentiation in GDNF-treated ENCCs. The human
orthologue of Npy2r maps to the HSCR susceptibility locus 4q31.3-q32.3,
suggesting a role for NPY2R both in ENS development and in HSCR.
Keywords: Hirschsprung disease; Enteric Nervous System; Transcriptome;
Upstream Regulators
REGULATORS OF GENE EXPRESSION IN ENCCS ARE PUTATIVE HSCR GENES
89
3
INTRODUCTION
The enteric nervous system (ENS) of the gastrointestinal tract is composed of
neurons and glial cells that are organized in interconnected ganglia in the
myenteric and submucosal plexuses. The ENS controls the peristaltic movements
of the bowel, fluid exchange between the gut and its lumen, and local blood flow1,2.
In vertebrates, the ENS is entirely derived from neural crest cells3. Vagal
(hindbrain) neural crest cells invade the foregut at around E9 in mice4, at which
point they are referred to as enteric neural crest cells (ENCCs). The intestinal wall
is colonized by ENCCs in a rostral to caudal direction between E9 and E15 in mice
and sacral neural crest cells contribute to colonization of the hindgut late during
ENS development5,6.
Glial cell line-derived neurotrophic factor (GDNF) acts as a chemo-
attractant to migrating vagal ENCCs and directs the migration of ENCCs along the
gut7. ENCCs express the REarranged during Transfection (RET) receptor tyrosine
kinase, and RET with its co-receptor GFRα1 is the receptor for GDNF8. Mice lacking
Ret, Gfrα1 or Gdnf fail to colonize the bowel beyond the stomach, leading to
intestinal aganglionosis9–13. Besides its role in migration, RET signaling is
important for proliferation of ENCCs at E12 and for their survival at E14-E16 in
mice14–17. In order to develop into a mature ENS, post-migratory ENCCs
differentiate into various neuronal subtypes and glial cells. GDNF promotes
neuronal, but not glial differentiation of ENCCs in vitro, and acts a chemoattractant
to outgrowing neurites7,15,18,19. The mature ENS contains sensory neurons,
interneurons and excitatory and inhibitory motor neurons that use a wide range of
neurotransmitters, similar to those found in the CNS20.
Mutations in RET and GDNF in humans contribute to Hirschsprung disease
(HSCR), a congenital disorder that is characterized by intestinal aganglionosis in a
variable segment of the distal gut. HSCR patients present with tonic contraction of
the muscle layers in the affected segment, which consequently leads to dilatation
of the proximal bowel. The prevalence of HSCR is 1:5,000 live births and it is
considered an inherited disease. Based on the families reported, the mode of
inheritance can differ. Most non-syndromic familial HSCR cases show a dominant
pattern of inheritance, mostly with reduced penetrance, whereas many syndromic
HSCR families show a recessive pattern of inheritance. The sporadic cases are
considered polygenic or oligogenic. Of the approximately 15 HSCR genes identified
so far, the RET gene is by far the most important gene. Coding mutations in RET are
CHAPTER 3
90
found in ~50% of familial and 15-35% of sporadic HSCR patients and represent
the great majority of genetic mutations found in HSCR patients21,22. Mutations in all
other HSCR-associated genes, including GDNF, are rare23–25.
The colonization of the intestine by ENCCs, followed by their neuronal and
glial differentiation, is a complex process controlled by various signaling pathways.
These include Hedgehog, Notch, Wnt, retinoic acid and TGF-β/BMP signaling26–30.
Several other pathways that are important for ENS development have been
identified in gene expression profiling studies. Iwashita et al. (2003) reported that
the HSCR genes Ret, Gfrα1, Ednrb and Sox10 are highly expressed by ENCCs in mice,
thereby explaining how mutations in these genes contribute to aganglionosis31. A
subsequent expression study by Heanue and Pachnis (2006) showed that at E15.5
in mice, ENCCs express markers of early and late neuronal development and glial
differentiation, representing different stages of ENS development. These authors
identified Sox2, Cart, Sema, Fgf and Jnk as novel signaling pathways in ENCCs32. In
addition, Vohra et al. (2006) found that E14 mouse ENCCs highly express genes
with synaptic functions and showed that these genes are important for ENCC
migration and neurite outgrowth33. Some insight in the pathways downstream of
RET signaling has come from expression profiling of GDNF-treated ENCCs by Ngan
et al. (2008), who analyzed the changes in gene expression in ENCCs after 8 and 16
hours of GDNF treatment. These authors showed that the Prokineticin pathway is
activated by RET signaling, but found that few other genes were differentially
expressed by the short-term GDNF treatment of ENCCs34.
These gene expression studies have focused on individual genes that are
highly expressed by ENCCs and showed that some of these genes are critical for the
development of the ENS. However, for many highly expressed genes it is still
unknown if and how they contribute to ENS development. In this study, we used an
unbiased, systematic pathway analysis to identify the biological processes that are
important for ENS development. Moreover, we treated ENCC cultures with GDNF
for 14 days to identify the stable, long-term gene expression changes induced by
RET signaling. We show that the expression of differentially expressed genes is
regulated by known ENS-related genes, and by several genes not previously
implicated in ENS development. These regulatory genes present novel candidates
genes for HSCR.
REGULATORS OF GENE EXPRESSION IN ENCCS ARE PUTATIVE HSCR GENES
91
3
MATERIALS AND METHODS
Animals
Male C57BL/6 mice carrying Wnt1-Cre+/- were mated with C57BL/6 females that
were homozygous for the Rosa26-LoxP-Stop-LoxP-YFP reporter locus. The stage of
embryonic development was set to E0.5 on the day the vaginal plug was seen.
Embryos carrying both Wnt1-Cre and Rosa26-LoxP-Stop-LoxP-YFP were identified
based on YFP expression in the gastrointestinal tract and craniofacial tissues.
ENCC isolation, culture conditions and RET activation
Full length guts (foregut to hindgut) were dissected from E14.5 embryos. Single
cell suspensions were made using 0.05 mg/ml Collagenase XI/Dispase II (Sigma
Aldrich, St. Louis, MO, USA), followed by incubation for 3-5 min at 37⁰C and
trituration. YFP-positive ENCCs were FACS sorted and directly seeded into
Fibronectin-coated plates containing DMEM/F-12 medium (Invitrogen, Carlsbad,
CA, USA) supplemented with 1% N2 supplement (Invitrogen), 2% B27 supplement
(Invitrogen), 20 ng/ml FGF (Peprotech EC, London, UK), 20 ng/ml EGF (Peprotech
EC) and 1% Penicillin-Streptomycin (Invitrogen). YFP-positive ENCCs from one gut
were equally divided over two wells; one of which was supplemented with 50
ng/ml GDNF (Peprotech EC). ENCCs were incubated at 37⁰C for 14 days in a
humidified atmosphere with 5% CO2. Neurosphere-like bodies appeared after 3-5
days and were kept in culture for 14 days without passaging.
Immunostaining
Wells with neurosphere-like bodies were fixed with 4% PFA for 15 minutes at
room temperature (RT) and washed with PBS+1%Triton X 100 (PBT) for 2x5
minutes. They were incubated in blocking solution (PBT + 1%BSA + 0.15%
glycine) for 1 hour at RT followed by primary antibodies diluted in blocking
solution overnight (O/N) at 4⁰C: rabbit anti-human p75 (1:500, Promega, UK),
mouse-anti βIII-Tubulin (1:500, Covance UK). After several washes in PBT,
secondary antibodies were applied: goat anti-rabbit Alexa Fluor 568 and goat anti-
mouse Alexa Fluor 568 (1:500, Invitrogen) O/N. DAPI was added to secondary
antibodies (1:1000). Antibody was washed off (3x15 minutes with PBT). Slides
were mounted using Vectashield (Vector labs, UK) and visualized using confocal
microscopy imaging on Zeiss LSM 710 (Zeiss, Cambridge, UK).
CHAPTER 3
92
RNA isolation and gene expression quantification
RNA was isolated from untreated and GDNF-treated ENCC cultures, as well as from
uncultured E14.5 embryonic guts using the RNeasy Mini Kit (Qiagen, Crawel, UK).
RNA yield, purity and integrity were determined using the Agilent 2100
BioAnalyzer with 6000 Nano Chips (Agilent Technologies, Amsterdam, The
Netherlands). DNA microarray analyses and RNA quality controls were performed
by ServiceXS (Leiden, The Netherlands) using GeneChip Mouse Genome 430 2.0
arrays (Affymetrix, Santa Clara, CA, USA). For the ENCC vs gut experiment, 3 ENCC
samples and 8 mouse gut samples entered the analysis. For the ENCC+GDNF vs
ENCC experiment, 7 pairs of untreated and GNDF-treated ENCC cultures were
analyzed.
Data normalization, statistical analysis and functional analysis
Quantile normalization and background correction were applied to the raw
intensity values of all samples using Partek version 6.4 (Partek Incorporated, St.
Louis, MO, USA) via GC Robust Multichip Analysis. For the ENCC vs gut analysis, 2
sample T test statistics were applied in Partek to calculate the fold changes with
associated p-values. Paired T test statistics were applied to calculate the fold
changes with associated p-values for the ENCC+GDNF vs ENCC analysis. The fold
change and p-value data were uploaded into Ingenuity Pathway Analysis (IPA)
(Ingenuity/Qiagen, Redwood City, CA, USA). The probe level data was mapped to
the gene level and averaged based on the median fold change values. For the
differential gene expression analysis, significance thresholds of fold change >5 and
p-value <1E-5 were applied for the ENCC vs gut comparison. For the ENCC+GDNF
vs ENCC analysis, significance thresholds were set at fold change >1.8 and p-value
<0.05. Gene Ontology enrichment was performed using DAVID Bioinformatics
Resources 6.7 and Upstream Regulator Analysis was performed in IPA. For the
Upstream Regulator Analysis the significance thresholds were set to z-score >1.8
and p-value <0.05. Hierarchical clustering of Upstream Regulators and their
downstream target genes was performed using the heatmap functionality in the
HIV sequence database (http://www.hiv.lanl.gov).
REGULATORS OF GENE EXPRESSION IN ENCCS ARE PUTATIVE HSCR GENES
93
3
RESULTS
Isolation of Enteric Neural Crest Cells
To identify genes that are involved in ENS development, gene expression profiles
of pure populations of ENCCs were generated. Mice expressing Cre recombinase
under the neural crest-specific Wnt1 promoter were crossed with mice carrying
the Rosa26-LoxP-Stop-LoxP-YFP locus. ENCCs were isolated at embryonic day
E14.5, since at this stage ENCCs express markers of both early and late ENS
development32. Double transgenic mouse embryos expressed YFP in neural crest
cells in craniofacial and gastrointestinal tissues as expected (Figure 1A). YFP-
expressing cells were isolated from E14.5 gut using Fluorescence activated cell
sorting (FACS) (Figure 1B). In culture, these cells formed neurosphere-like bodies
(NLBs) within 3 to 5 days (Figure 1C). Notably, NLBs from GDNF-treated ENCCs
were larger than NLBs from untreated ENCCs. Compared to untreated cultures,
GDNF-treated ENCC cultures showed marked βIII-Tubulin-positive connections
between NLBs, indicating that GDNF treatment induced neuronal differentiation of
ENCCs (Figure 1C). Both untreated and GDNF-treated ENCC cultures expressed the
neural crest marker p75/NGFR at 24 hours and 14 days in vitro.
E14.5 ENCCs are enriched for markers of neuronal development and
neuronal signaling
The gene expression profile of cultured ENCCs (harvested from E14.5 embryos)
was compared to E14.5 embryonic gut to identify genes that are highly expressed
by ENCCs. Since whole gut samples contain a variety of cell types, such as
endothelial cells, smooth muscle, connective tissue, serosa, enterochromaffin cells
and approximately 5% ENCCs, the expression profiles of ENCCs and whole gut
tissue are expected to be very different. Therefore, only genes with an >5-fold
change in gene expression (975 genes) were included in downstream analyses.
Gene Ontology analysis was performed to identify the biological pathways these
genes are involved in. ENCC-expressed genes were highly enriched for the Gene
Ontology terms transmission of nerve impulse, synaptic signaling and cell-cell
signaling, that are associated with neuronal signaling (Figure 2). In addition, genes
that are highly expressed in ENCCs play a role in biological-/cell adhesion and
neuron differentiation. 1187 genes were >5 times less abundantly expressed in
ENCCs than in the gut and these genes are involved in cell cycle control, in
CHAPTER 3
94
Figure 1. Isolation and characterization of ENCCs. A) Schematic overview of the Wnt1-Cre:Rosa26-
YFP mouse model. At embryonic day 14.5, transgenic mouse embryos expressed YFP in neural crest-
derived cells, including craniofacial tissue and ENCCs. B) YFP-expressing ENCCs were FACS-sorted from
dissected guts and were cultured in the presence or absence of GDNF for 14 days. RNA was isolated
from ENCCs as well as whole gut tissue and relative RNA expression levels were quantified by
microarray analysis. C) ENCCs expressed the neural crest marker p75/NGFR at both 24 hours and 14
days after isolation. Pronounced βIII-Tubulin-positive connections between NLBs were observed in
ENCC cultures treated with GNDF.
REGULATORS OF GENE EXPRESSION IN ENCCS ARE PUTATIVE HSCR GENES
95
3
particular during active cell division (M-phase). This finding is consistent with the
high proliferation rate of intestinal epithelial cells. Interestingly, genes highly
expressed by ENCCs (and low in gut tissue) and genes highly expressed in total gut
tissue (and low in ENCCs), are both significantly enriched for Gene Ontology terms
biological- and cell adhesion and regulation of cell proliferation (Figure 2).
However, these represent sets of genes with different properties and localizations.
Adhesion molecules in ENCCs included genes expressed in the ENS, such as
Immunoglobulin superfamily cell adhesion molecules (Ncam1, Ncam2, L1cam,
Nrcam), (proto-) cadherins (Cdh2, Cdh6, Pcdh15, Pcdha4) and integrins (Itga4)35.
Of note, Cdh13 was highly expressed by ENCCs and its human orthologue CDH13
maps to the HSCR susceptibility locus 16q23.336. Adhesion molecules that were
highly expressed in total gut tissue included Cdh1, which is expressed by Paneth
cells and goblet cells in the intestinal epithelium37, the Claudin family of tight
junction molecules (CLDN 2, 3, 4, 5, 6, 7, 8, 15 and 23), as well as genes involved in
the apical junction complex (Desmocollin 2, Desmoglein 2, Plakophilin 2 and 3).
These gut-expressed junction molecules are most likely expressed by the
enterocytes of the intestinal epithelium.
GDNF treatment of ENCCs induces terminal neuronal differentiation
RET signaling is critical during all stages of ENS development and is important for
survival, neuronal differentiation and neurite outgrowth at E14-E167,14,15,18,19. To
identify the molecular pathways underlying these processes, the changes in gene
expression in ENCCs after 14 days of GDNF treatment were analyzed. Compared to
untreated ENCCs, the expression of 445 genes was at least 1.8 times higher in
GDNF-treated ENCCs. The neuronal signaling genes that were highly expressed by
ENCCs compared to the gut were even further up-regulated upon GDNF treatment
(Figure 2). This finding suggests that GDNF treatment induced terminal neuronal
differentiation in E14.5 ENCCs and is consistent with the pronounced βIII-Tubulin-
positive connections that were observed between NLBs grown from GDNF-treated
ENCCs (Figure 1C). Additionally, the expression levels of 352 genes was at least 1.8
times reduced by treatment with GDNF, including genes involved in biological- and
cell adhesion and neuron differentiation. The adhesion molecules that are down-
regulated by GDNF include Cdh13, Col5a1, Fibronectin-1, Reelin, Thrombospondin 1
and 4, and Vcam1. These adhesion molecules are involved in cell migration and cell
motility, and thus corroborate the observation that ENCCs stop migrating when
they undergo terminal neuronal differentiation38,39. Consistent with its
CHAPTER 3
96
proliferative role in ENCCs and its inhibitory effect on neuronal differentiation,
endothelin receptor type B (Ednrb) was down-regulated by GDNF. The down-
regulation of genes involved in neuron differentiation and up-regulation of
Figure 2. Gene Ontology analysis of differentially expressed genes. Differentially expressed genes
between ENCCs and the gut, and between untreated and GDNF-treated ENCCs were annotated by Gene
Ontology enrichment analysis. Values represent multiple testing corrected (Benjamini) p-values. The
six most significant Gene Ontology terms per gene set are given and these are partially overlapping
between the different sets of genes.
REGULATORS OF GENE EXPRESSION IN ENCCS ARE PUTATIVE HSCR GENES
97
3
neuronal signaling genes, shows that different genes are involved in early and late
neuronal development.
The mature ENS displays a great diversity in neuronal subtypes, but the
molecular mechanisms controlling neuronal subtype specification in the ENS are
poorly understood. To test whether RET signaling in ENCCs contributes to the
specification of neuronal subtypes, the expression of neuronal subtypes markers
was determined in untreated and GDNF-treated ENCCs. The expression levels of
cholinergic (Acat2, Ache, Chrna5), glutamatergic (Grik2, Gria1, Grid2) and
GABAergic (Gabra1, Gabrg2, Gabrg3) markers were increased by GDNF, as well as
the expression of CART prepropeptide (Cartpt), NPY receptor Y2 (Npy2r),
Substance P precursor Tac1, Dopamine beta-hydroxylase (Dbh) and Vasoactive
intestinal peptide (Vip). The elevated expression of these subtype markers
supports our observation that RET signaling is involved in terminal neuronal
differentiation of ENCCs. Although several other subtype markers, such as 5-HT,
Calbindin, Calretinin, CGRP, Enkephalin, and NOS were not found to be
differentially expressed, the wide range of neuronal subtypes markers that are up-
regulated by GDNF suggests that RET signaling might be involved in the
specification of different neuronal subtypes.
Upstream Regulator Analysis identifies known and new genes important for
ENS development
Having identified gene expression patterns in untreated and GDNF-treated ENCCs,
we next aimed to identify regulatory genes in ENCCs and in GDNF-induced
neuronal maturation. To address this, we used the Upstream Regulator Analysis
tool in Ingenuity Pathway Analysis (IPA). Upstream Regulators are predicted to
regulate the expression of differentially expressed genes, taking into account the
enrichment for genes that are known to be regulated by a particular Upstream
Regulator (p-value of overlap) and the direction of differential gene expression
(activation z-score)40. Many genes that are known to have important regulatory
functions in ENS development are highly expressed by ENCCs. Therefore, genes
were selected from the Upstream Regulator Analysis that were either highly
expressed by ENCCs or up-regulated by GDNF. The Upstream Regulator Analysis of
differentially gene expression between ENCCs and the gut yielded nine regulatory
genes that are highly expressed by ENCCs themselves (Table 1).
The most significant Upstream Regulator, in terms of both activation z-
score and fold change in expression, is Cdkn2a (p16). CDKN2A induces cell cycle
CHAPTER 3
98
arrest and thereby inhibits proliferation. Another Upstream Regulator, Elavl1,
mediates the anti-proliferative activity of CDKN2A. Cdkn2a and Elavl1 activity thus
explain the low expression of cell cycle-related genes in ENCCs compared to the
gut. Nupr1 is a predicted Upstream Regulator that is also involved in cell cycle
control, and is important for cell cycle progression during cellular stress41–43. In
our study, Nupr1 might be up-regulated in ENCCs to counteract the stress induced
by in vitro culturing of ENCCs.
In line with its function in ENS development, Ret is a predicted Upstream
Regulator of gene expression in ENCCs. Another gene that is known to regulate
ENS development, Sox10, is a predicted Upstream Regulator as well. SOX10 is a
Table 1. Upstream Regulators of gene expression in naïve and GDNF-treated ENCCs.
Dataset Upstream
Regulator
Activation
z-score
p-value of
overlap
Target
genes
Fold
change Link to HSCR
High in
ENCCs
Cdkn2a 5.196 1.65E-07 55 467.6
Nupr1 4.740 2.51E-16 127 6.2
Bdnf 3.286 1.37E-05 46 9.2 Regulator of SNAP25 and VAMP2; migration of
ENCCs (33)
App 2.931 1.47E-12 157 7.1 Overexpression (Down syndrome) causes neuronal
loss (57–59)
Regulator of SNAP25 and VAMP2; migration of
ENCCs (33)
Inhibitor of MMP2 that is important for migration of
ENCCs (60)
Sox10 2.596 1.56E-03 9 19.8 Known HCSR gene (61)
Ret 2.359 9.10E-05 29 16.5 Known HSCR gene (62,63)
Mapk10 2.195 6.71E-03 5 22.7 Deletion in HSCR patient (64)
Phosphorylates APP and STMN2 (65,66)
Elavl1 2.183 1.98E-02 16 5.4
Timp1 1.890 1.38E-02 7 7.9 Inhibitor of MMP2 that is important for migration of
ENCCs (60)
Up by
GDNF
Adora2a 1.941 8.39E-02* 10 1.9
Npy2r 1.727* 2.58E-03 4 1.9 Maps to HSCR susceptibility locus 4q31.3-q32.3
(88)
Regulatory genes in ENCCs were predicted by IPA Upstream Regulator Analysis. Significant Upstream
Regulators were defined as having an activation z-score >1.8 and p-value of overlap <0.05. From the
Upstream Regulator Analysis, genes were selected that were expressed >5 times higher in ENCCs than
in the gut, or that were >1.8 times up-regulated by GDNF. Many Upstream Regulators can be linked to
ENS development and HSCR (also see Figure 4). *: Near-significant values.
REGULATORS OF GENE EXPRESSION IN ENCCS ARE PUTATIVE HSCR GENES
99
3
transcription factor that is required for survival of ENCCs and maintenance of their
progenitor state44,45. The identification of Ret and Sox10 as regulatory genes in
ENCCs demonstrates the validity of the Upstream Regulator Analysis and suggests
that the remaining Upstream Regulators and their respective pathways are key
regulators of ENS development. These remaining Upstream Regulators are Bdnf,
App, Mapk10 and Timp1.
Brain-derived neurotrophic factor (Bdnf) is expressed by ENCCs and the
mucosa and its receptors TRKB and p75/NGFR are abundantly expressed in the
human and mouse ENS46,47. BDNF signaling promotes neuronal survival, growth
and differentiation and functions in the mature ENS by controlling peristalsis and
colonic emptying48,49.
Amyloid precursor protein (App) is the precursor of Amyloid-β plaques in
Alzheimer disease. The endogenous function of APP is not completely understood,
but APP has been shown to be involved in synapse formation and plasticity and is
expressed in the human ENS50–53.
Mitogen-activated protein kinase 10 (Mapk10/Jnk3) is highly expressed by
ENCCs32. MAPK10 is involved in a broad range of processes, including
proliferation, differentiation, inflammation, cellular stress and neuronal apoptosis
and is regulated by RET signaling in ENCCs34.
Timp1 encodes a metallopeptidase inhibitor and is a predicted Upstream
Regulator of gene expression in ENCCs. To our knowledge, the expression of Timp1
in the ENS has not been described before. Besides Timp1, Timp2 and Timp3 are
also more abundantly expressed in ENCCs than in the gut in our data. TIMP1 might
contribute to ENS development in several ways. TIMP1 is involved in Epithelial-
Mesenchymal Transition and thereby for the delamination of neural crest cells
from the neuroepithelium54. In CNS development, TIMP1 regulates neural stem cell
chemotaxis and neurite outgrowth in cortical neurons and might serve similar
functions during ENS development55,56.
Adenosine and Neuropeptide Y signaling modulate GDNF-induced changes in
gene expression
Upstream Regulator Analysis of GDNF-induced changes in gene expression did not
result in any genes that have both a significant activation z-score and p-value.
There are, however, two Upstream Regulators with near-significant z-scores and p-
values; the adenosine receptor A2a (Adora2a) and the neuropeptide Y receptor Y2
(Npy2r). Adenosine regulates intestinal secretion, motility and immune responses
CHAPTER 3
100
and acts through adenosine receptors A1, A2a, A2b and A3. The adenosine A2a
receptor is expressed by the myenteric- and submucosal plexuses in both the small
and large intestine67. ADORA2A modulates intestinal motility by inhibiting motor
activity in the duodenum and colon of mice and humans, respectively68,69, thereby
allowing relaxation of the bowel.
Npy2r encodes the Neuropeptide Y receptor Y2, a receptor for
neuropeptide Y (NPY) and peptide YY (PYY)70. NPY and PYY induce colonic smooth
muscle relaxation and inhibit intestinal contractility and motility in response to
feeding, which is exerted through the neuropeptide Y1 and Y2 receptors71,72.
Synergistic regulation of gene expression
The predicted Upstream Regulators each have between 4 and 157 downstream
targets, combining up to 377 unique genes. Although a single Upstream Regulator
regulates the expression of most of these genes, there are also 67 genes that are
regulated by multiple regulatory genes (Figure 3). Many of the genes that are
regulated by three or more Upstream Regulators are known to play a role in the
ENS, such as Il-6, c-Jun, (p75/)Ngfr, Prnp, App, Napb, Nos1 and Npy. Hierarchical
clustering of downstream target genes showed that there are clusters of co-
regulated genes with similar functions. For example, Gap43, Bdnf, Mapt, Mbp, Penk,
Snap25, Syn1, Syn2, Syp, Tubb3 and Vamp2 all are involved in synaptic signaling
and are synergistically regulated by Bdnf and App. Likewise, Cdc25c, Ccna2,
Igf2bp3, Mki67, Skp2 and Tmpo are all involved in cell division and are regulated by
both Cdkn2a and Nupr1. The synergistic regulation of downstream target genes by
multiple Upstream Regulators provides insight into the molecular networks
underlying ENS development.
DISCUSSION
In this study we have analyzed gene expression profiles of GDNF-treated and
untreated mouse ENCCs (isolation time E14.5) and E14.5 mouse embryonic gut.
Unbiased, systematic Gene Ontology analysis showed that, compared to the whole
gut, ENCCs highly express genes involved in cell adhesion, neuronal differentiation
and synaptic signaling, representing various aspects of ENS development. These
data corroborate findings by Heanue and Pachnis32. These authors compared gene
REGULATORS OF GENE EXPRESSION IN ENCCS ARE PUTATIVE HSCR GENES
101
3
Figure 3. Upstream Regulators regulate gene expression synergistically. The regulation of
downstream target genes (right-hand side) by Upstream Regulators (bottom) is shown for all genes
that are regulated by two or more Upstream Regulators. Red and blue boxes indicate regulation by an
Upstream Regulator in naïve and GDNF-treated ENCCs, respectively. Yellow indicates that the gene is
not regulated by the specific Upstream Regulator.
CHAPTER 3
102
expression profiles of E15.5 ganglionic (Ret+/+) and aganglionic (Retk-/k-) gut and
found that ENCCs express markers of early differentiating glia and neurons,
terminally differentiated neurons, axonal outgrowth and synaptogenesis. In fact, of
the 47 genes that were confirmed to be expressed in the ENS by in situ
hybridization in that study, 35 genes (74%) were highly expressed by ENCCs in
our study. Likewise, the gene expression study by Vohra et al. confirmed
expression in the ENS for 38 genes by in situ hybridization, of which 23 genes
(61%) were differently expression between ENCCs and the gut in the present
study33. Of the 105 ENCC-expressed genes found by Iwashita et al., 19 genes (18%)
overlap with our data31. Importantly, this shows that the 14 days in vitro culturing
of ENCCs in our study did not lead to major changes in gene expression. Of the 975
genes that were higher expressed in ENCCs than in total gut tissue, 848 genes
(87%) were not reported in any of the three expression studies mentioned above.
However, part of these 848 genes are already known to be involved in ENS
development73,74. This study therefore represents a more complete catalog of
known and newly identified genes that are expressed in the developing ENS.
Regulators of gene expression in ENCCs are putative HSCR genes
Cdkn2a, Nupr1, Ret, Sox10, Bdnf, App, Mapk10, Elavl1 and Timp1 were identified as
genes that are highly expressed in ENCCs and that regulate the expression of other
ENCC-expressed genes. Of these regulators, RET and SOX10 are known HSCR genes.
Mutations in RET and SOX10 contribute to HSCR in humans and mice with null
mutations for Ret and Sox10 display total intestinal aganglionosis9,62,63,61,75,76. Given
the known neuronal functions of Bdnf, App, Mapk10 and Timp1, this raises the
hypothesis that these regulatory genes might also be implicated in HSCR and other
neuropathies, such as intestinal pseudo-obstruction, incontinence or constipation.
Mapk10 is a highly interesting regulator of gene expression in ENCCs with regard
to HSCR. MAPK10 modulates neuronal differentiation through phosphorylation of
APP and STMN265,66. STMN2 (SCG10) interacts with Kinesin Binding Protein (KBP,
KIAA1279) and mutations in KBP have been linked to the HSCR-associated
Goldberg-Shprintzen Syndrome77,78. Loss of MAPK10 could therefore contribute to
HSCR through impaired APP and/or STMN2 signaling in enteric neuron
differentiation. Since Mapk10 was reported to be highly expressed in ENCCs, it was
included in a copy number variant screen for candidate genes in HSCR64. The
authors reported one deletion in MAPK10 among 18 HSCR patients. Segregation
analysis showed that the MAPK10 deletion was neither necessary nor sufficient to
REGULATORS OF GENE EXPRESSION IN ENCCS ARE PUTATIVE HSCR GENES
103
3
cause HSCR in this family and the authors therefore hypothesized that the deletion
in MAPK10 acts as a modifier for developing HSCR64.
ENS development is of interest in relation to Down syndrome (Trisomy
21). The incidence of HSCR in Down syndrome is 2.6%, which is over a hundred
times higher than in the general population79. This suggests that the duplication of
one or more genes on chromosome 21 contributes to the onset of HSCR. APP is
located on chromosome 21 in humans and App is an Upstream Regulator of gene
expression in the developing murine ENS. It has been shown that overexpression
of wild type human APP in mice causes loss of cholinergic neurons through
disruption of Nerve Growth Factor transport and impairs neurogenesis in the
CNS57,58. Moreover, mice overexpressing an Alzheimer’s disease mutant form of
APP show reduced neuronal density in the myenteric plexus and altered intestinal
motility80. In the latter study it is not clear, however, whether the phenotype was
caused by the increased expression level of APP or by the presence of the
Alzheimer mutation in the overexpressed gene. These reports, combined with the
regulatory function of App in ENCCs, raise the intriguing hypothesis that
overexpression of APP might contribute to the high incidence of HSCR and other
gastrointestinal abnormalities in Down syndrome patients.
In our Upstream Regulator Analysis, App and Bdnf shared several
downstream target molecules that are involved in synaptic signaling, including
Snap25 and Vamp2. Snap25 and Vamp2 mediate the fusion of synaptic vesicles and
the high expression of these genes in ENCCs was reported by Vohra et al., who
subsequently showed that Snap25 and Vamp2 are important for ENCC migration
and neurite outgrowth33. Since both genes are regulated by APP and BDNF,
aberrant APP and BDNF signaling might result in similar defects in ENS
development.
The Upstream Regulator Timp1 is a natural inhibitor of matrix
metalloproteinases (MMPs), a group of proteins that rearrange the extracellular
matrix. MMP2 activity is inhibited by TIMP1 and pharmacological inhibition of
MMP2 impairs the migration of ENCCs along the mouse gut60. Furthermore, the
chemoattractive effect of TIMP1 on neural stem cells is mediated by β1-Integrin,
and lack of β1-Integrin in ENCCs causes a HSCR-like phenotype in mice55,81.
Another member of the TIMP family, TIMP2, is required for the migration of
cardiac neural crest cells in avians82. These data highlight the importance of the
extracellular matrix in neural crest migration and suggest a novel role for TIMP
proteins in ENS development.
CHAPTER 3
104
Of note, a metalloproteinase inhibitor domain that inhibits MMP2 has been
identified in APP83, indicating that APP inhibits MMP2 activity directly.
Additionally, APP may inhibit MMP2 indirectly, as it was shown that mice
overexpressing Alzheimer mutant APP and Presenilin-1 have higher levels of
TIMP1 in their ileum than wild-type mice84. Inhibition of MMP2 is another
mechanism by which overexpression of APP might contribute to the development
of HSCR in Down syndrome.
As discussed here, the Upstream Regulators with neuronal functions are
either known HSCR genes, or are linked to the molecular pathways underlying
HSCR or animal models of aganglionosis. There is a large body of literature
suggesting interactions between these genes, allowing us to construct a
hypothetical network of regulatory genes in ENCCs (Figure 4). The transcription
factor SOX10 binds to an enhancer of Ret, thereby regulating Ret expression85. In
turn, RET activates the MAPK pathway, including MAPK1034,86. MAPK10
phosphorylates APP, which regulates the expression of synaptic genes Snap25 and
Vamp2 together with BDNF, and directly inhibits MMP2 activity and indirectly via
TIMP165,83,84. MAPK10 also phosphorylates STMN2, leading to cytoskeletal
remodeling in conjunction with KBP66,77.
Modulators of RET-signaling in ENCCs
Several signaling cascades are activated by RET, including Pi3K/AKT, ERK, p38-
MAPK and JNK pathways86. Little is known, however, how these pathways
contribute to RET- induced proliferation, migration, differentiation and survival of
ENCCs. Similar to our study, Ngan et al. analyzed the changes in gene expression
induced by GDNF treatment of ENCCs and found that the expression levels of 29
genes were increased by GDNF treatment. Only 4 of these 29 genes were up-
regulated by GDNF in our data. These different findings can be explained by two
fundamental differences in study design, duration of GDNF stimulation and
developmental stage of ENCCs. Ngan et al. investigated the short-term changes (8
and 16 hour GDNF treatment) in E11.5 ENCCs, when RET is required for migration
and proliferation. In the present study, on the other hand, we have analyzed the
long-term changes (14 days GDNF treatment) in ENCCs at E14.5, a stage in which
RET is involved in survival and differentiation. The four genes that are up-
regulated by GDNF treatment in both studies (Kcnd2, Vip, Olfm1 and Tagln3), are
therefore likely to be critical modulators of RET signaling and are possibly
involved in different aspects of ENS development.
REGULATORS OF GENE EXPRESSION IN ENCCS ARE PUTATIVE HSCR GENES
105
3
Long-term GDNF treatment of ENCCs resulted in reduced expression
levels of genes involved in neuronal migration and early neuronal differentiation.
Conversely, GDNF increased the expression of genes that are associated with
terminal neuronal differentiation and synaptic signaling. Upstream Regulator
Analysis showed that the adenosine receptor A2a and the neuropeptide Y receptor
Y2 are near-significant modifiers of these cellular processes in GDNF-treated
ENCCs. Signaling via NPY2R modulates the neurotrophic effect of GDNF on ENCCs,
as a study by Anitha et al. has shown that GDNF treatment of ENCCs results in
increased proliferation and reduced apoptosis of ENCCs in a NPY-dependent
manner87. Of interest, the human orthologue of Npy2r maps to the HSCR
susceptibility locus4q31.3-q32.3 and increased NPY levels have been reported in
aganglionic bowel segments from HSCR patients88,89.
Figure 4. Hypothetical molecular network implicating Upstream Regulators in HSCR. The
Upstream Regulators in ENCCs are either known HSCR genes (RET and SOX10), can be linked to a
known HSCR gene (KBP), or can be linked to genes that are critical for colonization of the gut by ENCCs
(SNAP25, VAMP2 and MMP2). The link of each Upstream Regulator to HSCR and the respective
references are summarized in Table 1. Arrows indicate activation and blunted lines indicate inhibition.
Solid lines represent direct interactions in ENCCs. Dotted lines represent indirect interactions and
evidence from other cell types.
CHAPTER 3
106
Concluding remarks
In conclusion, systematic analysis of E14.5 ENCCs gene expression profiles showed
that they express markers of neural migration, axonal outgrowth and neuronal
signaling. Upstream Regulator Analysis predicted genes that serve critical
regulatory functions during ENS development and we propose a mechanistic
network of action for these regulatory genes. Additionally, we found that GDNF
treatment of ENCCs caused terminal neuronal differentiation into a variety of
neuronal subtypes. These regulatory genes in ENS development give insight into
the molecular pathways regulating ENS development and provide novel candidate
genes for HSCR.
FUNDING ACKNOWLEDGEMENT
This study was supported by a research grant from ZonMW (TOP-subsidie 40-
00812-98-10042).
REGULATORS OF GENE EXPRESSION IN ENCCS ARE PUTATIVE HSCR GENES
107
3
REFERENCES
1. Gershon, M. D. Nerves, reflexes, and the enteric nervous system: pathogenesis of the irritable
bowel syndrome. J. Clin. Gastroenterol. 39, S184–93 (2005). 2. Furness, J. B. The Enteric Nervous System. (Blackwell Publishing, 2006). 3. Le Douarin, N. M. & Kalcheim, C. The Neural Crest 2nd edition. (1999). 4. Durbec, P. L., Larsson-Blomberg, L. B., Schuchardt, A., Costantini, F. & Pachnis, V. Common
origin and developmental dependence on c-ret of subsets of enteric and sympathetic neuroblasts. Development 122, 349–58 (1996).
5. Druckenbrod, N. R. & Epstein, M. L. The pattern of neural crest advance in the cecum and colon. Dev. Biol. 287, 125–33 (2005).
6. Burns, A. J. & Douarin, N. M. The sacral neural crest contributes neurons and glia to the post-umbilical gut: spatiotemporal analysis of the development of the enteric nervous system. Development 125, 4335–47 (1998).
7. Young, H. M. et al. GDNF is a chemoattractant for enteric neural cells. Dev. Biol. 229, 503–16 (2001).
8. Durbec, P. et al. GDNF signalling through the Ret receptor tyrosine kinase. Nature 381, 789–793 (1996).
9. Schuchardt, A., D’Agati, V., Larsson-Blomberg, L., Costantini, F. & Pachnis, V. Defects in the kidney and enteric nervous system of mice lacking the tyrosine kinase receptor Ret. Nature 367, 380–383 (1994).
10. Enomoto, H. et al. GFR alpha1-deficient mice have deficits in the enteric nervous system and kidneys. Neuron 21, 317–324 (1998).
11. Pichel, J. G. et al. Defects in enteric innervation and kidney development in mice lacking GDNF. Nature 382, 73–76 (1996).
12. Sánchez, M. P. et al. Renal agenesis and the absence of enteric neurons in mice lacking GDNF. Nature 382, 70–73 (1996).
13. Moore, M. W. et al. Renal and neuronal abnormalities in mice lacking GDNF. Nature 382, 76–79 (1996).
14. Chalazonitis, A., Rothman, T. P., Chen, J. & Gershon, M. D. Age-dependent differences in the effects of GDNF and NT-3 on the development of neurons and glia from neural crest-derived precursors immunoselected from the fetal rat gut: expression of GFRalpha-1 in vitro and in vivo. Dev. Biol. 204, 385–406 (1998).
15. Taraviras, S. et al. Signalling by the RET receptor tyrosine kinase and its role in the development of the mammalian enteric nervous system. 2797, 2785–2797 (1999).
16. Gianino, S., Grider, J. R., Cresswell, J., Enomoto, H. & Heuckeroth, R. O. GDNF availability determines enteric neuron number by controlling precursor proliferation. Development 130, 2187–2198 (2003).
17. Uesaka, T., Nagashimada, M., Yonemura, S. & Enomoto, H. Diminished Ret expression compromises neuronal survival in the colon and causes intestinal aganglionosis in mice. J. Clin. Invest. 118, 1890–1898 (2008).
18. Hearn, C. J., Murphy, M. & Newgreen, D. GDNF and ET-3 differentially modulate the numbers of avian enteric neural crest cells and enteric neurons in vitro. Dev. Biol. 197, 93–105 (1998).
19. Natarajan, D., Marcos-Gutierrez, C., Pachnis, V. & de Graaff, E. Requirement of signalling by receptor tyrosine kinase RET for the directed migration of enteric nervous system progenitor cells during mammalian embryogenesis. Development 129, 5151–60 (2002).
20. Hao, M. M. & Young, H. M. Development of enteric neuron diversity. J. Cell. Mol. Med. 13, 1193–1210 (2009).
21. Attié, T. et al. Diversity of RET proto-oncogene mutations in familial and sporadic Hirschsprung disease. Hum. Mol. Genet. 4, 1381–6 (1995).
22. Hofstra, R. M. et al. RET and GDNF gene scanning in Hirschsprung patients using two dual denaturing gel systems. Hum. Mutat. 15, 418–29 (2000).
23. Brooks, A. S., Oostra, B. A. & Hofstra, R. M. W. Studying the genetics of Hirschsprung’s disease: Unraveling an oligogenic disorder. Clin. Genet. 67, 6–14 (2005).
24. Amiel, J. et al. Hirschsprung disease, associated syndromes and genetics: a review. J. Med. Genet. 45, 1–14 (2008).
25. Alves, M. M. et al. Contribution of rare and common variants determine complex diseases-
CHAPTER 3
108
Hirschsprung disease as a model. Dev. Biol. 382, 320–9 (2013). 26. Ramalho-Santos, M., Melton, D. A. & McMahon, A. P. Hedgehog signals regulate multiple
aspects of gastrointestinal development. Development 127, 2763–2772 (2000). 27. Okamura, Y. & Saga, Y. Notch signaling is required for the maintenance of enteric neural crest
progenitors. Development 135, 3555–3565 (2008). 28. Ikeya, M., Lee, S. M., Johnson, J. E., McMahon, A. P. & Takada, S. Wnt signalling required for
expansion of neural crest and CNS progenitors. Nature 389, 966–70 (1997). 29. Sato, Y. & Heuckeroth, R. O. Retinoic acid regulates murine enteric nervous system precursor
proliferation, enhances neuronal precursor differentiation, and reduces neurite growth in vitro. Dev. Biol. 320, 185–98 (2008).
30. Goldstein, A. M., Brewer, K. C., Doyle, A. M., Nagy, N. & Roberts, D. J. BMP signaling is necessary for neural crest cell migration and ganglion formation in the enteric nervous system. Mech. Dev. 122, 821–33 (2005).
31. Iwashita, T., Kruger, G. M., Pardal, R., Kiel, M. J. & Morrison, S. J. Hirschsprung disease is linked to defects in neural crest stem cell function. Science 301, 972–976 (2003).
32. Heanue, T. A. & Pachnis, V. Expression profiling the developing mammalian enteric nervous system identifies marker and candidate Hirschsprung disease genes. Proc. Natl. Acad. Sci. U. S. A. 103, 6919–24 (2006).
33. Vohra, B. P. S. et al. Differential gene expression and functional analysis implicate novel mechanisms in enteric nervous system precursor migration and neuritogenesis. Dev. Biol. 298, 259–271 (2006).
34. Ngan, E. S. W. et al. Prokineticin-1 (Prok-1) works coordinately with glial cell line-derived neurotrophic factor (GDNF) to mediate proliferation and differentiation of enteric neural crest cells. Biochim. Biophys. Acta - Mol. Cell Res. 1783, 467–478 (2008).
35. McKeown, S. J., Wallace, A. S. & Anderson, R. B. Expression and function of cell adhesion molecules during neural crest migration. Dev. Biol. 373, 244–257 (2013).
36. Carrasquillo, M. M. et al. Genome-wide association study and mouse model identify interaction between RET and EDNRB pathways in Hirschsprung disease. Nat. Genet. 32, 237–44 (2002).
37. Schneider, M. R. et al. A key role for E-cadherin in intestinal homeostasis and paneth cell maturation. PLoS One 5, e14325 (2010).
38. Young, H. M. et al. Dynamics of neural crest-derived cell migration in the embryonic mouse gut. Dev. Biol. 270, 455–473 (2004).
39. Young, H. M., Turner, K. N. & Bergner, A. J. The location and phenotype of proliferating neural-crest-derived cells in the developing mouse gut. Cell Tissue Res. 320, 1–9 (2005).
40. Krämer, A., Green, J., Pollard, J. & Tugendreich, S. Causal analysis approaches in Ingenuity Pathway Analysis. Bioinformatics 30, 523–30 (2014).
41. Gironella, M. et al. p8/nupr1 regulates DNA-repair activity after double-strand gamma irradiation-induced DNA damage. J. Cell. Physiol. 221, 594–602 (2009).
42. Guo, X. et al. Lentivirus-mediated RNAi knockdown of NUPR1 inhibits human nonsmall cell lung cancer growth in vitro and in vivo. Anat. Rec. (Hoboken). 295, 2114–21 (2012).
43. Hamidi, T. et al. Nupr1-aurora kinase A pathway provides protection against metabolic stress-mediated autophagic-associated cell death. Clin. Cancer Res. 18, 5234–46 (2012).
44. Kapur, R. P. Early death of neural crest cells is responsible for total enteric aganglionosis in Sox10(Dom)/Sox10(Dom) mouse embryos. Pediatr. Dev. Pathol. 2, 559–569 (1999).
45. Bondurand, N., Natarajan, D., Barlow, A., Thapar, N. & Pachnis, V. Maintenance of mammalian enteric nervous system progenitors by SOX10 and endothelin 3 signalling. Development 133, 2075–2086 (2006).
46. Hoehner, J. C., Wester, T., Pahlman, S. & Olsen, L. Localization of neurotrophins and their high-affinity receptors during human enteric nervous system development. Gastroenterology 110, 756–767 (1996).
47. Lommatzsch, M. et al. Abundant production of brain-derived neurotrophic factor by adult visceral epithelia. Implications for paracrine and target-derived Neurotrophic functions. Am. J. Pathol. 155, 1183–93 (1999).
48. Coulie, B. et al. Recombinant human neurotrophic factors accelerate colonic transit and relieve constipation in humans. Gastroenterology 119, (2000).
49. Grider, J. R., Piland, B. E., Gulick, M. A. & Qiao, L. Y. Brain-derived neurotrophic factor augments peristalsis by augmenting 5-HT and calcitonin gene-related peptide release. Gastroenterology 130, 771–780 (2006).
REGULATORS OF GENE EXPRESSION IN ENCCS ARE PUTATIVE HSCR GENES
109
3
50. Turner, P. R., O’Connor, K., Tate, W. P. & Abraham, W. C. Roles of amyloid precursor protein and its fragments in regulating neural activity, plasticity and memory. Progress in Neurobiology 70, (2003).
51. Tyan, S.-H. et al. Amyloid precursor protein (APP) regulates synaptic structure and function. Mol. Cell. Neurosci. 51, 43–52 (2012).
52. Arai, H. et al. Expression patterns of beta-amyloid precursor protein (beta-APP) in neural and nonneural human tissues from Alzheimer’s disease and control subjects. Ann. Neurol. 30, 686–693 (1991).
53. Cabal, A. et al. beta-Amyloid precursor protein (beta APP) in human gut with special reference to the enteric nervous system. Brain Res. Bull. 38, 417–23 (1995).
54. Jung, Y. S. et al. TIMP-1 induces an EMT-like phenotypic conversion in MDCK cells independent of its MMP-inhibitory domain. PLoS One 7, (2012).
55. Lee, S. Y. et al. TIMP-1 modulates chemotaxis of human neural stem cells through CD63 and integrin signalling. Biochem. J. 459, 565–76 (2014).
56. Ould-Yahoui, A. et al. A new role for TIMP-1 in modulating neurite outgrowth and morphology of cortical neurons. PLoS One 4, (2009).
57. Salehi, A. et al. Increased App Expression in a Mouse Model of Down’s Syndrome Disrupts NGF Transport and Causes Cholinergic Neuron Degeneration. Neuron 51, 29–42 (2006).
58. Trazzi, S. et al. The amyloid precursor protein (APP) triplicated gene impairs neuronal precursor differentiation and neurite development through two different domains in the ts65dn mouse model for down syndrome. J. Biol. Chem. 288, 20817–20829 (2013).
59. Semar, S. et al. Changes of the enteric nervous system in amyloid-beta protein precursor transgenic mice correlate with disease progression. J. Alzheimer’s Dis. 36, 7–20 (2013).
60. Anderson, R. B. Matrix metalloproteinase-2 is involved in the migration and network formation of enteric neural crest-derived cells. Int. J. Dev. Biol. 54, 63–69 (2010).
61. Pingault, V. et al. SOX10 mutations in patients with Waardenburg-Hirschsprung disease. Nat. Genet. 18, 171–3 (1998).
62. Edery, P. et al. Mutations of the RET proto-oncogene in Hirschsprung’s disease. Nature 367, 378–380 (1994).
63. Romeo, G. et al. Point mutations affecting the tyrosine kinase domain of the RET proto-oncogene in Hirschsprung’s disease. Nature 367, 377–378 (1994).
64. Jiang, Q., Ho, Y.-Y., Hao, L., Nichols Berrios, C. & Chakravarti, A. Copy number variants in candidate genes are genetic modifiers of Hirschsprung disease. PLoS One 6, e21219 (2011).
65. Kimberly, W. T., Zheng, J. B., Town, T., Flavell, R. A. & Selkoe, D. J. Physiological regulation of the beta-amyloid precursor protein signaling domain by c-Jun N-terminal kinase JNK3 during neuronal differentiation. J. Neurosci. 25, 5533–43 (2005).
66. Neidhart, S. et al. c-Jun N-terminal kinase-3 (JNK3)/stress-activated protein kinase-beta (SAPKbeta) binds and phosphorylates the neuronal microtubule regulator SCG10. FEBS Lett. 508, 259–64 (2001).
67. Antonioli, L. et al. Regulation of enteric functions by adenosine: Pathophysiological and pharmacological implications. Pharmacology and Therapeutics 120, 233–253 (2008).
68. Zizzo, M. G., Mastropaolo, M., Lentini, L., Mulè, F. & Serio, R. Adenosine negatively regulates duodenal motility in mice: role of A(1) and A(2A) receptors. Bull. Int. Assoc. Eng. Geol. 164, 3–11 (2011).
69. Fornai, M. et al. A1 and A2a receptors mediate inhibitory effects of adenosine on the motor activity of human colon. Neurogastroenterol. Motil. 21, 451–466 (2009).
70. Chen, C. H., Stephens, R. L. & Rogers, R. C. PYY and NPY: control of gastric motility via action on Y1 and Y2 receptors in the DVC. Neurogastroenterol. Motil. 9, 109–16 (1997).
71. Hellström, P. M. Mechanisms involved in colonic vasoconstriction and inhibition of motility induced by neuropeptide Y. Acta Physiol. Scand. 129, 549–556 (1987).
72. Tough, I. R. et al. Endogenous peptide YY and neuropeptide Y inhibit colonic ion transport, contractility and transit differentially via Y₁ and Y₂ receptors. Br. J. Pharmacol. 164, 471–84 (2011).
73. Teng, L., Mundell, N. a, Frist, A. Y., Wang, Q. & Labosky, P. a. Requirement for Foxd3 in the maintenance of neural crest progenitors. Development 135, 1615–1624 (2008).
74. Kuo, Y.-M. et al. Extensive enteric nervous system abnormalities in mice transgenic for artificial chromosomes containing Parkinson disease-associated alpha-synuclein gene mutations precede central nervous system changes. Hum. Mol. Genet. 19, 1633–50 (2010).
CHAPTER 3
110
75. Southard-Smith, E. M., Kos, L. & Pavan, W. J. Sox10 mutation disrupts neural crest development in Dom Hirschsprung mouse model. Nat. Genet. 18, 60–64 (1998).
76. Herbarth, B. et al. Mutation of the Sry-related Sox10 gene in Dominant megacolon, a mouse model for human Hirschsprung disease. Proc. Natl. Acad. Sci. U. S. A. 95, 5161–5165 (1998).
77. Alves, M. M. et al. KBP interacts with SCG10, linking Goldberg-Shprintzen syndrome to microtubule dynamics and neuronal differentiation. Hum. Mol. Genet. 19, 3642–3651 (2010).
78. Brooks, A. S. et al. Homozygous Nonsense Mutations in KIAA1279 Are Associated with Malformations of the Central and Enteric Nervous Systems. Am. J. Hum. Genet. 77, 120–126 (2005).
79. Friedmacher, F. & Puri, P. Hirschsprung’s disease associated with Down syndrome: A meta-analysis of incidence, functional outcomes and mortality. Pediatr. Surg. Int. 29, 937–946 (2013).
80. Semar, S. et al. Changes of the enteric nervous system in amyloid-beta protein precursor transgenic mice correlate with disease progression. J. Alzheimer’s Dis. 36, 7–20 (2013).
81. Breau, M. A. et al. Lack of beta1 integrins in enteric neural crest cells leads to a Hirschsprung-like phenotype. Development 133, 1725–1734 (2006).
82. Cantemir, V., Cai, D. H., Reedy, M. V & Brauer, P. R. Tissue inhibitor of metalloproteinase-2 (TIMP-2) expression during cardiac neural crest cell migration and its role in proMMP-2 activation. Dev. Dyn. 231, 709–19 (2004).
83. Miyazaki, K., Hasegawa, M., Funahashi, K. & Umeda, M. A metalloproteinase inhibitor domain in Alzheimer amyloid protein precursor. Nature 362, 839–41 (1993).
84. Puig, K. L., Lutz, B. M., Urquhart, S. A., Rebel, A. A. & Zhou, X. Overexpression of Mutant Amyloid-beta Protein Precursor and Presenilin 1 Modulates Enteric Nervous System. J. Alzheimer’s Dis. 44, 1263–1278 (2015).
85. Lang, D. & Epstein, J. A. Sox10 and Pax3 physically interact to mediate activation of a conserved c-RET enhancer. Hum. Mol. Genet. 12, 937–45 (2003).
86. Plaza-Menacho, I., Burzynski, G. M., de Groot, J. W., Eggen, B. J. L. & Hofstra, R. M. W. Current concepts in RET-related genetics, signaling and therapeutics. Trends Genet. 22, 627–36 (2006).
87. Anitha, M. et al. Glial-Derived Neurotrophic Factor Modulates Enteric Neuronal Survival and Proliferation Through Neuropeptide Y. Gastroenterology 131, 1164–1178 (2006).
88. Brooks, A. S. et al. A novel susceptibility locus for Hirschsprung’s disease maps to 4q31.3-q32.3. J. Med. Genet. 43, e35 (2006).
89. Hamada, Y. et al. Increased neuropeptide Y-immunoreactive innervation of aganglionic bowel in Hirschsprung’s disease. Virchows Arch. A. Pathol. Anat. Histopathol. 411, 369–77 (1987).
REMOVING THE MAXIMUM ASSIGNED VALUE OF THE
GENOTYPE QUALITY SCORE ALLOWS SPECIFIC
DETECTION OF DE NOVO MUTATIONS IN NEXT-
GENERATION SEQUENCING DATA
Duco Schriemer1, Hongsheng Gui2,3, Rutger W.W. Brouwer4, Erwin Brosens5, Clara
S.M. Tang2, Christian Gilissen6, Wilfred F.J. van IJcken4, Salud Borrego7,8, Isabella
Ceccherini9, Aravinda Chakravarti10, Stanislas Lyonnet11,12, Paul K. Tam2, Maria-
Mercè Garcia-Barceló2, Bart J.L. Eggen1, Robert M.W. Hofstra5,13
1 Department of Neuroscience, section Medical Physiology, University of Groningen, University Medical Center
Groningen, Groningen, The Netherlands 2 Department of Surgery, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong, SAR, China 3 Centre for Genomic Sciences, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong, SAR,
China 4 Biomics Erasmus Center for Biomics, Erasmus Medical Center, Rotterdam, The Netherlands 5 Department of Clinical Genetics, Erasmus Medical Center, Rotterdam, the Netherlands 6 Department of Human Genetics, Donders Centre for Neuroscience, Radboud University Medical Center, The
Netherlands 7 Unidad de Gestión Clínica de Genética, Reproducción y Medicina Fetal Hospitales, Universitarios Virgen del Rocío,
Seville, Spain 8 CIBER de Enfermedades Raras (CIBERER), ISCIII, Seville, Spain 9 UOC Genetica Medica, Istituto Gaslini, Genova, Italy 10 McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, USA 11 Département de Génétique, Faculté de Médecine, Université Paris Descartes, Paris, France 12 INSERM U-781, AP-HP Hôpital Necker-Enfants Malades, Paris, France 13 Stem Cells and Regenerative Medicine, Birth Defects Research Centre UCL Institute of Child Health, London, UK
Manuscript in preparation
4
CHAPTER 4
112
ABSTRACT
De novo mutations (DNMs) play a major role in the development of rare genetic
syndromes and more common neurodevelopmental disorders. Next-generation
sequencing (NGS) allows for the unbiased analysis of DNMs in the human genome
or exome. Detecting DNMs in NGS data is challenging, as the number of sequencing
artefacts is far greater than the number of real DNMs. Thresholds for reference
allele ratio, sequencing depth, and Genotype Quality have been shown to
effectively reduce the number of sequencing artefacts. In this study, we used a
training set of 22 confirmed DNMs and 59 false positives to test how thresholds
settings for these sequencing parameters affect the sensitivity and specificity of
detecting DNMs. We found that removing the maximum assigned value to the
Genotype Quality score allows to distinguish confirmed DNMs from false positives.
Sequencing depth and reference allele frequency also had good and moderate
discriminative power, respectively. By combining thresholds for all parameters, we
were able to detect DNMs with 100% sensitivity and 98.3% specificity in our
training dataset. Validation of the filtering method successfully detected all 33
confirmed and four novel DNMs in an independent dataset and produced 1 false
positive. The findings of this study provide a guideline for the efficient detection of
DNMs in NGS data.
REMOVING THE MAX OF THE GQ SCORE ALLOWS SPECIFIC DETECTION OF DNMS IN NGS DATA
113
4
INTRODUCTION
A de novo mutation (DNM) is an alteration in the genome that is present for the
first time in a person as a result of a mutation in a germ cell (egg or sperm) of one
of the parents or in the fertilized egg itself. Next-generation sequencing (NGS) has
allowed the genome- and exome-wide detection of DNMs. Exome sequencing has
revealed a role for DNMs in rare genetic syndromes and more common
neurodevelopmental disorders1–13. These studies found an average of 1 to 2 exonic
DNMs per proband. However, technical artefacts in NGS data poses a challenge for
the identification of DNMs. Due to the large number of genotyped base pairs in
exome sequencing, the number of incorrectly called genotypes is several orders of
magnitude larger than the actual number of DNMs. Currently, DNM detection
methods are not fully accurate and all methods still require confirmation by
validation techniques such as Sanger sequencing to discriminate between real
DNMs and false positives. As cohort sizes in sequencing studies are expanding and
exome sequencing is being replaced by whole-genome sequencing, the number of
candidate DNMs that require experimental validation can reach thousands14–16. It
is therefore important to detect DNMs effectively and efficiently.
It has been proposed that the accuracy of genotype calling can be
improved by jointly calling the genotypes of multiple individuals17,18. Multiple-
sample genotype calling uses the observed alleles in other sequenced individuals
to calculate the prior probability of finding a given genotype. For DNMs, which are
typically found in only a single individual, the prior probability of finding a variant
is low. This means that strong evidence is required to reliable call a variant. Joint
genotype calling of multiple individuals has the additional benefit of annotating
reference allele homozygous genotypes in the variant file at positions where a
variant has been found in another individual. Since DNMs are absent in parents, it
is important to be able to discriminate between reference allele homozygous
genotypes and missing genotypes to efficiently detect DNMs.
When applied to parents-proband trios, joint genotype calling can improve
the detection of DNMs19. Several software packages were developed to detect
DNMs in NGS data that all make use of trio-based joint genotype calling20–24. Each
software package has its additional methodology to specifically detect DNMs.
PolyMutt and DeNovoGear use genomic mutation rates to calculate the likelihood
of finding a DNM at a specific position20,21. However, mutation rates vary between
genomic regions and TrioDeNovo was developed to take region-specific mutation
CHAPTER 4
114
rates into account22. DNMFilter uses machine learning to discriminate between
DNMs and false positives based on several sequencing parameters23 and mirTrios
uses sequencing parameters in a probabilistic model to detect DNMs24.
Three sequencing parameters were shown to efficiently eliminate
incorrect genotype calls, while retaining most of the true variants: the reference
allele ratio, sequencing depth and Genotype Quality score25,26. In the present study,
we aimed to assess the relative contribution of these parameters to DNM detection
and analyzed how filtering thresholds for these parameters, in probands and
parents, affects the sensitivity and specificity of detecting DNMs.
METHODS
Exome sequencing
DNA from 20 parents-proband trios was subjected to exome sequencing in three
different centers (Table 1). Each center performed exome capture and NGS
according to in house protocols. In short, the exomes of 5 trios were captured
using the Illumina Truseq Exome Enrichment kit and sequenced on an Illumina
GAIIX machine. The other two centers (5 and 10 trios, respectively) both used the
Agilent SureSelect Exome Capture kit V4 and an Illumina HiSeq2000 sequencer.
Sequencing reads from all 20 trios were aligned to the human reference genome
hg19 (build 37) using standard BWA alignment27. Multi-sample genotype calling
was performed on all individuals simultaneously using the Genome Analysis
Toolkit (GATK) Unified Genotyper 2.028.
Table 1. NGS technologies and validation results per center.
Center Trios Capture array Sequencer Mean depth 10X
coverage Validated
candidates Confirmed
DNMs Validation
rate DNMs/
trio
A 5 Illimina Truseq Illumina GAII 26.8 X 74% 27 1 3.7% 0.2
B 5 Agilent
SureSelect V4 Illumina
HiSeq 2000 41.9 X 89% 24 8 33.3% 1.6
C 10 Agilent
SureSelect V4 Illumina
HiSeq 2000 55.7 X 95% 32 15 46.9% 1.5
Parents-proband trios were sequenced in three different centers. Sequencing depth differed per center
and affected the number of confirmed DNMs and validation success accordingly.
REMOVING THE MAX OF THE GQ SCORE ALLOWS SPECIFIC DETECTION OF DNMS IN NGS DATA
115
4
Threshold setting to identify candidate DNMs
DNMs were identified as heterozygous variants in probands that were absent in
both parents. The Mendelian Disease Arguments function of the KGGSeq software29
was used to extract DNMs from the multi-sample VCF file. Variants were selected
to have: 1) call quality ≥50; 2) mapping quality ≥30; 3) read depth ≥5 in probands
and parents; 4) Genotype Quality score ≥10 in probands and parents; 5) reference
allele ratio ≥0.10 in heterozygous probands; 6) reference allele ratio ≤0.10 in
reference allele homozygous parents; and 7) are in coding regions of the genome
(Figure 1).
These low thresholds settings preserved most real variants in the data, yet
removed many low quality variants25. To further reduce the number of DNM
candidates, variants with a minor allele frequency above 1% in dbSNP137,
1000Genomes 201204, ESP6500AA and ESP6500EA were excluded.
Visual inspection and further selection of DNMs
Candidate DNMs after low threshold filtering were visually inspected in the
Integrative Genomics Viewer (IGV)30 to select the most likely DNMs. Candidate
DNMs were discarded if 1) all alternative allele reads were from the same strand,
2) more than two alleles were observed, 3) the proband showed the alternative
allele in <20% of the sequencing reads and a parent in >5% of the sequencing
reads, 4) only a single sequencing read supported the alternative allele, 5) ≥50% of
the sequencing reads had a base PHRED quality score <10.
Validation of DNMs
Candidate DNMs were Sanger sequenced in the proband and the parents by two
individual sequencing runs in opposite directions.
Comparison between confirmed DNMs and false positives
The reference allele ratio, sequencing depth and (unmaximized) Genotype Quality
score of confirmed DNMs and false positives were entered and plotted in GraphPad
Prism 5. Parameters values were compared between confirmed and false positive
DNMs and between probands and parents by Mann-Whitney test. GraphPad Prism
5 was subsequently used to generate receiver operating characteristic (ROC)
curves and calculate the area under the curve (AUC).
CHAPTER 4
116
Figure 1. DNM filtering pipeline. DNMs were identified as heterozygous variants in probands that
were absent in both parents. Low thresholds for sequencing parameters removed low quality DNM
candidates. The most likely DNMs were selected for Sanger sequencing validation by visual inspection
in IGV. Based on the characteristics of confirmed DNMs, candidates after initial filtering were re-
assessed to test if DNMs had been missed.
REMOVING THE MAX OF THE GQ SCORE ALLOWS SPECIFIC DETECTION OF DNMS IN NGS DATA
117
4
DNM validation cohort
A second, independent exome sequencing dataset of 25 parents-proband trios was
generated by the Beijing Genomics Institute (BGI). Exomes were captured using
Agilent SureSelect Exome Capture kit V4, NGS was performed on an Illumina
HiSeq2000 and variants were called for each sample individually by the GATK
Unified Genotyper 2.0. DNMs were selected as describe for the training cohort,
with slightly more stringent thresholds for three parameters; 1) read depth ≥15 in
probands, 2) reference allele ratio ≥0.35 in probands, 3) Genotype Quality score =
99 in probands.
RESULTS
Identification of DNMs
Exome sequencing data from 20 parents-proband trios, generated by three
sequencing centers using two sequencing technologies, were used to screen for
DNMs (Table 1). Candidate DNMs were identified by filtering at low thresholds for
quality parameters and the 91 most likely DNMs (4.55 ± 3.35 per trio) were
selected for Sanger sequencing validation (Figure 1). Eight DNM candidates could
not be amplified successfully by PCR. Of the remaining 83 candidate DNMs, 24
were confirmed as DNMs and 59 candidates proved to be false positives.
The set of 91 candidate DNMs was obtained by variant filtering and
selection of the best candidates in IGV. The selection of candidates in IGV was
performed manually and was partially subjective. Therefore, all DNM candidates
after initial filtering were filtered again, using more stringent threshold settings
that were set after evaluation of the confirmed and false positive DNMs (Figure 1).
This last, more stringent analysis identified three additional candidate DNMs that
had been missed in the first selection in IGV. Sanger sequencing validation showed
that all three additional candidate DNMs were false positives.
Mosaic DNMs
Within the confirmed group of DNMs there were two outliers. In these two the
reference allele ratios observed were 0.80 and 0.76, respectively. In line with the
skewed distribution of reference and alternative alleles in the NGS data, Sanger
sequencing validation of these two mutations showed a higher peak for the
reference allele than for the alternative allele. This finding suggests that these
CHAPTER 4
118
DNMs were present in only part of the DNA sample and are somatic, rather than
germline DNMs. Since our focus was on the identification of germline DNMs, the
two somatic DNMs were excluded from the comparison between confirmed DNMs
and false positive DNM candidates.
Characteristics of confirmed DNMs are different from false positives
To assess what the discriminating features of true DNMs are in NGS data, reference
allele ratio, sequencing depth and Genotype Quality score were compared between
the 22 confirmed germline DNMs and 59 false positives.
Reference allele ratio
In probands, the average reference allele ratio did not differ between confirmed
DNMs and false positives (p=0.0585), but confirmed DNMs fell within a smaller
range (0.44-0.70) than false positives (0.20-0.86) (Figure 2A). In the homozygous
reference parents, base calls for the alternative allele are not expected, but were
nevertheless identified in a total of 5 sequencing reads in three different parents.
All 5 alternative allele calls had base call quality scores below 10 and were
therefore likely to be technical artefacts. In contrast, most of the reads underlying
the DNM had a high quality (PHRED>20). If the low-quality base calls do not
represent artefacts, then there likely is a low-level mosaicism at that position in
the parent. Low-level mosaicisms of a few percent cannot be detected by Sanger
sequencing, but can be identified by deep-sequencing NGS. Sequencing reads
supporting the alternative allele were observed in 26.3% of the false positive DNM
candidates and with higher base call quality scores. The presence of (low-quality)
alternate reads in the parents therefore suggests that a DNM candidate is likely,
but not necessarily, a false positive.
Sequencing Depth
The average sequencing depth of confirmed DNMs was higher than that of false
positives, in both probands and their parents (p=1.87E-7 and p=1.30E-10,
respectively) (Figure 2B). In probands, the lowest observed sequencing depth of
confirmed DNMs was 17, while 53% of false positives had a lower sequencing
depth (Figure 2B). With a minimal of 9 reads, parents showed a lower minimal
sequencing depth for confirmed DNMs than probands, although the average
sequencing depth did not differ between probands and parents (p=0.300).
REMOVING THE MAX OF THE GQ SCORE ALLOWS SPECIFIC DETECTION OF DNMS IN NGS DATA
119
4
Figure 2. Characteristics of confirmed
DNMs and false positives in probands
and parents. Reference allele ratio,
sequencing depth and second smallest
genotype likelihood were plotted for all
individual data points. Lines represent
mean values. A) Reference allele ratio in
probands ranges from 0.44 to 0.70,
whereas this range was wider for false
positives. Although many false positives
showed reads with the alternative allele
in parents, these were found at low
frequency in confirmed DNMs as well.
B) Sequencing depth of confirmed DNMs
was higher than that of false positives, in
both probands and parents. The
minimal depth of confirmed DNMs was
17 in probands and 9 in parents. C) The
second smallest PHRED-scaled genotype
likelihood (ssPL) represents the
unmaximized Genotype Quality score.
With all values over 200, ssPL values
were remarkably high for confirmed
DNMs in probands.
CHAPTER 4
120
Genotype Quality score
A genotype call is produced for each covered position in NGS data, which can be
either AA, AB or BB. For each of these three genotypes, a likelihood score is
calculated that indicates how likely the genotype is to be true, considering the
number of reads and their qualities supporting the A and B alleles. The most likely
one of the three genotypes is selected as the genotype at that position. The
likelihood scores for the three possible genotypes are defined as a PHRED score (–
10*log10[p-value]), and is referred to as the PHRED-scaled Genotype Likelihood
(PL). The PHRED-scaled Genotype likelihoods (PL) scores are normalized such that
the most likely genotype receives a score of 0. The Genotype Quality (GQ) score
(not to be confused with the Genotype Likelihood (PL) score) is defined as the
second smallest of the three Genotype Likelihood (PL) scores (the smallest always
being 0). So if the two alternative genotypes are very unlikely, they will have high
Genotype Likelihood (PL) scores and hence the Genotype Quality (GQ) score will
also be high. The maximum value that is assigned to the Genotype Quality (GQ)
score is 99, meaning that all values above 99 will be set to this maximum.
The Genotype Quality (GQ) score is a commonly used quality parameter in
NGS data and was therefore used to compare confirmed DNMs and false positives.
All confirmed DNMs and 35 of 59 (59.3%) false positives had a Genotype Quality
(GQ) score of 99 in probands, suggesting that confirmed and false positive DNMs
had very similar Genotype Quality (GQ) scores. However, the Genotype Likelihoods
(PL) of the other two possible genotypes are specified in a VCF file as well, and no
maximum value is assigned to these scores. The second smallest PHRED-scaled
Genotype Likelihood (ssPL) can therefore be regarded as the unmaximized
Genotype Quality (GQ) score. These ssPL scores were significantly higher in
confirmed DNMs than in false positive DNM candidates, in probands (p=3.85E-14)
and in parents (p=2.61E-12) (Figure 2C). All ssPL scores in probands were above
200, whereas 84.7% of all false positives had ssPL values below 200. The ssPL
score provides a good distinction between true and false positive DNMs. The high
minimum ssPL values for confirmed DNMs were not observed in parents, where
the lowest ssPL value was 27.
Discriminative power of different parameters
Having compared the characteristics of confirmed DNMs and false positives, we
analyzed how the thresholds for the reference allele ratio, the sequencing depth
and the ssPL influenced the sensitivity and specificity of detecting DNMs. DNMs
REMOVING THE MAX OF THE GQ SCORE ALLOWS SPECIFIC DETECTION OF DNMS IN NGS DATA
121
4
could not be accurately detected based on thresholds for reference allele ratio,
neither in probands nor in parents (Figure 3A,B). A sequencing depth threshold
around 30 allowed the detection of DNMs with both a sensitivity and specificity of
~75%, in both probands and parents (Figure 3C,D). An ssPL threshold of 200 in
probands detected DNMs with 100% sensitivity and 84.7% specificity (Figure 3E).
In comparison, selection of DNMs based on a maximum Genotype Quality score of
99 identified DNMs with a specificity of only 40.7%. In parents, 79.2% sensitivity
and 88.0% specificity can be obtained by a ssPL threshold of 80 (Figure 3F).
Receiver Operating Characteristic (ROC) curves were generated to
determine the discriminative power of the different sequencing quality
parameters. The ssPL in probands was the best predictor for DNMs, with an area
under the curve (AUC) of 0.9607 (Figure 4). ssPL in parents and sequencing depth
in probands and parents were to a lesser extent able to discriminate DNMs from
false positives (AUC = 0.8321, 0.8463 and 0.8077, respectively). The reference
allele ratio had low predictive power. The AUC in probands was 0.6140, hardly
better than a variable that has no predictive power (AUC of 0.5). The specificity of
detecting DNMs on the basis of reference allele ratio in parents never reached
above 26.3% and the corresponding AUC is 0.6052. By combining thresholds that
allow maximum sensitivity (100%), only 1 false positive was observed (98.3%
specificity). Six false positives were found (89.8% specificity) when respecting the
maximum Genotype Quality score of 99.
Validation on independent data set
The characteristics of confirmed DNMs and false positives were based on NGS data
from three different centers, using two different technologies. The parameters
described thus reflect the characteristics of DNM across multiple datasets. The
applicability on independent NGS datasets was further evaluated using a dataset
with 33 confirmed DNMs. Using conservative filtering thresholds (see Methods), all
confirmed DNMs were successfully detected with only 1 false positive. Moreover,
four additional DNM candidates were identified that had been missed in the initial
DNM screening of the confirmation set. These DNM candidates had been ignored
because they showed the alternative allele in 1.2-3.0% of the sequencing reads in
the parents. However, all four candidates were confirmed to be DNMs by Sanger
sequencing validation. A total of 9 reads with the alternative allele were observed
in these four DNMs. Three had a low base call quality (≤9), whereas the base call
CHAPTER 4
122
Figure 3. Effect of filtering thresholds on sensitivity and specificity of DNM detection. The
sensitivity and specificity with which DNMs were detected are plotted against filtering thresholds for
the different parameters. The threshold for reference allele ratio in probands is a maximum value
threshold, whereas other thresholds are minimum value thresholds.
REMOVING THE MAX OF THE GQ SCORE ALLOWS SPECIFIC DETECTION OF DNMS IN NGS DATA
123
4
quality for the other six was high (≥29). It is unclear whether these represent
technical artefacts or parental mosaicism that was too low to detect by the Sanger
sequencing.
DISCUSSION
The specific detection of DNMs is difficult due to the relatively high number of
incorrect genotype calls in NGS data. Joint genotype calling in trios reduces the
number of potential DNM candidates and several software packages have been
developed that use probabilistic modelling to further improve the specificity of
DNM detection20–24. Reference allele ratio, sequencing depth and Genotype Quality
have been shown to efficiently filter Mendelian errors in NGS data25,26. Here we
assessed if and how these parameters can discriminate between confirmed DNMs
and false positives. To this purpose, we used a set of 22 confirmed germline DNMs
and 59 false positives that were initially deemed plausible DNMs.
Using threshold settings for reference allele ratio, sequencing depth and
Genotype Quality, we were able to detect DNMs with 100% sensitivity and 98.3%
specificity. The key to reaching high specificity was to remove the maximum
assigned value to the Genotype Quality score. By using the second smallest PHRED-
scaled genotype likelihood instead of the (maximized) Genotype Quality score, the
specificity of detecting DNMs using this single parameter increased from 40.7% to
84.7% (at 100% sensitivity). This raises the question why the Genotype Quality
score has a maximum value? First of all, it requires more space to save values of
more than 2 digits, especially considering that thousands of Genotype Quality
scores are saved in each VCF file. Secondly, a Genotype Quality score of 99
corresponds to a chance of 1.26E-10 that the genotype call is incorrect. Higher
Genotype Quality scores are therefore not deemed more informative. Nonetheless,
we found that all 22 confirmed DNMs had a second smallest genotype likelihood
score of at least 200 in probands, which translates to a ≤1E-20 chance of making an
incorrect genotype call. Since the majority of false positive DNMs had ssPL scores
below 200, the discriminative power of the ssPL in probands was high, with an
AUC of 0.9607. Also in parents, the ssPL proved a useful parameter to distinguish
DNMs from false positives, just like sequencing depth in both probands and
parents. The reference allele frequency in probands and parents had limited
discriminative power.
CHAPTER 4
124
In addition to the 22 germline DNMs, two DNMs were identified that had
likely occurred somatically. These mutations showed an underrepresentation of
the alternative allele in both the NGS data and the Sanger sequencing validation.
This finding is in line with a study reporting that 6.5% of all DNMs are mosaic and
have occurred somatically31. The accurate detection of somatic DNMs is important,
as they are involved in cancer and other disorders32,33. However, mosaic DNMs are
more difficult to detect than germline DNMs, especially if the frequency of the
mutation is low.
Figure 4. Receiver Operating Characteristic (ROC) curves of the sequencing parameters. The Area
Under the ROC Curve (AUC) was calculated as a measure of the discriminative power of a parameter.
The ssPL value in probands was the most powerful parameter to distinguish between confirmed DNMs
and false positives.
REMOVING THE MAX OF THE GQ SCORE ALLOWS SPECIFIC DETECTION OF DNMS IN NGS DATA
125
4
Existing software packages for the detection of DNMs in NGS data rely on
joint genotype calling and probabilistic models to predict DNMs. Although different
programs use different models, they reach similar sensitivity and specificity24. In
the present study we show that filtering thresholds for the unmaximized Genotype
Quality score, in combination with sequencing depth and reference allele
frequency, provides an easy and straightforward filtering approach to detect DNMs
with high sensitivity and specificity across exome sequencing data from different
sequencing centers. We propose that incorporation of the ssPL score in
probabilistic models may further enhance the specificity of detecting DNMs and
urge the use of this this parameter in future analyses. Whether removing the
maximum value of the Genotype Quality score has benefit in the detection of other
types of genetic variation is unclear, but deserves further investigation.
CHAPTER 4
126
REFERENCES 1. Vissers, L. E. L. M. et al. A de novo paradigm for mental retardation. Nat. Genet. 42, 1109–1112
(2010). 2. Hoischen, A. et al. De novo mutations of SETBP1 cause Schinzel-Giedion syndrome. Nat. Genet.
42, 483–485 (2010). 3. Ng, S. B. et al. Exome sequencing identifies MLL2 mutations as a cause of Kabuki syndrome.
Nat. Genet. 42, 790–3 (2010). 4. Hoischen, A. et al. De novo nonsense mutations in ASXL1 cause Bohring-Opitz syndrome. Nat.
Genet. 43, 729–731 (2011). 5. Sirmaci, A. et al. Mutations in ANKRD11 cause KBG syndrome, characterized by intellectual
disability, skeletal malformations, and macrodontia. Am. J. Hum. Genet. 89, 289–94 (2011). 6. Sanders, S. J. et al. De novo mutations revealed by whole-exome sequencing are strongly
associated with autism. Nature 485, 237–241 (2012). 7. O’Roak, B. J. et al. Sporadic autism exomes reveal a highly interconnected protein network of
de novo mutations. Nature 485, 246–250 (2012). 8. Neale, B. M. et al. Patterns and rates of exonic de novo mutations in autism spectrum
disorders. Nature 485, 242–245 (2012). 9. Iossifov, I. et al. De Novo Gene Disruptions in Children on the Autistic Spectrum. Neuron 74,
285–299 (2012). 10. Xu, B. et al. Exome sequencing supports a de novo mutational paradigm for schizophrenia. Nat.
Genet. 43, 864–8 (2011). 11. Girard, S. L. et al. Increased exonic de novo mutation rate in individuals with schizophrenia.
Nat. Genet. 43, 860–863 (2011). 12. Allen, A. S. et al. De novo mutations in epileptic encephalopathies. Nature 501, 217–21 (2013). 13. Gilissen, C. et al. Genome sequencing identifies major causes of severe intellectual disability.
Nature 511, 344–347 (2014). 14. Kong, A. et al. Rate of de novo mutations and the importance of father’s age to disease risk.
Nature 488, 471–475 (2012). 15. Francioli, L. C. et al. Genome-wide patterns and properties of de novo mutations in humans.
Nat. Genet. 47, 822–826 (2015). 16. Homsy, J. et al. De novo mutations in congenital heart disease with neurodevelopmental and
other congenital anomalies. Science (80-. ). 350, 1262–1266 (2015). 17. DePristo, M. A. et al. A framework for variation discovery and genotyping using next-
generation DNA sequencing data. Nat Genet 43, 491–498 (2011). 18. Nielsen, R., Paul, J. S., Albrechtsen, A. & Song, Y. S. Genotype and SNP calling from next-
generation sequencing data. Nat. Rev. Genet. 12, 443–451 (2011). 19. Cartwright, R. A., Hussin, J., Keebler, J. E. M., Stone, E. A. & Awadalla, P. A family-based
probabilistic method for capturing de novo mutations from high-throughput short-read sequencing data. Stat. Appl. Genet. Mol. Biol. 11, (2012).
20. Li, B. et al. A Likelihood-Based Framework for Variant Calling and De Novo Mutation Detection in Families. PLoS Genet. 8, e1002944 (2012).
21. Ramu, A. et al. DeNovoGear: de novo indel and point mutation discovery and phasing. Nat. Methods 10, 985–7 (2013).
22. Wei, Q. et al. A Bayesian framework for de novo mutation calling in parents-offspring trios. Bioinformatics 31, 1–7 (2014).
23. Liu, Y., Li, B., Tan, R., Zhu, X. & Wang, Y. A gradient-boosting approach for filtering de novo mutations in parent-offspring trios. Bioinformatics 30, 1830–1836 (2014).
24. Li, J. et al. mirTrios: an integrated pipeline for detection of de novo and rare inherited mutations from trios-based next-generation sequencing. J. Med. Genet. 52, 275–81 (2015).
25. Patel, Z. H. et al. The struggle to find reliable results in exome sequencing data: Filtering out Mendelian errors. Front. Genet. 5, 1–13 (2014).
26. Carson, A. R. et al. Effective filtering strategies to improve data quality from population-based whole exome sequencing studies. BMC Bioinformatics 15, 125 (2014).
27. Li, H. et al. The Sequence Alignment/Map format and SAMtools. Bioinformatics 25, 2078–2079 (2009).
28. McKenna, A. et al. The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 20, 1297–303 (2010).
REMOVING THE MAX OF THE GQ SCORE ALLOWS SPECIFIC DETECTION OF DNMS IN NGS DATA
127
4
29. Li, M.-X., Gui, H.-S., Kwan, J. S. H., Bao, S.-Y. & Sham, P. C. A comprehensive framework for prioritizing variants in exome sequencing studies of Mendelian diseases. Nucleic Acids Res. 40, e53 (2012).
30. Robinson, J. T. et al. Integrative genomics viewer. Nat. Biotechnol. 29, 24–6 (2011). 31. Acuna-Hidalgo, R. et al. Post-zygotic Point Mutations Are an Underrecognized Source of De
Novo Genomic Variation. Am. J. Hum. Genet. 97, 67–74 (2015). 32. Youssoufian, H. & Pyeritz, R. E. Mechanisms and consequences of somatic mosaicism in
humans. Nat. Rev. Genet. 3, 748–758 (2002). 33. Watson, I. R., Takahashi, K., Futreal, P. A. & Chin, L. Emerging patterns of somatic mutations in
cancer. Nat. Rev. Genet. 14, 703–718 (2013).
COMBINED STRATEGIES TO IDENTIFY DISEASE
ASSOCIATED GENES FOR RARE COMPLEX DISEASES;
HIRSCHSPRUNG DISEASE AS A MODEL
Duco Schriemer1, Erwin Brosens2, Hongsheng Gui3,4, Clara S. Tang3, Rutger W.W.
Brouwer5, Marta Bleda6, Wilfred F.J. van IJcken5, Salud Borrego7,8, Isabella
Ceccherini9, Aravinda Chakravarti10, Stanislas Lyonnet11,12, Paul K. Tam3, Maria-
Mercè Garcia-Barceló3, Bart J.L. Eggen1, Robert M.W. Hofstra2,13
1 Department of Neuroscience, section Medical Physiology, University of Groningen, University Medical Center
Groningen, Groningen, The Netherlands
2 Department of Clinical Genetics, Erasmus Medical Center, Rotterdam, the Netherlands
3 Department of Surgery, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong, SAR, China
4 Centre for Genomic Sciences, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong, SAR,
China
5 Biomics Erasmus Center for Biomics, Erasmus Medical Center, Rotterdam, The Netherlands
6 CIBER de Enfermedades Raras (CIBERER), ISCIII, Seville, Spain
7 Unidad de Gestión Clínica de Genética, Reproducción y Medicina Fetal Hospitales, Universitarios Virgen del Rocío,
Seville, Spain
8 CIBER de Enfermedades Raras (CIBERER), ISCIII, Seville, Spain
9 UOC Genetica Medica, Istituto Gaslini, Genova, Italy
10 McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, USA
11 Département de Génétique, Faculté de Médecine, Université Paris Descartes, Paris, France
12 INSERM U-781, AP-HP Hôpital Necker-Enfants Malades, Paris, France
13 Stem Cells and Regenerative Medicine, Birth Defects Research Centre UCL Institute of Child Health, London, UK
Manuscript in preparation
5
CHAPTER 5
130
ABSTRACT
The genetic architecture of common, heritable diseases is complex, with
involvement of both common and rare genetic variants. Association studies for
rare variants are challenging, as the low frequency of rare variants and large
multiple-testing correction require large sample sizes. In this study we focused on
the prioritization of rare variant identified by exome data from 48 Hirschsprung
disease (HSCR) patients and 212 controls. HSCR is a complex genetic disorder that
is characterized by incomplete development of the enteric nervous system (ENS)
in the distal colon.
We sampled almost exclusively extreme phenotypes and selected rare,
pathogenic variants. All variants per gene were collapsed and a meta-analysis was
performed on data from three centers. The burden test we performed gave every
gene a nominal p-value and the 48 most promising genes, with a nominal p-value
<0.01, were subsequently ranked by seven gene prioritization tools.
CELSR1, CLOCK, FASN and CACNA1H were among the top 5-ranked
candidate genes based on average ranking and were among the top 13 genes with
the most significant nominal p-values in the burden test meta-analysis.
Subsequently, gene expression data from the developing mouse gut and ENS
progenitor cells were used to assess whether these candidate genes are
abundantly expressed in the cell types relevant for HSCR. Of these four highly-
ranked candidate genes, Fasn and Cacana1h were expressed by ENS progenitor
cells, but were not differentially expressed between ENS progenitors, gut and
controls tissues. Celsr1 and Clock were expressed at lower levels in ENS progenitor
cells than in the rest of the gut, but Celsr1 expression did increase upon activation
of RET, a receptor in ENS progenitors and the major risk gene in HSCR. Clock, Fasn
and Cacna1h expression was not affected by RET signaling.
In conclusion, we show that burden tests, gene prioritization tools and
gene expression data from a relevant cell type can be used to identify candidate
genes for HSCR in an underpowered genetic study. These genes should be studied
in more detail in further genetic or functional studies to delineate their role in
HSCR.
COMBINED STRATEGIES TO IDENTIFY GENES FOR RARE COMPLEX DISEASES; HSCR AS A MODEL
131
5
INTRODUCTION
Both common and rare variants contribute to the onset of complex genetic
diseases1–4. Common variants (minor allele frequency > 5%) have been found
associated to complex genetic diseases in genome-wide association studies
(GWAS)5,6. Although large GWAS generally uncover multiple disease-associated
loci, the common variants in an associated haplotype contribute but do not cause
the disease. The common variants in GWAS-associated loci collectively explain only
20-60% of the observed heritability7. Part of the missing heritability likely is
explained by rare variants (minor allele frequency < 1%). Resequencing of genes in
GWAS-associated loci has indeed identified rare variants in these genes8–10. In
addition, highly heritable forms of complex diseases can present in families and in
isolated populations, suggesting a role for highly penetrant, rare variants11,12. This
suggests that both common and rare variants in the same gene(s) can contribute to
the development of genetically complex diseases13.
Genome-wide genetic analysis, by array-based genotyping or exome/
genome sequencing, provides an unbiased approach to identify disease-associated
genes. However, this advantage comes at a cost. A multiple-testing correction has
to be applied to the large number of tested loci, thereby reducing the statistical
power of the study. With up to 2.5 million genetic markers, the multiple testing
corrections applied to GWAS are considerable. The number of genotyped bases is
even larger in sequencing studies, as the exome alone contains ~30 million base
pairs. Statistical power in sequencing studies further suffers from the low
frequency of genetic variants. Whereas GWAS make use of common variants to
increase statistical power, sequencing studies capture all genetic variation, most of
which is in fact rare14,15. As a result of the large multiple testing correction and low
allele frequencies, power calculations in exome sequencing studies show that
10,000 to 100,000 individuals are required to find genetic associations of rare
variants, especially in complex disorders16–18.
Several solutions for the lack of statistical power in exome/genome
sequencing studies have been proposed. First of all, rare variants are
overrepresented in extreme disease phenotypes, so sampling of patients with
extreme phenotypes increases the power of finding rare variant associations19,20.
Secondly, case-control analysis can be restricted to genetic variants that (are
predicted to) disrupt protein function. Since (predicted) damaging variants are
likely to be disease-causal, they will be less frequently found in controls due to
CHAPTER 5
132
negative evolutionary selection. Thirdly, all rare variants in a gene or pathway can
be collapsed into a single variable to increase the variant frequency, reduce the
number of association tests and thereby increase the power21. It goes without
saying that meta-analysis of multiple smaller studies can be an effective strategy to
increase the statistical power22,23.
Even when the abovementioned strategies are applied to increase
statistical power, true associations may not reach the significance threshold that is
dictated by multiple testing for 20,000 genes. It has been postulated that such
associations may be uncovered in replication cohorts where only a limited number
of ‘top hits’ are analyzed, leading to a lower multiple testing-corrected significance
threshold24. Top hits can be specified as the associations with the lowest nominal
p-value, and biological plausibility can also be taken into account.
Gene prioritization tools have been developed to identify the disease-
causal gene in a set of candidate genes from genetic studies, using genes that are
known to be relevant to the phenotype as so-called seed genes25,26. Different
strategies can be employed to identify plausible seed or candidate genes. For
example, some gene prioritization tools require the user to specify seed genes27–30,
whereas other tools extract known disease genes from the Online Mendelian
Inheritance in Man (OMIM) database31–35. Similarity measurements between
candidate genes and seed genes can be based on a variety of data sources, such as
functional annotation, protein interactions, co-expression, sequence similarity and
text mining. In addition to gene-level information, variant level information such as
allele frequency and predicted pathogenicity can be used to rank candidate
genes34,35.
In this study, we focused on Hirschsprung disease (HSCR) as an example of
a disease with a complex genetic architecture. HSCR is characterized by the lack of
neuronal innervation in the distal colon, resulting from incomplete colonization of
the bowel by enteric neural crest cells (ENCCs), the progenitor cells of the enteric
nervous system (ENS). The incidence of HSCR is approximately 1 in 3,500 in Asians
and 1 in 6,500 in Caucasians36. Over 15 genes have been linked to HSCR, but these
genes explain only around 25% of the heritability13,36. As in many complex genetic
diseases, identification of new genes in HSCR has initially focused on linkage
analysis in familial cases and isolated populations and revealed a role for the RET
and EDNRB pathways in HSCR37–39. More recently, GWAS on sporadic HSCR
patients have been performed. The genes associated with HSCR in these GWAS are
RET, NRG1 and the SEMA3 gene locus40,41. In addition to the common variants, rare
COMBINED STRATEGIES TO IDENTIFY GENES FOR RARE COMPLEX DISEASES; HSCR AS A MODEL
133
5
variants in these genes and other HSCR-associated genes have been reported13,42,43.
To study the role of rare variants in HSCR on a genome-wide level, we sampled
extreme phenotypes, selected rare pathogenic variants, collapsed all variants per
gene, performed a meta-analysis on data from three different centers and used
gene prioritization tools to analyze exome sequencing data from HSCR patients.
METHODS
Patient collection
Forty-eight sporadic, non-syndromic HSCR patients were selected from five clinical
centers. Fourteen patients were of Chinese origin and 34 patients were of
Caucasian ancestry. We prioritized the most severe and most rare HSCR cases for
this study, namely patients with long segment or total colonic aganglionosis (Table
1). Sixteen patients had previously tested negative for coding variants in RET.
Control individuals without neurological or psychological disorders were selected
in each center to match ethnicity and sequencing technology. Parental informed
consent was obtained from all participants.
Exome sequencing
DNA samples were subjected to exome sequencing at four sequencing centers
using local, in-house technologies. The exome-capture kits and sequencing
platforms used per center are summarized in Table 1. Sequencing data from
Table 1. Patient collection and sequencing technologies.
Cohort
Patients
Controls Ethnicity Sequencing
platform Exome capture
10X coverage
Rare variants Short
segment
Long segment
Total
HK 6 8 14 73 Han-
Chinese Illumina GAII
Illumina Truseq
79% 194
SP 10 5 15 100 European ABI Solid 4 NimbleGen V2 85% 205
NL 0 19 19 39 European Illumina HiSeq2000 Agilent
SureSelect V4 95% 296
Meta-analysis
16 32 48 212
Overview of the numbers of patients and controls that were sequenced by each center and the
sequencing technologies that were used.
CHAPTER 5
134
two centers with identical sequencing platforms were analyzed together in
downstream analyses. Alignment of Illumina sequencing reads were mapped to the
genome using BWA and Solid sequencing reads were mapped using Bfast44. All
sequencing reads were mapped to the human reference genome version 19 (hg19).
Quality Control (QC) of sequencing data was carried out using the FastQC toolbox,
Picard’s metric summary and the GATK Depth-of-Coverage module. After QC,
sequencing data were preprocessed for local indel realignment, PCR duplicate
removal and base quality recalibration45. SNPs and Indels were called using the
GATK unified Genotyper 2.046 and stored in standard VCF files. Each sequencing
center performed variant calling simultaneously on their respective HSCR patients
and control subjects. KGGSeq47 was used to extract variants that 1) had a
sequencing quality score ≥ 50; 2) mapping quality ≥20; 3) Fisher strand bias score
≤60; 4) genotype quality score ≥20; 5) sequencing depth ≥8; 6) reference allele
ratio <0.75; 7) are exonic; 8) are non-synonymous SNPs or indels; 9) have minor
allele frequency <1% in dbSNP137, 1000 Genomes and NHLBI Exome Sequencing
Project; 10) were successfully genotyped in ≥80% of patients and ≥80% of
controls; 11) were predicted deleterious by KGGSeq’s logistic regression analysis
of dbNSFP v3.0 functional impact scores48.
Burden test
The number of rare variants per gene in patients and controls was analyzed by
three different centers individually, using the same protocol. The Combined
Multivariate and Collapsing (CMC) test in Rvtests package was used to collapse all
variants identified within the same gene21. P-values were calculated by asymptotic
chi-square distribution. Meta-analysis of the summary statistic of three centers
was performed using sample-size weighted z-score.
Candidate gene prioritization
Genes with a nominal association p-value <0.01 in the burden test were selected as
seed genes for downstream candidate gene prioritization. Gene prioritization was
performed in Endeavour Web Server (http://www.esat.kuleuven.be/
endeavour)27,28, ToppGene (http://toppgene.cchmc.org)29, ToppNet (part of the
ToppGene suite29), GPSy (http://gpsy.genouest.org)30, FunSimMat (http://
funsimmat.bioinf.mpi-inf.mpg.de)31–33, Exomiser (http://www.sanger.ac.uk/
science/tools/exomiser)34 and ExomeWalker (http://compbio.charite.de/
ExomeWalker)35. The gene prioritization tools differ in the data sources that are
COMBINED STRATEGIES TO IDENTIFY GENES FOR RARE COMPLEX DISEASES; HSCR AS A MODEL
135
5
used to compare candidate genes to seed genes. For gene prioritization in
Endeavour all 19 available databases in Endeavour were used. ToppGene was run
using default parameters, with ‘Interaction’ as additional training feature and
ToppNet was run using default settings. Moreover, Endeavour, ToppGene and
ToppNet require the user to select a set a genes (specific seed genes) that are
known to be relevant to the disease. For this we assembled a list of 45 seed genes
that are either genetically linked to HSCR in humans, or loss of these genes causes
aganglionosis in mouse models (Supplementary Table 1).
In contrast to a self-made list of genes as required for Endeavour,
ToppGene and ToppNet, GPSy was run using ‘Nervous system’ and ‘Homo sapiens‘
as selected topic and species, respectively, with otherwise default parameters. The
‘disease candidate prioritization’ function in FunSimMat was run using OMIM term
#142623 (Hirschsprung disease) and were ranked by Biological Pathway (BP).
Gene prioritization in Exomiser was performed using the hiPHIVE prioritiser and
Orphanet ID 388 (Hirschsprung disease). OMIM term #142623 (Hirschsprung
disease) was selected as phenotype in ExomeWalker. No variant quality filters
were applied in Exomiser and ExomeWalker, since these were applied in the
upstream filtering and annotation by KGGSeq. Overall gene ranking was based on
the average rank per gene in the different prioritization tools.
Gene expression analysis
Publically available expression data from E14.5 mouse embryos and was combined
with in-house expression data from embryonic mouse gut and ENCCs isolated from
E14.5 mouse embryos that expressed YFP under control of the Wnt1 promoter.
Expression data from control tissues were extracted from the Gene Expression
Omnibus49: testis and ovary (GSE6881), kidney (GSE4230)50, gonad (GSE6916)51
and cardiac tissue (GSE1479)52. Moreover, additional intestinal tissue expression
sets were obtained: stomach, pylorus and duodenum (GSE15872)53. Probe set
summaries from the raw Affymetrix data (cel files) were calculated using BRB-
ArrayTools version 4.5.0 - Beta_2 (http://linus.nci.nih.gov/BRB-ArrayTools.html).
The probes were annotated by Bioconductor (www.bioconductor.org), R v3.2.2
Patched (2015-09-12 r69372) and the annotation package mouse4302.db (version
3.0.0). ‘Just GCRMA’ (GC content – Robust Multi-Array Average) was used from the
‘GCRMA’ package available in Bioconductor. The just GCRMA algorithm adjusts for
background intensities (optical noise and non-specific binding), normalizes each
array using quantile normalization and includes variance stabilisation and log2
CHAPTER 5
136
transformation. Replicate spots within an array were averaged. Genes showing
minimal variation across the set of arrays were excluded from the analysis (if less
than 10% of expression data had at least a 1.5 fold change in either direction from
gene's median value, or at least 50% of arrays had missing data for that gene).
Probes were present on the Affymetrix mouse4302.db chip for 45 of the 48
candidate genes (DEFB132 and OR10K1 do not have a mouse orthologue and Or2d2
does not have probes on the chip). Genes whose expression differed by at least 1.5
fold from the median in at least 20% of the arrays were retained. The minimum
fold change for the class comparisons was set at 1.5, statistics were performed
using a two-sample T-test with random variance model. The permutation p-values
for significant genes were computed based on 10,000 random permutations and
the nominal significance level of each univariate test was set at 0.05.
Power calculations
The Genetic Power Calculator54 was used to calculate the statistical power of the
present study and the number of patient required in a future replication study. The
prevalence of HSCR was set to 0.0002 (1:5000 live born individuals36). D-prime
was set to 0.8, assuming that the sensitivity of detecting variants in exome
sequencing data is 80% (Table 1). Calculations for cohort sizes assumed the same
case : control ratio as for the 48 HSCR cases and 212 controls (1 : 4.417).
Significance levels were set at 0.05 and were not adjusted for multiple testing.
Dominant inheritance was assumed.
RESULTS
Sampling of extreme phenotypes
Rare variants have a relatively large contribution to disease in patients with an
extreme phenotype19,20. Therefore we prioritized the most severe form of HSCR for
exome sequencing. A variable segment of the gut can be aganglionic in HSCR. In
80% of the cases, only a short-segment of the colon is affected, whereas a long-
segment of the colon is aganglionic in the remaining 20%36. Long-segment HSCR
has a high heritability and a dominant mode of inheritance with reduced
penetrance55. Moreover, in long-segment HSCR there is a relative large
contribution of rare variants in RET, the major HSCR gene55,56. Therefore, the
highest contribution of rare variants is expected in patients with long-segment
COMBINED STRATEGIES TO IDENTIFY GENES FOR RARE COMPLEX DISEASES; HSCR AS A MODEL
137
5
HSCR. In three different cohorts, a total of 32 cases of long-segment HSCR were
sampled for exome sequencing and were supplemented with 16 cases of short-
segment HSCR, resulting in 66.7% long-segment cases in our cohort (Table 1).
Exome sequencing and variant filtering
Exome sequencing was performed on 48 sporadic HSCR patients in three different
centers using different sequencing technologies (Table 1). The average 10X
coverage per center ranged from 79% to 95% (Table 1). Rare variants were
selected from the exome sequencing data that lead to a loss of function (nonsense,
splice site and frameshift mutations) or are missense mutations that were
predicted to be deleterious. This yielded between 194 and 296 rare variants per
individual on average per center.
Gene burden test and meta-analysis
The low frequencies of individual rare variants hamper the statistical power to
find a significant association to a disease. Therefore, all rare variants per gene that
were predicted to be pathogenic were collapsed into a single variable; the number
of variants per gene. The statistical power of finding disease-associated genes
depends on frequency at which mutations are found in comparison to a control set
of samples. This frequency varies among genes: 2.4% of all genes carried rare,
damaging variants in ≥5% of the controls, 20.8% of the genes had a variant
frequency of 1-5% in control samples and 27.5% a frequency <1% in our controls.
No rare, damaging variants were found in the controls in the remaining 49.3% of
all genes (Figure 1A).
Given the mutation frequencies per gene, the power of detecting true
associations was calculated for different genotype relative risks. For genes with a
high variant frequency of 5% our study with 48 patients and 212 controls could
detect true associations (at nominal p-value), even at low relative risk (Figure 1B).
Also for genes with a 0.5-1% variant frequency there was sufficient power to
detect damaging variants with a moderate or high relative risk. For genes that
carried rare, damaging variants less frequently in controls, there was limited
statistical power, even at a high relative risk. Increasing the size of the study to for
instance 100 cases and 442 controls would increases the statistical power, but up
to 1000 cases and 4417 controls are required to obtain sufficient power to detect
associations for genes that carry variants in 0.01% of the controls (at nominal p-
value) (Figure 1C,D).
CHAPTER 5
138
The variant frequency per gene was compared between patients and
controls in three independent centers. A meta-analysis was performed on data
from three different sequencing centers, representing different ethnicities and
sequencing technologies. A quantile-quantile plot (QQ-plot) showed that the
Figure 1. Frequency distribution of rare, damaging variants per gene and its effect on statistical
power. A) Histogram displaying the number of genes that carry rare, damaging variants at a specified
frequency in the control population. B) Statistical power of detecting a significant association (at a
nominal p-value of 0.05) in our cohort of 48 HSCR cases and 212 controls, given the genotype relative
risk and frequency of variants in the gene. C,D) Statistical power of detecting significant associations if
cohort sizes would have been increased to 100 cases and 442 controls (C) or 1000 cases and 4417
controls (D).
COMBINED STRATEGIES TO IDENTIFY GENES FOR RARE COMPLEX DISEASES; HSCR AS A MODEL
139
5
observed distribution of p-values followed the expected distribution, for individual
case-controls studies as well as for the meta-analysis (Figure 2). This suggests that
there were no confounding factors producing artificial associations. The most
significant associations in the meta-analysis were found for KLHDC4 (p=1.43x10-5)
and CR1 (p=2.07x10-5), but didn’t reach genome-wide significance (2.5x10-6, after
Bonferroni correction for 20,000 genes) (Table 2).
Candidate gene prioritization
Due to the low power of the rare variant association study, there may have been
real pathogenic variants in genes that did not reach genome-wide significance.
Small-scale replication studies have greater power to detect true associations, but
require a selection of candidate genes to follow up on24.
Gene prioritization tools have been developed to identify plausible genes
in a set of candidate genes and can therefore be used to select the best candidate
genes for follow-up studies25,26. Forty-eight genes had a nominal p-value <0.01 and
were selected to be ranked by seven gene prioritization tools to identify the best
candidate HSCR gene among them. CELSR1 achieved the highest average rank
across seven gene prioritization tools, followed by CLOCK, GRM4, FASN and
CACNA1H (Figure 3A). The overall ranking by the gene prioritization tools was
compared to the nominal p-value of the genes in the burden test meta-analysis and
four of the five highest-ranked genes (CELSR1, CLOCK, FASN and CACNA1H) were
among the 13 most significantly associated genes (Figure 3B). Although these
genes have no known functions in ENS development, biological functions and
expression in neural cells has been reported for CELSR1, CLOCK, FASN and
CACNA1H57–60.
The correlation between ranking results from different gene prioritization
tools varied substantially (Figure 3C). The highest correlation coefficient was
found between Exomiser and ExomeWalker (r = 0.54), but ExomeWalker showed
no correlation to any other tool. Endeavour and ToppGene showed a moderate
correlation with all other tools, except ExomeWalker.
Expression in the gut and in ENCCs
Gene prioritization tools use a variety of data sources to rank candidate genes,
including gene expression data. However, the expression data used in gene
prioritization tools is generally not derived from the cell type studied. As HSCR
CHAPTER 5
140
results from incomplete colonization of the bowel by ENCCs (the progenitor cells
of the ENS), the expression levels of the mouse orthologues of the 48 candidate
HSCR genes obtained in the burden test were analyzed in the developing mouse
gut, and more specifically in the ENCCs. Two genes (DEFB132 and OR10K1) do not
have a mouse orthologue and for one mouse orthologue (Or2d2) there were no
Figure 2. Quantile-quantile plot of the association p-values. The observed p-values follow the
expected distribution, for individual centers and for the meta-analysis. The highest associated genes in
the meta-analysis, KLHDC4 (p=1.43x10-5) and CR1 (p=2.07x10-5), are indicated.
COMBINED STRATEGIES TO IDENTIFY GENES FOR RARE COMPLEX DISEASES; HSCR AS A MODEL
141
5
probes on the chip. Of the 45 candidates genes for which expression data were
available, all but one (Cr1) were expressed by ENCCs. Clock, Grm4, Fasn and
Cacna1h, whose human orthologues were highly ranked by the gene prioritization
tools, were not abundantly expressed by ENCCs compared to control tissues
(testis, ovary, heart and kidney), or in the intestinal samples. Celsr1 was not highly
expressed in ENCCs or the gut, but its expression level was increased by activation
of the RET receptor with its ligand GDNF.
Of the 48 candidate genes selected from the burden test, Clstn2, Cdk12,
Cpxm2, Ghdc, D8Ertd82e (mouse orthologue of SGK223) and Mrps34 were higher
expressed in ENCCs than in control tissues. These ENCC-expressed genes, with the
addition of Pprc1 and Ddc, were also expressed at higher levels in intestinal
samples compared to control tissues (Table 3). The remaining candidate genes
from the burden test were expressed by ENCCs (with the exception of CR1), but
were not differentially expressed between ENCCs and the gut. Of the genes that are
highly expressed in the gut, the human orthologue of Ghdc was the seventh most
significantly associated gene to HSCR in the burden test meta-analysis (nominal p-
value: 1.87x10-4). GHDC (GH3 domain containing) only ranked at position 35 of the
48 candidate genes in the overall gene prioritization. The biological function of
GHDC is unknown and the gene could therefore not be linked to the known ENS
genes. However, Exomiser and ExomeWalker, both taking into account the
pathogenicity score of the identified variants, ranked GHDC at position 6 and 3,
respectively, suggesting that the identified variants in GHDC are highly pathogenic.
Combined with the expression of Ghdc in the developing mouse ENS and the high
rank in the burden test, this makes GHDC an excellent candidate gene for HSCR.
These data indicate that the use of tissue-specific gene expression data is a
complementary approach to identify candidate genes that were not picked up by
the gene prioritization tools due to lack of functional characterization of the gene.
Another potentially interesting candidate gene that was highly expressed
by ENCCs is Cdk12. Cdk12 was ranked as the 10th best candidate gene by the gene
prioritization tools and is involved in neuronal differentiation in the murine CNS61.
Given its high expression in ENCCs, Cdk12 may also be involved in ENS
development and contribute to HSCR. Clstn2, Cpxm2, Mrps34 and D8Ertd82e
(orthologue of SGK223) were highly expressed by ENCCs, but did not rank high in
the gene prioritization and were not among the most significantly associated genes
in the burden test.
CHAPTER 5
142
Table 2. Genes with a nominal p-value of association <0.01.
Gene Description HK SP NL Meta
KLHDC4 kelch domain containing 4 1.09E-03 2.30E-04 1 1.43E-05 CR1 complement component (3b/4b) receptor 1 0.022 9.51E-03 0.011 2.07E-05 ATP13A5 ATPase type 13A5 0.015 0.067 0.011 1.31E-04 FASN fatty acid synthase 0.022 2.30E-04 1 1.58E-04
GPRIN1 G protein regulated inducer of neurite outgrowth 1
0.059 9.51E-03 0.044 1.63E-04
CLOCK clock homolog (mouse) 0.019 9.51E-03 0.148 1.68E-04 GHDC GH3 domain containing 0.022 9.51E-03 0.148 1.87E-04 ZNF76 zinc finger protein 76 (expressed in testis) 0.022 9.51E-03 0.148 1.87E-04 XPO6 exportin 6 1.09E-03 9.51E-03 1 3.01E-04 ZFAND4 Zinc Finger, AN1-Type Domain 4 0.408 2.30E-04 0.148 3.05E-04 ERN2 ER to nucleus signaling 2 0.022 9.51E-03 0.446 6.43E-04
CELSR1 cadherin, EGF LAG seven-pass G-type receptor 1
0.408 5.20E-03 0.039 9.30E-04
CACNA1H calcium channel, voltage-dependent, T type, alpha 1H
1 5.20E-03 2.98E-03 1.11E-03
NINL ninein-like 0.721 1.43E-03 0.062 1.34E-03 POLR1A polymerase (RNA) I polypeptide A, 194kDa 1 1.80E-04 0.148 1.51E-03 ALX3 ALX homeobox 3 0.022 9.51E-03 1 2.26E-03 CDK12 Cdc2-related kinase, arginine/serine-rich 0.022 9.51E-03 1 2.26E-03 DDC dopa decarboxylase 0.022 9.51E-03 1 2.26E-03 HOGA1 4-Hydroxy-2-Oxoglutarate Aldolase 1 0.022 9.51E-03 1 2.26E-03
HSD3B2 hydroxy-delta-5-steroid dehydrogenase, 3 beta- and steroid delta-isomerase 2
0.022 9.51E-03 1 2.26E-03
IMPG1 interphotoreceptor matrix proteoglycan 1 0.022 9.51E-03 1 2.26E-03
OR10K1 olfactory receptor, family 10, subfamily K, member 1
0.022 9.51E-03 1 2.26E-03
APOBEC1 apolipoprotein B mRNA editing enzyme, catalytic polypeptide 1
1 5.85E-06 1 2.58E-03
GPR179 G protein-coupled receptor 179 0.022 0.290 0.039 2.65E-03 RNF17 ring finger protein 17 0.022 0.016 1 3.33E-03
MCM8 minichromosome maintenance complex component 8
1 9.51E-03 0.011 3.41E-03
ACADL acyl-Coenzyme A dehydrogenase, long chain
0.408 9.51E-03 0.148 3.91E-03
CPXM2 carboxypeptidase X (M14 family), member 2 0.408 9.51E-03 0.148 3.91E-03 CLSTN2 calsyntenin 2 0.022 5.20E-03 0.481 4.31E-03 P4HA3 prolyl 4-hydroxylase, alpha polypeptide III 0.019 0.067 0.597 4.78E-03
ACSM5 acyl-CoA synthetase medium-chain family member 5
0.022 0.025 1 4.89E-03
LMOD3 leiomodin 3 (fetal) 0.022 0.025 1 4.89E-03
OR2D2 olfactory receptor, family 2, subfamily D, member 2
0.022 0.025 1 4.89E-03
DEFB132 defensin, beta 132 0.187 0.697 1.82E-06 5.80E-03
DAAM1 dishevelled associated activator of morphogenesis 1
0.187 9.51E-03 0.597 6.18E-03
VTI1B vesicle transport through interaction with t-SNAREs homolog 1B
0.022 9.51E-03 0.481 6.51E-03
TMEM67 transmembrane protein 67 0.022 0.290 0.148 6.64E-03
GABPB1 GA binding protein transcription factor, beta subunit 1
1 9.51E-03 0.039 6.96E-03
MRPS34 mitochondrial ribosomal protein S34 1 9.51E-03 0.039 6.96E-03 NCKAP5L NCK-Associated Protein 5-Like 1 9.51E-03 0.039 6.96E-03 SGK223 homolog of rat pragma of Rnd2 0.620 9.51E-03 0.148 7.05E-03 ZSWIM5 zinc finger, SWIM-type containing 5 0.015 0.637 0.039 7.12E-03
TRPM2 transient receptor potential cation channel, subfamily M, member 2
1 0.025 0.011 7.16E-03
KNTC1 kinetochore associated 1 0.968 9.51E-03 0.062 8.57E-03
PPRC1 peroxisome proliferator-activated receptor gamma, coactivator-related 1
0.187 5.20E-03 1 8.74E-03
DMRT3 doublesex and mab-3 related transcription factor 3
1.09E-03 0.117 0.481 9.36E-03
PRDM7 PR domain containing 7 0.408 2.30E-04 0.481 9.44E-03 GRM4 glutamate receptor, metabotropic 4 0.022 9.51E-03 0.315 9.92E-03
COMBINED STRATEGIES TO IDENTIFY GENES FOR RARE COMPLEX DISEASES; HSCR AS A MODEL
143
5
Statistical power of a replication study for selected genes
Using gene prioritization tools and gene expression data from the developing ENS,
we identified CELSR1, CLOCK, FASN, CACNA1H and GHDC as the best candidate
genes for HSCR. However, follow-up experiments are required to establish
whether these genes are involved in HSCR, as our study was underpowered to find
genome-wide significant associations. The data presented in this study were used
to calculate the power of a replication study. Using the variant frequencies per
gene in patients and controls, relative risks were calculated per gene. These ranged
from 7.4 for CLOCK to 26.5 for CACNA1H. Consequently, 14 to 33 unrelated patients
are required to find a significant association of these genes to HSCR with 80%
power and 23 to 41 patients are required for 95% power54 (Table 4). Since no
mutations were found in FASN and GHDC in our control cohort, it was not possible
to calculate the relative risk and required number of patients in a replication study
for these genes.
DISCUSSION
Resequencing studies on GWAS-associated genes or candidate genes from
functional studies has revealed a role for rare, coding variants in complex genetic
diseases, including HSCR41,62. However, the contribution of rare variants to
complex genetic diseases is difficult to study, as statistical power is negatively
affected by locus heterogeneity, low allele frequency and large multiple-testing
correction17. Using HSCR as a model of a complex genetic disease, we combined
several strategies to perform a rare variant (gene) association study.
Burden test
To maximize the statistical power of our rare variant association test, we
prioritized long-segment HSCR cases, collapsed all rare, damaging variants per
gene and performed a meta-analysis on three case-control studies. This approach
gave our study sufficient power to detect associations at nominal p-value for genes
in which variants are relatively abundant or have a high relative risk. However, the
meta-analysis was underpowered to reach genome-wide significance.
Mutations in RET, the main HSCR gene, are normally found in 15-35% of
sporadic HSCR patients63,64. However, we did not find RET or any other known
HSCR genes among the highest associated genes in the burden test. The most
CHAPTER 5
144
Figure 3. Candidate gene ranking results from the gene prioritization tools. A) Gene prioritization
results from the seven tools. Highly ranked genes are shown in dark blue and genes with a low rank in
light blue. Genes are shown by overall rank. B) Relationship between the p-value of association in the
burden test and the overall rank in the gene prioritization. C) Heatmap showing the correlations
between the ranking results of the different tools.
COMBINED STRATEGIES TO IDENTIFY GENES FOR RARE COMPLEX DISEASES; HSCR AS A MODEL
145
5
significantly associated known HSCR gene was NRG1, with a nominal p-value of
0.015. The reason for not finding the HSCR genes is due to the fact that we selected
mainly patients without mutations in the known HSCR genes, this increases the
likelihood of identifying new HSCR genes. Coding mutations in all HSCR genes
other than RET are rare and are mainly associated with syndromic rather than
isolated HSCR13,42,43. Lack of association of these genes with HSCR in our data was
therefore also not unexpected.
Gene prioritization tools
The 48 genes with a nominal p-value <0.01 in the burden test performed, were
selected to be prioritized by seven gene prioritization tools. Different prioritization
tools use seed genes from different sources to look for similarity with candidate
genes. Endeavour, ToppGene and ToppNet require the user to specify genes that
are known to be involved in the disease or underlying biological process27–29. The
benefit of this approach is that the seed genes are very specific for the phenotype.
Other tools lack this user specific input and do not use all known genes that are
critical for ENS development. For example, Hlx65 and Hoxb566,67 are not included as
general ‘nervous system development’ seed genes by GPSy30. GPSy therefore
misses connections with these seed genes. User-specific input also has its
drawbacks as the focus is on ‘known’ genes. GPSy may therefore uncover novel
pathways in ENS development as many genes that are involved in neuronal
development in the CNS might also be relevant to the ENS. FunSimMat, Exomiser
and ExomeWalker extract seed genes from OMIM31–35. Although this yields seed
Table 3. Differentially expression of the 48 candidate genes in E14.5 mouse embryo.
Case Control High expression Low expression
Mouse whole gut Testis, ovary,
heart and kidney
Cdk12, Mrps34, D8Ertd82e,
Ghdc, Zfand4
Gabpb1, Trpm2, Celsr1,
Polr1a, Hoga1, Clock, Grm4,
Daam1, Rnf17
Mouse ENCC Mouse whole gut Cpxm2,D8Ertd82e, Ghdc,
Cdk12, Clstn2, Mrps34
Celsr1, Polr1a, Gabpb1, Grm4,
Kntc1, Clock, Mcm8, Hoga1
Mouse ENCC Testis, ovary,
heart and kidney
Cpxm2, D8Ertd82e, Ghdc,
Cdk12, Clstn2, Mrps34
Celsr1, Polr1a, Gabpb1, Grm4,
Kntc1, Clock, Mcm8, Hoga1
Mouse ENCC +
GDNF
Testis, ovary,
heart and kidney
Cpxm2, Ghdc, D8Ertd82e,
Vti1b, Celsr1, Clstn2, Cdk12,
Mrps34
Kntc1, Gabpb1, Polr1a, Clock,
Hoga1, Grm4, P4ha3, Mcm8
Mouse ENCC +
GDNF Mouse ENCC Celsr1 Ddc
Of the top candidate genes from the gene prioritization tools, Celsr1 and Clock were less abundantly
expressed by ENCCs than in the gut or in control tissues, but Celsr1 expression was upregulated in
ENCCs after activation of the RET receptor by its ligand GDNF.
CHAPTER 5
146
genes that are well established in the disease, the number of disease genes in
OMIM may be an underrepresentation of the number of genes involved. In the case
of HSCR, only four genes can be retrieved from OMIM (RET, GDNF, EDNRB and
EDN3), whereas the manually assembled list of ENS development genes that was
used in Endeavour, ToppGene and ToppNet, contained 45 seed genes.
The gene prioritization tools also differ in the data sources that are used to
compare candidate genes to seed genes. Endeavour, ToppGene and GPSy use a
wide range of data sources, such as functional annotation, expression, interaction
and sequence similarity. ToppNet, FunSimMat, Exomiser and ExomeWalker rely on
a single data source to connect candidate genes to seed genes (protein interaction,
functional annotation, phenotypic similarity to mouse models and protein
interaction, respectively). Exomiser and ExomeWalker are specifically designed for
exome sequencing studies, and take the frequency and predicted pathogenicity of
the identified genetic variants into account. The different strategies implemented
by gene prioritization tools are reflected by the differences in gene ranking results.
Correlations between prioritization results from different tools varied
substantially. Endeavour and ToppGene showed moderate correlation with all
other tools, except ExomeWalker. ExomeWalker combines variant level
information from exome sequencing with protein interactions between candidate
genes and disease genes derived from OMIM. However, no interactions with the
known HSCR genes RET, GDNF, EDNRB and EDN3 were found for any of the
candidate genes. Gene ranking by ExomeWalker was therefore solely based on the
frequency and predicted pathogenicity of the identified variants. The variant level
Table 4. Calculation of cohort size for a replication study on selected candidate genes.
Candidate gene
Frequency in patients
Frequency in controls
Relative risk
80% power 90% power 95% power
cases/controls cases/controls cases/controls
CELSR1 0.125 0.014 8.83 25 110 33 146 41 181
CLOCK 0.104 0.014 7.36 33 146 44 194 54 239
FASN 0.063 0.000 ∞
CACNA1H 0.125 0.005 26.5 14 62 19 84 23 102
GHDC 0.063 0.000 ∞
Power calculations for a rare-variant association replication study on selected candidate genes. Given
the frequencies of rare, damaging variants in the candidate genes in the burden test, the relative risk
and number of cases and controls were calculated.
COMBINED STRATEGIES TO IDENTIFY GENES FOR RARE COMPLEX DISEASES; HSCR AS A MODEL
147
5
information is also used by Exomiser, explaining why ExomeWalker results
correlated with those from Exomiser, but to no other tools tested in this study.
Despite the variable correlations between ranking results, several genes
were consistently highly ranked by the gene prioritization tools. CELSR1, CLOCK,
FASN and CACNA1H were among the top 5-ranked candidate genes based on
average rank and were among the 13 most significant nominal p-values in the
burden test meta-analysis.
Function of the proposed top 4 candidate genes
CELSR1 (Cadherin, EGF LAG Seven-Pass G-Type Receptor 1) is an adhesion
molecule that is involved in planar cell polarity; the organization of cells in a plane
or sheet68. CELSR1 mutations have been associated with the neural tube defect
craniorachischisis in humans, a finding that is corroborated by improper closure of
the neural tube in Celsr1 mouse models68–70. Additionally, Celsr1 is involved in the
directional migration of facial branchiomotor neurons71. Planar cell polarity genes
are involved in ENS development, where Celsr3 and Fzd3 regulate guidance and
growth of neuronal projections72. Ablation of Celsr3 in ENS progenitor cells causes
constriction of colonic segments, distention of the proximal segment, and reduced
gut transit time, symptoms that are all hallmarks of HSCR72. These results
demonstrate a role for planar cell polarity genes in ENS development, making
CELSR1 an excellent candidate gene for HSCR.
CLOCK encodes a core component of the circadian clock. Gastrointestinal
motility follows a circadian rhythm and neurotransmitters that regulate gut
contractility, such as Vip and nNos, are rhythmically expressed in the distal murine
colon73,74. Clock genes are expressed in intestinal epithelial cells and enteric
neurons in mice and may well be responsible for the rhythmic innervation of the
gut73.
FASN (fatty acid synthase) is an enzyme that catalyzes fatty acid synthesis.
In the murine CNS, Fasn is highly expressed in neurogenic areas and is required for
maintenance of neural stem cell pools75. As the development of the ENS depends
on propagation of stem cells, FASN may be involved in ENS development.
The mouse homologue of CACNA1H (Calcium channel, voltage-dependent,
T type, alpha 1H subunit) is expressed by migrating ENCCs76. Although Cacna1h
and other Ca2+ channels are expressed by ENCCs at different developmental time
points, blockage of Ca2+ channels in gut explants does not impair ENCC migration
or neurite outgrowth76.
CHAPTER 5
148
Additional candidate genes
Combining different strategies as proposed in this study reduces the number of
candidate genes dramatically. Ending up with a small number of candidate genes
makes genetic studies surveyable and amenable for functional analysis. However,
it also raises the question whether we do not exclude potentially valid candidates.
For instance, GHDC and to a lesser extent CDK12 are excellent candidates for HSCR
because of their high expression in ENCC, pathogenicity of identified variants and
low nominal p-value in the burden test. Therefore, one should be critical in
excluding genes too easily.
Conclusions
Although our rare variant association study was underpowered to detect genome-
wide associations to HSCR, the study serves as a pilot study to direct future
research. Power calculations for genetic studies rely on prior knowledge of variant
frequencies and effect size, and these parameters can be estimated from the data
presented here. It should be noted that the frequency of variants in a gene is
variable between genes, meaning that for some genes there is higher statistical
power than for others. Therefore our approach will be useful only for genes that
carry relatively many rare, damaging variants in the general population. In
addition to calculating the statistical power for such genes in a small-scale
replication study, we prioritized the top hits from our burden test to select the
most promising candidate genes to follow up on. Only a limited number of
unrelated patients are required in such a follow-up study.
COMBINED STRATEGIES TO IDENTIFY GENES FOR RARE COMPLEX DISEASES; HSCR AS A MODEL
149
5
REFERENCES
1. Iyengar, S. K. & Elston, R. C. The genetic basis of complex traits: rare variants or ‘common gene,
common disease’? Methods Mol. Biol. 376, 71–84 (2007). 2. Bodmer, W. & Bonilla, C. Common and rare variants in multifactorial susceptibility to common
diseases. Nat. Genet. 40, 695–701 (2008). 3. Gibson, G. Rare and common variants: twenty arguments. Nat. Rev. Genet. 13, 135–145 (2012). 4. Saint Pierre, A. & Génin, E. How important are rare variants in common disease? Brief. Funct.
Genomics elu025– (2014). doi:10.1093/bfgp/elu025 5. Stranger, B. E., Stahl, E. A. & Raj, T. Progress and Promise of Genome-Wide Association Studies
for Human Complex Trait Genetics. Genetics 187, 367–383 (2011). 6. Visscher, P. M., Brown, M. A., McCarthy, M. I. & Yang, J. Five Years of GWAS Discovery. Am. J.
Hum. Genet. 90, 7–24 (2012). 7. Lander, E. S. Initial impact of the sequencing of the human genome. Nature 470, 187–197
(2011). 8. Johansen, C. T. et al. Excess of rare variants in genes identified by genome-wide association
study of hypertriglyceridemia. Nat. Genet. 42, 684–687 (2010). 9. Rivas, M. A. et al. Deep resequencing of GWAS loci identifies independent rare variants
associated with inflammatory bowel disease. Nat. Genet. 43, 1066–73 (2011). 10. Beaudoin, M. et al. Deep Resequencing of GWAS Loci Identifies Rare Variants in CARD9, IL23R
and RNF186 That Are Associated with Ulcerative Colitis. PLoS Genet. 9, e1003723 (2013). 11. Easton, D. F., Bishop, D. T., Ford, D. & Crockford, G. P. Genetic linkage analysis in familial breast
and ovarian cancer: results from 214 families. The Breast Cancer Linkage Consortium. Am. J. Hum. Genet. 52, 678–701 (1993).
12. Heutink, P. & Oostra, B. a. Gene finding in genetically isolated populations. Hum. Mol. Genet. 11, 2507–15 (2002).
13. Alves, M. M. et al. Contribution of rare and common variants determine complex diseases-Hirschsprung disease as a model. Dev. Biol. 382, 320–9 (2013).
14. Tennessen, J. A. et al. Evolution and functional impact of rare coding variation from deep sequencing of human exomes. Science 337, 64–9 (2012).
15. Nelson, M. R. et al. An Abundance of Rare Functional Variants in 202 Drug Target Genes Sequenced in 14,002 People. Science (80-. ). 337, 100–104 (2012).
16. Bansal, V., Libiger, O., Torkamani, A. & Schork, N. J. Statistical analysis strategies for association studies involving rare variants. Nat. Rev. Genet. 11, 773–785 (2010).
17. Zhi, D. & Chen, R. Statistical guidance for experimental design and data analysis of mutation detection in rare monogenic Mendelian diseases by exome sequencing. PLoS One 7, (2012).
18. Zuk, O. et al. Searching for missing heritability: designing rare variant association studies. Proc. Natl. Acad. Sci. U. S. A. 111, E455–64 (2014).
19. Kryukov, G. V, Shpunt, A., Stamatoyannopoulos, J. a & Sunyaev, S. R. Power of deep, all-exon resequencing for discovery of human trait genes. Proc. Natl. Acad. Sci. U. S. A. 106, 3871–3876 (2009).
20. Li, D., Lewinger, J. P., Gauderman, W. J., Murcray, C. E. & Conti, D. Using extreme phenotype sampling to identify the rare causal variants of quantitative traits in association studies. Genet. Epidemiol. 35, 790–9 (2011).
21. Li, B. & Leal, S. Methods for detecting associations with rare variants for common diseases: application to analysis of sequence data. Am. J. Hum. Genet. 83, 311–321 (2008).
22. Lee, S., Teslovich, T. M., Boehnke, M. & Lin, X. General framework for meta-analysis of rare variants in sequencing association studies. Am. J. Hum. Genet. 93, 42–53 (2013).
23. Hu, Y.-J. et al. Meta-analysis of gene-level associations for rare variants based on single-variant statistics. Am. J. Hum. Genet. 93, 236–48 (2013).
24. Lipman, P. J. et al. On the follow-up of genome-wide association studies: An overall test for the most promising SNPs. Genet. Epidemiol. 35, 303–309 (2011).
25. Tranchevent, L. C. et al. A guide to web tools to prioritize candidate genes. Brief. Bioinform. 12, 22–32 (2011).
26. Moreau, Y. & Tranchevent, L.-C. Computational tools for prioritizing candidate genes: boosting disease gene discovery. Nat. Rev. Genet. 13, 523–536 (2012).
27. Aerts, S. et al. Gene prioritization through genomic data fusion. Nat. Biotechnol. 24, 537–544
CHAPTER 5
150
(2006). 28. Tranchevent, L. C. et al. ENDEAVOUR update: a web resource for gene prioritization in
multiple species. Nucleic Acids Res. 36, 377–384 (2008). 29. Chen, J., Bardes, E. E., Aronow, B. J. & Jegga, A. G. ToppGene Suite for gene list enrichment
analysis and candidate gene prioritization. Nucleic Acids Res. 37, 305–311 (2009). 30. Britto, R. et al. GPSy: a cross-species gene prioritization system for conserved biological
processes--application in male gamete development. Nucleic Acids Res. 40, W458–65 (2012). 31. Schlicker, A. & Albrecht, M. FunSimMat: A comprehensive functional similarity database.
Nucleic Acids Res. 36, 434–439 (2008). 32. Schlicker, A. & Albrecht, M. FunSimMat update: new features for exploring functional
similarity. Nucleic Acids Res. 38, D244–8 (2010). 33. Schlicker, A., Lengauer, T. & Albrecht, M. Improving disease gene prioritization using the
semantic similarity of Gene Ontology terms. Bioinformatics 26, i561–i567 (2010). 34. Robinson, P. N. et al. Improved exome prioritization of disease genes through cross-species
phenotype comparison. Genome Res. 24, 340–348 (2014). 35. Smedley, D. et al. Walking the interactome for candidate prioritization in exome sequencing
studies of Mendelian diseases. Bioinformatics 30, 3215–22 (2014). 36. Amiel, J. et al. Hirschsprung disease, associated syndromes and genetics: a review. J. Med.
Genet. 45, 1–14 (2008). 37. Angrist, M. et al. A gene for Hirschsprung disease (megacolon) in the pericentromeric region of
human chromosome 10. Nat. Genet. 4, 351–6 (1993). 38. Lyonnet, S. et al. A gene for Hirschsprung disease maps to the proximal long arm of
chromosome 10. Nat. Genet. 4, 346–50 (1993). 39. Puffenberger, E. et al. Identity-by-descent and association mapping of a recessive gene for
Hirschsprung disease on human chromosome 13q22. Hum. Mol. Genet. 3, 1217–1225 (1994). 40. Garcia-Barcelo, M.-M. et al. Genome-wide association study identifies NRG1 as a susceptibility
locus for Hirschsprung’s disease. Proc. Natl. Acad. Sci. U. S. A. 106, 2694–9 (2009). 41. Jiang, Q. et al. Functional loss of semaphorin 3C and/or semaphorin 3D and their epistatic
interaction with ret are critical to Hirschsprung disease liability. Am. J. Hum. Genet. 96, 581–96 (2015).
42. Brooks, A. S., Oostra, B. A. & Hofstra, R. M. W. Studying the genetics of Hirschsprung’s disease: Unraveling an oligogenic disorder. Clin. Genet. 67, 6–14 (2005).
43. Heanue, T. a & Pachnis, V. Enteric nervous system development and Hirschsprung’s disease: advances in genetic and stem cell studies. Nat. Rev. Neurosci. 8, 466–79 (2007).
44. Li, H. et al. The Sequence Alignment/Map format and SAMtools. Bioinformatics 25, 2078–2079 (2009).
45. DePristo, M. A. et al. A framework for variation discovery and genotyping using next-generation DNA sequencing data. Nat. Genet. 43, 491–8 (2011).
46. McKenna, A. et al. The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 20, 1297–303 (2010).
47. Li, M.-X., Gui, H.-S., Kwan, J. S. H., Bao, S.-Y. & Sham, P. C. A comprehensive framework for prioritizing variants in exome sequencing studies of Mendelian diseases. Nucleic Acids Res. 40, e53 (2012).
48. Liu, X., Jian, X. & Boerwinkle, E. dbNSFP v2.0: A database of human non-synonymous SNVs and their functional predictions and annotations. Hum. Mutat. 34, 2393–2402 (2013).
49. Barrett, T. et al. NCBI GEO: archive for functional genomics data sets--update. Nucleic Acids Res. 41, D991–D995 (2013).
50. Chen, Y.-T., Kobayashi, A., Kwan, K. M., Johnson, R. L. & Behringer, R. R. Gene expression profiles in developing nephrons using Lim1 metanephric mesenchyme-specific conditional mutant mice. BMC Nephrol. 7, 1 (2006).
51. Small, C. L. Profiling Gene Expression During the Differentiation and Development of the Murine Embryonic Gonad. Biol. Reprod. 72, 492–501 (2005).
52. Tanaka, M. et al. A mouse model of congenital heart disease: cardiac arrhythmias and atrial septal defect caused by haploinsufficiency of the cardiac transcription factor Csx/Nkx2.5. Cold Spring Harb. Symp. Quant. Biol. 67, 317–25 (2002).
53. Li, X. et al. Dynamic patterning at the pylorus: Formation of an epithelial intestine-stomach boundary in late fetal life. Dev. Dyn. 238, 3205–3217 (2009).
54. Purcell, S., Cherny, S. S. & Sham, P. C. Genetic Power Calculator: design of linkage and
COMBINED STRATEGIES TO IDENTIFY GENES FOR RARE COMPLEX DISEASES; HSCR AS A MODEL
151
5
association genetic mapping studies of complex traits. Bioinformatics 19, 149–150 (2003). 55. Badner, J. A., Sieber, W. K., Garver, K. L. & Chakravarti, A. A genetic study of Hirschsprung
disease. Am. J. Hum. Genet. 46, 568–80 (1990). 56. Emison, E. S. et al. Differential contributions of rare and common, coding and noncoding Ret
mutations to multifactorial Hirschsprung disease liability. Am. J. Hum. Genet. 87, 60–74 (2010). 57. Feng, J., Han, Q. & Zhou, L. Planar cell polarity genes, Celsr1-3, in neural development.
Neurosci. Bull. 28, 309–15 (2012). 58. Kimiwada, T. et al. Clock genes regulate neurogenic transcription factors, including NeuroD1,
and the neuronal differentiation of adult neural stem/progenitor cells. Neurochem. Int. 54, 277–85
59. Loftus, T. M. et al. Reduced food intake and body weight in mice treated with fatty acid synthase inhibitors. Science 288, 2379–81 (2000).
60. Chemin, J. et al. Specific contribution of human T-type calcium channel isotypes (alpha(1G), alpha(1H) and alpha(1I)) to neuronal excitability. J. Physiol. 540, 3–14 (2002).
61. Chen, H. R., Lin, G. T., Huang, C. K. & Fann, M. J. Cdk12 and Cdk13 regulate axonal elongation through a common signaling pathway that modulates Cdk5 expression. Exp. Neurol. 261, 10–21 (2014).
62. Gui, H. et al. RET and NRG1 interplay in Hirschsprung disease. Hum. Genet. 132, 591–600 (2013).
63. Attié, T. et al. Diversity of RET proto-oncogene mutations in familial and sporadic Hirschsprung disease. Hum. Mol. Genet. 4, 1381–6 (1995).
64. Hofstra, R. M. et al. RET and GDNF gene scanning in Hirschsprung patients using two dual denaturing gel systems. Hum. Mutat. 15, 418–29 (2000).
65. Bates, M. D., Dunagan, D. T., Welch, L. C., Kaul, A. & Harvey, R. P. The Hlx homeobox transcription factor is required early in enteric nervous system development. BMC Dev. Biol. 6, 33 (2006).
66. Carter, T. C. et al. Hirschsprung’s disease and variants in genes that regulate enteric neural crest cell proliferation, migration and differentiation. J. Hum. Genet. 57, 485–93 (2012).
67. Lui, V. C. H. et al. Perturbation of hoxb5 signaling in vagal neural crests down-regulates ret leading to intestinal hypoganglionosis in mice. Gastroenterology 134, 1104–15 (2008).
68. Curtin, J. A. et al. Mutation of Celsr1 Disrupts Planar Polarity of Inner Ear Hair Cells and Causes Severe Neural Tube Defects in the Mouse. Curr. Biol. 13, 1129–1133 (2003).
69. Allache, R., De Marco, P., Merello, E., Capra, V. & Kibar, Z. Role of the planar cell polarity gene CELSR1 in neural tube defects and caudal agenesis. Birth Defects Res. A. Clin. Mol. Teratol. 94, 176–81 (2012).
70. Robinson, A. et al. Mutations in the planar cell polarity genes CELSR1 and SCRIB are associated with the severe neural tube defect craniorachischisis. Hum. Mutat. 33, 440–447 (2012).
71. Qu, Y. et al. Atypical cadherins Celsr1-3 differentially regulate migration of facial branchiomotor neurons in mice. J. Neurosci. 30, 9392–401 (2010).
72. Sasselli, V. & Boesmans, W. Planar cell polarity genes control the connectivity of enteric neurons. J. Clin. Invest. 123, (2013).
73. Hoogerwerf, W. a. et al. Clock Gene Expression in the Murine Gastrointestinal Tract: Endogenous Rhythmicity and Effects of a Feeding Regimen. Gastroenterology 133, 1250–1260 (2007).
74. Hoogerwerf, W. a. Role of clock genes in gastrointestinal motility. Am. J. Physiol. Gastrointest. Liver Physiol. 299, G549–G555 (2010).
75. Knobloch, M. et al. Metabolic control of adult neural stem cell activity by Fasn-dependent lipogenesis. Nature 493, 226–230 (2012).
76. Hirst, C. S. et al. Ion Channel Expression in the Developing Enteric Nervous System. PLoS One 10, e0123436 (2015).
CHAPTER 5
152
SUPPLEMENTARY INFORMATION
Supplementary Table 1. List of seed genes used in Endeavour, ToppGene and ToppNet.
Gene Gene name Human phenotype Refs
ALDH1A2 aldehyde dehydrogenase 1 family, member A2
1 ASCL1 achaete-scute complex homolog 1 CCHS 2,3 DCC DCC Netrin 1 Receptor
4
DSCAM Down Syndrome Cell Adhesion Molecule HSCR-associated 5,6
ECE1 endothelin converting enzyme 1 Hirschsprung disease, cardiac defects and autonomic dysfunction
7,8
EDN3 endothelin 3 Hirschsprung disease Waardenburg syndrome type 4
9–11
EDNRB endothelin receptor type B Hirschsprung disease Waardenburg syndrome type 4
12–15
ERBB2 v-erb-b2 erythroblastic leukemia viral oncogene homolog 2
16
ERBB3 v-erb-b2 erythroblastic leukemia viral oncogene homolog 3
17,18
FOXD3 Forkhead Box D3
19 GDNF glial cell derived neurotrophic factor Hirschsprung disease 20–25 GFRA1 GDNF family receptor alpha 1 Hirschsprung disease 26,27 GFRA2 GDNF family receptor alpha 2
28
GLI1 GLI family zinc finger 1
29,30 GLI2 GLI family zinc finger 2
29,30
GLI3 GLI family zinc finger 3
29,30 HAND2 Heart And Neural Crest Derivatives Expressed 2
31,32
HLX H2.0-like homeobox
33 HOXB5 Homeobox B5
34,35
IHH Indian hedgehog homolog
36
IKBKAP inhibitor of kappa light polypeptide gene enhancer in B-cells, kinase complex-associated protein
Familial dysautonomial 37,38
ITGB1 integrin beta 1
39,40 KIF1BP KIF1 Binding Protein Goldberg-Shprintzen syndrome 41–43 L1CAM L1 cell adhesion molecule Partial agenesis of corpus callosum 44–47 NKX2-1 NK2 Homeobox 1 Single HSCR patient 48 NRG1 neuregulin 1 Hirschsprung disease 49,50 NRG3 neuregulin 3 Hirschsprung disease CNVs 51,52 NRTN neurturin Hirschsprung disease 53–55 NTF3 Neurotrophin 3
56
NTRK3 neurotrophic tyrosine kinase, receptor, type 3
56,57
PAX3 paired box 3 Waardenburg syndrome type 1 and type 3
58
PHOX2B paired-like homeobox 2b Neuroblastoma with Hirschsprung disease
59–61
PSPN Persephin Single HSCR patient 54 PTCH1 patched homolog 1
62,63
RET ret proto-oncogene Hirschsprung disease 64–66 SALL4 sal-like 4 Duane-radial ray syndrome 67
SEMA3A sema domain, immunoglobulin domain (Ig), short basic domain, secreted, (semaphorin) 3A
68,69
SEMA3C sema domain, immunoglobulin domain (Ig), short basic domain, secreted, (semaphorin) 3C
Hirschsprung disease 68
SEMA3D sema domain, immunoglobulin domain (Ig), short basic domain, secreted, (semaphorin) 3D
Hirschsprung disease 68,69
SHH sonic hedgehog homolog
36,70 SOX10 SRY (sex determining region Y)-box 10 Waardenburg syndrome, type 4C 71–74 SPRY2 sprouty homolog 2
75
TCF4 Transcription Factor 4 Pitt-Hopkins syndrome 76–78 ZEB2 zinc finger E-box binding homeobox 2 Mowat-Wilson syndrome 79,80 ZIC2 Zic family member 2
81
COMBINED STRATEGIES TO IDENTIFY GENES FOR RARE COMPLEX DISEASES; HSCR AS A MODEL
153
5
1. Niederreither, K. et al. The regional pattern of retinoic acid synthesis by RALDH2 is essential for the development of posterior pharyngeal arches and the enteric nervous system. Development 130, 2525–34 (2003).
2. de Pontual, L. et al. Noradrenergic neuronal development is impaired by mutation of the proneural HASH-1 gene in congenital central hypoventilation syndrome (Ondine’s curse). Hum. Mol. Genet. 12, 3173–80 (2003).
3. Guillemot, F. et al. Mammalian achaete-scute homolog 1 is required for the early development of olfactory and autonomic neurons. Cell 75, 463–76 (1993).
4. Jiang, Y., Liu, M. T. & Gershon, M. D. Netrins and DCC in the guidance of migrating neural crest-derived cells in the developing bowel and pancreas. Dev. Biol. 258, 364–384 (2003).
5. Jannot, A. S. et al. Chromosome 21 Scan in Down Syndrome Reveals DSCAM as a Predisposing Locus in Hirschsprung Disease. PLoS One 8, 1–8 (2013).
6. Yamakawa, K. et al. DSCAM: a novel member of the immunoglobulin superfamily maps in a Down syndrome region and is involved in the development of the nervous system. Hum. Mol. Genet. 7, 227–37 (1998).
7. Hofstra, R. M. et al. A loss-of-function mutation in the endothelin-converting enzyme 1 (ECE-1) associated with Hirschsprung disease, cardiac defects, and autonomic dysfunction. Am. J. Hum. Genet. 64, 304–8 (1999).
8. Yanagisawa, H. et al. Dual genetic pathways of endothelin-mediated intercellular signaling revealed by targeted disruption of endothelin converting enzyme-1 gene. Development 125, 825–836 (1998).
9. Edery, P. et al. Mutation of the endothelin-3 gene in the Waardenburg-Hirschsprung disease (Shah-Waardenburg syndrome). Nat. Genet. 12, 442–4 (1996).
10. Hofstra, R. M. et al. A homozygous mutation in the endothelin-3 gene associated with a combined Waardenburg type 2 and Hirschsprung phenotype (Shah-Waardenburg syndrome). Nat. Genet. 12, 445–7 (1996).
11. Baynash, A. G. et al. Interaction of endothelin-3 with endothelin-B receptor is essential for development of epidermal melanocytes and enteric neurons. Cell 79, 1277–1285 (1994).
12. Puffenberger, E. G. et al. A missense mutation of the endothelin-B receptor gene in multigenic hirschsprung’s disease. Cell 79, 1257–1266 (1994).
13. Amiel, J. Heterozygous endothelin receptor B (EDNRB) mutations in isolated Hirschsprung disease. Hum. Mol. Genet. 5, 355–357 (1996).
14. Auricchio, A. Endothelin-B receptor mutations in patients with isolated Hirschsprung disease from a non-inbred population. Hum. Mol. Genet. 5, 351–354 (1996).
15. Hosoda, K. et al. Targeted and natural (piebald-lethal) mutations of endothelin-B receptor gene produce megacolon associated with spotted coat color in mice. Cell 79, 1267–1276 (1994).
16. Crone, S. A., Negro, A., Trumpp, A., Giovannini, M. & Lee, K.-F. Colonic Epithelial Expression of ErbB2 Is Required for Postnatal Maintenance of the Enteric Nervous System. Neuron 37, 29–40 (2003).
17. Riethmacher, D. et al. Severe neuropathies in mice with targeted mutations in the ErbB3 receptor. Nature 389, 725–730 (1997).
18. Chalazonitis, A., D’Autréaux, F., Pham, T. D., Kessler, J. a & Gershon, M. D. Bone morphogenetic proteins regulate enteric gliogenesis by modulating ErbB3 signaling. Dev. Biol. 350, 64–79 (2011).
19. Teng, L., Mundell, N. a, Frist, A. Y., Wang, Q. & Labosky, P. a. Requirement for Foxd3 in the maintenance of neural crest progenitors. Development 135, 1615–1624 (2008).
20. Angrist, M., Bolk, S., Halushka, M., Lapchak, P. A. & Chakravarti, A. Germline mutations in glial cell line-derived neurotrophic factor (GDNF) and RET in a Hirschsprung disease patient. Nat. Genet. 14, 341–344 (1996).
21. Ivanchuk, S. M., Myers, S. M., Eng, C. & Mulligan, L. M. De novo mutation of GDNF, ligand for the RET/GDNFR-alpha receptor complex, in Hirschsprung disease. Hum. Mol. Genet. 5, 2023–2026 (1996).
22. Hofstra, R. M. et al. RET and GDNF gene scanning in Hirschsprung patients using two dual denaturing gel systems. Hum. Mutat. 15, 418–29 (2000).
23. Pichel, J. G. et al. Defects in enteric innervation and kidney development in mice lacking GDNF. Nature 382, 73–76 (1996).
24. Sánchez, M. P. et al. Renal agenesis and the absence of enteric neurons in mice lacking GDNF.
CHAPTER 5
154
Nature 382, 70–73 (1996). 25. Moore, M. W. et al. Renal and neuronal abnormalities in mice lacking GDNF. Nature 382, 76–
79 (1996). 26. Borrego, S. et al. Investigation of germline GFRA4 mutations and evaluation of the involvement
of GFRA1, GFRA2, GFRA3, and GFRA4 sequence variants in Hirschsprung disease. J. Med. Genet. 40, e18 (2003).
27. Enomoto, H. et al. GFR alpha1-deficient mice have deficits in the enteric nervous system and kidneys. Neuron 21, 317–324 (1998).
28. Rossi, J. et al. Alimentary tract innervation deficits and dysfunction in mice lacking GDNF family receptor α2. J. Clin. Invest. 112, 707–716 (2003).
29. Liu, J. A.-J. et al. Identification of GLI Mutations in Patients With Hirschsprung Disease That Disrupt Enteric Nervous System Development in Mice. Gastroenterology (2015). doi:10.1053/j.gastro.2015.07.060
30. Yang, J. T. et al. Expression of human GLI in mice results in failure to thrive, early death, and patchy Hirschsprung-like gastrointestinal dilatation. Mol. Med. 3, 826–35 (1997).
31. Hendershot, T. J. et al. Expression of Hand2 is sufficient for neurogenesis and cell type-specific gene expression in the enteric nervous system. Dev. Dyn. 236, 93–105 (2007).
32. Lei, J. & Howard, M. J. Targeted deletion of Hand2 in enteric neural precursor cells affects its functions in neurogenesis, neurotransmitter specification and gangliogenesis, causing functional aganglionosis. Development 138, 4789–4800 (2011).
33. Bates, M. D., Dunagan, D. T., Welch, L. C., Kaul, A. & Harvey, R. P. The Hlx homeobox transcription factor is required early in enteric nervous system development. BMC Dev. Biol. 6, 33 (2006).
34. Carter, T. C. et al. Hirschsprung’s disease and variants in genes that regulate enteric neural crest cell proliferation, migration and differentiation. J. Hum. Genet. 57, 485–93 (2012).
35. Lui, V. C. H. et al. Perturbation of hoxb5 signaling in vagal neural crests down-regulates ret leading to intestinal hypoganglionosis in mice. Gastroenterology 134, 1104–15 (2008).
36. Ramalho-Santos, M., Melton, D. A. & McMahon, A. P. Hedgehog signals regulate multiple aspects of gastrointestinal development. Development 127, 2763–2772 (2000).
37. Tang, C. S. et al. Fine mapping of the 9q31 Hirschsprung’s disease locus. Hum. Genet. 127, 675–683 (2010).
38. Cheng, W. W.-C. et al. Depletion of the IKBKAP ortholog in zebrafish leads to hirschsprung disease-like phenotype. World J. Gastroenterol. 21, 2040–6 (2015).
39. Breau, M. A. et al. Lack of beta1 integrins in enteric neural crest cells leads to a Hirschsprung-like phenotype. Development 133, 1725–1734 (2006).
40. Breau, M. A., Dahmani, A., Broders-Bondon, F., Thiery, J.-P. & Dufour, S. Beta1 integrins are required for the invasion of the caecum and proximal hindgut by enteric neural crest cells. Development 136, 2791–801 (2009).
41. Brooks, A. S. et al. Homozygous nonsense mutations in KIAA1279 are associated with malformations of the central and enteric nervous systems. Am. J. Hum. Genet. 77, 120–6 (2005).
42. Drévillon, L. et al. KBP-cytoskeleton interactions underlie developmental anomalies in Goldberg-Shprintzen syndrome. Hum. Mol. Genet. 22, 2387–99 (2013).
43. Dafsari, H. S. et al. Goldberg-Shprintzen megacolon syndrome with associated sensory motor axonal neuropathy. Am. J. Med. Genet. A 167, 1300–4 (2015).
44. Okamoto, N., Wada, Y. & Goto, M. Hydrocephalus and Hirschsprung’s disease in a patient with a mutation of L1CAM. J. Med. Genet. 34, 670–671 (1997).
45. Parisi, M. A. et al. Hydrocephalus and intestinal aganglionosis: is L1CAM a modifier gene in Hirschsprung disease? Am. J. Med. Genet. 108, 51–6 (2002).
46. Okamoto, N. et al. Hydrocephalus and Hirschsprung’s disease with a mutation of L1CAM. J. Hum. Genet. 49, 334–7 (2004).
47. Jackson, S.-R. et al. L1CAM mutation in association with X-linked hydrocephalus and Hirschsprung’s disease. Pediatr. Surg. Int. 25, 823–5 (2009).
48. Garcia-Barcelo, M. et al. TTF-1 and RET promoter SNPs: regulation of RET transcription in Hirschsprung’s disease. Hum. Mol. Genet. 14, 191–204 (2005).
49. Garcia-Barcelo, M.-M. et al. Genome-wide association study identifies NRG1 as a susceptibility locus for Hirschsprung’s disease. Proc. Natl. Acad. Sci. U. S. A. 106, 2694–9 (2009).
50. Luzón-Toro, B. et al. Comprehensive Analysis of NRG1 Common and Rare Variants in
COMBINED STRATEGIES TO IDENTIFY GENES FOR RARE COMPLEX DISEASES; HSCR AS A MODEL
155
5
Hirschsprung Patients. PLoS One 7, e36524 (2012). 51. Tang, C. S.-M. et al. Genome-wide copy number analysis uncovers a new HSCR gene: NRG3.
PLoS Genet. 8, e1002687 (2012). 52. Yang, J. et al. Exome sequencing identified NRG3 as a novel susceptible gene of Hirschsprung’s
disease in a Chinese population. Mol. Neurobiol. 47, 957–66 (2013). 53. Doray, B. et al. Mutation of the RET ligand, neurturin, supports multigenic inheritance in
Hirschsprung disease. Hum. Mol. Genet. 7, 1449–1452 (1998). 54. Ruiz-Ferrer, M. et al. Novel mutations at RET ligand genes preventing receptor activation are
associated to Hirschsprung’s disease. J. Mol. Med. (Berl). 89, 471–80 (2011). 55. Heuckeroth, R. O. et al. Gene targeting reveals a critical role for neurturin in the development
and maintenance of enteric, sensory, and parasympathetic neurons. Neuron 22, 253–63 (1999).
56. Chalazonitis, A. et al. Neurotrophin-3 is required for the survival-differentiation of subsets of developing enteric neurons. J. Neurosci. 21, 5620–5636 (2001).
57. Fernández, R. M. et al. A novel point variant in NTRK3, R645C, suggests a role of this gene in the pathogenesis of Hirschsprung disease. Ann. Hum. Genet. 73, 19–25 (2009).
58. Lang, D. et al. Pax3 is required for enteric ganglia formation and functions with Sox10 to modulate expression of c-ret. J. Clin. Invest. 106, 963–71 (2000).
59. Amiel, J. et al. Polyalanine expansion and frameshift mutations of the paired-like homeobox gene PHOX2B in congenital central hypoventilation syndrome. Nat. Genet. 33, 459–61 (2003).
60. Garcia-Barceló, M. et al. Association study of PHOX2B as a candidate gene for Hirschsprung’s disease. Gut 52, 563–7 (2003).
61. Pattyn, A., Morin, X., Cremer, H., Goridis, C. & Brunet, J. F. The homeobox gene Phox2b is essential for the development of autonomic neural crest derivatives. Nature 399, 366–70 (1999).
62. Ngan, E. S.-W. et al. Hedgehog/Notch-induced premature gliogenesis represents a new disease mechanism for Hirschsprung disease in mice and humans. J. Clin. Invest. 121, 3467–78 (2011).
63. Wang, Y. et al. Common Genetic Variations in Patched1 (PTCH1) Gene and Risk of Hirschsprung Disease in the Han Chinese Population. PLoS One 8, 1–8 (2013).
64. Edery, P. et al. Mutations of the RET proto-oncogene in Hirschsprung’s disease. Nature 367, 378–380 (1994).
65. Romeo, G. et al. Point mutations affecting the tyrosine kinase domain of the RET proto-oncogene in Hirschsprung’s disease. Nature 367, 377–378 (1994).
66. Schuchardt, A., D’Agati, V., Larsson-Blomberg, L., Costantini, F. & Pachnis, V. Defects in the kidney and enteric nervous system of mice lacking the tyrosine kinase receptor Ret. Nature 367, 380–383 (1994).
67. Warren, M. et al. A Sall4 mutant mouse model useful for studying the role of Sall4 in early embryonic development and organogenesis. Genesis 45, 51–58 (2007).
68. Jiang, Q. et al. Functional loss of semaphorin 3C and/or semaphorin 3D and their epistatic interaction with ret are critical to Hirschsprung disease liability. Am. J. Hum. Genet. 96, 581–96 (2015).
69. Luzón-Toro, B. et al. Mutational spectrum of semaphorin 3A and semaphorin 3D genes in Spanish Hirschsprung patients. PLoS One 8, e54800 (2013).
70. Fu, M. Sonic hedgehog regulates the proliferation, differentiation, and migration of enteric neural crest cells in gut. J. Cell Biol. 166, 673–684 (2004).
71. Pingault, V. et al. SOX10 mutations in patients with Waardenburg-Hirschsprung disease. Nat. Genet. 18, 171–173 (1998).
72. Touraine, R. L. et al. Neurological phenotype in Waardenburg syndrome type 4 correlates with novel SOX10 truncating mutations and expression in developing brain. Am. J. Hum. Genet. 66, 1496–503 (2000).
73. Sánchez-Mejías, A. et al. Involvement of SOX10 in the pathogenesis of Hirschsprung disease: report of a truncating mutation in an isolated patient. J. Mol. Med. (Berl). 88, 507–14 (2010).
74. Southard-Smith, E. M., Kos, L. & Pavan, W. J. Sox10 mutation disrupts neural crest development in Dom Hirschsprung mouse model. Nat. Genet. 18, 60–64 (1998).
75. Taketomi, T. et al. Loss of mammalian Sprouty2 leads to enteric neuronal hyperplasia and esophageal achalasia. Nat. Neurosci. 8, 855–7 (2005).
76. Amiel, J. et al. Mutations in TCF4, encoding a class I basic helix-loop-helix transcription factor, are responsible for Pitt-Hopkins syndrome, a severe epileptic encephalopathy associated with
CHAPTER 5
156
autonomic dysfunction. Am. J. Hum. Genet. 80, 988–993 (2007). 77. Zweier, C. et al. Haploinsufficiency of TCF4 causes syndromal mental retardation with
intermittent hyperventilation (Pitt-Hopkins syndrome). Am. J. Hum. Genet. 80, 994–1001 (2007).
78. Peippo, M. M. et al. Pitt-Hopkins syndrome in two patients and further definition of the phenotype. Clin. Dysmorphol. 15, 47–54 (2006).
79. Wakamatsu, N. et al. Mutations in SIP1, encoding Smad interacting protein-1, cause a form of Hirschsprung disease. Nat. Genet. 27, 369–70 (2001).
80. Van de Putte, T. et al. Mice lacking ZFHX1B, the gene that codes for Smad-interacting protein-1, reveal a role for multiple neural crest cell defects in the etiology of Hirschsprung disease-mental retardation syndrome. Am. J. Hum. Genet. 72, 465–70 (2003).
81. Zhang, Y. & Niswander, L. Zic2 is required for enteric nervous system development and neurite outgrowth: a mouse model of enteric hyperplasia and dysplasia. Neurogastroenterol. Motil. 25, 538–41 (2013).
OVEREXPRESSION OF THE CHROMOSOME 21 GENE
ATP5O RESULTS IN FEWER ENTERIC NEURONS;
THE MISSING LINK BETWEEN DOWN SYNDROME
AND HIRSCHSPRUNG DISEASE?
Rajendra K. Chauhan1, Rizky Lasabuda1, Duco Schriemer2, William W.C. Cheng1,
Zakia Azmani1, Bianca M. de Graaf1, Alice S. Brooks1, Sarah Edie3, Roger H. Reeves3,
Bart J.L. Eggen2, Alan J.Burns1,5, Iain T. Shepherd4, Robert M.W. Hofstra1,5
1 Department of Clinical Genetics, Erasmus Medical Center Rotterdam, Rotterdam, The Netherlands
2 Department of Neuroscience, section Medical Physiology, University of Groningen, University Medical Center
Groningen, Groningen, The Netherlands
3 Johns Hopkins University School of Medicine, Department of Physiology and McKusick-Nathans Institute for
Genetic Medicine, Baltimore, USA
4 Department of Biology, Emory University, Atlanta, USA
5 Birth Defects Research Centre, UCL Institute of Child Health, London, United Kingdom
Manuscript in preparation
6
CHAPTER 6
158
ABSTRACT
Hirschsprung disease (HSCR) is characterized by the absence of enteric ganglia in
the distal region of the gastrointestinal tract, leading to severe intestinal
obstruction. Around 12% of patients with HSCR have a chromosomal abnormality,
the most of which have Down Syndrome (DS), trisomy 21. Moreover, individuals
with DS have a >100-fold higher risk of developing HSCR than the general
population. This suggests that overexpression of human chromosome 21 (Hsa21)
genes contribute to the etiology of HSCR. To identify the gene(s) contributing to
HSCR in DS, we overexpressed candidate genes in a reporter zebrafish, Tg(-
8.3bphox2b:Kaede) where neural crest derived cells express the fluorescent kaede
protein. We prioritized 21 genes and overexpressed them by microinjecting
capped mRNAs in single-cell stage zebrafish embryos and scored them at 5 days
post fertilization (dpf). We show that overexpression of ATP5O (ATP synthase, H+
transporting, mitochondrial F1 complex, O subunit) leads to a disturbed enteric
nervous system (ENS) with a reduced number of enteric neurons, strongly
implicating ATP5O as a contributor to a HSCR phenotype. The ATP5O gene encodes
a component of the F-type ATPase found in the mitochondrial matrix and
participates in ATP synthesis coupled proton transport. ATP5O does not link to the
known HSCR pathways, and although we show expression of the protein in the
enteric ganglia, its involvement in disease development and ENS development is
yet to be uncovered.
OVEREXPRESSION OF ATP5O AS THE LINK BETWEEN DS AND HSCR?
159
6
INTRODUCTION
Hirschsprung disease (HSCR, MIM #142623) is a complex congenital gut motility
disorder resulting from a failure in the development of the enteric nervous system
(ENS) of the gastrointestinal (GI) tract. It is characterized by the absence of enteric
ganglia in a variable length of the distal gut. HSCR is recognized by a failure to pass
meconium in the first 48 hr after birth, abdominal distention, vomiting, and
neonatal enterocolitis. It leads to severe intestinal obstruction and life threatening
constipation. The prevalence of HSCR is 1 in 5000 live births and there is an
unexplained sex bias of four males to one female1. The lack of neurons in the distal
part of the GI tract results from a failure of enteric neural crest cells (NCC) to
migrate, differentiate, proliferate or survive and thereby colonize the gut and form
a functional network of neurons and glia (reviewed by Sasselli et al., 20122).
HSCR is considered as an inherited disease, based on the fact that there
are familial cases (~5%), and the 200-fold increased risk of HSCR to siblings of
patients3. Highly penetrant, coding mutations, in approximately 15 genes, have
been identified to cause or contribute to HSCR (for review see4). The major gene in
HSCR is RET, with a mutation prevalence of 50% in familial HSCR and 15% in
sporadic HSCR5,6. However, cumulatively all the mutations in HSCR-associated
genes explain only a small fraction of cases. In addition to the high penetrant
coding mutations, common low-penetrance polymorphic variants at RET, in the
region containing SEMA3C/SEMA3D and in NRG1 are also associated with HSCR7-10.
However, all together the heritability of the vast majority (~80%) of HSCR cases is
still to be uncovered. Finding genetic factors that may explain the missing
heritability could come from analysis of known HSCR linkage regions, syndromic
HSCR cases, or from the chromosomal abnormalities often identified in HSCR
patients.
HSCR is associated with chromosomal abnormalities in 12% of all cases. In
this study we focused on the most common chromosomal abnormality found in
HSCR, Trisomy 21. Trisomy 21, leading to Down Syndrome (DS), is the most
frequent cause of learning difficulties with an incidence of 1 in 750 live births11.
The incidence of DS among HSCR patients ranges from 2% to 10%3. Moreover, DS
patients have >100 fold higher risk of developing HSCR than the general
population3. This suggests that overexpression of one or more genes on
chromosome 21 may have a substantial contribution to HSCR development in DS-
associated HSCR cases. However, none of the established HSCR genes are localized
CHAPTER 6
160
on chromosome 21. Existing animal models for DS have not, as yet, been explored
in detail for any ENS related defects and despite the vast knowledge available, this
association still remains poorly understood.
Here we aimed to identify the gene(s) on chromosome 21 that could
contribute to the HSCR phenotype. We injected mRNA of selected Hsa21 genes into
a transgenic zebrafish reporter model, and found that elevated levels of one of the
chromosome 21 genes, ATP5O, resulted in altered ENS development and a HSCR-
like phenotype. Moreover, we show that ATP5O is expressed in the zebrafish gut
and in the myenteric and submucosal ganglia of human postnatal colon sections.
METHODS
Prioritizing Hsa21 candidate HSCR Genes
In this study we first prioritized candidate genes based on genetic data and
literature. The genetic data we used was: conservation of the genes between
human and mouse12,13; expression of the genes in mouse enteric NCC (in-house
RNA sequencing data); whether genes encode transcription factors and; presence
of the genes in segmental duplicated regions of chromosome 21 in DS/HSCR
patients14. In our literature search we took in consideration: previous studies on
associations between DS and HSCR; Hsa21 genes that are involved in ENS and gut
development; genes related to neuronal development; genes involved in neuronal
signaling; known animal models of DS.
Hsa21 clone sets
To be able to microinject capped human mRNAs into 1-cell stage Tg(-
8.3bphox2b:Kaede) zebrafish, the set of prioritized Hsa21 genes were sub-cloned in
pCS2+ and were grown overnight followed by plasmid isolation and purification
using the NucleoBond® Xtra plasmid purification system (Marchery-Nagel, Nagel,
2012). All the constructs were verified by DNA sequencing. A pSG5-huAPP-695
construct was used for the APP clone.
Zebrafish husbandry and strains
The Tg(-8.3bphox2b:Kaede) zebrafish line expresses the fluorescent Kaede protein
in phox2b expressing cells, including those of the ENS15. retsa2684/+ zebrafish line
was obtained directly from Zebrafish International Resource Center (ZIRC)16. Both
OVEREXPRESSION OF ATP5O AS THE LINK BETWEEN DS AND HSCR?
161
6
the Tg(-8.3bphox2b:Kaede) and retsa2684/+ zebrafish lines were maintained by
pairwise mating. A cross between retsa2684/+ and Tg(-8.3bphox2b:Kaede) was
performed to generate Tg(-8.3bphox2b:Kaede); retsa2684/+ fish. Zebrafish were
maintained at 28°C according to the standard zebrafish laboratory protocols17.
Embryos were scored for ENS defects and abnormal phenotypes at 5 dpf as
described below. The institutional review board for experimental animals of
Erasmus MC, Rotterdam approved the use of zebrafish embryos for this study. All
procedures and fish experiments were performed in accordance with Dutch animal
welfare legislations and those of the Erasmus Dierexperimenteel Centrum (EDC).
In vitro transcription of mRNA and microinjections into zebrafish embryos
In order to generate capped mRNA for microinjections, the plasmids were
linearized with an appropriate restriction enzyme. After digestion, the plasmid
DNA was cleaned using a phenol chloroform extraction method followed by
ethanol precipitation. The linearized plasmids were used for in vitro synthesis of
capped mRNA using the mMEssage mMachine SP6 kit (Ambion Inc., AM1340).
Total RNA was purified using the RNeasy mini kit (Qiagen,Inc., 74104) and loaded
on a 2% agarose gel to assess RNA quality and integrity. Capped mRNA
quantification was done using the Nanodrop8000 (Thermo). RNA samples were
stored at -80ºC. Capped mRNA was diluted in nuclease free water and
microinjections were done in 1-cell stage zebrafish embryo to overexpress the
genes, as described previously18. Different dosages of each mRNA (5pg, 10pg, 50pg,
100pg, 150pg, 200pg and 250pg) were injected to determine their effect on ENS
development. The mRNA-injected animals were raised in E3 media until 5dpf at
28°C. Non-injected control (NIC) embryos served as positive controls for survival.
Imaging and neuronal counting in zebrafish
Zebrafish embryos injected with capped mRNA and NIC were scored using a Leica
MZ16FA microscope for any visible phenotype under bright field and by using a
GFP filter to image phox2b-positive enteric NCC. To analyze the zebrafish embryos,
they were anesthetized using tricaine in E3 media and then mounted on 0.3%
agarose gel for capturing the images. The digital images were made using a Leica
MZ16FA microscope. Fluorescent imaging was made under the same settings for
each image. Images were processed with Leica LAS and Adobe Photoshop CS
software. To count the number of enteric neurons, we used an in-house made
CHAPTER 6
162
algorithm with image analysis software from FIJI in a semi-automated way (Figure
1C,D).
Whole mount in situ hybridization in zebrafish
A fragment of 686bp of zebrafish atp5o cDNA was amplified using RT-PCR using
primer pair 5’-TTTCATCCCAGACCAGTACG-3’ (forward) and 5’-
GGTATCCCTGATCAGCTTGG-3’ (reverse). The amplified PCR product was ligated
directly into the pCR®II-TOPO vector using the Dual Promoter TA cloning kit
(Invitrogen). Positive clones were confirmed by DNA sequencing for the
orientation and the correct sequence and used to generate antisense and sense
Figure 1. Schematic overview of experimental procedure. A) Schema for overexpression of
prioritized candidate genes from Hsa21. B) Tg(-8.3bphox2b:Kaede) zebrafish line in bright field and
under GFP filter, phox2b expressing neural crest cells are marked with fluorescent kaede protein. C) The
enteric neurons of the intestinal region corresponding to 8 myotomes from the urogenital opening were
selected as shown. D) Using FIJI software, the enteric neurons were counted as represented in the
picture.
OVEREXPRESSION OF ATP5O AS THE LINK BETWEEN DS AND HSCR?
163
6
probes to detect atp5o mRNA expression. Whole-mount in situ hybridization was
carried out as previously described19. We used the DIG RNA labelling kit (Roche) to
generate digoxygenin-labeled riboprobes against atp5o. Stained embryos were
mounted in 70% glycerol. The images were acquired using a Leica MZ16FA
microscope.
Zebrafish genotyping
Tg(-8.3bphox2b:Kaede); retsa2684/+ embryos were grown until 5dpf for phenotyping.
DNA was extracted from individual embryos and genotyping PCR was performed
to distinguish mutants from wildtype using the gene-specific primers ret-wt-F1
(5’GATCTCGTTCGCCTGGC3’), ret-mut-F1 (5’GATCTCGTTCGCCTGGT3’) and ret-wt-
R1 (5’GGGGGCGTGTGACTAATTT3’).
Immunohistochemistry on human colon material
Control postnatal human colon tissues were obtained from the Pathology
Department repository of the Erasmus University Medical Center.
Immunohistochemical (IHC) staining was performed using the Ventana
Benchmark Ultra automated staining system (Ventana Medical System, Tuscon, AZ,
USA). Briefly, after deparaffination the sectioned specimens for IHC detection of
ATP50 were processed for 60 min antigen retrieval using Cell Conditioning
Solution (CC1, Ventana 950-124). After 30 minutes incubation with the primary
antibody at 36°C (ATP5O 1:200), detection with UltraView Universal DAB detection
kit (Ventana 760-500) was performed after amplification with Ultraview
amplification kit (Ventana 760-080). The sections were counterstained with
hematoxylin II (Ventana 790-2208).
Epistasis between ATP5O and ret in zebrafish
1 ng of translation-blocking antisense morpholino against ret20 and 50 pg of ATP5O
capped mRNA were co-injected in 1 cell-stage Tg(-8.3bphox2b:Kaede) embryos.
Embryos injected with either ret morpholino or ATP5O capped mRNA served as
controls. At 5 dpf, embryos were imaged and enteric neurons present in the three
myotome-length long, distal-most intestine were counted and compared.
Cell culture and transfections
The SK-N-SH Neuroblastoma cell line (ATCC # HTB-11) was cultured according to
the ATCC’s protocol (LGC Standards, Middlesex, UK) and incubated at 37oC,
CHAPTER 6
164
supplied with 5% of CO2. Approximately 106 cells were cultured in 1 well of a 6-
wells plate for 24 hr prior to transient transfections. Cells were transfected with
1µg of DNA construct containing ATP5O (pCS2+/ATP5O) or empty vector (pCS2+)
and we used untransfected (UT) cells as a negative control. Transfections were
done using 4µl GeneJuice transfection reagent (Novagen, 70967, Millipore)
according to the manufacturer’s instructions. Cells were starved in serum free
media for 48 hr prior to harvesting and analysis.
Cell apoptosis and cell proliferation assay
Cell apoptosis was assessed by FACS analysis using PE Annexin V Apoptosis
Detection Kit I (BD PharmingenTM) as per manufacturer’s instructions. Cells were
washed with PBS. Early apoptotic cells were identified as PE Annexin V-positive
and 7AAD-negative, while cells positive for both, PE Annexin V and 7AAD were
marked as apoptotic cells. For cell cycle staining assays, ethanol fixed cells were
stained with propidium iodide (PI) for 30 min at room temperature. Stained cells
were analyzed on a FACS flow cytometer (BD Biosciences, San Jose, CA) and for
both assays data analysis was performed using FlowJo.
Statistical analysis
Results are presented as means ± standard deviation (SD). Data were analyzed by
unpaired two-tailed t-test (comparisons of two groups) for the statistical
significance.
RESULTS
Prioritization of candidate genes and generation of cDNA clones
To test which gene(s) on chromosome 21 contribute(s) to HSCR in DS patients, a
selection of the most promising candidate genes was made. A total of 169 genes
were initially assembled for screening. They consisted of 149 Hsa21 genes that are
conserved between human and mouse and another 20 genes that are non-
conserved, but are potentially interesting, and human specific12,13. From these 169
candidates we selected genes encoding transcription factors, genes involved in
neuronal development and genes reported as involved in DS with or without gut
abnormalities (such as HSCR). Following these criteria, we generated a subset of
65 candidate genes and among them we further prioritized the genes based on
OVEREXPRESSION OF ATP5O AS THE LINK BETWEEN DS AND HSCR?
165
6
their expression in E14.5 mouse enteric NCC (in-house RNA sequencing data), and
based on functional evidence from studies in other model organisms. This pipeline
resulted in a shortlist of 28 genes (Table 1) and we were able to synthesize 21
capped mRNAs (technical difficulties made us exclude 7 genes). A list of prioritized
genes is presented in Table 1. A schematic of the experimental design is shown in
Figure 1A.
Table 1. List of prioritized 28 genes in Hsa21 for overexpression.
No. Gene Name Gene Start (hg19) (bp)
Accession number
Conservation in zebrafish
Microinjection status
1 APP 27252861 NM_201414 Yes Yes
2 ATP5O 35275757 NM_001697 Yes Yes
3 BACH1 30566392 BC063307 Yes Yes
4 BRWD1 40556102 NM_001007246 Yes Yes
5 BTG3 18965971 NM_001130914 Yes Yes
6 CBR1 37442239 NM_001757 Yes Yes
7 CHAF1B 37757676 NM_005441 Yes Yes
8 CHODL 19165801 NM_024944 Yes Yes
9 DSCAM 41382926 AB384859 Yes Yes
10 DYRK1A 38739236 BC156309 Yes Yes
11 HMGN1 40714241 NM_004965 No Yes
12 PCBP3 47063608 BC012061 Yes Yes
13 PDE9A 44073746 NM_001001567 Yes Yes
14 PIGP 38435146 NM_153681 Yes Yes
15 PKNOX1 44394620 NM_004571 Yes Yes
16 RCAN1 35885440 BC002864 Yes Yes
17 RUNX1 36160098 BC069929 Yes Yes
18 SH3BGR 40817781 NM_001001713 Yes Yes
19 SIM2 38071433 NM_005069 Yes Yes
20 SOD1 33031935 NM_000454 Yes Yes
21 SUMO3 46191374 NM_006936 Yes Yes
22 DSCR3 38591910 BC110655 Yes No
23 ETS2 40177231 NM_005239 Yes No
24 HLCS 38123493 NM_000411 Yes No
25 ITSN1 35014706 BC116186 Yes No
26 TIAM1 32361860 BC117196 Yes No
27 TTC3 38445571 BC137345 Yes No
28 WRB 40752170 NM_004627 Yes No
Overview of the 28 prioritized genes for overexpression. mRNA was injected into the Tg(-
8.3bphox2b:Kaede) zebrafish for the first 21 genes. The last 7 genes were omitted due to failed mRNA
generation.
CHAPTER 6
166
Tg(-8.3bphox2b:Kaede); retsa2684/+ mutants display ENS defect
In humans, loss of function mutations in the RET gene result in HSCR. In this study
we used the retsa2684/+ zebrafish that was identified in a ENU mutagenesis project as
a positive control for a HSCR-like phenotype16. The retsa2684/+ line was crossed with
the Tg(-8.3bphox2b:Kaede) reporter zebrafish line and the number of enteric
neurons was scored at 5dpf followed by genotyping. The retsa2684/+ mutant embryos
contained significantly less enteric neurons in the gut, indicating an HSCR-like
phenotype, when compared to control animals (Figure 2A-D). The quantification of
enteric neurons, corresponding to 8 myotomes from the urogenital opening,
demonstrated a significant reduction in number of enteric neurons in retsa2684/+ fish
(88 ± 41) compared to WT fish (158 ± 23) (p<0.0001, Figure 2G). These data show
that the Tg(-8.3bphox2b:Kaede) is a suitable animal model for HSCR-like
aganglionosis.
Overexpression of selected candidate gene mRNA in a zebrafish model
Capped mRNAs of the 21 selected Hsa21 genes were injected into Tg(-
8.3bphox2b:Kaede) zebrafish (Figure 1A,B), which were subsequently examined at
5 dpf (as described in the Methods section). The mRNA dosage was titrated in a
range of 5pg to 250pg, to find the optimal dosage for each mRNA based on the
lethality and phenotype observed. Injections of mRNAs resulted in normal ENS
phenotypes for all mRNAs, except one. Only when overexpressing ATP5O (100pg),
a reduction in the number of enteric neurons was observed along the entire
intestine with normal gross morphology when compared to the non-injected
controls (Figure 2E,F). The percentage of zebrafish displaying reduction in enteric
neurons remained similar at higher dosage (150pg). Counting the enteric neurons
within the gut, corresponding to 8 myotomes from the urogenital opening,
Figure 2. Reduced numbers of enteric neurons in the retsa2684/+ mutant fish and ATP5O mRNA
injected fish. A,B) The control Tg(-8.3bphox2b:kaede) fish at 5 dpf in the bright field and under GFP
filter showing fluorescently tagged phox2b expressing cells and the gut is completely colonized with
enteric neurons until the urogenital opening. The asterisk indicates the urogenital opening. C,D) Enteric
neurons along the gut of the Tg(-8.3bphox2b:Kaede); retsa2684/+. Heterozygous ret mutant displayed less
neurons in the gut at 5dpf and discontinuity of colonization of the gut is indicated by arrowhead. E,F)
The zebrafish injected with ATP5O mRNA at 100pg dosage show less enteric neurons in the gut. G)
Quantification of enteric neurons in the intestine corresponding to 8 myotomes of 5dpf ret mutant
compared to the control fish, marked significant reduction. H) Quantification of enteric neuronal count
of non-injected controls compared to the ATP5O overexpressed embryos. A total of 37.5 % of embryos
injected with 100pg of ATP5O displayed reduction in enteric neurons.
OVEREXPRESSION OF ATP5O AS THE LINK BETWEEN DS AND HSCR?
167
6
CHAPTER 6
168
the average count for the controls was 178 ± 23 neurons (Figure 2H). We classified
a gut as hypo-neuronal when the fish contained 2 SD less enteric neurons
compared to the average control zebrafish. For the fish injected with ATP5O mRNA
we found that 37.5% (15/40) of the embryos displayed such a reduction in the
number of enteric neurons in the gut. The enteric neuron count in the affected
embryos displaying reduced enteric neurons was 113 ± 18 (Figure 2H), showing
that elevated levels of ATP5O interfere with normal development of the ENS.
Expression of atp5o in zebrafish
Whole mount in situ hybridization (ISH) was used to determine the spatio-
temporal expression pattern of atp5o between 1dpf and 5dpf of zebrafish
development. RNA in situ hybridization revealed expression of atp5o in different
organs at different developmental stages. At 1dpf, atp5o expression was seen
ubiquitously (Figure 3B,C). At 2dpf the expression was restricted to the
cerebellum, the otolith and the whole gut (Figure 3E,F). Between 3dpf to 5dpf
atp5o was predominantly expressed in the intestine and cerebellum (Figure
3E,F,H,I,K,L,N,O). At 5dpf, high atp5o expression was observed in the proximal and
mid intestine along with the caudal vein (Figure 3N). The sense probe did not show
any staining at 1dpf – 5dpf stages (Figure 3A,D,G,J,M), confirming the specificity of
the probe.
Expression of ATP5O in postnatal human colon
To assess whether ATP5O is also expressed in the human colon,
immunohistochemistry was performed on postnatal colon from healthy
individuals. ATP5O was specifically detected in the ganglia present in the
submucosal (Figure 4A,B) and myenteric plexuses (Figure 4C,D). In addition,
ATP5O was also detected in the colon epithelium. These results suggest that ATP5O
may be important for ENS development in humans as well.
In vitro assays for cell apoptosis and cell cycle analysis
To examine whether ATP5O overexpression affects early apoptosis or the cell cycle
and thereby leads to less neurons in zebrafish gut, we used a human
neuroblastoma cell line (SK-N-SH) and assayed cell apoptosis and cell cycle. SK-N-
SH cells expressing ATP5O were cultured in the absence of serum for 48 hr and
tested for both alterations in apoptosis and cell cycle. Flow cytometry analysis
OVEREXPRESSION OF ATP5O AS THE LINK BETWEEN DS AND HSCR?
169
6
didn’t indicate any significant changes in the early and late apoptosis in cells
expressing ATP5O, when compared to other conditions (Supplementary Figure
1A). In order to identify the impact of ATP5O overexpression on cell cycle using
Figure 3. Spatio-temporal expression of atp5o in zebrafish. atp5o expression at indicated
developmental stages ranging from 1–5dpf in zebrafish embryos (lateral view) detected using ISH.
atp5o is expressed along the GI tract in all the stages. It is expressed ubiquitously at 1dpf (B,C) and the
expression becomes restricted to cerebellum, otolith and whole gut by 2dpf (E,F). Arrowheads indicate
expression in the brain and arrow marks indicate expression in the gastrointestinal tract
(E,F,H,I,K,L,N,O). The sense probe shows no staining at 1–5dpf developmental stages as shown
(A,D,G,J,M).
CHAPTER 6
170
flow cytometry, we observed a slight increase in the fraction of cells in the G1
phase as a result of ATP5O overexpression (Supplementary Figure 1B). Although
the observed difference is not statistically significant, it could indicate that there
can be some effect on the cell cycle arrest in ATP5O-overexpressing cells.
Epistasis between ATP5O and ret in zebrafish
To investigate whether ATP5O interacts with RET during the development of the
ENS and in the pathogenesis of HSCR in DS, we knocked down ret and
overexpressed ATP5O simultaneously by co-injecting ret translation-blocking
morpholino (1 ng) and ATP5O capped mRNA (50 pg) and compared enteric
neurons in distal intestine at 5 dpf to controls injected with either ret morpholino
or ATP5O mRNA alone (Figure 5). The doses were chosen so that neither was
sufficient to induce severe ENS defect by itself, and any synergistic effect between
ret knockdown and ATP5O overexpression would readily be observed. The ret
morpholino caused a mild decrease in enteric neuron number in the distal
intestine compared to ATP5O mRNA control. However, co-injection of ret
morpholino and ATP5O mRNA did not result in further significant reduction,
suggesting limited or no synergistic effect between ret knockdown and ATP5O
overexpression.
Other phenotypic effects of injection of DSCAM and SIM2 mRNA
The use of this zebrafish model and its optical transparency allowed us to detect
other gross developmental abnormalities upon overexpression of the prioritized
genes. Injection of two candidate genes (SIM2 and DSCAM) resulted in an abnormal
phenotype. Overexpression of SIM2 (100 pg) resulted in notochord defects in 66%
of the injected embryos at 5dpf (Figure 6A,C) and 33% among them also displayed
craniofacial abnormalities (Supplementary figure 2A,B). Overexpression of DSCAM
(200 pg) resulted in deformed notochord and myotomes in 68% of the embryos at
5dpf (Figure 6B,D). The majority of these embryos also lacked the swim bladder.
Microinjection of higher dosages (>200 pg for DSCAM and >100 pg for SIM2) of
these mRNAs induced lethality.
OVEREXPRESSION OF ATP5O AS THE LINK BETWEEN DS AND HSCR?
171
6
DISCUSSION
This study reports a role for a chromosome 21 gene, ATP5O, in the development of
the ENS in zebrafish using an mRNA overexpression screen, as overexpression of
ATP5O results in reduced numbers of enteric neurons in the zebrafish gut. This
phenotype is comparable to that of the retsa2684/+ zebrafish line that carries a
mutation in ret, a known HSCR gene in humans. This makes us hypothesize that
elevated levels of ATP5O, as likely in the case of DS, could contribute to the high
prevalence of HSCR among DS patients.
The zebrafish as a model organism for human enteric neuropathies
The intestinal architecture and anatomy of zebrafish closely resembles that of
mammals21. The zebrafish gut undergoes rapid development and by 5dpf the
whole GI tract is functional22. In contrast to amniotes, the zebrafish gut is simpler,
Figure 4. Expression of ATP5O in postnatal human colon. Expression of ATP5O detected by
immunohistochemistry on paraffin embedded post-natal colon sections. Arrowheads indicate
expression of ATP5O in submucosal plexus (A,B) and myenteric plexus (C,D). ATP5O is also expressed in
the gut epithelia as shown by arrows (A, B).
CHAPTER 6
172
it lacks submucosal layer and myenteric neurons are arranged as neuronal pairs or
single neurons21. The zebrafish ENS is also derived from NCC, as in other
vertebrates23. In zebrafish, NCC migrate as two parallel chains of cells to colonize
the whole gut and differentiate into enteric neurons and glia24. Despite these
differences, the organization of the ENS, which modulates functions such as
motility, homeostasis and secretion, is comparable but less complex compared to
mammals making it a good model for human GI diseases25. Previous studies have
shown that perturbation of zebrafish orthologues of known human HSCR genes
using morpholino mediated knockdown, but also some mutant zebrafish for genes
not connected to HSCR, leads to loss of enteric neurons in zebrafish gut and
recapitulates the human HSCR phenotype10,20,24,26-29. In particular RET is known to
be the major player in HSCR and in ENS development4,30. For these reasons we
included the retsa2684/+ mutant zebrafish line as positive control. Indeed when
quantifying the number of neurons in the hindgut, the most distal part of zebrafish
intestine, the region in which mostly the aganglionosis in HSCR patients is
observed, the number of neurons in this mutant fish was reduced.
ATP5O overexpression results in reduced enteric neurons
Microinjection of ATP5O mRNA resulted in reduced numbers of enteric neurons in
the zebrafish gut comparable to what was found in case of the Tg(-
8.3bphox2b:Kaede); retsa2684/+ zebrafish (Figure 2A-H). ATP5O was the only gene for
which overexpression resulted in ENS defects. The fact that overexpression results
in fewer enteric neurons might not be a real surprise as mouse Atp5o is highly
expressed in mouse enteric NCC (in-house RNA sequencing data). ATP5O is also
expressed in the ganglia of submucosal and myenteric plexuses as shown in our
studies using control postnatal colon sections. Similarly, atp5o is also expressed in
the zebrafish gut during early embryonic development and the ENS also forms
during this period. Furthermore, meta-analysis of DS phenotypes in segmental
trisomy’s and its association with congenital gut abnormalities such as HSCR,
duodenal stenosis and intestinal atresia suggested a critical GI region of <13 MB.
This region also includes ATP5O14. Previous identity-by-descent (IBD) and
association mapping in a large (inbred) Mennonite population also showed that
ATP5O is within the IBD region associated with HSCR31. All these data suggest that
ATP5O might well be responsible, or at least contribute to, the HSCR phenotype
often seen in DS patients.
OVEREXPRESSION OF ATP5O AS THE LINK BETWEEN DS AND HSCR?
173
6
The role of ATP5O in HSCR
ATP5O is a mitochondrial gene, which encodes the ATP synthase H+ transporting,
mitochondrial F1 complex, O subunit protein and is also known as Oligomycin
Sensitivity Conferral Protein (OSCP). It is a component of ATP synthase (F(1)F(0)
ATP synthase or Complex V) found in the mitochondrial matrix. ATP synthase is
composed of an extramembranous catalytic core (F1) and a peripheral membrane
proton channel (F0). The encoded protein appears to be part of the connector
linking these two subunits and may be involved in transmission of conformational
changes or proton conductance. It produces ATP from ADP via oxidative
phosphorylation in the presence of a proton gradient across the mitochondrial
membrane. Electron transport complexes of the respiratory chain generate this
gradient32,33. The gene ontology (GO) annotation of ATP5O associates it with drug
binding and transporter activity. It was hypothesized that overexpression of
ATP5O could interfere with the normal subunit composition of ATP synthase,
resulting in an impairment of oxidative phosphorylation34. An imbalance of
expression, as generated in our zebrafish, could potentially impair the subunit
composition of ATP synthase, leading to oxidative phosphorylation disruption and
Figure 5. Epistasis between ATP5O and ret. Quantification of enteric neurons at 5dpf in the distal
most intestine corresponding to 3 myotomes of zebrafish embryos for epistatic interaction between
ATP5O and ret. Embryos were injected with ATP5O (50ng), ret MO (1ng) and a combination of both and
the enteric neuronal count is plotted in the graph. There are no significant differences between ATP5O
(50pg) vs 1 ng ret MO (p=0.6587), 1 ng ret MO vs ATP5O (50pg) + 1ng ret MO (p=0.5437) and ATP5O
(50pg) vs ATP5O (50pg) + 1 ng ret MO (p=0.2146).
CHAPTER 6
174
eventual perturbed proliferation of these cells. We found a slight but not significant
effect on cell cycle arrest (G1 phase) upon overexpression of ATP5O in SK-N-SH
cells. It has also been shown that disruption of oxidative phosphorylation can have
a neurotoxic effect on neuronal progenitor cells35, and overproduction of ATP
synthase in Escherichia coli has already been implicated in cell division and
growth36. During early ENS development, the enteric NCCs migrate, proliferate
extensively and differentiate into neurons and glia. ATP5O overexpression could
potentially affect enteric NCC proliferation and lead to fewer neurons in zebrafish
gut, as observed in our experiments.
Figure 6. Notochord defects in SIM2 and DSCAM mRNA injected zebrafish. Overexpression of SIM2
and DSCAM lead to defects in the notochord, as represented by arrows in the bright field images of 5dpf
embryos as compared to respective controls (A, B). The notochord is discontinuous and deformed in
embryos in which SIM2 and DSCAM are overexpressed (C, D).
OVEREXPRESSION OF ATP5O AS THE LINK BETWEEN DS AND HSCR?
175
6
ATP5O does not interact with ret
A previous study showed over-representation of the enhancer
polymorphism RET+9.7 (rs2435357:C>T) in DS-HSCR37. The disease-associated
allele was significantly different between individuals with DS alone, HSCR alone,
and those with HSCR and DS, demonstrating an association and interaction
between RET and chromosome 21 gene dosage. However, our zebrafish data did
not demonstrate any interaction between ret and ATP5O in ENS development,
suggesting that they acted independently in separate pathways.
Additional phenotypes due to overexpression of Hsa21 genes
In this overexpression screen of 21 candidate genes from Hsa21, we also identified
phenotypic defects other than that of the ENS for DSCAM and SIM2. Zebrafish
dscam is highly expressed in the developing brain. It is thought to be involved in
shaping the nervous system and early morphogenesis of the zebrafish embryo38.
The expression pattern of sim2 in zebrafish has been reported using whole mount
ISH; it is expressed mainly in the diencephalon, the midbrain and the pharyngeal
arches39. Overexpression of DSCAM and SIM2 in zebrafish displayed defects mainly
in notochord development and in the floor plate, exhibiting discontinuity with
some twists and folds upon overexpression of these genes. The notochord is
essential for proper vertebrate development by producing secreted factors that
signal to the surrounding tissues. It is also important for specification of the
ventral fates in the CNS and it plays an important role in patterning and in a proper
structural integrity. The defects in notochord development are possibly due to
uneven cell patterning or selective cell death or defects in signaling pathways
required for normal notochord development (reviewed by Stemple, 200540).
Moreover, besides notochord defects we also observed craniofacial abnormalities
on overexpression of SIM2 in a subset of embryos at 5dpf along with notochord
defects (Supplementary Figure 2A,B), indicating its prospective contribution to the
phenotype observed in DS affected individuals. To our surprise, some candidates
(such as DSCAM, SIM2, and APP) already associated with ENS phenotypes, based on
previous genetic studies and murine models did not display any visible ENS
phenotype following their overexpression in the zebrafish model. DSCAM has been
highlighted as a predisposing locus to HSCR in patients with DS14,31,41. Our previous
studies, using in vitro methods, have shown that overexpression of SIM2 leads to a
down regulation of the RET gene8. Similarly, a transgenic APP mouse model
displayed reduction in myenteric neuronal density and delay in gut transit42. These
CHAPTER 6
176
are characteristic features of HSCR in humans, but we were not able to recapitulate
similar phenotypes in the zebrafish model upon their overexpression. This could
be due to the fact that the regulatory mechanism required for efficient translation
of certain human RNAs was not equally efficient in zebrafish, or that the human
protein does not have the same effect as the zebrafish protein or alternatively the
threshold dosage of RNA required resulting in a phenotype may not have been
achieved in our study. Furthermore, we cannot rule out any unknown feedback
mechanisms resulting in the net neutrality of overexpression. On the other hand, a
phenotypic effect may simply require a combinatorial overexpression of more than
one gene.
Within the list of 21 genes we did not include COL6A4 although it was
recently shown that overexpression of Col6a4 in transgenic mice could lead to a
HSCR-like phenotype43. The reason for not including it was the fact that we had not
found any direct or indirect evidence for the involvement of Col6a4 with ENS
development nor did we see the gene being expressed in the mouse enteric NCC at
E14.5 (in-house RNA sequencing data).
Conclusions
Although the association of DS with HSCR is well recognized, the causative link
between them is not well understood. The majority of DS affected individuals
exhibit GI abnormalities44, which might be related to abnormal ENS development.
Here, we used a transgenic zebrafish line, whose ENS is marked with the
fluorescent Kaede protein, to assay the functional effects of overexpression of
Hsa21 candidate genes. We found that ATP5O affects ENS development in
zebrafish. The use of a vertebrate model to find the missing link between DS and
HSCR opens the door for larger screens and better understanding of this complex
association.
ACKNOWLEDGEMENTS
This study was supported by research grant from ZonMW (TOP-subsidie 40-
00812-98-10042) and the Maag Lever Darm stichting (WO09-62). We thank Prof.
Bart de Strooper (Leuven, Belgium) for providing us with the APP plasmid; Herma
van der Linde and Maria Alves for technical advice and helpful discussions.
OVEREXPRESSION OF ATP5O AS THE LINK BETWEEN DS AND HSCR?
177
6
REFERENCES
1. Badner, J.A., et al., A genetic study of Hirschsprung disease. Am J Hum Genet, 1990. 46(3): p.
568-80. 2. Sasselli, V., V. Pachnis, and A.J. Burns, The enteric nervous system. Dev Biol, 2012. 366(1): p.
64-73. 3. Amiel, J., et al., Hirschsprung disease, associated syndromes and genetics: a review. J Med Genet,
2008. 45(1): p. 1-14. 4. Alves, M.M., et al., Contribution of rare and common variants determine complex diseases-
Hirschsprung disease as a model. Dev Biol, 2013. 382(1): p. 320-9. 5. Attie, T., et al., Diversity of RET proto-oncogene mutations in familial and sporadic
Hirschsprung disease. Hum Mol Genet, 1995. 4(8): p. 1381-6. 6. Hofstra, R.M., et al., RET and GDNF gene scanning in Hirschsprung patients using two dual
denaturing gel systems. Hum Mutat, 2000. 15(5): p. 418-29. 7. Emison, E.S., et al., A common sex-dependent mutation in a RET enhancer underlies
Hirschsprung disease risk. Nature, 2005. 434(7035): p. 857-63. 8. Sribudiani, Y., et al., Variants in RET associated with Hirschsprung's disease affect binding of
transcription factors and gene expression. Gastroenterology, 2011. 140(2): p. 572-582 e2. 9. Garcia-Barcelo, M.M., et al., Genome-wide association study identifies NRG1 as a susceptibility
locus for Hirschsprung's disease. Proc Natl Acad Sci U S A, 2009. 106(8): p. 2694-9. 10. Jiang, Q., et al., Functional loss of semaphorin 3C and/or semaphorin 3D and their epistatic
interaction with ret are critical to Hirschsprung disease liability. Am J Hum Genet, 2015. 96(4): p. 581-96.
11. Parker, S.E., et al., Updated National Birth Prevalence estimates for selected birth defects in the United States, 2004-2006. Birth Defects Res A Clin Mol Teratol, 2010. 88(12): p. 1008-16.
12. Gardiner, K., et al., Mouse models of Down syndrome: how useful can they be? Comparison of the gene content of human chromosome 21 with orthologous mouse genomic regions. Gene, 2003. 318: p. 137-47.
13. Sturgeon, X. and K.J. Gardiner, Transcript catalogs of human chromosome 21 and orthologous chimpanzee and mouse regions. Mamm Genome, 2011. 22(5-6): p. 261-71.
14. Korbel, J.O., et al., The genetic architecture of Down syndrome phenotypes revealed by high-resolution analysis of human segmental trisomies. Proc Natl Acad Sci U S A, 2009. 106(29): p. 12031-6.
15. Harrison, C., T. Wabbersen, and I.T. Shepherd, In vivo visualization of the development of the enteric nervous system using a Tg(-8.3bphox2b:Kaede) transgenic zebrafish. Genesis, 2014. 52(12): p. 985-90.
16. Busch-Nentwich, E., Kettleborough, R., Harvey, S., Collins, J., Ding, M., Dooley, C., Fenyes, F., Gibbons, R., Herd, C., Mehroke, S., Scahill, C., Sealy, I., Wali, N., White, R., and Stemple, D.L. , Sanger Institute Zebrafish Mutation Project mutant, phenotype and image data submission. 2012.
17. Westerfield, M., The zebrafish book : a guide for the laboratory use of zebrafish (Danio rerio). 2007, [Eugene, OR]: M. Westerfield.
18. Hyatt, T.M. and S.C. Ekker, Vectors and techniques for ectopic gene expression in zebrafish. Methods Cell Biol, 1999. 59: p. 117-26.
19. Thisse, C. and B. Thisse, High-resolution in situ hybridization to whole-mount zebrafish embryos. Nat Protoc, 2008. 3(1): p. 59-69.
20. Heanue, T.A. and V. Pachnis, Ret isoform function and marker gene expression in the enteric nervous system is conserved across diverse vertebrate species. Mech Dev, 2008. 125(8): p. 687-99.
21. Wallace, K.N., et al., Intestinal growth and differentiation in zebrafish. Mech Dev, 2005. 122(2): p. 157-73.
22. Wallace, K.N. and M. Pack, Unique and conserved aspects of gut development in zebrafish. Dev Biol, 2003. 255(1): p. 12-29.
23. Kelsh, R.N. and J.S. Eisen, The zebrafish colourless gene regulates development of non-ectomesenchymal neural crest derivatives. Development, 2000. 127(3): p. 515-25.
24. Shepherd, I.T., et al., Roles for GFRalpha1 receptors in zebrafish enteric nervous system development. Development, 2004. 131(1): p. 241-9.
CHAPTER 6
178
25. Shepherd, I. and J. Eisen, Development of the zebrafish enteric nervous system. Methods Cell Biol, 2011. 101: p. 143-60.
26. Cheng, W.W., et al., Depletion of the IKBKAP ortholog in zebrafish leads to hirschsprung disease-like phenotype. World J Gastroenterol, 2015. 21(7): p. 2040-6.
27. Elworthy, S., et al., Phox2b function in the enteric nervous system is conserved in zebrafish and is sox10-dependent. Mech Dev, 2005. 122(5): p. 659-69.
28. Uyttebroek, L., et al., The zebrafish mutant lessen: an experimental model for congenital enteric neuropathies. Neurogastroenterol Motil, 2016. 28(3): p. 345-57.
29. Simonson, L.W., et al., Characterization of enteric neurons in wild-type and mutant zebrafish using semi-automated cell counting and co-expression analysis. Zebrafish, 2013. 10(2): p. 147-53.
30. Emison, E.S., et al., Differential contributions of rare and common, coding and noncoding Ret mutations to multifactorial Hirschsprung disease liability. Am J Hum Genet, 2010. 87(1): p. 60-74.
31. Puffenberger, E.G., et al., Identity-by-descent and association mapping of a recessive gene for Hirschsprung disease on human chromosome 13q22. Hum Mol Genet, 1994. 3(8): p. 1217-25.
32. Devenish, R.J., et al., The oligomycin axis of mitochondrial ATP synthase: OSCP and the proton channel. J Bioenerg Biomembr, 2000. 32(5): p. 507-15.
33. Ronn, T., et al., Genetic variation in ATP5O is associated with skeletal muscle ATP50 mRNA expression and glucose uptake in young twins. PLoS One, 2009. 4(3): p. e4793.
34. Chen, H., et al., Cloning of the cDNA for the human ATP synthase OSCP subunit (ATP5O) by exon trapping and mapping to chromosome 21q22.1-q22.2. Genomics, 1995. 28(3): p. 470-6.
35. Lee, Y., et al., Selective impairment on the proliferation of neural progenitor cells by oxidative phosphorylation disruption. Neurosci Lett, 2013. 535: p. 134-9.
36. von Meyenburg, K., B.B. Jorgensen, and B. van Deurs, Physiological and morphological effects of overproduction of membrane-bound ATP synthase in Escherichia coli K-12. Embo J, 1984. 3(8): p. 1791-7.
37. Arnold, S., et al., Interaction between a chromosome 10 RET enhancer and chromosome 21 in the Down syndrome-Hirschsprung disease association. Hum Mutat, 2009. 30(5): p. 771-5.
38. Yimlamai, D., et al., The zebrafish down syndrome cell adhesion molecule is involved in cell movement during embryogenesis. Dev Biol, 2005. 279(1): p. 44-57.
39. Thisse B, T.C., Fast Release Clones: A High Throughput Expression Analysis. ZFIN Direct Data Submission (http://zfinorg). 2004.
40. Stemple, D.L., Structure and function of the notochord: an essential organ for chordate development. Development, 2005. 132(11): p. 2503-12.
41. Jannot, A.S., et al., Chromosome 21 scan in Down syndrome reveals DSCAM as a predisposing locus in Hirschsprung disease. PLoS One, 2013. 8(5): p. e62519.
42. Semar, S., et al., Changes of the enteric nervous system in amyloid-beta protein precursor transgenic mice correlate with disease progression. J Alzheimers Dis, 2013. 36(1): p. 7-20.
43. Soret, R., et al., A collagen VI-dependent pathogenic mechanism for Hirschsprung's disease. J Clin Invest, 2015. 125(12): p. 4483-96.
44. Buchin, P.J., J.S. Levy, and J.N. Schullinger, Down's syndrome and the gastrointestinal tract. J Clin Gastroenterol, 1986. 8(2): p. 111-4.
OVEREXPRESSION OF ATP5O AS THE LINK BETWEEN DS AND HSCR?
179
6
SUPPLEMENTARY INFORMATION
Supplementary Figure 1. In vitro apoptosis and cell cycle assays. SK-N-SH neuroblastoma cells were transected with construct containing ATP5O and the cells were starved for 48 hours and assayed for apoptosis and cell cycle assays by FACS analysis. Data are represented as mean ± SD for two independent experiments. A) Cells that were PE Annexin V-positive and 7AAD-negative were classified as early apoptotic, while cells positive for both PE Annexin V and 7AAD were marked as apoptotic. There were no differences between ATP5O-transfected cells and controls. B) Cell cycle analysis using propidium iodide (PI) DNA staining indicated no major differences between the cell phases, but there was a slight increase in G1 phase for ATP5O-transfected cells as compared to the controls.
CHAPTER 6
180
Supplementary Figure 2. Craniofacial abnormalities by SIM2 overexpression. A) Control embryo
at 5dpf. (B) SIM2 (100pg) overexpressing embryo at 5dpf displaying craniofacial abnormality as shown
by arrows in 33% of them along with the notochord phenotype. The lower jaw appears dislodged
compared to the control.
GENERAL DISCUSSION
7
CHAPTER 7
182
Hirschsprung disease (HSCR) is a congenital disorder in which the muscles in a
variable segment of the distal colon are unable to relax. This results in
constipation, vomiting and distention of the bowel proximal to the affected
segment. The lack of colonic relaxation in HSCR stems from an incomplete
development of the enteric nervous system (ENS). During embryonic development
almost all enteric neural crest cells (ENCCs) that form the ENS enter the foregut
and proliferate, survive and migrate towards the distal hindgut, before they
differentiate into neurons and glial cells1. A distortion of any of these processes
leads to an incomplete colonization of the colon by ENCCs, lack of neuronal
innervation, and hence failure of the colonic smooth muscles to relax.
HSCR is an inherited disease for which mutations in over 20 genes and 5
associated genomic loci have been found. Collectively they explain ~25% of the
observed heritability2–4. The majority of these genes has been identified by linkage
analysis and sequencing of candidate genes. However, over the past decade genetic
research has been greatly facilitated by technological developments in genome-
wide association studies (GWAS) and next-generation sequencing (NGS). Three
GWAS on predominantly sporadic HSCR patients have been conducted5–7. In
addition, rare variants in the associated loci and in yet undiscovered HSCR genes
will likely contribute to the genetic risk of this complex genetic disease. These
hypotheses can be tested by exome- or genome sequencing of (large) cohorts of
HSCR patients.
NEXT-GENERATION SEQUENCING IN COMPLEX GENETIC DISEASES
Study designs to identify rare, pathogenic variants
Common variants have successfully been associated with disease in GWAS, but
have a low penetrance. Rare disease-associated variants on the other hand usually
have a higher penetrance and are therefore more informative for understanding
disease etiology and in genetic counseling8. Rare variants can be identified on a
genome/exome-wide scale by NGS, but large-scale rare variant association studies
are far less abundant than GWAS. The main reason for this is the high cost per
sample in NGS compared to GWAS. Genetic power calculations show that rare
variant associations studies require similar numbers of cases and controls as
GWAS, meaning that NGS studies have less statistical power than equally expensive
GWAS9. However, the cost of NGS is declining rapidly, thereby opening new
GENERAL DISCUSSION
183
7
opportunities for rare variant association studies. However, even at reducing costs
sequencing thousands of cases and controls remains an expensive ordeal and
patient numbers may not be available for diseases that are rare. For these reasons
the question arises whether it is worthwhile to sequence small numbers of
patients (and controls) in complex genetic diseases.
Case-control analysis on a limited number of sporadic patients will not
have sufficient statistical power to find genome-wide significance, but other study
designs may be successful with small numbers of individuals. Multiplex families
remain a valuable source for genetic testing. For example, exome sequencing of 29
patients and 11 controls from 14 late-onset Alzheimer disease pedigrees
successfully identified rare variants in PLD3 in two families10.
If sporadic patients are being studied, several approaches have been
suggested to increase the power of a rare variant association study. In chapter 5
we applied these strategies to analyze exome sequencing data from 48 sporadic
HSCR cases and 212 controls. We maximized the power of the study by prioritizing
long-segment HSCR cases (which have the highest heritability), selecting rare,
pathogenic variants, collapsing all variants per gene and performing a meta-
analysis on data from three sequencing centers. Since we found no genome-wide
associations, we analyzed whether the highest associated genes (by nominal p-
value) are good candidate genes for HSCR. Gene prioritization tools consistently
predicted CELSR1, CLOCK and FASN as the most likely disease-causal genes. In
addition, GHDC is a poorly characterized gene for which no connection to the ENS
was found, but the gene was highly expressed by ENCCs and contained disrupting
variants.
Predicting the most likely disease-causal genes in an underpowered
genetic study based on gene function or expression is yet no evidence for
involvement in the disease. Even when assuming that there are no real
associations with HSCR, some genes will have more significant p-values than
others and these genes could be of interest in the disease context just by chance.
The gene prioritization and gene expression data do therefore not identify new
disease-associated genes, but should be regarded as means to identify the best
candidate genes for follow up studies on a limited number of genes. Moreover,
small rare variant association studies can be used to assess the frequency of rare
variants in the candidate genes in cases and controls, which can be used to
calculate odds ratios. These are important for power calculations in follow-up
studies.
CHAPTER 7
184
Identification of de novo mutations in NGS data
As said small studies are almost all underpowered. However, NGS can be
successfully applied on relatively small numbers of sporadic disease patients for
the identification of de novo mutations (chapter 2). Exome sequencing studies
have revealed a role for de novo mutations in the genetic etiology of rare congenital
syndromes and more common neurodevelopmental disorders11. However, the
identification of de novo mutations in exome sequencing data is not trivial, since
the number of true de novo mutations is much lower than the number of false
positives that arise from sequencing errors and incorrectly called genotypes. Joint
genotype calling of offspring-parent trios12 and statistical modelling efforts13–17
have improved de novo mutation detection significantly. In chapter 4 we analyzed
if and how de novo mutations can be detected by applying filtering thresholds for
reference allele ratio, sequencing depth and Genotype Quality score. We found that
by removing the maximum assigned value of the Genotype Quality score, de novo
mutations could be detected with high sensitivity and specificity in our training
data and in a second, independent dataset.
We restricted our analysis to three sequencing parameters (reference
allele ratio, sequencing depth and Genotype Quality score) that were reported to
efficiently filter errors from NGS data18,19. However, other parameters such as
calling quality, mapping quality, quality/depth ratio and [depth in
proband]/[depth in parent] ratio have also been proposed as discriminating
parameters for de novo mutation detection17. A combined approach where the
unmaxized Genotype Quality score and other sequencing parameters are
incorporated in a statistical model may be most powerful in detecting de novo
mutations.
GENETICS OF HSCR
Identification of four new HSCR genes in a de novo mutation screen
In our exome sequencing on sporadic, non-syndromic HSCR patients, we
predominantly included patients with long-segment HSCR, since this subtype has
the highest heritability and follows a dominant mode of inheritance with
incomplete penetrance20. Rare variants with a large effect size are therefore
expected to make a large contribution to long-segment HSCR. In chapter 2 we
tested the hypothesis that rare variants in sporadic, long-segment HSCR occur de
GENERAL DISCUSSION
185
7
novo. We identified 28 de novo mutations in 24 HSCR patients. Eight of these de
novo mutations were found in RET, the major HSCR gene, supporting a role for de
novo mutations in the etiology of long-segment HSCR. Moreover, this finding, and
the overrepresentation of loss of function de novo mutations, suggests that de novo
mutations in other genes may contribute to HSCR as well.
To overcome the mentioned problems one encounters when sequencing
small cohorts of patients, we carried out unbiased in silico and in vivo analyses for
the 20 genes (besides RET) that harbor de novo mutations to determine whether
they are involved in the development of the ENS. RET and CKAP2L were enriched
for rare variants in HSCR patients compared to controls, but only RET could be
replicated in an independent cohort. The 20 genes with de novo mutations could
not be linked to the signaling pathways in HSCR, but all were expressed by either
ENCCs from E14.5 mouse embryos or by human iPS cell-derived neural crest cells.
Since our in silico analyses did not yield convincing evidence for a role of the de
novo mutated genes in HSCR, we decided to functionally test our candidate genes
using the transgenic Tg(-8.3phox2b:Kaede) zebrafish model21. Morpholino-
mediated knockdown of the 12 genes that had non-synonymous de novo mutations
and an orthologue in the zebrafish resulted in HSCR-like aganglionosis for four
genes (DENND3, NCLN, NUP98 and TBATA).
To find a disease-mimicking phenotype for four out of 12 candidate genes
is a great success. However, questions have been raised about the specificity of
morpholinos to study gene function, as up to 70% of morpholino-induced
phenotypes cannot be reproduced by genome-editing techniques22. Does this mean
that the HSCR-like phenotypes we observed are false positives? We think not.
Second, non-overlapping morpholinos were used to confirm the results for all
genes that showed a phenotype in the initial screen and appropriate 5-nucleotide
mismatch controls were used. The morpholino experiments were repeated
independently in a second laboratory and phenotypes were reproducible.
Moreover, morpholinos were co-injected with a morpholino against p53 to test
whether the phenotype was caused by p53-induced apoptosis. Since the results
from all control experiments are in line with a specific morpholino-induced
phenotype, we confidently conclude that the four identified HSCR genes are
indispensable for ENS development. Nevertheless, confirmation of the phenotype
by a genetically engineered mutant would further strengthen this conclusion.
Mutations in the four novel HSCR genes were infrequent in our replication
cohort and not significantly associated to HSCR. A larger replication cohort is
CHAPTER 7
186
required to assess to what extent mutations in these genes contribute to HSCR.
This question is currently being addressed by the International Hirschsprung
Disease Consortium. DNA from approximately 300 HSCR patients and controls will
be analyzed for mutations in the four genes and other candidate genes by targeted
NGS. This study will be informative for the frequency of mutations in the four novel
genes. It is also unclear how mutations in the four genes contribute to
aganglionosis in the HSCR patients that carry the mutations. Functional studies on
how these genes are involved in the migration, proliferation, survival and
differentiation of ENCCs are therefore warranted. Such studies could be performed
in vivo in animal models, for example in the zebrafish model that was used to
screen the candidate genes, or in transgenic mouse models. Alternatively, in vitro
model systems, such as primary ENCCs, iPS cell-derived neural crest cells, neural
crest-like cell lines or cultured gut explants can be applied to delineate the role of
DENND3, NCLN, NUP98 and TBATA in ENS development.
How to identify additional HSCR-associated variants?
Functionally testing candidate genes identified in a genetic study, as we applied in
chapter 2, resulted in the identification of four new disease genes that would not
have been identified by statistical evidence only and which were not connected to
the known signaling pathways in HSCR. This emphasizes the importance of
unbiased functional testing. Although there should be no doubt that the plethora of
bioinformatic predictions that is available has contributed massively to our
knowledge of biology, one should realize that our understanding of cellular
processes is limited. Using functional analysis, although costly and time
consuming, we were able to unveil new and unexpected genes for HSCR, that
would not have been found when solely relying on bioinformatic predictions. One
might conclude that we should perform more functional assays to include or
exclude identified variants. Clearly this is not realistic for all identified variants.
Burden tests might help but as mentioned in small cohorts these tests are not
informative. An alternative might be the screening of a selection of genes based on
gene function and expression.
Gene expression analysis in the developing ENS
Another approach to deal with low statistical power in genetic studies is to select a
number of candidate genes, rather than performing a genome-wide study. The low
statistical power of genome-wide studies stems from the large multiple testing
GENERAL DISCUSSION
187
7
correction and this correction is considerably lower in studies on limited numbers
of genes. Candidates genes for HSCR can be prioritized based on genetic pilot
studies (chapter 5), or on their expression in the developing ENS. Previous gene
expression studies on the ENS found that known HSCR genes are expressed by
ENCCs and uncovered new genes and pathways in ENS development23–26. Using a
transgenic mouse model to specifically isolate ENCCs and a novel analytical
approach to identify regulatory genes, we analyzed the gene expression profile of
ENCCs in more detail in chapter 3. Gene Ontology analysis showed that ENCCs
expressed markers of early and late neuronal differentiation and activation of the
RET receptor by its ligand GDNF induced neuronal maturation of ENCCs.
Moreover, we applied an Upstream Regulator analysis to identify genes that are
predicted to regulate the expression of differentially expressed genes. The known
HSCR genes Sox10 and Ret were predicted regulators of gene expression in ENCCs,
suggesting the Upstream Regulator analysis is a powerful approach to identify
relevant signaling molecules. Other predicted regulators, Bdnf, App, Mapk10 and
Timp1, are therefore likely to be important for ENS development as well and may
play a role in HSCR. The predicted Upstream Regulators could be linked together in
a molecular network underlying ENS development. Likewise, Adora2a and Npy2r
(mapping to the HSCR susceptibility locus 4q31.3-q32.327) were near-significant
regulators of terminal neuronal differentiation in GDNF-treated ENCCs.
Our study resulted in a list of good candidate genes, however this list is far
from complete. In our gene expression study ENCCs were isolated from mouse
embryos at developmental stage E14.5 and were cultured in vitro. The single time
point of analysis and in vitro culturing are limitations of this study. ENS
development in mice takes place between E9 and E1528. ENCCs invade the gut,
proliferate and migrate along the intestinal wall, and differentiate into neurons and
glia. ENCCs make up a heterogeneous cell population at any moment during ENS
development, where cells in the proximal regions start differentiating when
migratory ENCCs have not yet reached the hindgut29,30. At E14.5, the time point of
our expression analysis, ENCCs have reached the hindgut and we found that
markers of early and late neuronal differentiation were expressed. At earlier stages
of ENS development there will be more migratory and fewer differentiating ENCCs
and the corresponding expression profile would likely be different from that at
E14.5. A temporal expression analysis between E9 and E15 would give more
insight in the changes that ENCC undergo as they colonize the gut. Also the GDNF-
responsive genes will likely be different at earlier time points, as GDNF is
CHAPTER 7
188
important for the proliferation, migration, survival and neuronal differentiation of
ENCCs, depending on the developmental stage and position in the gut31–34.
Moreover, as HSCR originates from incomplete colonization of the gut by ENCCs, it
would be interesting to compare gene expression profiles of ENCCs at the
migratory wavefront and non-migrating cells behind the wavefront. Such an
experiment would provide information about the genes involved in the migration
of ENCCs and the gene expression changes that are associated with a halt in
migration.
Moreover, not all genes that are important for ENS development are highly
expressed by ENCCs. For example, the known HSCR genes GDNF, NRTN, EDN3,
ECE1 and SEMA3A are ligands that are secreted by the gut mesenchyme. Other
ligands, such as BMPs, NT-3 and SHH contribute to ENS development as well, but
are not highly expressed by ENCCs35. So genes important for ENS development
may have been missed by exclusively looking at expression in ENCCs. Furthermore,
these data suggest that when new receptors are identified, it is worthwhile to also
study the function of its ligand(s).
Ideally, isolated ENCCs should not be cultured prior to RNA isolation, as
cell culture conditions may affect gene expression. Nevertheless, in our gene
expression study we cultured isolated ENCCs to be able to study the effect of GDNF
on gene expression. In addition, we needed to expand our ENCC cultures in vitro to
obtain sufficient amounts of RNA for microarray analysis. Currently, microarrays
are being replaced by RNA sequencing to quantify gene expression and RNA
sequencing requires a lot less input material. In vitro culturing of isolated ENCCs
can therefore be circumvented in future gene expression experiments. In
comparison to microarrays, RNA sequencing is also more sensitive in detecting
low-abundance transcripts, can discriminate between different transcripts of a
gene, allows for the identification of genetic variants, and can be used to compare
the relative expression of different genes within a sample36.
Non-coding variants and disease
In addition to selection of candidate genes, gene expression data can be used in
disease gene discovery by expression quantitative trait locus (eQTL) mapping. It
has been suggested that eQTL mapping has greater power of finding risk SNPs than
investigating the association of SNPs with the presence of absence of disease37. In
HSCR it is assumed that reduced expression levels of RET and other HSCR genes
impair colonization of the bowel and common variants in intron 1 of RET have
GENERAL DISCUSSION
189
7
been shown to affect RET expression38,39. A genome-wide survey of common or
rare variants affecting the expression of known ENS genes is therefore likely to
identify novel genes underlying ENS development and HSCR.
Low penetrant mutations
There is a relatively large contribution of common variants to short-segment
HSCR. The two published GWAS by the International Hirschsprung Disease
Consortium analyzed 371 patients and 856 controls and 629 HSCR trios,
respectively5,7. A third GWAS on HSCR included 123 patients and 432 controls6.
Meta-analysis of these three GWAS showed that 34±11% of the HSCR risk is
attributable to common variation, but only 6% can be explained by the three
associated loci (RET, NRG1 and SEMA3) (Tang et al., unpublished data). This
suggests that there are undiscovered HSCR loci that may not have reached
genome-wide significance. Increasing patient cohorts in GWAS generally leads to
more associated loci40–42. Increasing cohort sizes in GWAS on HSCR to several
thousand cases and controls will therefore uncover additional associations and
lead to the identification of new HSCR genes. However, such an undertaking is
expensive and the collection of large numbers of HSCR patients may be
troublesome.
It must be noted that although the GWAS on HSCR uncovered new
signaling pathways in ENS development5–7, common variants explain only part of
the heritability in HSCR. The missing heritability may come from rare variants in
both previously associated and as yet undiscovered HSCR genes.
A molecular link between the CNS and the ENS
The four newly identified HSCR genes were expressed not only in the gut during
zebrafish development, but also in the anterior central nervous system (CNS). This
suggests a role for these genes in the development of both the ENS and the CNS.
Although the developmental origins of the CNS (neural tube) and ENS (neural
crest) are different, there are many similarities between the two nervous systems.
Like the CNS, the ENS is composed of efferent, afferent and interneurons that use
largely the same repertoire of neurotransmitters43,44. Moreover, the genes involved
in late ENS development show a large overlap with those expressed in the
developing CNS (chapter 3). Various CNS pathologies, such as autism, stroke,
Parkinsonian disease, multiple sclerosis and spinal cord injury can be associated
with constipation, suggesting that these conditions affect the proper function of the
CHAPTER 7
190
ENS45,46. Vice versa, the HSCR genes KBP, SOX10, NRG1, SEMA3A, IKBKAP and ZEB2
all contribute to CNS pathologies47–53. Further evidence for the link between the
ENS and CNS comes from the HSCR-associated syndromes, most of which show
neurological involvement2. The most prevalent HSCR-associated syndrome is
Down syndrome (DS, or Trisomy 21) with an incidence of 7.3% among HSCR
patients54. Conversely, DS patients have a 130-fold elevated risk of HSCR compared
to the general population54). This suggests that the triplication of one or more
genes on chromosome 21 poses a risk for developing HSCR. In chapter 6 we
explored the genetic link between DS and HSCR by overexpressing selected human
chromosome 21 genes into the Tg(-8.3bphox2b:Kaede) reporter zebrafish model.
We found that injection of ATP5O mRNA resulted in hypoganglionosis,
predominantly in the distal intestine. atp5o was expressed in the zebrafish gut and
cerebellum between day 1 and 5 post-fertilization. In post-natal human colon
ATP5O was expressed in the myenteric and submucosal plexuses, as well as in the
epithelium. These data suggest that ATP5O is relevant for ENS development and
that triplication of ATP5O likely contributes to HSCR in DS patients.
Implications for genetic counseling
The revolution in genetic screening, due to the introduction of NGS, has improved
genetic counseling enormously. For example in patients with intellectual disability
20 percent more causative mutations are being found, which more or less doubles
the number of diagnoses that can be made.
For rare diseases, such as HSCR, however, this is slightly more
complicated. The study presented in chapter 2 nicely illustrates the problems and
opportunities one is facing when screening a rare disease. On the one hand NGS
improved genetic counseling as some RET mutations that were missed in the
original screening were detected (not mentioned in the paper). However, what to
do with the identified de novo mutations? Do we have sufficient evidence that the
mutations identified are really disease causing? We did show in a zebrafish model
that knocking down the expression of the gene resulted in an HSCR-like phenotype,
but is this sufficient in combination with in silico prediction and the fact that then
mutations were de novo? In our case one could argue that the combination of all
data should suffice. But what to do when no functional assays are or can be
performed? Is in silico prediction in combination with a de novo appearance
enough? Criteria should be made how to deal with these de novo findings.
GENERAL DISCUSSION
191
7
For de novo mutations these criteria might be relatively easy but what to
do with a gene in which an inherited mutation is found only once (even
truncating)? In particular when one of the healthy parents is also carrying the
mutation? Burden tests will likely not give statistical evidence for involvement.
Should these all be functionally tested? This is not realistic. One could decide,
based on function and mutation type to use the mutation in counseling, but what is
known about the penetrance of the mutation? Again criteria need to be set how on
what to do with this type of mutations in rare diseases
Clearly, we need rules on how to work with these mutations as this type of
findings will become more and more common practice.
CONCLUSIONS
The work in this thesis describes for the first time the application of NGS to study
the contribution of rare, coding variants to sporadic HSCR. We show that a rare
variant association study has limited power to find genome-wide associations, but
does provide information on mutation frequencies per gene and odds ratios. NGS is
more powerful when applied to identify de novo mutations. Although the
identification of de novo mutations in NGS data is technically challenging,
combining de novo mutation screening and functional genomics shows that the
identified de novo mutations make a large contribution to sporadic, long-segment
HSCR. This finding is of importance for genetic counseling in HSCR and underlines
the value of unbiased functional analysis of candidate genes identified in a genetic
study.
CHAPTER 7
192
REFERENCES
1. Heanue, T. & Pachnis, V. Enteric nervous system development and Hirschsprung’s disease:
advances in genetic and stem cell studies. Nat. Rev. Neurosci. 8, 466–79 (2007). 2. Amiel, J. et al. Hirschsprung disease, associated syndromes and genetics: a review. J. Med.
Genet. 45, 1–14 (2008). 3. Brooks, A. S., Oostra, B. A. & Hofstra, R. M. W. Studying the genetics of Hirschsprung’s disease:
Unraveling an oligogenic disorder. Clin. Genet. 67, 6–14 (2005). 4. Alves, M. M. et al. Contribution of rare and common variants determine complex diseases-
Hirschsprung disease as a model. Dev. Biol. 382, 320–9 (2013). 5. Garcia-Barcelo, M.-M. et al. Genome-wide association study identifies NRG1 as a susceptibility
locus for Hirschsprung’s disease. Proc. Natl. Acad. Sci. U. S. A. 106, 2694–9 (2009). 6. Kim, J.-H. et al. A genome-wide association study identifies potential susceptibility Loci for
hirschsprung disease. PLoS One 9, e110292 (2014). 7. Jiang, Q. et al. Functional loss of semaphorin 3C and/or semaphorin 3D and their epistatic
interaction with ret are critical to Hirschsprung disease liability. Am. J. Hum. Genet. 96, 581–96 (2015).
8. Bodmer, W. & Bonilla, C. Common and rare variants in multifactorial susceptibility to common diseases. Nat. Genet. 40, 695–701 (2008).
9. Zuk, O. et al. Searching for missing heritability: designing rare variant association studies. Proc. Natl. Acad. Sci. U. S. A. 111, E455–64 (2014).
10. Cruchaga, C. et al. Rare coding variants in the phospholipase D3 gene confer risk for Alzheimer’s disease. Nature 505, 550–554 (2014).
11. Veltman, J. a. & Brunner, H. G. De novo mutations in human genetic disease. Nat. Rev. Genet. 13, 565–575 (2012).
12. Cartwright, R. A., Hussin, J., Keebler, J. E. M., Stone, E. A. & Awadalla, P. A family-based probabilistic method for capturing de novo mutations from high-throughput short-read sequencing data. Stat. Appl. Genet. Mol. Biol. 11, (2012).
13. Li, B. et al. A Likelihood-Based Framework for Variant Calling and De Novo Mutation Detection in Families. PLoS Genet. 8, e1002944 (2012).
14. Ramu, A. et al. DeNovoGear: de novo indel and point mutation discovery and phasing. Nat. Methods 10, 985–7 (2013).
15. Wei, Q. et al. A Bayesian framework for de novo mutation calling in parents-offspring trios. Bioinformatics 31, 1–7 (2014).
16. Liu, Y., Li, B., Tan, R., Zhu, X. & Wang, Y. A gradient-boosting approach for filtering de novo mutations in parent-offspring trios. Bioinformatics 30, 1830–1836 (2014).
17. Li, J. et al. mirTrios: an integrated pipeline for detection of de novo and rare inherited mutations from trios-based next-generation sequencing. J. Med. Genet. 52, 275–81 (2015).
18. Patel, Z. H. et al. The struggle to find reliable results in exome sequencing data: Filtering out Mendelian errors. Front. Genet. 5, 1–13 (2014).
19. Carson, A. R. et al. Effective filtering strategies to improve data quality from population-based whole exome sequencing studies. BMC Bioinformatics 15, 125 (2014).
20. Badner, J. A., Sieber, W. K., Garver, K. L. & Chakravarti, A. A genetic study of Hirschsprung disease. Am. J. Hum. Genet. 46, 568–80 (1990).
21. Harrison, C., Wabbersen, T. & Shepherd, I. T. In vivo visualization of the development of the enteric nervous system using a Tg(-8.3bphox2b:Kaede) transgenic zebrafish. Genesis 52, 985–90 (2014).
22. Kok, F. O. et al. Reverse Genetic Screening Reveals Poor Correlation between Morpholino-Induced and Mutant Phenotypes in Zebrafish. Dev. Cell 32, 97–108 (2014).
23. Iwashita, T., Kruger, G. M., Pardal, R., Kiel, M. J. & Morrison, S. J. Hirschsprung disease is linked to defects in neural crest stem cell function. Science 301, 972–976 (2003).
24. Heanue, T. A. & Pachnis, V. Expression profiling the developing mammalian enteric nervous system identifies marker and candidate Hirschsprung disease genes. Proc. Natl. Acad. Sci. U. S. A. 103, 6919–24 (2006).
25. Vohra, B. P. S. et al. Differential gene expression and functional analysis implicate novel mechanisms in enteric nervous system precursor migration and neuritogenesis. Dev. Biol. 298, 259–271 (2006).
GENERAL DISCUSSION
193
7
26. Ngan, E. S. W. et al. Prokineticin-1 (Prok-1) works coordinately with glial cell line-derived neurotrophic factor (GDNF) to mediate proliferation and differentiation of enteric neural crest cells. Biochim. Biophys. Acta - Mol. Cell Res. 1783, 467–478 (2008).
27. Brooks, A. S. et al. A novel susceptibility locus for Hirschsprung’s disease maps to 4q31.3-q32.3. J. Med. Genet. 43, e35 (2006).
28. Druckenbrod, N. R. & Epstein, M. L. The pattern of neural crest advance in the cecum and colon. Dev. Biol. 287, 125–33 (2005).
29. Young, H. M. et al. Dynamics of neural crest-derived cell migration in the embryonic mouse gut. Dev. Biol. 270, 455–473 (2004).
30. Young, H. M., Turner, K. N. & Bergner, A. J. The location and phenotype of proliferating neural-crest-derived cells in the developing mouse gut. Cell Tissue Res. 320, 1–9 (2005).
31. Hearn, C. J., Murphy, M. & Newgreen, D. GDNF and ET-3 differentially modulate the numbers of avian enteric neural crest cells and enteric neurons in vitro. Dev. Biol. 197, 93–105 (1998).
32. Taraviras, S. et al. Signalling by the RET receptor tyrosine kinase and its role in the development of the mammalian enteric nervous system. 2797, 2785–2797 (1999).
33. Young, H. M. et al. GDNF is a chemoattractant for enteric neural cells. Dev. Biol. 229, 503–16 (2001).
34. Natarajan, D., Marcos-Gutierrez, C., Pachnis, V. & de Graaff, E. Requirement of signalling by receptor tyrosine kinase RET for the directed migration of enteric nervous system progenitor cells during mammalian embryogenesis. Development 129, 5151–60 (2002).
35. Obermayr, F., Hotta, R., Enomoto, H. & Young, H. M. Development and developmental disorders of the enteric nervous system. Nat. Rev. Gastroenterol. Hepatol. 10, 43–57 (2012).
36. Zhao, S., Fung-Leung, W.-P., Bittner, A., Ngo, K. & Liu, X. Comparison of RNA-Seq and microarray in transcriptome profiling of activated T cells. PLoS One 9, e78644 (2014).
37. Webster, J. A. et al. Genetic Control of Human Brain Transcript Expression in Alzheimer Disease. Am. J. Hum. Genet. 84, 445–458 (2009).
38. Emison, E. S. et al. A common sex-dependent mutation in a RET enhancer underlies Hirschsprung disease risk. Nature 434, 857–63 (2005).
39. Sribudiani, Y. et al. Variants in RET associated with hirschsprung’s disease affect binding of transcription factors and gene expression. Gastroenterology 140, 572–582 (2011).
40. Visscher, P. M., Brown, M. A., McCarthy, M. I. & Yang, J. Five Years of GWAS Discovery. Am. J. Hum. Genet. 90, 7–24 (2012).
41. Pulit, S. L., Leusink, M., Menelaou, A. & de Bakker, P. I. W. Association claims in the sequencing era. Genes (Basel). 5, 196–213 (2014).
42. Gratten, J., Wray, N. R., Keller, M. C. & Visscher, P. M. Large-scale genomics unveils the genetic architecture of psychiatric disorders. Nat. Neurosci. 17, 782–90 (2014).
43. Gershon, M. D. The enteric nervous system: a second brain. Hosp. Pract. (1995) 34, 31–32, 35–38, 41–42 passim (1999).
44. Hao, M. M. & Young, H. M. Development of enteric neuron diversity. J. Cell. Mol. Med. 13, 1193–1210 (2009).
45. Bernier, R. et al. Disruptive CHD8 mutations define a subtype of autism early in development. Cell 158, 263–76 (2014).
46. Winge, K., Rasmussen, D. & Werdelin, L. Constipation in neurological diseases. J. Neurol. Neurosurg. Psychiatry 74, 13–19 (2003).
47. Brooks, A. S. et al. Homozygous Nonsense Mutations in KIAA1279 Are Associated with Malformations of the Central and Enteric Nervous Systems. Am. J. Hum. Genet. 77, 120–126 (2005).
48. Touraine, R. L. et al. Neurological phenotype in Waardenburg syndrome type 4 correlates with novel SOX10 truncating mutations and expression in developing brain. Am. J. Hum. Genet. 66, 1496–503 (2000).
49. Stefansson, H. et al. Neuregulin 1 and Susceptibility to Schizophrenia. Am. J. Hum. Genet. 71, 877–892 (2002).
50. Hanchate, N. K. et al. SEMA3A, a gene involved in axonal pathfinding, is mutated in patients with Kallmann syndrome. PLoS Genet. 8, e1002896 (2012).
51. Anderson, S. L. et al. Familial dysautonomia is caused by mutations of the IKAP gene. Am. J. Hum. Genet. 68, 753–8 (2001).
52. Slaugenhaupt, S. A. et al. Tissue-specific expression of a splicing mutation in the IKBKAP gene causes familial dysautonomia. Am. J. Hum. Genet. 68, 598–605 (2001).
CHAPTER 7
194
53. Wakamatsu, N. et al. Mutations in SIP1, encoding Smad interacting protein-1, cause a form of Hirschsprung disease. Nat. Genet. 27, 369–70 (2001).
54. Friedmacher, F. & Puri, P. Hirschsprung’s disease associated with Down syndrome: A meta-analysis of incidence, functional outcomes and mortality. Pediatr. Surg. Int. 29, 937–946 (2013).
SUMMARY
NEDERLANDSE SAMENVATTING
LIST OF ABBREVIATIONS
DANKWOORD / ACKNOWLEDGEMENTS
8
CHAPTER 8
196
SUMMARY
The enteric nervous system (ENS) is the nervous system of the gastrointestinal
tract. The ENS is responsible for the peristaltic movements of the bowel, regulation
of blood flow and secretions in the gut. The ENS is composed of neurons and glia
that are organized in clusters named ganglia along the entire length of the gut. In
patients with Hirschsprung disease (HSCR) these ganglia are missing in the most
distal region of the large intestine. This lack of enteric ganglia is referred to as
aganglionosis. HSCR is a hereditary disease that is usually diagnosed within a few
days after birth. Due to the absence of neurons in the bowel the muscles in the gut
of HSCR patients are unable to relax. This results in tonic contraction of these
muscles, life-threatening constipation, vomiting, and dilatation of the region of the
gut where feces is accumulating. The clinical and genetic aspects of HSCR are
discussed in chapter 1 of this thesis.
One in every 5,000 live born children is affected by HSCR and the
incidence is four times higher in males than in females. Although HSCR is a genetic
disorder, in most families the disease does not follow the classical Mendelian rules
of inheritance. When HSCR occurs in a familial form it can be dominant or
recessive. This is seen in approximately 10% of the HSCR cases. However, in 90%
of cases the patient is the only family member that is affected, called a sporadic
patient. In these sporadic cases the mode of inheritance is considered complex
with many genes and environmental factors involved. There is also variation in the
length of the gut segment that is aganglionic. A short segment is affected in 80% of
the cases, whereas ganglia are absent in a longer segment of the gut in 20% of the
patients. The latter is referred to as long-segment HSCR and is the most heritable
form of the disease.
Mutations in over 15 genes have been identified in HSCR patients, of which
RET is the most important gene. Mutations in the protein coding regions of RET
have been found in ~50% of familial patients and in 15-35% of sporadic patients.
Mutations in other HSCR genes are much more rare and are mainly found in
familial or syndromic cases. It is therefore likely that mutations in other, not yet
associated genes also contribute to the development of HSCR.
Chapter 2 describes the identification of novel HSCR genes by screening
for de novo mutations in sporadic patients. De novo mutations are new genetic
changes that are found for the first time in the offspring, but are absent in both
parents. In recent years de novo mutations have been implicated in the etiology of
SUMMARY
197
8
developmental disorders such as autism, schizophrenia and intellectual disability.
In our study we identified 28 de novo mutations in 24 patients, eight of which were
found in RET, the major HSCR gene. The remaining 20 genes that harbored de novo
mutations could not be linked to the known gene networks in ENS development.
Therefore, we decided to functionally test the role of the genes in which we
identified possible pathogenic mutations (loss of function or missense mutations)
in ENS development in an animal model. We opted for a transgenic zebrafish
model that expresses a fluorescent protein in ENS progenitor cells and in mature
enteric neurons. Knockdown of four of the 12 genes with possible pathogenic de
novo mutations (DENND3, NCLN, NUP98 and TBATA) resulted in reduced numbers
of enteric neurons in the distal gut of the zebrafish, reminiscent of the
aganglionosis observed in human HSCR patients. The four genes have known
functions in the development of the central nervous system (CNS). Our data
suggest that these genes are involved in the development of both the ENS and the
CNS.
Many known HSCR genes are expressed by the progenitor cells of the ENS.
In chapter 3 we analyzed the gene expression profile of ENS progenitors to better
understand the molecular mechanisms behind ENS development and to identify
new candidate genes for HSCR. Genes that were highly expressed by ENS
progenitors are involved in various stages of neural development and in adhesion
of migrating cells. That our approach is working well for the identification of HSCR
candidates is suggested by the identification of the known HSCR genes Ret and
Sox10 as critical genes in ENS development. Our data suggest that Bdnf, App,
Mapk10 and Timp1 are prefect candidates for HSCR. In addition, we showed that
activation of RET, the major HSCR gene, causes maturation of enteric neurons.
Next-generation sequencing (NGS) is a new technology for DNA
sequencing. NGS can be applied to sequence the entire genome and generates
enormous amounts of DNA sequence data. Because of the large amount of data in
NGS experiments it is inevitable that errors occur in the obtained DNA sequence.
This poses a challenge for the detection of de novo mutations, as the number of de
novo mutations is much lower than the number of errors in NGS data. Chapter 4
presents an analysis on how de novo mutations can be distinguish from sequencing
errors based on three commonly used quality parameters. We found that the most
informative quality parameter is the Genotype Quality score. Generally a maximum
value is assigned to the Genotype Quality score, but we found that de novo
mutations can be identified more specifically by removing the maximum value of
CHAPTER 8
198
this parameter. By selecting the optimal threshold values for the three quality
parameters, we were able to identify de novo mutations with an accuracy of 98.3%,
without missing any de novo mutations.
Complex genetic diseases are caused by a combination of common and
rare genetic variants. Rare variants can be identified by NGS, but since they are
found in only few patients it is difficult to obtain statistical evidence for their
contribution to disease development. In chapter 5 we analyzed how a
combination of strategies can be applied to extract useful information from an NGS
study on HSCR patients, despite the lack of statistical evidence. We identified
several candidate genes for HSCR by assessing which genes have plausible roles in
ENS development. We present ways in which even a small-scale genetic study can
be used to test whether these genes are indeed involved in the development of the
ENS.
In 30% of all cases HSCR occurs as part of a syndrome, most commonly
Down syndrome. Seven percent of people with Down syndrome also suffer from
HSCR. Down syndrome is caused by an extra copy of chromosome 21. It is
therefore likely that the extra copy of one or more genes on chromosome 21
contributes to the development of HSCR in Down syndrome patients. We tested
this hypothesis in chapter 6 by overexpressing several chromosome 21 genes in
the zebrafish model with the fluorescent ENS. Overexpression of one gene (ATP5O)
resulted in a reduced number of enteric neurons in the intestine of the zebrafish,
similar to what is observed in human HSCR patients. The extra copy of this gene
may contribute to the development of HSCR in patients with Down syndrome.
Chapter 7 describes the implications of the research described in this
thesis. We conclude that rare genetic variants contribute to sporadic HSCR and
emphasize that functional testing of candidate genes from genetic studies is of
great value in the identification of new disease genes. Suggestions are made for the
identification of new HSCR genes in future studies. Moreover, the implications of
this research for genetic counseling to HSCR patients and their families are
discussed.
SUMMARY
199
8
CHAPTER 8
200
NEDERLANDSE SAMENVATTING
In de wand van onze darmen bevinden zich spieren die door samen te trekken
ontlasting door de darm duwen. Deze spieren worden aangestuurd door het
zenuwstelsel van de darmen, het enterisch zenuwstelsel. Dit zenuwstelsel bestaat
uit zenuwcellen die in clusters verspreid zijn over de gehele lengte van de darm.
Deze clusters heten ganglia. In patiënten met de ziekte van Hirschsprung (HSCR)
zijn deze ganglia afwezig in het laatste deel van de dikke darm. Deze afwezigheid
wordt ook wel aganglionose genoemd. HSCR is een erfelijke ziekte die doorgaans
wordt vastgesteld in de eerste levensdagen van een pasgeboren kind. Door de
afwezigheid van zenuwcellen in de dikke darm kunnen de spieren in de darm zich
niet ontspannen in HSCR patiënten, wat leidt een sterk samengeknepen stuk darm
wat met een levensbedreigende verstopping, braken en het opzwellen van het deel
van de darm waar voedsel zich ophoopt als gevolg. De klinische en genetische
aspecten van HSCR worden besproken in hoofdstuk 1 van dit proefschrift.
HSCR wordt gediagnosticeerd in één op de 5000 levend geboren kinderen
en komt vier keer vaker voor bij jongens dan bij meisjes. Ondanks dat HSCR een
erfelijke ziekte is, volgt het niet de wetten van de klassieke genetica. Ongeveer
10% van de HSCR gevallen komt voor binnen families, maar meestal is de patiënt
het enige familielid met de ziekte (sporadische patiënt). Tevens is er variatie in de
lengte van het deel van de darm waarin de aganglionose voorkomt. In 80% van de
gevallen is een kort deel van de darm aangedaan, terwijl in de overige 20% van de
patiënten zenuwcellen afwezig zijn in een groot deel van de dikke darm. Dit
noemen we ook wel lang-segment HSCR. Lang-segment HSCR is de meest erfelijke
vorm van HSCR.
Tot op heden zijn in HSCR patiënten mutaties gevonden in meer dan 15
genen, met RET als het belangrijkste gen. Mutaties die de eiwitstructuur van RET
veranderen zijn gevonden in ~50% van de familiaire HSCR patiënten en in 15-35%
van de sporadische patiënten. Mutaties in de overige HSCR genen zijn veel
zeldzamer. Vermoedelijk zijn er daarom meer, nog onbekende genen die bijdragen
aan het ontstaan van de ziekte.
Hoofdstuk 2 beschrijft de identificatie van vier nieuwe genen voor HSCR
door te zoeken naar de novo mutaties. De novo mutaties zijn nieuw ontstane
genetische veranderingen die niet worden gevonden in de ouders, maar wel in het
kind. In recente jaren zijn dit soort mutaties in verband gebracht met
ontwikkelingsstoornissen zoals autisme, schizofrenie en verstandelijke handicaps.
NEDERLANDSE SAMENVATTING
201
8
In onze studie vonden we 28 de novo mutaties in 24 patiënten, waarvan 8 mutaties
in het belangrijkste HSCR gen RET. De overige 20 genen met de novo mutaties
konden niet worden gekoppeld aan de bekende gen-netwerken die betrokken zijn
bij de ontwikkeling van het zenuwstelsel van de darm. Daarom besloten we de rol
van deze genen in de ontwikkelingen van het zenuwstelsel van de darmen te testen
in een proefdiermodel. Hiervoor hebben we gebruik gemaakt van een zebravis
waarin de zenuwcellen in de darm fluorescent oplichten en daardoor makkelijk
zichtbaar zijn. Met behulp van dit diermodel vonden we dat inactivatie van vier
van de genen met de novo mutaties (DENND3, NCLN, NUP98 en TBATA) leidde tot
afwezigheid van zenuwcellen in het uiteinde van de darm van de zebravis,
vergelijkbaar met aganglionose die wordt gezien in HSCR patiënten. Van de vier
genen is bekend dat ze betrokken zijn bij de ontwikkeling van het centrale
zenuwstelsel in de hersenen. Wij hebben nu dus aangetoond dat ze ook een rol
spelen in de ontwikkeling en functie van het zenuwstelsel van de darm.
Veel bekende HSCR genen zijn actief in de voorlopercellen die uiteindelijk
het zenuwstelsel in de darm vormen. In hoofdstuk 3 hebben we alle genen die
actief zijn in de voorlopercellen in kaart gebracht, om zo beter te begrijpen hoe het
zenuwstelsel van de darm zich ontwikkelt en om nieuwe kandidaat genen te
vinden voor HSCR. De genen die actief waren, zijn betrokken bij de verschillende
stadia van de ontwikkeling van neuronen, of zijn belangrijk voor het maken van
verbindingen tussen zich verplaatsende cellen. Onze analyse laat zien dat bekende
HSCR genen als Ret en Sox10 zo kunnen worden aangetoond. Dit geeft aan dat de
methode die we gebruiken goed werkt. Naast de bekende genen vonden we ook
een aantal veelbelovende kandidaten, te weten Bdnf, App, Mapk10 en Timp1. Deze
genen spelen een belangrijke rol in de ontwikkeling van het zenuwstelsel van de
darm en zijn dus potentiële HSCR genen. Daarnaast lieten we zien dat activatie van
RET, het belangrijkste HSCR gen, belangrijk is voor de late ontwikkeling van
zenuwcellen.
Next-generation sequencing (NGS) is een nieuwe manier om de DNA
volgordes vast te stellen. NGS kan worden gebruikt om de DNA volgorde van het
hele genoom af te lezen en genereert dus enorme hoeveelheden DNA sequentie
data. Door de grote hoeveelheid data in NGS experimenten is het onoverkomelijk
dat er fouten optreden in het aflezen van de DNA volgorde. Dit bemoeilijkt de
identificatie van nieuwe, de novo mutaties in NGS data. Het aantal de novo mutaties
is namelijk veel lager dan het aantal fouten in NGS data en het vinden van de novo
mutaties in NGS data is dus als het zoeken naar een speld in een hooiberg. In
CHAPTER 8
202
hoofdstuk 4 wordt geanalyseerd hoe de novo mutaties onderscheiden kunnen
worden van fouten in NGS data op basis van drie veelgebruikte
kwaliteitsparameters. Hierbij vonden we dat de Genotype Kwaliteit score de meest
informatieve parameter is. Normaal gesproken is er een maximale waarde die
wordt toegekend aan de Genotype Kwaliteit score, maar wij ontdekten dat de novo
mutaties specifieker gevonden kunnen worden door geen maximum te stellen aan
de waarde van deze parameter. Door optimale waarden te kiezen voor de drie
geteste parameters konden we de novo mutaties 98,3% accuraat identificeren,
zonder mutaties te missen.
Complexe genetische aandoeningen worden veroorzaakt door een
combinatie van veel voorkomende en zeldzame genetische varianten. Zeldzame
varianten kunnen worden geïdentificeerd met behulp van NGS, maar doordat
zeldzame varianten slechts in enkele patiënten worden gevonden is het lastig om
statistisch bewijs te vinden dat de gevonden mutaties ook echt betrokken zijn bij
het ontstaan van een ziekte. In hoofdstuk 5 hebben we onderzocht hoe een
combinatie van strategieën toegepast kan worden om zinvolle informatie te halen
uit NGS data van HSCR patiënten, ondanks het gebrek aan sterk statistisch bewijs.
Door te kijken welke genen een aannemelijke rol hebben in de ontwikkeling van
het zenuwstelsel van de darm, hebben we een aantal kandidaat genen voor HSCR
geïdentificeerd. Een kleinschalige genetische studie kan een eventuele rol van deze
genen in HSCR bevestigen.
In 30% van alle gevallen komt HSCR voor als onderdeel van een
syndroom, waarvan het syndroom van Down het meest voorkomende is. Zeven
procent van mensen met het syndroom van Down heeft ook de ziekte van
Hirschsprung. Het syndroom van Down wordt veroorzaakt door een extra kopie
van chromosoom 21. Het is daarom aannemelijk dat de extra kopie van één of
meer van de genen op chromosoom 21 bijdraagt aan het ontstaan van HSCR in
patiënten met het syndroom van Down. Deze hypothese hebben we getest in
hoofdstuk 6 door een aantal genen die op chromosoom 21 liggen tot
overexpressie te brengen in het eerdergenoemde zebravis model met de
fluorescerende zenuwcellen in de darm. Overexpressie van één gen (ATP5O) leidde
tot een vermindering van het aantal zenuwcellen in de darm in de zebravis, net
zoals wordt gezien in HSCR patiënten. In patiënten met het syndroom van Down
zou de extra kopie van dit gen kunnen bijdragen aan de ontwikkeling van HSCR.
In hoofdstuk 7 worden de implicaties van het onderzoek in dit
proefschrift beschreven. We concluderen dat zeldzame genetische varianten een
NEDERLANDSE SAMENVATTING
203
8
rol spelen in sporadische HSCR en benadrukken dat functioneel testen van
kandidaat genen in een diermodel van grote waarde is bij het vinden van nieuwe
ziektegenen. Er worden suggesties gedaan voor de identificatie van nieuwe HSCR
genen in toekomstige studies. Tevens wordt besproken wat de implicaties van dit
onderzoek zijn voor genetische voorlichting voor families van patiënten met de
ziekte van Hirschsprung.
CHAPTER 8
204
LIST OF ABBREVIATIONS
ATP5O ATP Synthase, H+ Transporting, Mitochondrial F1 Complex, O
Subunit
AUC area under the curve
CCHS congenital central hypoventilation syndrome
CNS central nervous system
CNV copy number variation
DENND3 DENN/MADD Domain Containing 3
DNM de novo mutation
dpf days post-fertilization
DS Down syndrome
(E)NCC (enteric) neural crest cell
E14.5 embryonic day 14.5
EDNRB endothelin receptor type B
ENS enteric nervous system
GATK genome analysis toolkit
GDNF glial cell line-derived neurotrophic factor
GQ genotype quality
GWAS genome-wide association study
(h)iPS cell (human) induced pluripotent stem cell
hpf hours post-fertilization
Hsa21 human chromosome 21
HSCR Hirschsprung disease
IGV integrative genomics viewer
indel insertion and/or deletion
IPA Ingenuity pathway analysis
LOF loss of function
MO morpholino
NCLN nicalin
NGS next-generation sequencing
NLB neurosphere-like body
NPC neural progenitor cell
NRG neuregulin
NUP98 nucleoporin 98kDa
OMIM online Mendelian inheritance in man
LIST OF ABBREVIATIONS
205
8
RET rearranged during transfection
ROC receiver operating characteristic
SBMO splice-blocking morpholino
SEMA semaphorin
SNV single nucleotide variation
(ss)PL second smallest PHRED-scaled genotype likelihood
TBATA thymus, brain and testes associated
TBMO translation-blocking morpholino
VCF variant call format
WS Waardenburg Shah syndrome
YFP yellow fluorescent protein
CHAPTER 8
206
DANKWOORD / ACKNOWLEDGEMENTS
With the completion of this PhD thesis comes the time to say goodbye and thank
those who were involved in this period of my life.
Om bij het begin te beginnen wil ik prof. dr. Robert M.W. Hofstra en dr.
Bart J.L. Eggen bedanken voor de mogelijkheid om mijn PhD te doen in hun
onderzoeksgroepen. Beste Robert en Bart, ik ben erg blij jullie als (co-)promotor
te hebben gehad. Zowel professioneel als persoonlijk heb ik veel waardering voor
jullie beiden. Robert, ik vond het jammer om te horen dat jij en de groep naar
Rotterdam verhuisden, al begrijp ik je keuze en gun ik het je van harte. Ondanks de
fysieke afstand kon ik altijd bij je terecht met vragen, waarop ik doorgaans binnen
een paar uur een antwoord had. Beste Bart, na Robert’s vertrek kwam ik bij jou op
de afdeling terecht. Ondanks dat mijn werk af en toe wat ver van jouw bed stond,
had ik aan jou altijd een fantastische begeleider. Daarnaast heb ik persoonlijk veel
aan je gehad en heb ik heel veel met je kunnen lachen. Robert en Bart, jullie
hebben een permanente indruk op me gemaakt, ik zal altijd aan jullie denken
wanneer ik een bloemetjesoverhemd zie.
I would like to thank the members of the assessment committee, prof. dr.
Jon D. Laman, prof. dr. Paul K.H. Tam and prof. dr. ir. Joris A. Veltman, for
taking the time to read my PhD thesis. Beste Jon, ondanks dat je niet bij mijn
promotie kon zijn, waardeer ik de interesse die je altijd hebt getoond in mijn werk.
Voor het bijstaan tijdens mijn verdediging en het organiseren van het
daaropvolgende feest wil ik mijn paranimfen, Koen van Zomeren en Hiske van
Duinen bedanken.
Even though I was the only person working on the genetics of
Hirschsprung disease in a lab of CNS neuroscientists, I have always had a lot of
colleagues to collaborate with. Thanks to all the members of The International
Hirschsprung Disease Consortium, the groups of prof. dr. Salud Borrego, prof.
dr. Aravinda Chakravarti, prof. dr. Isabella Ceccherini, prof. dr. Stanislas
Lyonnet and dr. Jeanne Amiel and prof. dr. Paul Tam and dr. Mercè Garcia-
Barceló. It was a privilege to work on behalf of the consortium and a pleasure to
meet all of you on our consortium meetings. In particular I want to thank Mercè
and Hongsheng. My trip to Hong Kong was one to never forget and I appreciate
how you helped making my stay comfortable. Hongsheng, I lost count on how
many emails we have send back and forth. Thanks for working together on the de
novo paper. Also thanks to our collaborators dr. Alan Burns, dr. Nikhil Thapar,
DANKWOORD / ACKNOWLEDGEMENTS
207
8
dr. Dipa Natarajan and dr. Iain Shepherd. The last few years the G.I. Genetics
group in Rotterdam kept expanding. Yunia, Rajendra, Erwin, Katherine,
William, Maria, Danny, Rhiana, Alice and Bianca, even though I only showed up
every once in a while, I appreciate how you always made me feel like I was part of
the group. Yunia, you were very important in my first months in the lab, still in the
Genetics department in Groningen. Thanks a lot, I hope life is treating you as kind
as you are everybody around you. Rajendra, Erwin, Katherine and William,
thanks for our collaborations on the work in my thesis. It was a pleasure to work
with each of you.
I spent the first year of my PhD in the UMCG department of Genetics. The
department is simply too big to thank each of you personally, and probably many
people have left and joined the lab after I left. I do want to thank the members of
our group, Jan, Joke, Christine, Peter, Nicole, and Marjon, and my former office
mates, Suzanne, Olga, Agata and Céline. It was often a lot of fun to have you as
office mates, and at times it was quite interesting to share an office with four
women.
Na het eerste jaar van mijn PhD verhuisde ik naar de afdeling
Neurowetenschappen, sectie Medische Fysiologie van het UMCG. Alhoewel ik
niet echt bij één van de onderzoeksgroepen hoorde, heb ik me altijd thuis gevoeld
op de afdeling. Dit kan natuurlijk te maken hebben gehad met mijn strakke pauze
regime. Door koffie te drinken om 10 uur, te lunchen om 11:50 uur (stipt) en thee
te drinken om 15 uur kom je op een dag de meeste collega’s wel tegen.
Beste Erik, we hebben nooit veel met elkaar te maken gehad. Bedankt dat
ik welkom was op de afdeling Neurowetenschappen toen Robert naar Rotterdam
vertrok. Beste Jon, als het nieuwe afdelingshoofd geef jij op geheel eigen wijze
invulling aan deze positie. Goed om te zien dat je erg betrokken bent bij zowel
ieders onderzoek, alsook bij de persoonlijke ontwikkeling van jonge onderzoekers.
Beste Sjef, dank je wel dat ik een tijdje mee mocht draaien in de stamcelgroep toen
ik met iPS cellen werkte. Geniet van je welverdiende pensioen. Je speeches waren
altijd zeer de moeite waard, mocht je ooit besluiten meer met je schrijftalent te
gaan doen, dan ga ik het zeker lezen. Beste Rob, ik kon je uitgesproken mening en
eigenzinnigheid altijd goed waarderen. Het verraste me dat je mij bij binnenkomst
nog herkende als oud-student, het tekent je grote betrokkenheid voor je vak. Beste
Inge (Zijdewind), net als Rob kan je eigenzinnig en uitgesproken zijn, en toch zijn
jullie ook heel verschillend. Het kostte me wat tijd om je te doorgronden, maar
daarna zag ik vooral ook een heel warm en betrokken persoon. Beste Wieb, ik heb
CHAPTER 8
208
veel respect voor hoe je in het leven staat en omgaat met je ziekte. Ondanks je
fysieke ongemakken klaag je nooit en ben je altijd positief ingesteld, daar kunnen
de meeste van ons wel iets van leren. Dear Armağan, as far as I’m concerned you
are a great addition to the department, both scientifically and personally. Thanks
for your advice about career opportunities outside academia, it definitely helped
me to decide what I want to do next.
‘Zonder Trix gebeurt er niks’ luidt de tegeltjeswijsheid. Met je actieve,
ondernemende houding krijg je altijd veel voor elkaar. Het verse kopje koffie en
sociale praatje wanneer ik ’s ochtends op mijn werk kom ga ik zeker missen. Beste
Nieske, je bent gekscherend weleens de ‘labmoeder’ genoemd, een titel die je wat
mij betreft wel verdiend. Je staat altijd voor iedereen klaar en bent erg betrokken.
Top! Beste Ietje, het is altijd goed om iemand met jouw enthousiasme er tussen te
hebben. Wat hebben we lekkere biefstukjes gemaakt in de kookstudio tijdens de
labstapdag. Beste Michel, de man van surfen, snowboarden, hardlopen,
mountainbiken, fotografie en ongetwijfeld nog een paar hobby’s. Het was leuk te
delen in je enthousiasme voor je interesses. Beste Evelyn, dank je voor het
wegwijs maken op het moleculaire lab. Hopelijk ben je voorlopig klaar met
verhuizen en hebben jullie een mooi plekje gevonden om Hugo en Lieke op te laten
groeien. Beste Tjalling, waar anderen soms moe kunnen worden van je
pessimistische muggenzifterij, heb ik me er altijd kostelijk mee vermaakt. Onze
uitvoerige discussies over onbenulligheden en minieme details kon ik erg
waarderen. Beste Annelies, je was er niet vaak, maar met je uitgesproken mening
en uitvoerige verhalen heb je wel een (positieve) indruk achtergelaten. Beste Loes,
na mijn stage bij Ontwikkelingsgenetica in Haren kwamen we elkaar in het UMCG
weer tegen. Dank je dat je me wegwijs heb gemaakt in het lab, al had ik er af en toe
een schop onder m’n kont voor nodig. Beste Henk, Harry en Klaas, in de
dagelijkse praktijk was het voor mij niet altijd zichtbaar, maar jullie hebben
ongetwijfeld achter de schermen veel werk verzet waar de afdeling, en dus ook ik,
profijt van had. Bedankt daarvoor.
Beste Susanne, soms kun je iemand jaren niet spreken en daarna met
iemand opschieten alsof de tijd heeft stilgestaan. Dat had ik met jou. Dank je wel
voor alle leuke momenten. Beste Hilmar, omdat ik Suus al kende was je voor mij in
het begin toch een beetje die man die met Suus mee kwam. Gelukkig bleek je al
snel een leuke collega waar ik genoeg mee kon ouwehoeren. Bedankt voor de hulp
met de omslag van m’n boekje!
DANKWOORD / ACKNOWLEDGEMENTS
209
8
…Koen? Beste Koen, jij was vaak mijn maatje op het lab en daarnaast ben je
afgelopen jaren ook een goede vriend geweest. Ik heb erg veel met je gelachen,
maar heb ook veel aan je gehad als het even niet mee zat. Al wist ik niet altijd wat
ik kon doen, ik hoop dat je ook iets aan mij hebt gehad in de tijden dat jij het nodig
had. Beste Hiske, toen ik in jou een traploopmaatje vond werden mijn werkdagen
een stukje korter, maar ook veel gezelliger. Dank je wel voor alles wat onze
vriendschap brengt. Dear Marcin, for some reason we spent several years as good
coworkers before we started to become friends. Too bad our friendship didn’t start
earlier. Maybe one day I’ll still make that trip to visit you and Kasia in Poland. Dear
Thais, as you mentioned in your good-bye letter it’s remarkable how well we got
along and found support in each other, considering how different we are. Thanks
for everything and sorry for calling you old, I still owe you a beer to make up for
that. Dank ook aan de andere leden van het ‘kippenhok’, Corien en Rianne, voor
alle pauzes die ik doorbracht in jullie kamer en jullie daarmee van het werk hield.
Het was vaak gezellig, maar ook voor serieuze dingen kon ik bij jullie terecht.
Beste Ilia, ook wel gebombardeerd tot onze model PhD, omdat je altijd je
zaakjes op orde had. Zie dat vooral als een compliment! Bedankt voor alle
discussies over het schrijven van een proefschrift, opmaak, solliciteren, etc. I hope
for you and Suping that you will be able to find your next jobs in the same place.
Beste Sabrina, het was jammer je te zien vertrekken van de afdeling, maar ik ben
erg blij dat je goed op je plek bent in het ERIBA. Dear Claudio, thanks for being a
great office mate and climbing buddy. Sorry you had to endure ‘Dutch pasta’
several times before climbing. Beste Inge (Holtman), van alle collega’s op de
afdeling kwam mijn werk af en toe het meeste overeen met dat van jou. Dank je
voor de discussies en hulp bij het bioinformatische deel van mijn projecten en veel
plezier en succes in San Diego. Dear Zhuoran, you were my ‘neighbor’ for most of
the time during my PhD. I enjoyed practicing Dutch with you and Claudio and I had
fun teaching you difficult Dutch and Frisian sentences. Thanks for always being so
kind, positive and seemingly happy. Dear Xiaoming, I have a lot of respect for how
much you have grown and developed in the last couple of years. It was fun to
witness your unexpected great sense of humor. Dear Clarissa, thanks for all the
nice, warm talks and good times. It was amazing!
Beste Alain, het is bijzonder hoe je je ervaring met mensen met het
syndroom van Down om hebt kunnen zetten naar een promotieonderzoek. Je
persoonlijke betrokkenheid hierin valt te prijzen. Dear Chaitali, when we first met
I got to know you as my colleague’s wife. It’s funny how things can change and we
CHAPTER 8
210
became coworkers as well. I hope things will work out for you and Rajendra now
that he is almost moving on and you are settled in Groningen. Dear Arun, thanks
for all the social chats and discussions about work and other stuff in our office, and
during our afternoon tea breaks in the first years of our PhD’s. Dear Kumar, thank
you as well for the nice tea breaks. Don’t be too hesitant to practice your Dutch.
Practice makes perfect and it may come in handy when you are looking for your
next job within The Netherlands. Dear Sai, I was sorry to see that your time in the
department didn’t work out as hoped. I hope you will find a place where you can
find your happiness.
To the former lab members Reinhard, Wandert, Vishnu, Zhilin, Ellie,
Duygu, Ria, Ming-San, Thaiany and Divya, thank you for the coffee breaks,
lunches, after-work drinks, nice talks and good times. I hope you are all doing well.
Thanks to the two students, Eunice and Michelle, that I have supervised
during my PhD. Supervising you has not only confirmed my idea that I like to teach,
but also taught me how to be a better teacher. Thanks also to the many other
students we have had in the lab over the years. There were too many of you to
name all of you here in person (and frankly I forgot a few names), but you were
always a welcome addition to social atmosphere in the department.
Beste Martijn, leuk dat je af en toe bij ons op de afdeling was te vinden,
eigenlijk hebben we daarna te weinig contact gehouden. Veel geluk voor jou en
Linda met jullie gezinnetje. Hola Alejandra, I always felt sorry for you when Koen
and Martijn were mocking you. Thanks for considering me the nice guy among
them. Dear Khayum, it is hard to find someone who is as energetic and
enthusiastic as you. I’m sure that with your attitude you will find your way in
whatever life will bring you. Wout, Astrid en Wouter, bedankt voor de avondjes
klimmen, eerst in Gropo en later in Bjoeks!
Lieve Lotte, meer dan vier jaar lang ben jij degene geweest met wie ik
zowel mijn enthousiasme als mijn frustraties over mijn PhD heb kunnen delen. Ook
al heeft het uiteindelijk niet gewerkt tussen ons, ik ben je dankbaar voor onze jaren
samen en ik hoop dat het je goed gaat.
Meer dan wie dan ook hebben mijn ouders bijgedragen aan het feit dat dit
proefschrift er is. Heit en mam, jullie hebben Freerk, Aldert en mij altijd
gestimuleerd om iets te doen wat goed bij ons past, zonder verwachtingen te
hebben. Jullie hebben goed oog gehad voor wat ieder van ons nodig had om ons
gelukkig en zelfredzaam in de maatschappij af te leveren. Ik ben er van overtuigd
dat dit me meer dan wat dan ook heeft geholpen om te worden wie ik ben en te
DANKWOORD / ACKNOWLEDGEMENTS
211
8
bereiken wat ik heb bereikt. Freerk en Aldert, het is mooi om te zien hoe jullie je
weg vinden in het leven, zij het met Mirjam en de nieuwe aanwinst op komst in
Ryptsjerk, dan wel met Karla, Jhosue en Jurre in Ecuador. Helaas hoort bij het
volwassen worden ook dat we elkaar wat minder vaak zien, maar het feit dat dat
soms lastig is bevestigt juist hoe goed we het hebben met z´n allen. Dank jullie wel
voor alles.