evolutionary dynamics of the human nadph oxidase genes cybb, cyba, ncf2, and ncf4: functional...
TRANSCRIPT
Article
Evolutionary Dynamics of the Human NADPH Oxidase GenesCYBB CYBA NCF2 and NCF4 Functional ImplicationsEduardo Tarazona-Santosy12 Moara Machadoy2 Wagner CS Magalhaes2 Renee Chen1
Fernanda Lyon2 Laurie Burdett34 Andrew Crenshaw34 Cristina Fabbri5 Latife Pereira2 Laelia Pinto2
Rodrigo AF Redondo6 Ben Sestanovich1 Meredith Yeager34 and Stephen J Chanock1
1Laboratory of Translational Genomics of the Division of Cancer Epidemiology and Genetics National Cancer Institute NationalInstitutes of Health Gaithersburg MD2Departamento de Biologia Geral Instituto de Ciencias Biologicas Universidade Federal de Minas Gerais Belo Horizonte Brazil3Intramural Research Support Program SAIC Frederick NCI-FCRDC Frederick MD4Core Genotype Facility National Cancer Institute National Institute of Health Gaithersburg MD5Dipartimento di Biologia Evoluzionistica Sperimentale Universita di Bologna Via Selmi Bologna Italy6Institute of Science and Technology - Austria Am Campus 1 3400 Klosterneuburg AustriayThese authors contributed equally to this work
Corresponding author E-mail edutarsicbufmgbr chanocksmailnihgov
Associate Editor Sarah Tishkoff
Abstract
The phagocyte NADPH oxidase catalyzes the reduction of O2 to reactive oxygen species with microbicidal activity It iscomposed of two membrane-spanning subunits gp91-phox and p22-phox (encoded by CYBB and CYBA respectively)and three cytoplasmic subunits p40-phox p47-phox and p67-phox (encoded by NCF4 NCF1 and NCF2 respectively)Mutations in any of these genes can result in chronic granulomatous disease a primary immunodeficiency characterizedby recurrent infections Using evolutionary mapping we determined that episodes of adaptive natural selection haveshaped the extracellular portion of gp91-phox during the evolution of mammals which suggests that this region mayhave a function in host-pathogen interactions On the basis of a resequencing analysis of approximately 35 kb of CYBBCYBA NCF2 and NCF4 in 102 ethnically diverse individuals (24 of African ancestry 31 of European ancestry 24 of AsianOceanians and 23 US Hispanics) we show that the pattern of CYBA diversity is compatible with balancing naturalselection perhaps mediated by catalase-positive pathogens NCF2 in Asian populations shows a pattern of diversitycharacterized by a differentiated haplotype structure Our study provides insight into the role of pathogen-driven naturalselection in an innate immune pathway and sheds light on the role of CYBA in endothelial nonphagocytic NADPHoxidases which are relevant in the pathogenesis of cardiovascular and other complex diseases
Key words innate immunity immunogenetics chronic granulomatous disease
IntroductionThe phagocyte NADPH oxidase also known as the ldquorespira-tory burst oxidaserdquo is an enzymatic complex that plays acritical role in innate immunity Phagocyte NADPH oxidasecatalyzes the reduction of oxygen to O2 generating reactiveoxygen species (ROS) that are key components of phagocyticmicrobicidal activity (Heyworth et al 2003) PhagocyteNADPH oxidase includes two membrane-spanning polypep-tide subunits gp91-phox and p22-phox (encoded by CYBBand CYBA respectively) and a set of cytoplasmic polypeptidesubunits p40-phox p47-phox and p67-phox as well as aGTPase either Rac1 or Rac2 (encoded by NCF4 NCF1NCF2 and RAC1 or RAC2 respectively) Upon inductionthe cytoplasmic subunits bind the transmembrane compo-nents and activate the enzymatic complex producing ROS(fig 1 Sumimoto et al 2005) Mutations in CYBB CYBA NCF1NCF2 or NCF4 can result in chronic granulomatous disease(CGD) a primary immunodeficiency Most CGD patients
have no measurable respiratory burst and in less than 5of patients low levels of ROS production are noted(Heyworth et al 2003) Approximately 70 of CGD casesare X-linked owing to mutations in CYBB (Heyworth et al2003) and there is a high degree of allelic heterogeneity inX-linked as well as in autosomal forms of CGD except forcases due to NCF1 mutations (see the ImmunodeficiencyMutations Database httpbioinfutafibase_rootmutation_databases_listphp last accessed July 16 2013) NCF1 re-sides in a complex region of chromosome 7q11 and mostCGD mutations result from gene conversion of the wild-typegene to one of several neighboring highly paralogous pseu-dogenes (Chanock et al 2000)
Several studies in animal models and in vitro have con-firmed the long-standing clinical observation that theNADPH oxidase is critical for defense against catalase-positivebacteria and fungi (Buckley 2004) Association studies havesuggested a role for common genetic variants in CGD genes assusceptibility alleles for tuberculosis and malaria (Bustamante
Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution 2013 This work is written by US Government employees and is inthe public domain in the US
Mol Biol Evol 30(9)2157ndash2167 doi101093molbevmst119 Advance Access publication July 2 2013 2157
by guest on August 13 2016
httpmbeoxfordjournalsorg
Dow
nloaded from
et al 2011) as well as for immune related diseases such asCrohnrsquos disease and lupus as identified in genome-wide as-sociation studies (GWAS) in European populations (Riouxet al 2007 Roberts et al 2008 Jacob et al 2012) Besidesthe phagocyte NADPH oxidase other NADPH oxidaseswith different functions are expressed in a variety ofnonphagocytic cells including the endothelium and havebeen implicated in cardiovascular and renal diseaseAlthough p22-phox (encoded by CYBA) is a protein compo-nent shared by several of these NADPH oxidases (also calledNox) other more specific protein subunits are encoded bydifferent Nox genes homologous to the genes coding for thephagocytic subunits (Sumimoto et al 2005 San Jose et al2008) Although these nonphagocytic NADPH oxidases nor-mally produce less O2 even small imbalances in ROS levelsmay cause tissue damage due to oxidative stress which iscorrelated with the pathogenesis of gout chronic obstructivepulmonary disease rheumatoid arthritis and cardiovasculardiseases (Brandes and Kreuzer 2005) Therefore variants inNADPH oxidase genes may have pleiotropic effects across aspectrum of disorders (Santiago et al 2012)
Despite the involvement of the NADPH oxidase in a rangeof clinically relevant phenotypes our knowledge of the se-quence diversity of NADPH genes mostly derives from CGDpatients Although targeted SNP genotyping has been per-formed in the context of association studies for CYBA(Bedard et al 2009) and NCF4 (Olsson et al 2007) none ofthe large-scale resequencing efforts such as Seattle SNPs(httppgagswashingtonedu last accessed July 16 2013)Innate Immunity PGA (httpwwwpharmgatorgIIPGA2index_html last accessed July 16 2013) and the CornellndashCelera initiative (Bustamante et al 2005) have included theNADPH oxidase genes and the coverage of these genes for thecurrent release of the 1000 Genomes Project remains low formost of the studied individuals (1000 Genomes ProjectConsortium et al 2012 average coverage and their standard
deviations on May 2013 are CYBB 40 plusmn 22 CYBA 35 plusmn 20NCF2 49 plusmn 27 and NCF4 49 plusmn 27) Although GWAS haveidentified common variants that contribute to complex phe-notypes a component of missing heritability of common dis-eases due to rare variants that are detectable only byresequencing is emerging In this study we analyzed the pat-tern of sequence diversity of four of the NADPH genes (CYBBCYBA NCF2 and NCF4) between mammalian species and inhuman populations by resequencing these genes in 102 eth-nically diverse individuals We interpreted our results in termsof evolutionary histories by addressing the action of naturalselection and focusing on two temporal scales mammalianevolution and recent human evolution We excluded NCF1from our study because its high homology with its pseudo-genes prevents reliable sequencing in individual samples(Chanock et al 2000) Several studies have shown the impor-tance of natural selection on the evolution of immunity genesat both the interspecific (Kosiol et al 2008) and populationlevels (Ferrer-Admetlla et al 2008 Barreiro et al 2009 Barreiroand Quintana-Murci 2010) By definition variants under nat-ural selection are associated with different reproductive effi-ciencies (fitness) of their carriers and contribute to phenotypevariability therefore they may be biomedically relevant byinfluencing the susceptibility to rare or common diseasesThe goals of this study are as follows 1) to determine whetherthe pattern of diversity of human phagocyte NADPH genesreflects the action of different types of natural selection 2) toelucidate the evolutionary dynamics of NADPH genes at thetemporal scales of mammals and humans and 3) to under-stand the biomedical implications of this evolutionary processin human populations
Results
Molecular Evolution of NADPH Genes alongMammalian Phylogeny
We examined signatures of natural selection across thecoding regions of NADPH genes by analyzing sequencesfrom the complete genomes of 29 mammals listed in theEntrez and Ensembl databases (Lindblad-Toh et al 2011one sequence for each species see supplementary materialSupplementary Material online for details) and comparingthe amount of nonsynonymous and synonymous substitu-tions (Nielsen et al 2005) When comparing a set of homol-ogous sequences from different species most of the observeddifferences are fixed that is the differences are monomorphicwithin a species because enough time has passed for theobserved variant to appear increase its frequency and reacha frequency of 1 (Kimura 1974) We compared the number offixed synonymous substitutions (dS assumed to be neutral)and fixed nonsynonymous substitutions (dN for which wetest the hypothesis of natural selection) between speciesusing the parameter = dNdS which is informative of theaction of natural selection at the inter-specific level (Yang2007a) Under neutral evolution of nonsynonymous substitu-tions these substitutions fix at the same rate as synonymoussubstitutions and therefore dN amp dS and amp 1 If nonsyn-onymous substitutions tend to be deleterious purifying
FIG 1 Components of the phagocyte NADPH oxidase Representationof the inactivated (left) and activated (right) forms of the phagocyteNADPH oxidase components reproduced from Heyworth et al (2003)The activated form is responsible for the respiratory burst The proteins(and genes) are gp91 (CYBB Xp211) p22 (CYBA 16q24) p67 (NCF21q25) p40 (NCF4 22q131) and p47 (NCF1 7q1123)
2158
Tarazona-Santos et al doi101093molbevmst119 MBE by guest on A
ugust 13 2016httpm
beoxfordjournalsorgD
ownloaded from
selection maintains the substitutions at low frequencies andprevents fixation at the same rate as synonymous substitu-tions resulting in dNlt dS and lt 1 On the other hand ifepisodes of positive natural selection (that raise the frequencyof beneficial variants) are frequent nonsynonymous substi-tutions increase in frequency and fix more rapidly than neu-tral synonymous substitutions thus dNgt dS and gt 1 Weused the maximum likelihood framework developed by Yang(2007a) to estimate for the NADPH oxidase genes under avariety of evolutionary models as implemented in the PAMLsoftware (Yang 2007b) This approach allows inferences aboutthe evolution of a coding region along an interspecific phy-logeny and maps the codons that have evolved under strongweak purifying selection neutrality or adaptive positive se-lection (see supplementary material Supplementary Materialonline for details)
In general PAML evolutionary models that allow a combi-nation of purifying selection and neutrality are reasonably re-alistic These models are nested with respect to models thatalso incorporate positive selection at the cost of adding newparameters We evaluated the improvements in the goodnessof fit of the data using the latter model with respect to theformer models by applying a likelihood ratio test (LRT) Afterfitting the data to the most appropriate evolutionary modelNaive Empirical Bayes (NEB) or Bayes Empirical Bayes (BEB)approaches were used to infer theparameter for each codon
For the 29 species of mammals considered we filteredbased on quality control (supplementary table S1 Supple-mentary Material online) and analyzed 570 codons of CYBBin 26 species 198 codons of CYBA in 16 species 526 codons ofNCF2 in 23 species and 339 codons of NCF4 in 20 species (seesupplementary material Supplementary Material online forfurther details including the species and sequences used forthe analyses the parameter estimations for the differentmodels and the LRT results) Here we only present the resultsof model M3 of Yang (2007a) with three (K = 3) classes of (fig 2) This model allows for different classes including thepossibility of positive selection and is a reasonable way ofpresenting the results for the four genes under the samemodel Moreover in all cases the data fit better withmodel M3 than the nested and simpler M0 or M2 modelspresented by Yang (2007a)
The results of this analysis for CYBB CYBA NCF2 and NCF4are presented in figure 2 which shows the type of naturalselection (ie based on the estimated ) for each codon thatmost likely predominated during mammalian evolution Forthis temporal scale CYBA NCF2 and NCF4 coding regionshave evolved driven by a combination of different levels ofpurifying natural selection Overall the average and standarddeviation values for these genes are NCF2 = 0256 plusmn 0227NCF4 = 0126 plusmn 0140 and CYBA = 0109 plusmn 0116
Our most striking result is for CYBB which presents a widespectrum of mutations that account for gt70 of CGD pa-tients Although we would predict that purifying selection ongenes involved in Mendelian diseases (Blekhman et al 2008)would yield similar results for CYBB and other NADPHcomponents we observed a different pattern In generalCYBB is a conserved gene but 6 of its codons show
evidence of positive natural selection (supplementary tableS2 Supplementary Material online fig 2) In a genome-widesurvey performed by Kosiol et al (2008) they have reportedCYBB as a gene showing a signal of positive natural selectionMore importantly by evolutionary mapping we show herefor the first time that most of these positive selection eventsmap to the small extracellular portion of this protein (fig 3)The proximity of these inferred episodes of positive naturalselection to glycosylation sites in gp91 is noteworthy consid-ering the importance of the glycome in immunity (Marth andGrewal 2008)
Population Genetics of NADPH Genes
We sequenced CYBB CYBA NCF2 and NCF4 in a publiclyavailable panel that includes 24 individuals of African ances-try 31 Europeans 24 AsianOceanians and 23 admixed LatinAmericans (ie Hispanics) This panel is a suboptimal repre-sentation of the worldwide population a limitation that iscommon to most human genomic diversity projects focusedon SNP genotyping or resequencing efforts However basedon how human genetic diversity is apportioned within(gt85) and between (lt15) populations (Lewontin 19721000 Genomes Project Consortium 2010) even studies usingsuboptimal sampling are informative about the genetic struc-ture of human populations and serve to critically identify therole of evolutionary factors in human genetic diversity(Kimura 1974 Nielsen et al 2005 1000 Genomes ProjectConsortium 2010) All the raw results are available as supple-mentary material Supplementary Material online and at theSNP500Cancer project homepage (httpvariantgpsncinihgovcgfseqpagessnp500do last accessed July 16 2013) orcan be downloaded from the DIVERGENOME platform(Magalhaes et al 2012 httpwwwpggeneticaicbufmgbrdivergenome last accessed July 16 2013)
To ascertain which combination of evolutionary factorshas shaped the diversity of NADPH genes we assessed thepattern of nonsynonymous and synonymous polymor-phisms as well as intra- and interpopulation diversity forNADPH genes and tested the null hypothesis of neutralitythat patterns of diversity may be explained by consideringonly the demographic history of human populations and themutation and recombination rates of each locus
Nonsynonymous polymorphisms are underrepresentedin the human genome and usually occur at low frequencieswhen present reflecting the action of purifying natural selec-tion (1000 Genomes Project Consortium 2010) By resequen-cing Tarazona-Santos et al (2008) did not observe commonnonsynonymous polymorphisms for CYBB This result is con-sistent with purifying natural selection acting on X-chromo-some genes due to the exposure of deleterious recessivemutations to natural selection in hemizygous males Thussubstitutions in the coding region of CYBB should be rarein human populations and are seldom captured by studieswith small sample sizes Interestingly the lack of CYBBnonsynonymous polymorphisms in our sample of humanpopulations contrasts with the recurrent episodes of positiveselection of the extracellular portion of gp91 during
2159
Evolutionary Dynamics of Human NADPH Oxidase Genes doi101093molbevmst119 MBE by guest on A
ugust 13 2016httpm
beoxfordjournalsorgD
ownloaded from
mammalian evolution as inferred in this study For the auto-somal NADPH oxidase components (table 1 and haplotypetables online Supplementary Material online) we observe inthis study two rare and conservative nonsynonymous substi-tutions (ie involving amino acids with similar chemical prop-erties T85N and A304E) in NCF4 But for NCF2 and CYBA weobserved patterns of nonsynonymous substitutions thatseldom occur in human genes NCF2 has nine nonsynony-mous substitutions three of them are common (with a fre-quency higher than 5 in at least one of the studiedpopulation samples) and six are rare On the other handtwo nonsynonymous substitutions in CYBA are commonand ubiquitous in human populations namely Y72H(rs4673) and V174A (rs1049254 in a position where variationamong mammalian species is also observed) Moreover thefollowing two of the five common amino acid changes ob-served in the autosomal NADPH genes are predicted to bepossibly damaging (ie radical) by the Polyphen resource(Ramensky et al 2002) the two changes are R395W inHispanic NCF2 (rs13306575) and Y72H (rs4673) in CYBA Ingeneral Polyphen accurately predicts the effect of nonsynon-ymous substitutions based on biochemical and evolutionarydata (Williamson et al 2005) Notably the ImmunodeficiencyMutations Database (httpbioinfutafiNCF2basecontent=pubIDbases last accessed July 16 2013) reports one395W395W autosomal recessive CGD patient but we andthe HapMap project (wwwhapmaporg last accessed July 16
2013) observed the W allele at frequencies between 5 and10 in Asians and admixed Latin Americans including onesupposedly healthy 395W395W Japanese HapMap individ-ual On the basis of these results we verified whether NativeAmericans who descend from an ancestral Pleistocene Asianpopulations that peopled the Americas by the Behring Straitsmore than 14000 years ago may have relatively higher fre-quencies of this variant We genotyped the variant 395Wusing a Taqman assay in 558 Native Americans (see supple-mentary table S3 Supplementary Material online for detailedresults) and observed an allele frequency of 12 being thisvariant always present in heterozygous individuals
For the four studied genes the diversity and levels of re-combination are higher in Africans than in non-Africans(table 2 and haplotype tables available as supplementaryfiles Supplementary Material online) This result is a conse-quence of the African origin of modern humans and the ldquooutof Africardquo migration that occurred 40000ndash80000 years agoafter a bottleneck leading to the peopling of other continents(Campbell and Tishkoff 2008 Laval et al 2010) Therefore thefirst divergence between continental human populations wasbetween Africans and ancestral non-Africans Consistentlywith this scenario we observed the highest between-popula-tion differentiation for CYBB CYBA and NCF4 between thesetwo groups Interestingly NCF2 does not match this pattern(table 3)
FIG 2 Inferred types of natural selection for codons of the NADPH genes at the evolutionary time scale of mammals Codons are represented along thehorizontal axis For each gene three classes of sites (black dark gray and light gray) are considered and each class evolved under different inferred values (presented for each gene in the figure at the top of each graphic) These classes correspond to the model M3 of Yang (2007a) with three classes ofsites Given our data this model is more likely than alternative models of evolution that assume simpler scenarios such as a unique for the entire gene(see supplementary material Supplementary Material online for details regarding methods and results using alternative models) The three classescorrespond to different types and levels of natural selection from strong purifying selection (in the lightest gray) to positive selection (gt 1) For eachcodon the probability of belonging to each of the three classes of corresponds to the height of the corresponding color in the vertical bar Forexample codon 173 of CYBB (indicated by a white arrow) has a 0000 probability of belonging to the= 00095 class (light gray a class corresponding tostrong purifying selection) a 0169 probability of belonging to the = 03085 class (dark gray) and a 0831 probability of belonging to the = 18987class (black a class that suggests positive selection) In this case reasonable evidence of positive selection on this codon exists
2160
Tarazona-Santos et al doi101093molbevmst119 MBE by guest on A
ugust 13 2016httpm
beoxfordjournalsorgD
ownloaded from
FIG
3
Nat
ural
sele
ctio
nm
app
ing
acro
ssC
YBB
(en
codi
ng
gp91
)al
ong
mam
mal
ian
evol
utio
na
sid
enti
fied
usin
gth
ePA
ML
met
hod
byY
ang
(200
7a)
The
top
olog
ies
ofgp
91an
dp2
2ar
ere
pro
duce
dfr
omT
aylo
ret
al(
2004
Cop
yrig
ht20
04T
heA
mer
ican
Ass
ocia
tion
ofIm
mun
olog
ists
In
cU
sed
wit
hp
erm
issi
on)
Dar
kgr
ayam
ino
acid
sha
veev
olve
dun
der
pos
itiv
ese
lect
ion
wit
hgt
80
pro
babi
lity
Mos
tof
thes
eam
ino
acid
sar
ein
the
extr
acel
lula
rp
orti
onof
the
pro
tein
The
upp
erp
art
ofth
efig
ure
show
sth
ep
rote
inal
ign
men
tfo
rn
ine
mam
mal
sof
the
gp91
regi
onin
dica
ted
byth
ebl
ack
ellip
seI
nth
isre
gion
ahi
ghle
velo
fam
ino
acid
varia
tion
isfo
und
betw
een
spec
ies
and
seve
ralc
odon
ssh
owgt
1In
this
alig
nm
ent
gray
vert
ical
bars
corr
esp
ond
tova
riab
leam
ino
acid
site
sT
hep
rote
inal
ign
men
tof
mam
mal
ssh
ows
the
follo
win
gsp
ecie
sH
om(H
omo
sapi
ens
sapi
ens)
Pan
(Pan
trog
lody
tes)
Mac
(Mac
aca
mul
atta
)M
us(M
usm
uscu
lus)
Rat
(Rat
tus
norv
egic
us)
Cri
(Cri
cetu
lus
gris
eus)
Het
(Het
eroc
epha
lus
glab
er)
Cav
(Cav
iapo
rcel
lus)
an
dO
ry(O
ryct
olag
uscu
nicu
lus)
EC
ext
race
lula
ren
viro
nm
ent
TM
tra
nsm
embr
ane
laye
ran
dIC
in
trac
ellu
lar
envi
ron
men
t
2161
Evolutionary Dynamics of Human NADPH Oxidase Genes doi101093molbevmst119 MBE by guest on A
ugust 13 2016httpm
beoxfordjournalsorgD
ownloaded from
From the four studied genes NCF4 which encodes theregulatory protein p40-phox shows a pattern of diversitythat is typical for a gene that has evolved under neutralityIn addition to the features described in the previous para-graph the allelic spectra of NCF4 in the studied populationsare consistent with a neutral model of evolution (tables 2ndash4)
Although CYBB presented the most interesting evolution-ary history at the interspecific level with repeated episodes ofpositive natural selection recent human evolutionary historyhas resulted in interesting patterns of variation for CYBA andNCF2 CYBA encodes p22-phox which is a transmembraneprotein shared by different NADPH oxidases In addition toharboring two common nonsynonymous polymorphisms(V174A is also variable among different species of mammals)CYBA is the most variable and most affected by recombina-tion among the NADPH oxidase genes (tables 1 and 2) Inparticular CYBA diversity is very high in Europe comparedwith 329 genes resequenced in a European sample (httppgagswashingtonedusummary_statshtml last accessed July16 2013) CYBA ranks 11th (ie the 97th percentile)Moreover there are contrasting proportions of the totalnumber of polymorphismsnumber of singletons betweenAfricans (a low proportion) and Europeans (a high propor-tion) the latter showing an excess of common polymor-phisms with respect to the neutral expectation (see the DFL
test in table 4 and supplementary table S4 SupplementaryMaterial online) This excess of common variants inEuropeans is also significant when we conservatively testedit against a scenario of human evolution that incorporates theldquoOut of Africardquo bottleneck (Laval et al 2010) and the observedlevel of recombination in Europeans (CYBA = 807 for the se-quenced region) Because demographic forces and recombi-nation levels alone do not explain the high CYBA diversity andits excess of common polymorphisms we suggest that bal-ancing natural selection (that acts by maintaining differentalleles at high frequency in a population) has contributed to
shape the diversity of CYBA at least in the European popu-lation Indeed figure 4a shows that the haplotype network ofCYBA for the European population is consistent with theaction of balancing natural selection (Bamshad andWooding 2003) showing two well-differentiated commonclades that explain the observed high diversity and theexcess of common CYBA variants A comparative genomicanalysis confirms this inference the ratio of polymorphismsto differences fixed between human and chimpanzee is nothomogeneous along the gene in the different human popu-lations (Mc Donald 1998 supplementary table S5Supplementary Material online) as would be expectedunder neutral evolution Our inference of balancing naturalselection is consistent with the fact that 25ndash30 of the var-iation in levels of ROS production can be attributed to geneticfactors (Lacy et al 2000) and that ROS levels are associatedwith CYBA variants (Bedard et al 2009)
p67 encoded by NCF2 is a necessary cytosolic NADPHcomponent for phagocyte ROS production Asians show ahighly differentiated NCF2 haplotype structure (see frequen-cies of haplotypes NCF2-D11 and NCF2-E10 in the haplotypetables online and in the network shown in fig 4b) and thehighest FST values are observed in pairwise comparisons be-tween Asians and non-Asian populations (in particular withEuropeans table 3) and not between Africans and non-African populations as is usually observed in the humangenome We confirmed these results by analyzing data forNCF2 from the HapMap Project (supplementary materialSupplementary Material online) Moreover a trend towardan excess of rare polymorphisms exists in Asians that is notobserved elsewhere (tables 1 2 and 4 DFL =1904FFL =1893) Although we cannot exclude that this patternof diversity is compatible with the null hypotheses of neutral-ity and with the tested demographic history of human pop-ulations inferred by Laval et al (2010 tables 2ndash4) we canspeculate and envisage four additional evolutionary scenarios
Table 1 Allele Frequencies of Nonsynonymous Polymorphisms in NADPH Oxidase Genes
Genes rs Minor Allele(Amino Acid)
PolyphenPrediction
African European Asian Hispanic
CYBA
Y72H rs4673 T (Y) Possibly damaging 046 032 017 022
V174A rs1049254 T (V) Benign 017 048 048 018
NCF2
K181R rs2274064 G (R) Benign 035 037 041 048
T279M rs13306581 T (T) Probably damaging 000 000 005 000
V297A rs35937854 C (A) Benign 004 000 000 000
T361S Chr1181799289 NCBI36hg18 T (S) mdash 000 000 002 000
H389Q rs17849502 A (Q) Benign 000 005 000 007
R395W rs13306575 T (W) Possibly damaging 000 000 000 007
N419I rs35012521 T (I) Probably damaging 000 002 004 000
P454S rs55761650 T (S) mdash 000 000 000 002
L487S Chr1181795862 NCBI36hg18 C (S) mdash 002 000 000 000
NCF4
T85N rs112306225 A (N) mdash 000 000 002 000
A304E rs5995361 A (E) Benign 004 000 000 000
2162
Tarazona-Santos et al doi101093molbevmst119 MBE by guest on A
ugust 13 2016httpm
beoxfordjournalsorgD
ownloaded from
that may have contributed to shape the pattern of NCF2diversity in Asians 1) the observed trend is suggestive of aselective sweep on NCF2 standing variation a neutral orweakly deleterious existing variant becomes beneficial andrapidly increases in frequency (together with its associatedhaplotypes ie incomplete sweep) reducing the nucleotidediversity in the surrounding region and rendering other stand-ing substitutions rare During this process new rare substitu-tions appear in the expanding positively selected haplotype2) The pattern of diversity of NCF2 in Asia may result from anincomplete selective sweep acting on ARPC5 which is locatedapproximately 35 kb downstream of NCF2 In a genome-widescan for recent positive selection Voight et al (2006) identi-fied a strong signature of an incomplete sweep for ARPC5 inAsia (P = 0009) characterized by a higher than expected long-range linkage disequilibrium summarized by very high iHS
statistics (see Haplotter results for the HapMap II data athttphaplotteruchicagoedu last accessed July 16 2013)SNPs in NCF2 also presents high iHS statistics (P = 002) al-though values are lower than for ARPC5 3) The differentiatedpattern of diversity of NCF2 in Asia may also have been gen-erated without the action of natural selection during the firstcolonization of Asia by modern humans In a process of geo-graphic population expansion specifically in the front wave ofthe expansion some rare alleleshaplotypes (ie surfing al-leles) may become common by chance mimicking the pat-tern of diversity generated by a selective sweep (Excoffier andRay 2008) 4) The excess of rare variants may be an artifact ofpooling individuals from different populations (Ptak andPrzeworski 2002) Consistent with evolutionary scenarios 1ndash4 that produce similar patterns of diversity the haplotypenetwork of NCF2 for Eurasians (fig 4b) shows the following1) a large differentiation between Asians and Europeans thatis compatible with the high observed FST values and 2) a star-like shape associated with the haplotype NCF2-E10 that iscommon in Asia and rare elsewhere which is compatiblewith the excess of rare alleles in the Asian populations
DiscussionBy analyzing 29 mammalian genomes and four human pop-ulations we show in this study that natural selection hasacted in different ways over time to shape the pattern ofdiversity of the phagocyte NADPH oxidase genes At the tem-poral scale of the evolution of mammals we have inferredrecurrent episodes of positive selection acting on the extra-cellular portion of gp-91 that have been important to shapethe pattern of interspecific diversity of this gene Our in-terspecific analyses did not show a similar pattern of naturalselection in any of the other phagocyte NADPH oxidasegenes Even if current knowledge on the biology of NADPHdoes not allow us to interpret our results in terms of functionwe propose that the extracellular region of gp-91 is function-ally relevant Our results also imply that this region is highlydifferentiated among mammals at the protein level and thisvariability should be considered when mammals models areused to study the structure and function of phagocyteNADPH components
In the time scale of human evolution our analyses of theNADPH oxidase genes suggest that CYBA has been a target ofbalancing natural selection Because we do not have evidenceof population-specific variants that faced selective pressurethe inferred natural selection may have acted on a standingvariation in ancestral populations This implies that the selec-tive pressure began after the appearance of the variant andpossibly acted in a specific geographic region (Barret andSchluter 2008) The signatures of natural selection acting ona new mutation and on standing variation differ In the case ofselective sweeps episodes of natural selection on standingvariation are associated to a larger variance in the allelic spec-trum with respect to natural selection on a new mutationAlso selection on standing variation may produce an excess ofalleles at intermediate frequencies that is not associated withhigh nucleotide diversity (Przeworski et al 2005 Peter et al2012) This pattern contrasts with the effect of balancing
Table 2 Intrapopulation Diversity Indexes in the Studied Populationsfor the NADPH Oxidase Genes Obtained from Resequencing Dataa
African European Asian Hispanic
Number of chromosomes
CYBBb 42 52 34 36
CYBA 48 62 48 46
NCF2 48 62 48 46
NCF4 48 62 48 46
Segregating sitessingletons
CYBB 218 70 103 135
CYBA 6122 333 335 347
NCF2 4613 3311 2816 3714
NCF4 4512 267 191 308
Haplotype structureNumber of inferred haplotypesc
CYBB 14 5 7 12
CYBA 39 39 32 26
NCF2 38 32 18 31
NCF4 36 30 17 22
Haplotype diversity plusmn SD
CYBB 088 plusmn 003 034 plusmn 008 053 plusmn 010 070 plusmn 008
CYBA 098 plusmn 001 098 plusmn 001 097 plusmn 000 096 plusmn 002
NCF2 099 plusmn 001 096 plusmn 001 087 plusmn 004 097 plusmn 001
NCF4 098 plusmn 001 093 plusmn 002 093 plusmn 002 093 plusmn 002
Recombination parameter (q 103 per site)c
CYBB 008 lt001 lt001 lt002
CYBA 291 148 096 088
NCF2 052 038 010 058
NCF4 137 103 039 022
h estimatorsd
p 103 per site
CYBB 036 012 015 028
CYBA 190 163 151 157
NCF2 081 047 043 057
NCF4 101 076 064 084
hW 103 per site
CYBB 042 013 021 027
CYBA 231 118 125 13
NCF2 102 069 062 083
NCF4 101 070 054 087
aMost analyses were performed using software DnaSP (Rozas 2009)bData for CYBB (Xp211) are from Tarazona-Santos et al (2008)cHaplotypes and inferred using the method by Stephens and Scheet (2005) andthe software PHASEd Tajima (1983) W Watterson (1975)
2163
Evolutionary Dynamics of Human NADPH Oxidase Genes doi101093molbevmst119 MBE by guest on A
ugust 13 2016httpm
beoxfordjournalsorgD
ownloaded from
natural selection which produces an excess of common allelesassociated with high genetic diversity Thus the observed pat-tern of CYBA diversity in Europeans is not consistent with aselective sweep on a standing variation but it is consistent witha scenario of balancing selection acting on standing variation
If we consider for CYBA that heterozygote advantage maybe the mechanisms of balancing selection we can speculatethat the biological basis for this mechanism may be the fol-lowing considering that p22-phox is not exclusive of thephagocyte NADPH oxidase but it is also part of Nox com-plexes expressed in other tissues the dependence of ROSproduction on CYBA variants has to be finely regulated IfCYBA variants induce high levels of ROS these variants mayfavor a phagocyte-dependent efficient response to pathogensbut may damage other endothelial tissues Alternativelytissue oxidative damage does not occur if ROS productionis low but this response may be associated with a weaker
phagocyte respiratory burst against pathogens In this con-text heterozygote individuals with a CYBA-dependent inter-mediate level of ROS production may have been favored bynatural selection
Our results contribute to the discussion regarding the rel-evance of balancing selection in shaping the diversity ofinnate immunity genes (Ferrer-Admetlla et al 2008 Barreiroet al 2009) Ferrer-Admetlla et al (2008) have associated therecurrent signatures of balancing selection on inflammatorygenes with the need for fine regulation CYBA is the onlyphagocytic NADPH oxidase gene that also encodesnonphagocyte Nox components thus CYBA has a role inROS cell signaling a potentially dangerous process due toits capability to produce oxidative damage to tissues Geneswith these characteristics likely need even tighter regulationThe interplay between pathogen-driven selective pressureon innate immunity genes and their concomitantnonimmunological functions is complex In addition toCYBA other interesting examples of this interplay can befound among the 10 human toll-like receptors (TLRs) thatshow a variety of signatures of natural selection TLR8 whichshows the strongest signature of purifying selection is alsoinvolved in neuronal development (Barreiro et al 2009) and itis difficult to discriminate the role of each function in deter-mining the observed signature of natural selection
With few exceptions the pathogens responsible for naturalselection on immune genes are difficult to specify In the caseof NADPH oxidase we can infer based on the spectrum ofinfections in CGD patients that catalase-positive bacteria andfungi such as Staphylococci Salmonella Candida Aspergillusand M tuberculosis may be the selective pathogensInteractions between the host and pathogens also includemechanisms of the latter to impair the respiratory burst ofthe former For example Leishmania donovani blocks the as-sembly of NADPH oxidase at the phagosome membrane(Lodge et al 2006) These mechanisms may constitute selec-tive pressures imposed by pathogens
The associations reported in GWAS between rs4821544 inNCF4 and Crohnrsquos disease (an idiopathic inflammatory boweldisease that predominantly involves the ileum and colonRioux et al 2007) and between rs10911363 in NCF2 and
Table 3 Pairwise FST Genetic Distances between Populations
CYBBa CYBA
Africa Europe Asia Hispanic Africa Europe Asia Hispanic
Africa mdash 0316 0264 0092 mdash 0074 0083 0065
Europe 0257 mdash 0000 0107 mdash 0002 0056
Asia 0211 0000 mdash 0073 mdash 0054
Hispanic 0070 0082 0056 mdash mdash
NCF2 NCF4
Africa mdash 0048 0058 0037 mdash 0128 0136 0158
Europe mdash 0069 0000 mdash 0026 0026
Asia mdash 0059 mdash 0005
Hispanic mdash mdash
aFor CYBB FST estimators are above the diagonal Below the diagonal are the FST values corrected as if the effective population sizes of X chromosome genes were equal toautosomal ones
Table 4 Results of Neutrality Tests for the NADPH Oxidase Genesand Their Significancea
African European Asian Hispanic
Tajimarsquos D
CYBB 0473 0274 0813 0084
CYBA 0580 1242 0684 0707
NCF2 0412 0939 0883 0987
NCF4 0750 0243 0556 0102
Fu and Lirsquos D
CYBB 1050 1110 0395 0977
CYBA 1485 1734 1188 0138
NCF2 0407 1105 1904 1398
NCF4 0308 0443 1893 0266
Fu and Lirsquos F
CYBB 0980 0811 0114 0814
CYBA 1382 1893 1205 0405
NCF2 0512 1252 1893 1567
NCF4 0557 0231 1225 0248
NOTEmdashUnderlined values represent significant results under the demographic modelinferred by Laval et al (2010) for human populations See details in supplementarytable S4 Supplementary Material onlineaThe McDonaldndashKreitman test is nonsignificant in any of the cases
Significant under the WrightndashFisher model of constant population size
2164
Tarazona-Santos et al doi101093molbevmst119 MBE by guest on A
ugust 13 2016httpm
beoxfordjournalsorgD
ownloaded from
systemic lupus erythematosus (Cunninghame Graham et al2011) confirm the involvement of NADPH genes in the path-ogenesis of inflammatory-related common diseases Ourclaim that natural selection acted on CYBA (and maybe inNCF2) is relevant for biomedical studies because combiningevidence of natural selection with association analyses inimmune genes increases the statistical power to detect dis-ease-associated variants (Ayodo et al 2007) As a new gener-ation of association studies focusing on rare variation isemerging the combination of genes deemed interestingfrom GWAS and populations with an excess of rare variantsin these genes such as NCF2 in Asians are particularly inter-esting as a source of rare variants with clinical relevanceFinally by determining through molecular evolution mappingthat the extracellular portion of gp91 (encoded by CYBB inthe X-chromosome) has been subject to recurrent episodes ofpositive selection at the scale of mammals evolution we positthe hypothesis that this portion of the NADPH oxidase isrelevant for currently unknown biological processes thatonce revealed by structural and functional investigationswill contribute to understanding the role of NADPH oxidasein infectious autoimmune and cardiovascular diseases
Materials and Methods
Molecular Evolution Analysis of NADPH Genes
We used the maximum likelihood framework developed byYang (2007a) to estimate for the NADPH oxidase genes
under a variety of evolutionary models as implemented in thePAML software (Yang 2007b) This approach allows infer-ences about the evolution of a coding region along aninter-specific phylogeny mapping the codons that haveevolved under strongweak purifying selection neutrality oradaptive positive selection Further details about these anal-yses are available as supplementary material SupplementaryMaterial online
Human Population Genetics of NADPH Genes
For human population genetics analyses we conducted bidi-rectional Sanger sequencing of CYBB CYBA NCF2 and NCF4for a total of 35242 bp for each of 102 healthy individuals aspart of the SNP500 Cancer project (Packer et al 2006 seesupplementary fig S1 Supplementary Material online for de-tails) Human population resequencing data for CYBB werepublished in Tarazona-Santos et al (2008) The resequencingexperiments were performed as in Packer et al (2006) These102 unrelated individuals include the following 24 of Africanancestry (15 African Americans from the United States and 9Pygmies) 23 admixed Latin Americans (from Mexico PuertoRico and South America) 31 Europeans (from the CEPHUTAH pedigree and the NIEHS Environmental GenomeProject) and 24 AsiansndashOceanians (from Melanesia PakistanChina Cambodia Japan and Taiwan)
After controlling for multiple tests we confirmed that allSNPs fit the HardyndashWeinberg equilibrium by the Guo and
CYBA - A16
CYBA - A3
Y72H - rs4673
V174A - rs1049254 CYBA - E18
(a)
NCF2 - E10 H389Q - rs17849502
NCF2 - B3
NCF2 - D11
NCF2 - C1K181R - rs2274064(b)
FIG 4 Phylogenetic networks of (a) CYBA in Europeans and (b) NCF2 in Europeans (black) and Asians (yellow) The lengths of the branches areproportional to the number of mutations Only nonsynonymous mutations are shown The haplotype names correspond to the table of inferredhaplotypes in the supplementary material Supplementary Material online We only show the names of haplotypes with a frequency gt5 Ancestralhaplotypes (inferred as being the human haplotype or median vector most similar to the chimpanzee sequence) are indicated by an arrow Medianvectors are in red
2165
Evolutionary Dynamics of Human NADPH Oxidase Genes doi101093molbevmst119 MBE by guest on A
ugust 13 2016httpm
beoxfordjournalsorgD
ownloaded from
Thompson (1992) test which was implemented in the soft-ware Arlequin 30 (Excoffier et al 2007) Insertion-deletions(INDELs) were excluded from population genetics analysesTo assess intrapopulation diversity we used two statistics the per-site mean number of pairwise differences betweensequences (Tajima 1983) and w based on the number ofsegregating sites (S) (Watterson 1975) We measured pairwisebetween-populations diversity by using the FST statistics cal-culated using the software DnaSP (Rozas 2009)
Haplotypes and the recombination parameter were in-ferred using the PHASE software (Stephens and Scheet 2005)and diversity indexes calculations (tables 2 and 3) as well asneutrality statistics (table 4) were estimated using DnaSP soft-ware We applied two kinds of neutrality tests 1) tests basedon the allelic spectrum which is the distribution of polymor-phisms across different classes of frequencies namelyTajimarsquos DT (Tajima 1989) and FundashLirsquos DFL and FFL (Fu andLi 1993) and 2) tests based on comparisons between thenumber of polymorphisms in human populations and fixeddifferences with the chimpanzee (ie outgroup) namely theMcDonald and Kreitman (1991) test and the adaptedKolmogorovndashSmirnoff test by McDonald (1998) For thefirst set of tests we used as null hypotheses both the classicWrightndashFisher model of neutrality with a constant popula-tion size as well as the more realistic evolutionary scenario forhuman populations inferred by Laval et al (2010) In the caseof the scenario of Laval et al (2010) we ignored interconti-nental gene flow within the Old World because these raregene flow events likely does not affect the level of significanceof the neutrality tests given its very low inferred values(13 105) Null distributions used to test the significanceof the neutrality tests under these evolutionary scenarios weregenerated using coalescent simulations and a significancelevel of 005 (Hudson 2002) The KolmogorovndashSmirnoff testof neutrality adapted by McDonald (1998) was performedusing Slider software available at httpudeledu~mcdonaldaboutdnasliderhtml (last accessed July 16 2013) Weperformed coalescent simulations using ms software(Hudson 2002) Further methodological details are availableas supplementary material Supplementary Material onlineWe constructed the CYBA and NCF2 networks using allSNP variants and applying the Median joining algorithmand the maximum parsimony option calculations as imple-mented in the software Network 46 (Bandelt et al 1999)
Supplementary MaterialSupplementary tables S1ndashS5 and figure S1 are available atMolecular Biology and Evolution online (httpwwwmbeoxfordjournalsorg)
Acknowledgments
SJC conceived the study ET-S MM FL RC AC CF LPand BS generated the dataanalyzed the sequences LB ACWCSM and MY provided tools for data generationand analyses ET-S MM FL WCSM and RR performedpopulation genetics analyses ET-S and MM prepared tablesand figures ET-S and SJC wrote the manuscript All the
authors contributed with discussions about the results Theauthors are grateful to Silvia Fuselli for discussions of ourresults to Fernando Levi Soares for technical assistance withthe figures and the Sequencing Group of the CoreGenotyping Facility (National Cancer Institute) for their tech-nical assistance This work was supported by the IntramuralResearch Program of the National Cancer Institute (NCI) andNational Institutes of Health (NIH) CF was supported by theUniversity of Bologna ET-S MM WCSM and FL weresupported by the Fogarty International Center and NationalCancer Institute (1R01TW007894) Conselho Nacional deDesenvolvimento Cientıfico e Tecnologico (Brazil Grant3075562012-3) Fundacao de Amparo a Pesquisa de MinasGerais (Brazil CBB PPM 0026512) and the Brazilian Ministryof Education (CAPES Agency grant PNPD 0256709-1)
ReferencesAyodo G Price AL Keinan A Ajwang A Otieno MF Orago AS
Patterson N Reich D 2007 Combining evidence of natural selectionwith association analysis increases power to detect malaria-resis-tance variants Am J Hum Genet 81(2)234ndash242
Bamshad M Wooding SP 2003 Signatures of natural selection in thehuman genome Nat Rev Genet 4(2)99ndash111
Bandelt H-J Forster P Rohl A 1999 Median-joining networks for infer-ring intraspecific phylogenies Mol Biol Evol 16(1)37ndash48
Barreiro LB Ben-Ali M Quach H et al (17 co-authors) 2009Evolutionary dynamics of human Toll-like receptors and their dif-ferent contributions to host defense PLoS Genet 5(7)e1000562
Barreiro LB Quintana-Murci L 2010 From evolutionary genetics tohuman immunology how selection shapes host defence genesNat Rev Genet 11(1)17ndash30
Barrett RD Schluter D 2008 Adaptation from standing genetic varia-tion Trends Ecol Evol 23(1)38ndash44
Bedard K Attar H Bonnefont J Jaquet V Borel C Plastre O Stasia MJAntonarakis SE Krause KH 2009 Three common polymorphisms inthe CYBA gene form a haplotype associated with decreased ROSgeneration Hum Mutat 30(7)1123ndash1133
Blekhman R Man O Herrmann L Boyko AR Indap A Kosiol CBustamante CD Teshima KM Przeworski M 2008 Natural selectionon genes that underlie human disease susceptibility Curr Biol18(12)883ndash889
Brandes RP Kreuzer J 2005 Vascular NADPH oxidases molecular mech-anisms of activation Cardiovasc Res 65(1)16ndash27
Buckley RH 2004 Pulmonary complications of primary immunodefi-ciencies Paediatr Respir Rev 5(Suppl A)S225ndashS233
Bustamante CD Fledel-Alon A Williamson S et al (13 co-authors)2005 Natural selection on protein-coding genes in the humangenome Nature 437(7062)1153ndash1157
Bustamante J Arias AA Vogt G et al (29 co-authors) 2011 GermlineCYBB mutations that selectively affect macrophages in kindredswith X-linked predisposition to tuberculous mycobacterial diseaseNat Immunol 12(3)213ndash221
Campbell MC Tishkoff SA 2008 African genetic diversity implicationsfor human demographic history modern human origins and com-plex disease mapping Annu Rev Genomics Hum Genet 9403ndash433
Chanock SJ Roesler J Zhan S Hopkins P Lee P Barrett DT ChristensenBL Curnutte JT Gorlach A 2000 Genomic structure of the humanp47-phox (NCF1) gene Blood Cells Mol Dis 26(1)37ndash46
Cunninghame Graham DS Morris DL Bhangale TR Criswell LASyvanen AC Ronnblom L Behrens TW Graham RR Vyse TJ2011 Association of NCF2 IKZF1 IRF8 IFIH1 and TYK2 with sys-temic lupus erythematosus PLoS Genet 7(10)e1002341
Excoffier L Laval G Schneider S 2007 Arlequin ver 30 an integratedsoftware package for population genetics data analysis EvolBioinform Online 147ndash50
2166
Tarazona-Santos et al doi101093molbevmst119 MBE by guest on A
ugust 13 2016httpm
beoxfordjournalsorgD
ownloaded from
Excoffier L Ray N 2008 Surfing during population expansions promotesgenetic revolutions and structuration Trends Ecol Evol23(7)347ndash351
Ferrer-Admetlla A Bosch E Sikora M Marques-Bonet T Ramırez-SorianoA Muntasell A Navarro A Lazarus R Calafell F Bertranpetit J Casals F2008 Balancing selection is the main force shaping the evolution ofinnate immunity genes J Immunol 181(2)1315ndash1322
Fu YX Li WH 1993 Statistical tests of neutrality of mutations Genetics133(3)693ndash709
The 1000 Genomes Project Consortium 2010 A map of humangenome variation from population-scale sequencing Nature467(7319)1061ndash1073
The 1000 Genomes Project Consortium Abecasis GR Auton A BrooksLD DePristo MA Durbin RM Handsaker RE Kang HM Marth GTMcVean GA 2012 An integrated map of genetic variation from1092 human genomes Nature 491(7422)56ndash65
Guo SW Thompson EA 1992 Performing the exact test of Hardy-Weinberg proportion for multiple alleles Biometrics 48(2)361ndash372
Heyworth PG Cross AR Curnutte JT 2003 Chronic granulomatousdisease Curr Opin Immunol 15(5)578ndash584
Hudson RR 2002 Generating samples under a Wright-Fisher neutralmodel of genetic variation Bioinformatics 18(2)337ndash338
Jacob CO Eisenstein M Dinauer MC et al (21 co-authors) 2012 Lupus-associated causal mutation in neutrophil cytosolic factor 2 (NCF2)brings unique insights to the structure and function of NADPHoxidase Proc Natl Acad Sci U S A 109(2)E59ndashE67
Kimura M 1974 The neutral theory of molecular evolution CambridgeCambridge University Press
Kosiol C Vinar T da Fonseca RR Hubisz MJ Bustamante CD Nielsen RSiepel A 2008 Patterns of positive selection in six mammalian ge-nomes PLoS Genet 4(8)e1000144
Lacy F Kailasam MT OrsquoConnor DT Schmid-Schonbein GW Parmer RJ2000 Plasma hydrogen peroxide production in human essentialhypertension role of heredity gender and ethnicity Hypertension36(5)878ndash884
Laval G Patin E Barreiro LB Quintana-Murci L 2010 Formulating a his-torical and demographic model of recent human evolution based onresequencing data from noncoding regions PLoS One 5(4)e10284
Lewontin R 1972 The apportionment of human diversity Evol Biol 6391ndash398
Lindblad-Toh K Garber M Zuk O et al (88 co-authors) 2011 A high-resolution map of human evolutionary constraint using 29 mam-mals Nature 478(7370)476ndash482
Lodge R Diallo TO Descoteaux A 2006 Leishmania donovani lipopho-sphoglycan blocks NADPH oxidase assembly at the phagosomemembrane Cell Microbiol 8(12)1922ndash1931
Magalhaes WC Rodrigues MR Silva D Soares-Souza G Iannini MLCerqueira GC Faria-Campos AC Tarazona-Santos E 2012DIVERGENOME a bioinformatics platform to assist population ge-netics and genetic epidemiology studies Genet Epidemiol36(4)360ndash367
Marth JD Grewal PK 2008 Mammalian glycosylation in immunity NatRev Immunol 8(11)874ndash887
McDonald JH 1998 Improved tests for heterogeneity across a region ofDNA sequence in the ratio of polymorphism to divergence Mol BiolEvol 15377ndash384
McDonald JH Kreitman M 1991 Adaptive protein evolution at the Adhlocus in Drosophila Nature 351(6328)652ndash654
Nielsen R Bustamante C Clark AG et al (12 co-authors) 2005 A scanfor positively selected genes in the genomes of humans and chim-panzees PLoS Biol 3(6)e170
Olsson LM Lindqvist AK Kallberg H Padyukov L Burkhardt HAlfredsson L Klareskog L Holmdahl R 2007 A case-control studyof rheumatoid arthritis identifies an associated single nucleotidepolymorphism in the NCF4 gene supporting a role for the
NADPH-oxidase complex in autoimmunity Arthritis Res Ther9(5)R98
Packer BR Yeager M Burdett L et al (13 co-authors) 2006SNP500Cancer a public resource for sequence validation assay de-velopment and frequency analysis for genetic variation in candidategenes Nucleic Acids Res 34(Database issue)D617ndashD621
Peter BM Huerta-Sanchez E Nielsen R 2012 Distinguishing betweenselective sweeps from standing variation and from a de novo mu-tation PLoS Genet 8(10)e1003011
Przeworski M Coop G Wall JD 2005 The signature of positive selectionon standing genetic variation Evolution 59(11)2312ndash2323
Ptak SE Przeworski M 2002 Evidence for population growth in humansis confounded by fine-scale population structure Trends Genet18(11)559ndash563
Ramensky V Bork P Sunyaev S 2002 Human non-synonymous SNPsserver and survey Nucleic Acids Res 30(17)3894ndash3900
Rioux JD Xavier RJ Taylor KD et al (24 co-authors) 2007 Genome-wideassociation study identifies new susceptibility loci for Crohn diseaseand implicates autophagy in disease pathogenesis Nat Genet39(5)596ndash604
Roberts RL Hollis-Moffatt JE Gearry RB Kennedy MA Barclay MLMerriman TR 2008 Confirmation of association of IRGM andNCF4 with ileal Crohnrsquos disease in a population-based cohortGenes Immun 9(6)561ndash565
Rozas J 2009 DNA sequence polymorphism analysis using DnaSPMethods Mol Biol 537337ndash350
San Jose G Fortuno A Beloqui O Dıez J Zalba G 2008 NADPH oxidaseCYBA polymorphisms oxidative stress and cardiovascular diseasesClin Sci (Lond) 114(3)173ndash182
Santiago HC Gonzalez Lombana CZ Macedo JP et al (12 co-authors)2012 NADPH phagocyte oxidase knockout mice controlTrypanosoma cruzi proliferation but develop circulatory collapseand succumb to infection PLoS Negl Trop Dis 6(2)e1492
Stephens M Scheet P 2005 Accounting for decay of linkage disequilib-rium in haplotype inference and missing-data imputation Am JHum Genet 76(3)449ndash462
Sumimoto H Miyano K Takeya R 2005 Molecular composition andregulation of the Nox family NAD(P)H oxidases Biochem BiophysRes Commun 338(1)677ndash686
Tajima F 1983 Evolutionary relationship of DNA sequences in finitepopulations Genetics 105(2)437ndash460
Tajima F 1989 Statistical method for testing the neutral mutationhypothesis by DNA polymorphism Genetics 123(3)585ndash595
Tarazona-Santos E Bernig T Burdett L Magalhaes WC Fabbri C Liao JRedondo RA Welch R Yeager M Chanock SJ 2008 CYBB anNADPH-oxidase gene restricted diversity in humans and evidencefor differential long-term purifying selection on transmembrane andcytosolic domains Hum Mutat 29(5)623ndash632
Taylor RM Burritt JB Baniulis D Foubert TR Lord CI Dinauer MCParkos CA Jesaitis AJ 2004 Site-specific inhibitors of NADPH oxi-dase activity and structural probes of flavocytochrome b character-ization of six monoclonal antibodies to the p22phox subunitJ Immunol 173(12)7349ndash7357
Voight BF Kudaravalli S Wen X Pritchard JK 2006 A map of recentpositive selection in the human genome PLoS Biol 4(3)e72
Watterson GA 1975 On the number of segregating sites in geneticalmodels without recombination Theor Popul Biol 7(2) 256ndash276
Williamson SH Hernandez R Fledel-Alon A Zhu L Nielsen RBustamante CD 2005 Simultaneous inference of selection and pop-ulation growth from patterns of variation in the human genomeProc Natl Acad Sci U S A 102(22)7882ndash7887
Yang Z 2007a Adaptive molecular evolution In Balding DJ Bishop MCannings C editors Handbook of statistical genetics Vol 1 3rd edSusex (United Kingdom) John Wiley amp Sons p 377ndash406
Yang Z 2007b PAML 4 phylogenetic analysis by maximum likelihoodMol Biol Evol 241586ndash1591
2167
Evolutionary Dynamics of Human NADPH Oxidase Genes doi101093molbevmst119 MBE by guest on A
ugust 13 2016httpm
beoxfordjournalsorgD
ownloaded from
et al 2011) as well as for immune related diseases such asCrohnrsquos disease and lupus as identified in genome-wide as-sociation studies (GWAS) in European populations (Riouxet al 2007 Roberts et al 2008 Jacob et al 2012) Besidesthe phagocyte NADPH oxidase other NADPH oxidaseswith different functions are expressed in a variety ofnonphagocytic cells including the endothelium and havebeen implicated in cardiovascular and renal diseaseAlthough p22-phox (encoded by CYBA) is a protein compo-nent shared by several of these NADPH oxidases (also calledNox) other more specific protein subunits are encoded bydifferent Nox genes homologous to the genes coding for thephagocytic subunits (Sumimoto et al 2005 San Jose et al2008) Although these nonphagocytic NADPH oxidases nor-mally produce less O2 even small imbalances in ROS levelsmay cause tissue damage due to oxidative stress which iscorrelated with the pathogenesis of gout chronic obstructivepulmonary disease rheumatoid arthritis and cardiovasculardiseases (Brandes and Kreuzer 2005) Therefore variants inNADPH oxidase genes may have pleiotropic effects across aspectrum of disorders (Santiago et al 2012)
Despite the involvement of the NADPH oxidase in a rangeof clinically relevant phenotypes our knowledge of the se-quence diversity of NADPH genes mostly derives from CGDpatients Although targeted SNP genotyping has been per-formed in the context of association studies for CYBA(Bedard et al 2009) and NCF4 (Olsson et al 2007) none ofthe large-scale resequencing efforts such as Seattle SNPs(httppgagswashingtonedu last accessed July 16 2013)Innate Immunity PGA (httpwwwpharmgatorgIIPGA2index_html last accessed July 16 2013) and the CornellndashCelera initiative (Bustamante et al 2005) have included theNADPH oxidase genes and the coverage of these genes for thecurrent release of the 1000 Genomes Project remains low formost of the studied individuals (1000 Genomes ProjectConsortium et al 2012 average coverage and their standard
deviations on May 2013 are CYBB 40 plusmn 22 CYBA 35 plusmn 20NCF2 49 plusmn 27 and NCF4 49 plusmn 27) Although GWAS haveidentified common variants that contribute to complex phe-notypes a component of missing heritability of common dis-eases due to rare variants that are detectable only byresequencing is emerging In this study we analyzed the pat-tern of sequence diversity of four of the NADPH genes (CYBBCYBA NCF2 and NCF4) between mammalian species and inhuman populations by resequencing these genes in 102 eth-nically diverse individuals We interpreted our results in termsof evolutionary histories by addressing the action of naturalselection and focusing on two temporal scales mammalianevolution and recent human evolution We excluded NCF1from our study because its high homology with its pseudo-genes prevents reliable sequencing in individual samples(Chanock et al 2000) Several studies have shown the impor-tance of natural selection on the evolution of immunity genesat both the interspecific (Kosiol et al 2008) and populationlevels (Ferrer-Admetlla et al 2008 Barreiro et al 2009 Barreiroand Quintana-Murci 2010) By definition variants under nat-ural selection are associated with different reproductive effi-ciencies (fitness) of their carriers and contribute to phenotypevariability therefore they may be biomedically relevant byinfluencing the susceptibility to rare or common diseasesThe goals of this study are as follows 1) to determine whetherthe pattern of diversity of human phagocyte NADPH genesreflects the action of different types of natural selection 2) toelucidate the evolutionary dynamics of NADPH genes at thetemporal scales of mammals and humans and 3) to under-stand the biomedical implications of this evolutionary processin human populations
Results
Molecular Evolution of NADPH Genes alongMammalian Phylogeny
We examined signatures of natural selection across thecoding regions of NADPH genes by analyzing sequencesfrom the complete genomes of 29 mammals listed in theEntrez and Ensembl databases (Lindblad-Toh et al 2011one sequence for each species see supplementary materialSupplementary Material online for details) and comparingthe amount of nonsynonymous and synonymous substitu-tions (Nielsen et al 2005) When comparing a set of homol-ogous sequences from different species most of the observeddifferences are fixed that is the differences are monomorphicwithin a species because enough time has passed for theobserved variant to appear increase its frequency and reacha frequency of 1 (Kimura 1974) We compared the number offixed synonymous substitutions (dS assumed to be neutral)and fixed nonsynonymous substitutions (dN for which wetest the hypothesis of natural selection) between speciesusing the parameter = dNdS which is informative of theaction of natural selection at the inter-specific level (Yang2007a) Under neutral evolution of nonsynonymous substitu-tions these substitutions fix at the same rate as synonymoussubstitutions and therefore dN amp dS and amp 1 If nonsyn-onymous substitutions tend to be deleterious purifying
FIG 1 Components of the phagocyte NADPH oxidase Representationof the inactivated (left) and activated (right) forms of the phagocyteNADPH oxidase components reproduced from Heyworth et al (2003)The activated form is responsible for the respiratory burst The proteins(and genes) are gp91 (CYBB Xp211) p22 (CYBA 16q24) p67 (NCF21q25) p40 (NCF4 22q131) and p47 (NCF1 7q1123)
2158
Tarazona-Santos et al doi101093molbevmst119 MBE by guest on A
ugust 13 2016httpm
beoxfordjournalsorgD
ownloaded from
selection maintains the substitutions at low frequencies andprevents fixation at the same rate as synonymous substitu-tions resulting in dNlt dS and lt 1 On the other hand ifepisodes of positive natural selection (that raise the frequencyof beneficial variants) are frequent nonsynonymous substi-tutions increase in frequency and fix more rapidly than neu-tral synonymous substitutions thus dNgt dS and gt 1 Weused the maximum likelihood framework developed by Yang(2007a) to estimate for the NADPH oxidase genes under avariety of evolutionary models as implemented in the PAMLsoftware (Yang 2007b) This approach allows inferences aboutthe evolution of a coding region along an interspecific phy-logeny and maps the codons that have evolved under strongweak purifying selection neutrality or adaptive positive se-lection (see supplementary material Supplementary Materialonline for details)
In general PAML evolutionary models that allow a combi-nation of purifying selection and neutrality are reasonably re-alistic These models are nested with respect to models thatalso incorporate positive selection at the cost of adding newparameters We evaluated the improvements in the goodnessof fit of the data using the latter model with respect to theformer models by applying a likelihood ratio test (LRT) Afterfitting the data to the most appropriate evolutionary modelNaive Empirical Bayes (NEB) or Bayes Empirical Bayes (BEB)approaches were used to infer theparameter for each codon
For the 29 species of mammals considered we filteredbased on quality control (supplementary table S1 Supple-mentary Material online) and analyzed 570 codons of CYBBin 26 species 198 codons of CYBA in 16 species 526 codons ofNCF2 in 23 species and 339 codons of NCF4 in 20 species (seesupplementary material Supplementary Material online forfurther details including the species and sequences used forthe analyses the parameter estimations for the differentmodels and the LRT results) Here we only present the resultsof model M3 of Yang (2007a) with three (K = 3) classes of (fig 2) This model allows for different classes including thepossibility of positive selection and is a reasonable way ofpresenting the results for the four genes under the samemodel Moreover in all cases the data fit better withmodel M3 than the nested and simpler M0 or M2 modelspresented by Yang (2007a)
The results of this analysis for CYBB CYBA NCF2 and NCF4are presented in figure 2 which shows the type of naturalselection (ie based on the estimated ) for each codon thatmost likely predominated during mammalian evolution Forthis temporal scale CYBA NCF2 and NCF4 coding regionshave evolved driven by a combination of different levels ofpurifying natural selection Overall the average and standarddeviation values for these genes are NCF2 = 0256 plusmn 0227NCF4 = 0126 plusmn 0140 and CYBA = 0109 plusmn 0116
Our most striking result is for CYBB which presents a widespectrum of mutations that account for gt70 of CGD pa-tients Although we would predict that purifying selection ongenes involved in Mendelian diseases (Blekhman et al 2008)would yield similar results for CYBB and other NADPHcomponents we observed a different pattern In generalCYBB is a conserved gene but 6 of its codons show
evidence of positive natural selection (supplementary tableS2 Supplementary Material online fig 2) In a genome-widesurvey performed by Kosiol et al (2008) they have reportedCYBB as a gene showing a signal of positive natural selectionMore importantly by evolutionary mapping we show herefor the first time that most of these positive selection eventsmap to the small extracellular portion of this protein (fig 3)The proximity of these inferred episodes of positive naturalselection to glycosylation sites in gp91 is noteworthy consid-ering the importance of the glycome in immunity (Marth andGrewal 2008)
Population Genetics of NADPH Genes
We sequenced CYBB CYBA NCF2 and NCF4 in a publiclyavailable panel that includes 24 individuals of African ances-try 31 Europeans 24 AsianOceanians and 23 admixed LatinAmericans (ie Hispanics) This panel is a suboptimal repre-sentation of the worldwide population a limitation that iscommon to most human genomic diversity projects focusedon SNP genotyping or resequencing efforts However basedon how human genetic diversity is apportioned within(gt85) and between (lt15) populations (Lewontin 19721000 Genomes Project Consortium 2010) even studies usingsuboptimal sampling are informative about the genetic struc-ture of human populations and serve to critically identify therole of evolutionary factors in human genetic diversity(Kimura 1974 Nielsen et al 2005 1000 Genomes ProjectConsortium 2010) All the raw results are available as supple-mentary material Supplementary Material online and at theSNP500Cancer project homepage (httpvariantgpsncinihgovcgfseqpagessnp500do last accessed July 16 2013) orcan be downloaded from the DIVERGENOME platform(Magalhaes et al 2012 httpwwwpggeneticaicbufmgbrdivergenome last accessed July 16 2013)
To ascertain which combination of evolutionary factorshas shaped the diversity of NADPH genes we assessed thepattern of nonsynonymous and synonymous polymor-phisms as well as intra- and interpopulation diversity forNADPH genes and tested the null hypothesis of neutralitythat patterns of diversity may be explained by consideringonly the demographic history of human populations and themutation and recombination rates of each locus
Nonsynonymous polymorphisms are underrepresentedin the human genome and usually occur at low frequencieswhen present reflecting the action of purifying natural selec-tion (1000 Genomes Project Consortium 2010) By resequen-cing Tarazona-Santos et al (2008) did not observe commonnonsynonymous polymorphisms for CYBB This result is con-sistent with purifying natural selection acting on X-chromo-some genes due to the exposure of deleterious recessivemutations to natural selection in hemizygous males Thussubstitutions in the coding region of CYBB should be rarein human populations and are seldom captured by studieswith small sample sizes Interestingly the lack of CYBBnonsynonymous polymorphisms in our sample of humanpopulations contrasts with the recurrent episodes of positiveselection of the extracellular portion of gp91 during
2159
Evolutionary Dynamics of Human NADPH Oxidase Genes doi101093molbevmst119 MBE by guest on A
ugust 13 2016httpm
beoxfordjournalsorgD
ownloaded from
mammalian evolution as inferred in this study For the auto-somal NADPH oxidase components (table 1 and haplotypetables online Supplementary Material online) we observe inthis study two rare and conservative nonsynonymous substi-tutions (ie involving amino acids with similar chemical prop-erties T85N and A304E) in NCF4 But for NCF2 and CYBA weobserved patterns of nonsynonymous substitutions thatseldom occur in human genes NCF2 has nine nonsynony-mous substitutions three of them are common (with a fre-quency higher than 5 in at least one of the studiedpopulation samples) and six are rare On the other handtwo nonsynonymous substitutions in CYBA are commonand ubiquitous in human populations namely Y72H(rs4673) and V174A (rs1049254 in a position where variationamong mammalian species is also observed) Moreover thefollowing two of the five common amino acid changes ob-served in the autosomal NADPH genes are predicted to bepossibly damaging (ie radical) by the Polyphen resource(Ramensky et al 2002) the two changes are R395W inHispanic NCF2 (rs13306575) and Y72H (rs4673) in CYBA Ingeneral Polyphen accurately predicts the effect of nonsynon-ymous substitutions based on biochemical and evolutionarydata (Williamson et al 2005) Notably the ImmunodeficiencyMutations Database (httpbioinfutafiNCF2basecontent=pubIDbases last accessed July 16 2013) reports one395W395W autosomal recessive CGD patient but we andthe HapMap project (wwwhapmaporg last accessed July 16
2013) observed the W allele at frequencies between 5 and10 in Asians and admixed Latin Americans including onesupposedly healthy 395W395W Japanese HapMap individ-ual On the basis of these results we verified whether NativeAmericans who descend from an ancestral Pleistocene Asianpopulations that peopled the Americas by the Behring Straitsmore than 14000 years ago may have relatively higher fre-quencies of this variant We genotyped the variant 395Wusing a Taqman assay in 558 Native Americans (see supple-mentary table S3 Supplementary Material online for detailedresults) and observed an allele frequency of 12 being thisvariant always present in heterozygous individuals
For the four studied genes the diversity and levels of re-combination are higher in Africans than in non-Africans(table 2 and haplotype tables available as supplementaryfiles Supplementary Material online) This result is a conse-quence of the African origin of modern humans and the ldquooutof Africardquo migration that occurred 40000ndash80000 years agoafter a bottleneck leading to the peopling of other continents(Campbell and Tishkoff 2008 Laval et al 2010) Therefore thefirst divergence between continental human populations wasbetween Africans and ancestral non-Africans Consistentlywith this scenario we observed the highest between-popula-tion differentiation for CYBB CYBA and NCF4 between thesetwo groups Interestingly NCF2 does not match this pattern(table 3)
FIG 2 Inferred types of natural selection for codons of the NADPH genes at the evolutionary time scale of mammals Codons are represented along thehorizontal axis For each gene three classes of sites (black dark gray and light gray) are considered and each class evolved under different inferred values (presented for each gene in the figure at the top of each graphic) These classes correspond to the model M3 of Yang (2007a) with three classes ofsites Given our data this model is more likely than alternative models of evolution that assume simpler scenarios such as a unique for the entire gene(see supplementary material Supplementary Material online for details regarding methods and results using alternative models) The three classescorrespond to different types and levels of natural selection from strong purifying selection (in the lightest gray) to positive selection (gt 1) For eachcodon the probability of belonging to each of the three classes of corresponds to the height of the corresponding color in the vertical bar Forexample codon 173 of CYBB (indicated by a white arrow) has a 0000 probability of belonging to the= 00095 class (light gray a class corresponding tostrong purifying selection) a 0169 probability of belonging to the = 03085 class (dark gray) and a 0831 probability of belonging to the = 18987class (black a class that suggests positive selection) In this case reasonable evidence of positive selection on this codon exists
2160
Tarazona-Santos et al doi101093molbevmst119 MBE by guest on A
ugust 13 2016httpm
beoxfordjournalsorgD
ownloaded from
FIG
3
Nat
ural
sele
ctio
nm
app
ing
acro
ssC
YBB
(en
codi
ng
gp91
)al
ong
mam
mal
ian
evol
utio
na
sid
enti
fied
usin
gth
ePA
ML
met
hod
byY
ang
(200
7a)
The
top
olog
ies
ofgp
91an
dp2
2ar
ere
pro
duce
dfr
omT
aylo
ret
al(
2004
Cop
yrig
ht20
04T
heA
mer
ican
Ass
ocia
tion
ofIm
mun
olog
ists
In
cU
sed
wit
hp
erm
issi
on)
Dar
kgr
ayam
ino
acid
sha
veev
olve
dun
der
pos
itiv
ese
lect
ion
wit
hgt
80
pro
babi
lity
Mos
tof
thes
eam
ino
acid
sar
ein
the
extr
acel
lula
rp
orti
onof
the
pro
tein
The
upp
erp
art
ofth
efig
ure
show
sth
ep
rote
inal
ign
men
tfo
rn
ine
mam
mal
sof
the
gp91
regi
onin
dica
ted
byth
ebl
ack
ellip
seI
nth
isre
gion
ahi
ghle
velo
fam
ino
acid
varia
tion
isfo
und
betw
een
spec
ies
and
seve
ralc
odon
ssh
owgt
1In
this
alig
nm
ent
gray
vert
ical
bars
corr
esp
ond
tova
riab
leam
ino
acid
site
sT
hep
rote
inal
ign
men
tof
mam
mal
ssh
ows
the
follo
win
gsp
ecie
sH
om(H
omo
sapi
ens
sapi
ens)
Pan
(Pan
trog
lody
tes)
Mac
(Mac
aca
mul
atta
)M
us(M
usm
uscu
lus)
Rat
(Rat
tus
norv
egic
us)
Cri
(Cri
cetu
lus
gris
eus)
Het
(Het
eroc
epha
lus
glab
er)
Cav
(Cav
iapo
rcel
lus)
an
dO
ry(O
ryct
olag
uscu
nicu
lus)
EC
ext
race
lula
ren
viro
nm
ent
TM
tra
nsm
embr
ane
laye
ran
dIC
in
trac
ellu
lar
envi
ron
men
t
2161
Evolutionary Dynamics of Human NADPH Oxidase Genes doi101093molbevmst119 MBE by guest on A
ugust 13 2016httpm
beoxfordjournalsorgD
ownloaded from
From the four studied genes NCF4 which encodes theregulatory protein p40-phox shows a pattern of diversitythat is typical for a gene that has evolved under neutralityIn addition to the features described in the previous para-graph the allelic spectra of NCF4 in the studied populationsare consistent with a neutral model of evolution (tables 2ndash4)
Although CYBB presented the most interesting evolution-ary history at the interspecific level with repeated episodes ofpositive natural selection recent human evolutionary historyhas resulted in interesting patterns of variation for CYBA andNCF2 CYBA encodes p22-phox which is a transmembraneprotein shared by different NADPH oxidases In addition toharboring two common nonsynonymous polymorphisms(V174A is also variable among different species of mammals)CYBA is the most variable and most affected by recombina-tion among the NADPH oxidase genes (tables 1 and 2) Inparticular CYBA diversity is very high in Europe comparedwith 329 genes resequenced in a European sample (httppgagswashingtonedusummary_statshtml last accessed July16 2013) CYBA ranks 11th (ie the 97th percentile)Moreover there are contrasting proportions of the totalnumber of polymorphismsnumber of singletons betweenAfricans (a low proportion) and Europeans (a high propor-tion) the latter showing an excess of common polymor-phisms with respect to the neutral expectation (see the DFL
test in table 4 and supplementary table S4 SupplementaryMaterial online) This excess of common variants inEuropeans is also significant when we conservatively testedit against a scenario of human evolution that incorporates theldquoOut of Africardquo bottleneck (Laval et al 2010) and the observedlevel of recombination in Europeans (CYBA = 807 for the se-quenced region) Because demographic forces and recombi-nation levels alone do not explain the high CYBA diversity andits excess of common polymorphisms we suggest that bal-ancing natural selection (that acts by maintaining differentalleles at high frequency in a population) has contributed to
shape the diversity of CYBA at least in the European popu-lation Indeed figure 4a shows that the haplotype network ofCYBA for the European population is consistent with theaction of balancing natural selection (Bamshad andWooding 2003) showing two well-differentiated commonclades that explain the observed high diversity and theexcess of common CYBA variants A comparative genomicanalysis confirms this inference the ratio of polymorphismsto differences fixed between human and chimpanzee is nothomogeneous along the gene in the different human popu-lations (Mc Donald 1998 supplementary table S5Supplementary Material online) as would be expectedunder neutral evolution Our inference of balancing naturalselection is consistent with the fact that 25ndash30 of the var-iation in levels of ROS production can be attributed to geneticfactors (Lacy et al 2000) and that ROS levels are associatedwith CYBA variants (Bedard et al 2009)
p67 encoded by NCF2 is a necessary cytosolic NADPHcomponent for phagocyte ROS production Asians show ahighly differentiated NCF2 haplotype structure (see frequen-cies of haplotypes NCF2-D11 and NCF2-E10 in the haplotypetables online and in the network shown in fig 4b) and thehighest FST values are observed in pairwise comparisons be-tween Asians and non-Asian populations (in particular withEuropeans table 3) and not between Africans and non-African populations as is usually observed in the humangenome We confirmed these results by analyzing data forNCF2 from the HapMap Project (supplementary materialSupplementary Material online) Moreover a trend towardan excess of rare polymorphisms exists in Asians that is notobserved elsewhere (tables 1 2 and 4 DFL =1904FFL =1893) Although we cannot exclude that this patternof diversity is compatible with the null hypotheses of neutral-ity and with the tested demographic history of human pop-ulations inferred by Laval et al (2010 tables 2ndash4) we canspeculate and envisage four additional evolutionary scenarios
Table 1 Allele Frequencies of Nonsynonymous Polymorphisms in NADPH Oxidase Genes
Genes rs Minor Allele(Amino Acid)
PolyphenPrediction
African European Asian Hispanic
CYBA
Y72H rs4673 T (Y) Possibly damaging 046 032 017 022
V174A rs1049254 T (V) Benign 017 048 048 018
NCF2
K181R rs2274064 G (R) Benign 035 037 041 048
T279M rs13306581 T (T) Probably damaging 000 000 005 000
V297A rs35937854 C (A) Benign 004 000 000 000
T361S Chr1181799289 NCBI36hg18 T (S) mdash 000 000 002 000
H389Q rs17849502 A (Q) Benign 000 005 000 007
R395W rs13306575 T (W) Possibly damaging 000 000 000 007
N419I rs35012521 T (I) Probably damaging 000 002 004 000
P454S rs55761650 T (S) mdash 000 000 000 002
L487S Chr1181795862 NCBI36hg18 C (S) mdash 002 000 000 000
NCF4
T85N rs112306225 A (N) mdash 000 000 002 000
A304E rs5995361 A (E) Benign 004 000 000 000
2162
Tarazona-Santos et al doi101093molbevmst119 MBE by guest on A
ugust 13 2016httpm
beoxfordjournalsorgD
ownloaded from
that may have contributed to shape the pattern of NCF2diversity in Asians 1) the observed trend is suggestive of aselective sweep on NCF2 standing variation a neutral orweakly deleterious existing variant becomes beneficial andrapidly increases in frequency (together with its associatedhaplotypes ie incomplete sweep) reducing the nucleotidediversity in the surrounding region and rendering other stand-ing substitutions rare During this process new rare substitu-tions appear in the expanding positively selected haplotype2) The pattern of diversity of NCF2 in Asia may result from anincomplete selective sweep acting on ARPC5 which is locatedapproximately 35 kb downstream of NCF2 In a genome-widescan for recent positive selection Voight et al (2006) identi-fied a strong signature of an incomplete sweep for ARPC5 inAsia (P = 0009) characterized by a higher than expected long-range linkage disequilibrium summarized by very high iHS
statistics (see Haplotter results for the HapMap II data athttphaplotteruchicagoedu last accessed July 16 2013)SNPs in NCF2 also presents high iHS statistics (P = 002) al-though values are lower than for ARPC5 3) The differentiatedpattern of diversity of NCF2 in Asia may also have been gen-erated without the action of natural selection during the firstcolonization of Asia by modern humans In a process of geo-graphic population expansion specifically in the front wave ofthe expansion some rare alleleshaplotypes (ie surfing al-leles) may become common by chance mimicking the pat-tern of diversity generated by a selective sweep (Excoffier andRay 2008) 4) The excess of rare variants may be an artifact ofpooling individuals from different populations (Ptak andPrzeworski 2002) Consistent with evolutionary scenarios 1ndash4 that produce similar patterns of diversity the haplotypenetwork of NCF2 for Eurasians (fig 4b) shows the following1) a large differentiation between Asians and Europeans thatis compatible with the high observed FST values and 2) a star-like shape associated with the haplotype NCF2-E10 that iscommon in Asia and rare elsewhere which is compatiblewith the excess of rare alleles in the Asian populations
DiscussionBy analyzing 29 mammalian genomes and four human pop-ulations we show in this study that natural selection hasacted in different ways over time to shape the pattern ofdiversity of the phagocyte NADPH oxidase genes At the tem-poral scale of the evolution of mammals we have inferredrecurrent episodes of positive selection acting on the extra-cellular portion of gp-91 that have been important to shapethe pattern of interspecific diversity of this gene Our in-terspecific analyses did not show a similar pattern of naturalselection in any of the other phagocyte NADPH oxidasegenes Even if current knowledge on the biology of NADPHdoes not allow us to interpret our results in terms of functionwe propose that the extracellular region of gp-91 is function-ally relevant Our results also imply that this region is highlydifferentiated among mammals at the protein level and thisvariability should be considered when mammals models areused to study the structure and function of phagocyteNADPH components
In the time scale of human evolution our analyses of theNADPH oxidase genes suggest that CYBA has been a target ofbalancing natural selection Because we do not have evidenceof population-specific variants that faced selective pressurethe inferred natural selection may have acted on a standingvariation in ancestral populations This implies that the selec-tive pressure began after the appearance of the variant andpossibly acted in a specific geographic region (Barret andSchluter 2008) The signatures of natural selection acting ona new mutation and on standing variation differ In the case ofselective sweeps episodes of natural selection on standingvariation are associated to a larger variance in the allelic spec-trum with respect to natural selection on a new mutationAlso selection on standing variation may produce an excess ofalleles at intermediate frequencies that is not associated withhigh nucleotide diversity (Przeworski et al 2005 Peter et al2012) This pattern contrasts with the effect of balancing
Table 2 Intrapopulation Diversity Indexes in the Studied Populationsfor the NADPH Oxidase Genes Obtained from Resequencing Dataa
African European Asian Hispanic
Number of chromosomes
CYBBb 42 52 34 36
CYBA 48 62 48 46
NCF2 48 62 48 46
NCF4 48 62 48 46
Segregating sitessingletons
CYBB 218 70 103 135
CYBA 6122 333 335 347
NCF2 4613 3311 2816 3714
NCF4 4512 267 191 308
Haplotype structureNumber of inferred haplotypesc
CYBB 14 5 7 12
CYBA 39 39 32 26
NCF2 38 32 18 31
NCF4 36 30 17 22
Haplotype diversity plusmn SD
CYBB 088 plusmn 003 034 plusmn 008 053 plusmn 010 070 plusmn 008
CYBA 098 plusmn 001 098 plusmn 001 097 plusmn 000 096 plusmn 002
NCF2 099 plusmn 001 096 plusmn 001 087 plusmn 004 097 plusmn 001
NCF4 098 plusmn 001 093 plusmn 002 093 plusmn 002 093 plusmn 002
Recombination parameter (q 103 per site)c
CYBB 008 lt001 lt001 lt002
CYBA 291 148 096 088
NCF2 052 038 010 058
NCF4 137 103 039 022
h estimatorsd
p 103 per site
CYBB 036 012 015 028
CYBA 190 163 151 157
NCF2 081 047 043 057
NCF4 101 076 064 084
hW 103 per site
CYBB 042 013 021 027
CYBA 231 118 125 13
NCF2 102 069 062 083
NCF4 101 070 054 087
aMost analyses were performed using software DnaSP (Rozas 2009)bData for CYBB (Xp211) are from Tarazona-Santos et al (2008)cHaplotypes and inferred using the method by Stephens and Scheet (2005) andthe software PHASEd Tajima (1983) W Watterson (1975)
2163
Evolutionary Dynamics of Human NADPH Oxidase Genes doi101093molbevmst119 MBE by guest on A
ugust 13 2016httpm
beoxfordjournalsorgD
ownloaded from
natural selection which produces an excess of common allelesassociated with high genetic diversity Thus the observed pat-tern of CYBA diversity in Europeans is not consistent with aselective sweep on a standing variation but it is consistent witha scenario of balancing selection acting on standing variation
If we consider for CYBA that heterozygote advantage maybe the mechanisms of balancing selection we can speculatethat the biological basis for this mechanism may be the fol-lowing considering that p22-phox is not exclusive of thephagocyte NADPH oxidase but it is also part of Nox com-plexes expressed in other tissues the dependence of ROSproduction on CYBA variants has to be finely regulated IfCYBA variants induce high levels of ROS these variants mayfavor a phagocyte-dependent efficient response to pathogensbut may damage other endothelial tissues Alternativelytissue oxidative damage does not occur if ROS productionis low but this response may be associated with a weaker
phagocyte respiratory burst against pathogens In this con-text heterozygote individuals with a CYBA-dependent inter-mediate level of ROS production may have been favored bynatural selection
Our results contribute to the discussion regarding the rel-evance of balancing selection in shaping the diversity ofinnate immunity genes (Ferrer-Admetlla et al 2008 Barreiroet al 2009) Ferrer-Admetlla et al (2008) have associated therecurrent signatures of balancing selection on inflammatorygenes with the need for fine regulation CYBA is the onlyphagocytic NADPH oxidase gene that also encodesnonphagocyte Nox components thus CYBA has a role inROS cell signaling a potentially dangerous process due toits capability to produce oxidative damage to tissues Geneswith these characteristics likely need even tighter regulationThe interplay between pathogen-driven selective pressureon innate immunity genes and their concomitantnonimmunological functions is complex In addition toCYBA other interesting examples of this interplay can befound among the 10 human toll-like receptors (TLRs) thatshow a variety of signatures of natural selection TLR8 whichshows the strongest signature of purifying selection is alsoinvolved in neuronal development (Barreiro et al 2009) and itis difficult to discriminate the role of each function in deter-mining the observed signature of natural selection
With few exceptions the pathogens responsible for naturalselection on immune genes are difficult to specify In the caseof NADPH oxidase we can infer based on the spectrum ofinfections in CGD patients that catalase-positive bacteria andfungi such as Staphylococci Salmonella Candida Aspergillusand M tuberculosis may be the selective pathogensInteractions between the host and pathogens also includemechanisms of the latter to impair the respiratory burst ofthe former For example Leishmania donovani blocks the as-sembly of NADPH oxidase at the phagosome membrane(Lodge et al 2006) These mechanisms may constitute selec-tive pressures imposed by pathogens
The associations reported in GWAS between rs4821544 inNCF4 and Crohnrsquos disease (an idiopathic inflammatory boweldisease that predominantly involves the ileum and colonRioux et al 2007) and between rs10911363 in NCF2 and
Table 3 Pairwise FST Genetic Distances between Populations
CYBBa CYBA
Africa Europe Asia Hispanic Africa Europe Asia Hispanic
Africa mdash 0316 0264 0092 mdash 0074 0083 0065
Europe 0257 mdash 0000 0107 mdash 0002 0056
Asia 0211 0000 mdash 0073 mdash 0054
Hispanic 0070 0082 0056 mdash mdash
NCF2 NCF4
Africa mdash 0048 0058 0037 mdash 0128 0136 0158
Europe mdash 0069 0000 mdash 0026 0026
Asia mdash 0059 mdash 0005
Hispanic mdash mdash
aFor CYBB FST estimators are above the diagonal Below the diagonal are the FST values corrected as if the effective population sizes of X chromosome genes were equal toautosomal ones
Table 4 Results of Neutrality Tests for the NADPH Oxidase Genesand Their Significancea
African European Asian Hispanic
Tajimarsquos D
CYBB 0473 0274 0813 0084
CYBA 0580 1242 0684 0707
NCF2 0412 0939 0883 0987
NCF4 0750 0243 0556 0102
Fu and Lirsquos D
CYBB 1050 1110 0395 0977
CYBA 1485 1734 1188 0138
NCF2 0407 1105 1904 1398
NCF4 0308 0443 1893 0266
Fu and Lirsquos F
CYBB 0980 0811 0114 0814
CYBA 1382 1893 1205 0405
NCF2 0512 1252 1893 1567
NCF4 0557 0231 1225 0248
NOTEmdashUnderlined values represent significant results under the demographic modelinferred by Laval et al (2010) for human populations See details in supplementarytable S4 Supplementary Material onlineaThe McDonaldndashKreitman test is nonsignificant in any of the cases
Significant under the WrightndashFisher model of constant population size
2164
Tarazona-Santos et al doi101093molbevmst119 MBE by guest on A
ugust 13 2016httpm
beoxfordjournalsorgD
ownloaded from
systemic lupus erythematosus (Cunninghame Graham et al2011) confirm the involvement of NADPH genes in the path-ogenesis of inflammatory-related common diseases Ourclaim that natural selection acted on CYBA (and maybe inNCF2) is relevant for biomedical studies because combiningevidence of natural selection with association analyses inimmune genes increases the statistical power to detect dis-ease-associated variants (Ayodo et al 2007) As a new gener-ation of association studies focusing on rare variation isemerging the combination of genes deemed interestingfrom GWAS and populations with an excess of rare variantsin these genes such as NCF2 in Asians are particularly inter-esting as a source of rare variants with clinical relevanceFinally by determining through molecular evolution mappingthat the extracellular portion of gp91 (encoded by CYBB inthe X-chromosome) has been subject to recurrent episodes ofpositive selection at the scale of mammals evolution we positthe hypothesis that this portion of the NADPH oxidase isrelevant for currently unknown biological processes thatonce revealed by structural and functional investigationswill contribute to understanding the role of NADPH oxidasein infectious autoimmune and cardiovascular diseases
Materials and Methods
Molecular Evolution Analysis of NADPH Genes
We used the maximum likelihood framework developed byYang (2007a) to estimate for the NADPH oxidase genes
under a variety of evolutionary models as implemented in thePAML software (Yang 2007b) This approach allows infer-ences about the evolution of a coding region along aninter-specific phylogeny mapping the codons that haveevolved under strongweak purifying selection neutrality oradaptive positive selection Further details about these anal-yses are available as supplementary material SupplementaryMaterial online
Human Population Genetics of NADPH Genes
For human population genetics analyses we conducted bidi-rectional Sanger sequencing of CYBB CYBA NCF2 and NCF4for a total of 35242 bp for each of 102 healthy individuals aspart of the SNP500 Cancer project (Packer et al 2006 seesupplementary fig S1 Supplementary Material online for de-tails) Human population resequencing data for CYBB werepublished in Tarazona-Santos et al (2008) The resequencingexperiments were performed as in Packer et al (2006) These102 unrelated individuals include the following 24 of Africanancestry (15 African Americans from the United States and 9Pygmies) 23 admixed Latin Americans (from Mexico PuertoRico and South America) 31 Europeans (from the CEPHUTAH pedigree and the NIEHS Environmental GenomeProject) and 24 AsiansndashOceanians (from Melanesia PakistanChina Cambodia Japan and Taiwan)
After controlling for multiple tests we confirmed that allSNPs fit the HardyndashWeinberg equilibrium by the Guo and
CYBA - A16
CYBA - A3
Y72H - rs4673
V174A - rs1049254 CYBA - E18
(a)
NCF2 - E10 H389Q - rs17849502
NCF2 - B3
NCF2 - D11
NCF2 - C1K181R - rs2274064(b)
FIG 4 Phylogenetic networks of (a) CYBA in Europeans and (b) NCF2 in Europeans (black) and Asians (yellow) The lengths of the branches areproportional to the number of mutations Only nonsynonymous mutations are shown The haplotype names correspond to the table of inferredhaplotypes in the supplementary material Supplementary Material online We only show the names of haplotypes with a frequency gt5 Ancestralhaplotypes (inferred as being the human haplotype or median vector most similar to the chimpanzee sequence) are indicated by an arrow Medianvectors are in red
2165
Evolutionary Dynamics of Human NADPH Oxidase Genes doi101093molbevmst119 MBE by guest on A
ugust 13 2016httpm
beoxfordjournalsorgD
ownloaded from
Thompson (1992) test which was implemented in the soft-ware Arlequin 30 (Excoffier et al 2007) Insertion-deletions(INDELs) were excluded from population genetics analysesTo assess intrapopulation diversity we used two statistics the per-site mean number of pairwise differences betweensequences (Tajima 1983) and w based on the number ofsegregating sites (S) (Watterson 1975) We measured pairwisebetween-populations diversity by using the FST statistics cal-culated using the software DnaSP (Rozas 2009)
Haplotypes and the recombination parameter were in-ferred using the PHASE software (Stephens and Scheet 2005)and diversity indexes calculations (tables 2 and 3) as well asneutrality statistics (table 4) were estimated using DnaSP soft-ware We applied two kinds of neutrality tests 1) tests basedon the allelic spectrum which is the distribution of polymor-phisms across different classes of frequencies namelyTajimarsquos DT (Tajima 1989) and FundashLirsquos DFL and FFL (Fu andLi 1993) and 2) tests based on comparisons between thenumber of polymorphisms in human populations and fixeddifferences with the chimpanzee (ie outgroup) namely theMcDonald and Kreitman (1991) test and the adaptedKolmogorovndashSmirnoff test by McDonald (1998) For thefirst set of tests we used as null hypotheses both the classicWrightndashFisher model of neutrality with a constant popula-tion size as well as the more realistic evolutionary scenario forhuman populations inferred by Laval et al (2010) In the caseof the scenario of Laval et al (2010) we ignored interconti-nental gene flow within the Old World because these raregene flow events likely does not affect the level of significanceof the neutrality tests given its very low inferred values(13 105) Null distributions used to test the significanceof the neutrality tests under these evolutionary scenarios weregenerated using coalescent simulations and a significancelevel of 005 (Hudson 2002) The KolmogorovndashSmirnoff testof neutrality adapted by McDonald (1998) was performedusing Slider software available at httpudeledu~mcdonaldaboutdnasliderhtml (last accessed July 16 2013) Weperformed coalescent simulations using ms software(Hudson 2002) Further methodological details are availableas supplementary material Supplementary Material onlineWe constructed the CYBA and NCF2 networks using allSNP variants and applying the Median joining algorithmand the maximum parsimony option calculations as imple-mented in the software Network 46 (Bandelt et al 1999)
Supplementary MaterialSupplementary tables S1ndashS5 and figure S1 are available atMolecular Biology and Evolution online (httpwwwmbeoxfordjournalsorg)
Acknowledgments
SJC conceived the study ET-S MM FL RC AC CF LPand BS generated the dataanalyzed the sequences LB ACWCSM and MY provided tools for data generationand analyses ET-S MM FL WCSM and RR performedpopulation genetics analyses ET-S and MM prepared tablesand figures ET-S and SJC wrote the manuscript All the
authors contributed with discussions about the results Theauthors are grateful to Silvia Fuselli for discussions of ourresults to Fernando Levi Soares for technical assistance withthe figures and the Sequencing Group of the CoreGenotyping Facility (National Cancer Institute) for their tech-nical assistance This work was supported by the IntramuralResearch Program of the National Cancer Institute (NCI) andNational Institutes of Health (NIH) CF was supported by theUniversity of Bologna ET-S MM WCSM and FL weresupported by the Fogarty International Center and NationalCancer Institute (1R01TW007894) Conselho Nacional deDesenvolvimento Cientıfico e Tecnologico (Brazil Grant3075562012-3) Fundacao de Amparo a Pesquisa de MinasGerais (Brazil CBB PPM 0026512) and the Brazilian Ministryof Education (CAPES Agency grant PNPD 0256709-1)
ReferencesAyodo G Price AL Keinan A Ajwang A Otieno MF Orago AS
Patterson N Reich D 2007 Combining evidence of natural selectionwith association analysis increases power to detect malaria-resis-tance variants Am J Hum Genet 81(2)234ndash242
Bamshad M Wooding SP 2003 Signatures of natural selection in thehuman genome Nat Rev Genet 4(2)99ndash111
Bandelt H-J Forster P Rohl A 1999 Median-joining networks for infer-ring intraspecific phylogenies Mol Biol Evol 16(1)37ndash48
Barreiro LB Ben-Ali M Quach H et al (17 co-authors) 2009Evolutionary dynamics of human Toll-like receptors and their dif-ferent contributions to host defense PLoS Genet 5(7)e1000562
Barreiro LB Quintana-Murci L 2010 From evolutionary genetics tohuman immunology how selection shapes host defence genesNat Rev Genet 11(1)17ndash30
Barrett RD Schluter D 2008 Adaptation from standing genetic varia-tion Trends Ecol Evol 23(1)38ndash44
Bedard K Attar H Bonnefont J Jaquet V Borel C Plastre O Stasia MJAntonarakis SE Krause KH 2009 Three common polymorphisms inthe CYBA gene form a haplotype associated with decreased ROSgeneration Hum Mutat 30(7)1123ndash1133
Blekhman R Man O Herrmann L Boyko AR Indap A Kosiol CBustamante CD Teshima KM Przeworski M 2008 Natural selectionon genes that underlie human disease susceptibility Curr Biol18(12)883ndash889
Brandes RP Kreuzer J 2005 Vascular NADPH oxidases molecular mech-anisms of activation Cardiovasc Res 65(1)16ndash27
Buckley RH 2004 Pulmonary complications of primary immunodefi-ciencies Paediatr Respir Rev 5(Suppl A)S225ndashS233
Bustamante CD Fledel-Alon A Williamson S et al (13 co-authors)2005 Natural selection on protein-coding genes in the humangenome Nature 437(7062)1153ndash1157
Bustamante J Arias AA Vogt G et al (29 co-authors) 2011 GermlineCYBB mutations that selectively affect macrophages in kindredswith X-linked predisposition to tuberculous mycobacterial diseaseNat Immunol 12(3)213ndash221
Campbell MC Tishkoff SA 2008 African genetic diversity implicationsfor human demographic history modern human origins and com-plex disease mapping Annu Rev Genomics Hum Genet 9403ndash433
Chanock SJ Roesler J Zhan S Hopkins P Lee P Barrett DT ChristensenBL Curnutte JT Gorlach A 2000 Genomic structure of the humanp47-phox (NCF1) gene Blood Cells Mol Dis 26(1)37ndash46
Cunninghame Graham DS Morris DL Bhangale TR Criswell LASyvanen AC Ronnblom L Behrens TW Graham RR Vyse TJ2011 Association of NCF2 IKZF1 IRF8 IFIH1 and TYK2 with sys-temic lupus erythematosus PLoS Genet 7(10)e1002341
Excoffier L Laval G Schneider S 2007 Arlequin ver 30 an integratedsoftware package for population genetics data analysis EvolBioinform Online 147ndash50
2166
Tarazona-Santos et al doi101093molbevmst119 MBE by guest on A
ugust 13 2016httpm
beoxfordjournalsorgD
ownloaded from
Excoffier L Ray N 2008 Surfing during population expansions promotesgenetic revolutions and structuration Trends Ecol Evol23(7)347ndash351
Ferrer-Admetlla A Bosch E Sikora M Marques-Bonet T Ramırez-SorianoA Muntasell A Navarro A Lazarus R Calafell F Bertranpetit J Casals F2008 Balancing selection is the main force shaping the evolution ofinnate immunity genes J Immunol 181(2)1315ndash1322
Fu YX Li WH 1993 Statistical tests of neutrality of mutations Genetics133(3)693ndash709
The 1000 Genomes Project Consortium 2010 A map of humangenome variation from population-scale sequencing Nature467(7319)1061ndash1073
The 1000 Genomes Project Consortium Abecasis GR Auton A BrooksLD DePristo MA Durbin RM Handsaker RE Kang HM Marth GTMcVean GA 2012 An integrated map of genetic variation from1092 human genomes Nature 491(7422)56ndash65
Guo SW Thompson EA 1992 Performing the exact test of Hardy-Weinberg proportion for multiple alleles Biometrics 48(2)361ndash372
Heyworth PG Cross AR Curnutte JT 2003 Chronic granulomatousdisease Curr Opin Immunol 15(5)578ndash584
Hudson RR 2002 Generating samples under a Wright-Fisher neutralmodel of genetic variation Bioinformatics 18(2)337ndash338
Jacob CO Eisenstein M Dinauer MC et al (21 co-authors) 2012 Lupus-associated causal mutation in neutrophil cytosolic factor 2 (NCF2)brings unique insights to the structure and function of NADPHoxidase Proc Natl Acad Sci U S A 109(2)E59ndashE67
Kimura M 1974 The neutral theory of molecular evolution CambridgeCambridge University Press
Kosiol C Vinar T da Fonseca RR Hubisz MJ Bustamante CD Nielsen RSiepel A 2008 Patterns of positive selection in six mammalian ge-nomes PLoS Genet 4(8)e1000144
Lacy F Kailasam MT OrsquoConnor DT Schmid-Schonbein GW Parmer RJ2000 Plasma hydrogen peroxide production in human essentialhypertension role of heredity gender and ethnicity Hypertension36(5)878ndash884
Laval G Patin E Barreiro LB Quintana-Murci L 2010 Formulating a his-torical and demographic model of recent human evolution based onresequencing data from noncoding regions PLoS One 5(4)e10284
Lewontin R 1972 The apportionment of human diversity Evol Biol 6391ndash398
Lindblad-Toh K Garber M Zuk O et al (88 co-authors) 2011 A high-resolution map of human evolutionary constraint using 29 mam-mals Nature 478(7370)476ndash482
Lodge R Diallo TO Descoteaux A 2006 Leishmania donovani lipopho-sphoglycan blocks NADPH oxidase assembly at the phagosomemembrane Cell Microbiol 8(12)1922ndash1931
Magalhaes WC Rodrigues MR Silva D Soares-Souza G Iannini MLCerqueira GC Faria-Campos AC Tarazona-Santos E 2012DIVERGENOME a bioinformatics platform to assist population ge-netics and genetic epidemiology studies Genet Epidemiol36(4)360ndash367
Marth JD Grewal PK 2008 Mammalian glycosylation in immunity NatRev Immunol 8(11)874ndash887
McDonald JH 1998 Improved tests for heterogeneity across a region ofDNA sequence in the ratio of polymorphism to divergence Mol BiolEvol 15377ndash384
McDonald JH Kreitman M 1991 Adaptive protein evolution at the Adhlocus in Drosophila Nature 351(6328)652ndash654
Nielsen R Bustamante C Clark AG et al (12 co-authors) 2005 A scanfor positively selected genes in the genomes of humans and chim-panzees PLoS Biol 3(6)e170
Olsson LM Lindqvist AK Kallberg H Padyukov L Burkhardt HAlfredsson L Klareskog L Holmdahl R 2007 A case-control studyof rheumatoid arthritis identifies an associated single nucleotidepolymorphism in the NCF4 gene supporting a role for the
NADPH-oxidase complex in autoimmunity Arthritis Res Ther9(5)R98
Packer BR Yeager M Burdett L et al (13 co-authors) 2006SNP500Cancer a public resource for sequence validation assay de-velopment and frequency analysis for genetic variation in candidategenes Nucleic Acids Res 34(Database issue)D617ndashD621
Peter BM Huerta-Sanchez E Nielsen R 2012 Distinguishing betweenselective sweeps from standing variation and from a de novo mu-tation PLoS Genet 8(10)e1003011
Przeworski M Coop G Wall JD 2005 The signature of positive selectionon standing genetic variation Evolution 59(11)2312ndash2323
Ptak SE Przeworski M 2002 Evidence for population growth in humansis confounded by fine-scale population structure Trends Genet18(11)559ndash563
Ramensky V Bork P Sunyaev S 2002 Human non-synonymous SNPsserver and survey Nucleic Acids Res 30(17)3894ndash3900
Rioux JD Xavier RJ Taylor KD et al (24 co-authors) 2007 Genome-wideassociation study identifies new susceptibility loci for Crohn diseaseand implicates autophagy in disease pathogenesis Nat Genet39(5)596ndash604
Roberts RL Hollis-Moffatt JE Gearry RB Kennedy MA Barclay MLMerriman TR 2008 Confirmation of association of IRGM andNCF4 with ileal Crohnrsquos disease in a population-based cohortGenes Immun 9(6)561ndash565
Rozas J 2009 DNA sequence polymorphism analysis using DnaSPMethods Mol Biol 537337ndash350
San Jose G Fortuno A Beloqui O Dıez J Zalba G 2008 NADPH oxidaseCYBA polymorphisms oxidative stress and cardiovascular diseasesClin Sci (Lond) 114(3)173ndash182
Santiago HC Gonzalez Lombana CZ Macedo JP et al (12 co-authors)2012 NADPH phagocyte oxidase knockout mice controlTrypanosoma cruzi proliferation but develop circulatory collapseand succumb to infection PLoS Negl Trop Dis 6(2)e1492
Stephens M Scheet P 2005 Accounting for decay of linkage disequilib-rium in haplotype inference and missing-data imputation Am JHum Genet 76(3)449ndash462
Sumimoto H Miyano K Takeya R 2005 Molecular composition andregulation of the Nox family NAD(P)H oxidases Biochem BiophysRes Commun 338(1)677ndash686
Tajima F 1983 Evolutionary relationship of DNA sequences in finitepopulations Genetics 105(2)437ndash460
Tajima F 1989 Statistical method for testing the neutral mutationhypothesis by DNA polymorphism Genetics 123(3)585ndash595
Tarazona-Santos E Bernig T Burdett L Magalhaes WC Fabbri C Liao JRedondo RA Welch R Yeager M Chanock SJ 2008 CYBB anNADPH-oxidase gene restricted diversity in humans and evidencefor differential long-term purifying selection on transmembrane andcytosolic domains Hum Mutat 29(5)623ndash632
Taylor RM Burritt JB Baniulis D Foubert TR Lord CI Dinauer MCParkos CA Jesaitis AJ 2004 Site-specific inhibitors of NADPH oxi-dase activity and structural probes of flavocytochrome b character-ization of six monoclonal antibodies to the p22phox subunitJ Immunol 173(12)7349ndash7357
Voight BF Kudaravalli S Wen X Pritchard JK 2006 A map of recentpositive selection in the human genome PLoS Biol 4(3)e72
Watterson GA 1975 On the number of segregating sites in geneticalmodels without recombination Theor Popul Biol 7(2) 256ndash276
Williamson SH Hernandez R Fledel-Alon A Zhu L Nielsen RBustamante CD 2005 Simultaneous inference of selection and pop-ulation growth from patterns of variation in the human genomeProc Natl Acad Sci U S A 102(22)7882ndash7887
Yang Z 2007a Adaptive molecular evolution In Balding DJ Bishop MCannings C editors Handbook of statistical genetics Vol 1 3rd edSusex (United Kingdom) John Wiley amp Sons p 377ndash406
Yang Z 2007b PAML 4 phylogenetic analysis by maximum likelihoodMol Biol Evol 241586ndash1591
2167
Evolutionary Dynamics of Human NADPH Oxidase Genes doi101093molbevmst119 MBE by guest on A
ugust 13 2016httpm
beoxfordjournalsorgD
ownloaded from
selection maintains the substitutions at low frequencies andprevents fixation at the same rate as synonymous substitu-tions resulting in dNlt dS and lt 1 On the other hand ifepisodes of positive natural selection (that raise the frequencyof beneficial variants) are frequent nonsynonymous substi-tutions increase in frequency and fix more rapidly than neu-tral synonymous substitutions thus dNgt dS and gt 1 Weused the maximum likelihood framework developed by Yang(2007a) to estimate for the NADPH oxidase genes under avariety of evolutionary models as implemented in the PAMLsoftware (Yang 2007b) This approach allows inferences aboutthe evolution of a coding region along an interspecific phy-logeny and maps the codons that have evolved under strongweak purifying selection neutrality or adaptive positive se-lection (see supplementary material Supplementary Materialonline for details)
In general PAML evolutionary models that allow a combi-nation of purifying selection and neutrality are reasonably re-alistic These models are nested with respect to models thatalso incorporate positive selection at the cost of adding newparameters We evaluated the improvements in the goodnessof fit of the data using the latter model with respect to theformer models by applying a likelihood ratio test (LRT) Afterfitting the data to the most appropriate evolutionary modelNaive Empirical Bayes (NEB) or Bayes Empirical Bayes (BEB)approaches were used to infer theparameter for each codon
For the 29 species of mammals considered we filteredbased on quality control (supplementary table S1 Supple-mentary Material online) and analyzed 570 codons of CYBBin 26 species 198 codons of CYBA in 16 species 526 codons ofNCF2 in 23 species and 339 codons of NCF4 in 20 species (seesupplementary material Supplementary Material online forfurther details including the species and sequences used forthe analyses the parameter estimations for the differentmodels and the LRT results) Here we only present the resultsof model M3 of Yang (2007a) with three (K = 3) classes of (fig 2) This model allows for different classes including thepossibility of positive selection and is a reasonable way ofpresenting the results for the four genes under the samemodel Moreover in all cases the data fit better withmodel M3 than the nested and simpler M0 or M2 modelspresented by Yang (2007a)
The results of this analysis for CYBB CYBA NCF2 and NCF4are presented in figure 2 which shows the type of naturalselection (ie based on the estimated ) for each codon thatmost likely predominated during mammalian evolution Forthis temporal scale CYBA NCF2 and NCF4 coding regionshave evolved driven by a combination of different levels ofpurifying natural selection Overall the average and standarddeviation values for these genes are NCF2 = 0256 plusmn 0227NCF4 = 0126 plusmn 0140 and CYBA = 0109 plusmn 0116
Our most striking result is for CYBB which presents a widespectrum of mutations that account for gt70 of CGD pa-tients Although we would predict that purifying selection ongenes involved in Mendelian diseases (Blekhman et al 2008)would yield similar results for CYBB and other NADPHcomponents we observed a different pattern In generalCYBB is a conserved gene but 6 of its codons show
evidence of positive natural selection (supplementary tableS2 Supplementary Material online fig 2) In a genome-widesurvey performed by Kosiol et al (2008) they have reportedCYBB as a gene showing a signal of positive natural selectionMore importantly by evolutionary mapping we show herefor the first time that most of these positive selection eventsmap to the small extracellular portion of this protein (fig 3)The proximity of these inferred episodes of positive naturalselection to glycosylation sites in gp91 is noteworthy consid-ering the importance of the glycome in immunity (Marth andGrewal 2008)
Population Genetics of NADPH Genes
We sequenced CYBB CYBA NCF2 and NCF4 in a publiclyavailable panel that includes 24 individuals of African ances-try 31 Europeans 24 AsianOceanians and 23 admixed LatinAmericans (ie Hispanics) This panel is a suboptimal repre-sentation of the worldwide population a limitation that iscommon to most human genomic diversity projects focusedon SNP genotyping or resequencing efforts However basedon how human genetic diversity is apportioned within(gt85) and between (lt15) populations (Lewontin 19721000 Genomes Project Consortium 2010) even studies usingsuboptimal sampling are informative about the genetic struc-ture of human populations and serve to critically identify therole of evolutionary factors in human genetic diversity(Kimura 1974 Nielsen et al 2005 1000 Genomes ProjectConsortium 2010) All the raw results are available as supple-mentary material Supplementary Material online and at theSNP500Cancer project homepage (httpvariantgpsncinihgovcgfseqpagessnp500do last accessed July 16 2013) orcan be downloaded from the DIVERGENOME platform(Magalhaes et al 2012 httpwwwpggeneticaicbufmgbrdivergenome last accessed July 16 2013)
To ascertain which combination of evolutionary factorshas shaped the diversity of NADPH genes we assessed thepattern of nonsynonymous and synonymous polymor-phisms as well as intra- and interpopulation diversity forNADPH genes and tested the null hypothesis of neutralitythat patterns of diversity may be explained by consideringonly the demographic history of human populations and themutation and recombination rates of each locus
Nonsynonymous polymorphisms are underrepresentedin the human genome and usually occur at low frequencieswhen present reflecting the action of purifying natural selec-tion (1000 Genomes Project Consortium 2010) By resequen-cing Tarazona-Santos et al (2008) did not observe commonnonsynonymous polymorphisms for CYBB This result is con-sistent with purifying natural selection acting on X-chromo-some genes due to the exposure of deleterious recessivemutations to natural selection in hemizygous males Thussubstitutions in the coding region of CYBB should be rarein human populations and are seldom captured by studieswith small sample sizes Interestingly the lack of CYBBnonsynonymous polymorphisms in our sample of humanpopulations contrasts with the recurrent episodes of positiveselection of the extracellular portion of gp91 during
2159
Evolutionary Dynamics of Human NADPH Oxidase Genes doi101093molbevmst119 MBE by guest on A
ugust 13 2016httpm
beoxfordjournalsorgD
ownloaded from
mammalian evolution as inferred in this study For the auto-somal NADPH oxidase components (table 1 and haplotypetables online Supplementary Material online) we observe inthis study two rare and conservative nonsynonymous substi-tutions (ie involving amino acids with similar chemical prop-erties T85N and A304E) in NCF4 But for NCF2 and CYBA weobserved patterns of nonsynonymous substitutions thatseldom occur in human genes NCF2 has nine nonsynony-mous substitutions three of them are common (with a fre-quency higher than 5 in at least one of the studiedpopulation samples) and six are rare On the other handtwo nonsynonymous substitutions in CYBA are commonand ubiquitous in human populations namely Y72H(rs4673) and V174A (rs1049254 in a position where variationamong mammalian species is also observed) Moreover thefollowing two of the five common amino acid changes ob-served in the autosomal NADPH genes are predicted to bepossibly damaging (ie radical) by the Polyphen resource(Ramensky et al 2002) the two changes are R395W inHispanic NCF2 (rs13306575) and Y72H (rs4673) in CYBA Ingeneral Polyphen accurately predicts the effect of nonsynon-ymous substitutions based on biochemical and evolutionarydata (Williamson et al 2005) Notably the ImmunodeficiencyMutations Database (httpbioinfutafiNCF2basecontent=pubIDbases last accessed July 16 2013) reports one395W395W autosomal recessive CGD patient but we andthe HapMap project (wwwhapmaporg last accessed July 16
2013) observed the W allele at frequencies between 5 and10 in Asians and admixed Latin Americans including onesupposedly healthy 395W395W Japanese HapMap individ-ual On the basis of these results we verified whether NativeAmericans who descend from an ancestral Pleistocene Asianpopulations that peopled the Americas by the Behring Straitsmore than 14000 years ago may have relatively higher fre-quencies of this variant We genotyped the variant 395Wusing a Taqman assay in 558 Native Americans (see supple-mentary table S3 Supplementary Material online for detailedresults) and observed an allele frequency of 12 being thisvariant always present in heterozygous individuals
For the four studied genes the diversity and levels of re-combination are higher in Africans than in non-Africans(table 2 and haplotype tables available as supplementaryfiles Supplementary Material online) This result is a conse-quence of the African origin of modern humans and the ldquooutof Africardquo migration that occurred 40000ndash80000 years agoafter a bottleneck leading to the peopling of other continents(Campbell and Tishkoff 2008 Laval et al 2010) Therefore thefirst divergence between continental human populations wasbetween Africans and ancestral non-Africans Consistentlywith this scenario we observed the highest between-popula-tion differentiation for CYBB CYBA and NCF4 between thesetwo groups Interestingly NCF2 does not match this pattern(table 3)
FIG 2 Inferred types of natural selection for codons of the NADPH genes at the evolutionary time scale of mammals Codons are represented along thehorizontal axis For each gene three classes of sites (black dark gray and light gray) are considered and each class evolved under different inferred values (presented for each gene in the figure at the top of each graphic) These classes correspond to the model M3 of Yang (2007a) with three classes ofsites Given our data this model is more likely than alternative models of evolution that assume simpler scenarios such as a unique for the entire gene(see supplementary material Supplementary Material online for details regarding methods and results using alternative models) The three classescorrespond to different types and levels of natural selection from strong purifying selection (in the lightest gray) to positive selection (gt 1) For eachcodon the probability of belonging to each of the three classes of corresponds to the height of the corresponding color in the vertical bar Forexample codon 173 of CYBB (indicated by a white arrow) has a 0000 probability of belonging to the= 00095 class (light gray a class corresponding tostrong purifying selection) a 0169 probability of belonging to the = 03085 class (dark gray) and a 0831 probability of belonging to the = 18987class (black a class that suggests positive selection) In this case reasonable evidence of positive selection on this codon exists
2160
Tarazona-Santos et al doi101093molbevmst119 MBE by guest on A
ugust 13 2016httpm
beoxfordjournalsorgD
ownloaded from
FIG
3
Nat
ural
sele
ctio
nm
app
ing
acro
ssC
YBB
(en
codi
ng
gp91
)al
ong
mam
mal
ian
evol
utio
na
sid
enti
fied
usin
gth
ePA
ML
met
hod
byY
ang
(200
7a)
The
top
olog
ies
ofgp
91an
dp2
2ar
ere
pro
duce
dfr
omT
aylo
ret
al(
2004
Cop
yrig
ht20
04T
heA
mer
ican
Ass
ocia
tion
ofIm
mun
olog
ists
In
cU
sed
wit
hp
erm
issi
on)
Dar
kgr
ayam
ino
acid
sha
veev
olve
dun
der
pos
itiv
ese
lect
ion
wit
hgt
80
pro
babi
lity
Mos
tof
thes
eam
ino
acid
sar
ein
the
extr
acel
lula
rp
orti
onof
the
pro
tein
The
upp
erp
art
ofth
efig
ure
show
sth
ep
rote
inal
ign
men
tfo
rn
ine
mam
mal
sof
the
gp91
regi
onin
dica
ted
byth
ebl
ack
ellip
seI
nth
isre
gion
ahi
ghle
velo
fam
ino
acid
varia
tion
isfo
und
betw
een
spec
ies
and
seve
ralc
odon
ssh
owgt
1In
this
alig
nm
ent
gray
vert
ical
bars
corr
esp
ond
tova
riab
leam
ino
acid
site
sT
hep
rote
inal
ign
men
tof
mam
mal
ssh
ows
the
follo
win
gsp
ecie
sH
om(H
omo
sapi
ens
sapi
ens)
Pan
(Pan
trog
lody
tes)
Mac
(Mac
aca
mul
atta
)M
us(M
usm
uscu
lus)
Rat
(Rat
tus
norv
egic
us)
Cri
(Cri
cetu
lus
gris
eus)
Het
(Het
eroc
epha
lus
glab
er)
Cav
(Cav
iapo
rcel
lus)
an
dO
ry(O
ryct
olag
uscu
nicu
lus)
EC
ext
race
lula
ren
viro
nm
ent
TM
tra
nsm
embr
ane
laye
ran
dIC
in
trac
ellu
lar
envi
ron
men
t
2161
Evolutionary Dynamics of Human NADPH Oxidase Genes doi101093molbevmst119 MBE by guest on A
ugust 13 2016httpm
beoxfordjournalsorgD
ownloaded from
From the four studied genes NCF4 which encodes theregulatory protein p40-phox shows a pattern of diversitythat is typical for a gene that has evolved under neutralityIn addition to the features described in the previous para-graph the allelic spectra of NCF4 in the studied populationsare consistent with a neutral model of evolution (tables 2ndash4)
Although CYBB presented the most interesting evolution-ary history at the interspecific level with repeated episodes ofpositive natural selection recent human evolutionary historyhas resulted in interesting patterns of variation for CYBA andNCF2 CYBA encodes p22-phox which is a transmembraneprotein shared by different NADPH oxidases In addition toharboring two common nonsynonymous polymorphisms(V174A is also variable among different species of mammals)CYBA is the most variable and most affected by recombina-tion among the NADPH oxidase genes (tables 1 and 2) Inparticular CYBA diversity is very high in Europe comparedwith 329 genes resequenced in a European sample (httppgagswashingtonedusummary_statshtml last accessed July16 2013) CYBA ranks 11th (ie the 97th percentile)Moreover there are contrasting proportions of the totalnumber of polymorphismsnumber of singletons betweenAfricans (a low proportion) and Europeans (a high propor-tion) the latter showing an excess of common polymor-phisms with respect to the neutral expectation (see the DFL
test in table 4 and supplementary table S4 SupplementaryMaterial online) This excess of common variants inEuropeans is also significant when we conservatively testedit against a scenario of human evolution that incorporates theldquoOut of Africardquo bottleneck (Laval et al 2010) and the observedlevel of recombination in Europeans (CYBA = 807 for the se-quenced region) Because demographic forces and recombi-nation levels alone do not explain the high CYBA diversity andits excess of common polymorphisms we suggest that bal-ancing natural selection (that acts by maintaining differentalleles at high frequency in a population) has contributed to
shape the diversity of CYBA at least in the European popu-lation Indeed figure 4a shows that the haplotype network ofCYBA for the European population is consistent with theaction of balancing natural selection (Bamshad andWooding 2003) showing two well-differentiated commonclades that explain the observed high diversity and theexcess of common CYBA variants A comparative genomicanalysis confirms this inference the ratio of polymorphismsto differences fixed between human and chimpanzee is nothomogeneous along the gene in the different human popu-lations (Mc Donald 1998 supplementary table S5Supplementary Material online) as would be expectedunder neutral evolution Our inference of balancing naturalselection is consistent with the fact that 25ndash30 of the var-iation in levels of ROS production can be attributed to geneticfactors (Lacy et al 2000) and that ROS levels are associatedwith CYBA variants (Bedard et al 2009)
p67 encoded by NCF2 is a necessary cytosolic NADPHcomponent for phagocyte ROS production Asians show ahighly differentiated NCF2 haplotype structure (see frequen-cies of haplotypes NCF2-D11 and NCF2-E10 in the haplotypetables online and in the network shown in fig 4b) and thehighest FST values are observed in pairwise comparisons be-tween Asians and non-Asian populations (in particular withEuropeans table 3) and not between Africans and non-African populations as is usually observed in the humangenome We confirmed these results by analyzing data forNCF2 from the HapMap Project (supplementary materialSupplementary Material online) Moreover a trend towardan excess of rare polymorphisms exists in Asians that is notobserved elsewhere (tables 1 2 and 4 DFL =1904FFL =1893) Although we cannot exclude that this patternof diversity is compatible with the null hypotheses of neutral-ity and with the tested demographic history of human pop-ulations inferred by Laval et al (2010 tables 2ndash4) we canspeculate and envisage four additional evolutionary scenarios
Table 1 Allele Frequencies of Nonsynonymous Polymorphisms in NADPH Oxidase Genes
Genes rs Minor Allele(Amino Acid)
PolyphenPrediction
African European Asian Hispanic
CYBA
Y72H rs4673 T (Y) Possibly damaging 046 032 017 022
V174A rs1049254 T (V) Benign 017 048 048 018
NCF2
K181R rs2274064 G (R) Benign 035 037 041 048
T279M rs13306581 T (T) Probably damaging 000 000 005 000
V297A rs35937854 C (A) Benign 004 000 000 000
T361S Chr1181799289 NCBI36hg18 T (S) mdash 000 000 002 000
H389Q rs17849502 A (Q) Benign 000 005 000 007
R395W rs13306575 T (W) Possibly damaging 000 000 000 007
N419I rs35012521 T (I) Probably damaging 000 002 004 000
P454S rs55761650 T (S) mdash 000 000 000 002
L487S Chr1181795862 NCBI36hg18 C (S) mdash 002 000 000 000
NCF4
T85N rs112306225 A (N) mdash 000 000 002 000
A304E rs5995361 A (E) Benign 004 000 000 000
2162
Tarazona-Santos et al doi101093molbevmst119 MBE by guest on A
ugust 13 2016httpm
beoxfordjournalsorgD
ownloaded from
that may have contributed to shape the pattern of NCF2diversity in Asians 1) the observed trend is suggestive of aselective sweep on NCF2 standing variation a neutral orweakly deleterious existing variant becomes beneficial andrapidly increases in frequency (together with its associatedhaplotypes ie incomplete sweep) reducing the nucleotidediversity in the surrounding region and rendering other stand-ing substitutions rare During this process new rare substitu-tions appear in the expanding positively selected haplotype2) The pattern of diversity of NCF2 in Asia may result from anincomplete selective sweep acting on ARPC5 which is locatedapproximately 35 kb downstream of NCF2 In a genome-widescan for recent positive selection Voight et al (2006) identi-fied a strong signature of an incomplete sweep for ARPC5 inAsia (P = 0009) characterized by a higher than expected long-range linkage disequilibrium summarized by very high iHS
statistics (see Haplotter results for the HapMap II data athttphaplotteruchicagoedu last accessed July 16 2013)SNPs in NCF2 also presents high iHS statistics (P = 002) al-though values are lower than for ARPC5 3) The differentiatedpattern of diversity of NCF2 in Asia may also have been gen-erated without the action of natural selection during the firstcolonization of Asia by modern humans In a process of geo-graphic population expansion specifically in the front wave ofthe expansion some rare alleleshaplotypes (ie surfing al-leles) may become common by chance mimicking the pat-tern of diversity generated by a selective sweep (Excoffier andRay 2008) 4) The excess of rare variants may be an artifact ofpooling individuals from different populations (Ptak andPrzeworski 2002) Consistent with evolutionary scenarios 1ndash4 that produce similar patterns of diversity the haplotypenetwork of NCF2 for Eurasians (fig 4b) shows the following1) a large differentiation between Asians and Europeans thatis compatible with the high observed FST values and 2) a star-like shape associated with the haplotype NCF2-E10 that iscommon in Asia and rare elsewhere which is compatiblewith the excess of rare alleles in the Asian populations
DiscussionBy analyzing 29 mammalian genomes and four human pop-ulations we show in this study that natural selection hasacted in different ways over time to shape the pattern ofdiversity of the phagocyte NADPH oxidase genes At the tem-poral scale of the evolution of mammals we have inferredrecurrent episodes of positive selection acting on the extra-cellular portion of gp-91 that have been important to shapethe pattern of interspecific diversity of this gene Our in-terspecific analyses did not show a similar pattern of naturalselection in any of the other phagocyte NADPH oxidasegenes Even if current knowledge on the biology of NADPHdoes not allow us to interpret our results in terms of functionwe propose that the extracellular region of gp-91 is function-ally relevant Our results also imply that this region is highlydifferentiated among mammals at the protein level and thisvariability should be considered when mammals models areused to study the structure and function of phagocyteNADPH components
In the time scale of human evolution our analyses of theNADPH oxidase genes suggest that CYBA has been a target ofbalancing natural selection Because we do not have evidenceof population-specific variants that faced selective pressurethe inferred natural selection may have acted on a standingvariation in ancestral populations This implies that the selec-tive pressure began after the appearance of the variant andpossibly acted in a specific geographic region (Barret andSchluter 2008) The signatures of natural selection acting ona new mutation and on standing variation differ In the case ofselective sweeps episodes of natural selection on standingvariation are associated to a larger variance in the allelic spec-trum with respect to natural selection on a new mutationAlso selection on standing variation may produce an excess ofalleles at intermediate frequencies that is not associated withhigh nucleotide diversity (Przeworski et al 2005 Peter et al2012) This pattern contrasts with the effect of balancing
Table 2 Intrapopulation Diversity Indexes in the Studied Populationsfor the NADPH Oxidase Genes Obtained from Resequencing Dataa
African European Asian Hispanic
Number of chromosomes
CYBBb 42 52 34 36
CYBA 48 62 48 46
NCF2 48 62 48 46
NCF4 48 62 48 46
Segregating sitessingletons
CYBB 218 70 103 135
CYBA 6122 333 335 347
NCF2 4613 3311 2816 3714
NCF4 4512 267 191 308
Haplotype structureNumber of inferred haplotypesc
CYBB 14 5 7 12
CYBA 39 39 32 26
NCF2 38 32 18 31
NCF4 36 30 17 22
Haplotype diversity plusmn SD
CYBB 088 plusmn 003 034 plusmn 008 053 plusmn 010 070 plusmn 008
CYBA 098 plusmn 001 098 plusmn 001 097 plusmn 000 096 plusmn 002
NCF2 099 plusmn 001 096 plusmn 001 087 plusmn 004 097 plusmn 001
NCF4 098 plusmn 001 093 plusmn 002 093 plusmn 002 093 plusmn 002
Recombination parameter (q 103 per site)c
CYBB 008 lt001 lt001 lt002
CYBA 291 148 096 088
NCF2 052 038 010 058
NCF4 137 103 039 022
h estimatorsd
p 103 per site
CYBB 036 012 015 028
CYBA 190 163 151 157
NCF2 081 047 043 057
NCF4 101 076 064 084
hW 103 per site
CYBB 042 013 021 027
CYBA 231 118 125 13
NCF2 102 069 062 083
NCF4 101 070 054 087
aMost analyses were performed using software DnaSP (Rozas 2009)bData for CYBB (Xp211) are from Tarazona-Santos et al (2008)cHaplotypes and inferred using the method by Stephens and Scheet (2005) andthe software PHASEd Tajima (1983) W Watterson (1975)
2163
Evolutionary Dynamics of Human NADPH Oxidase Genes doi101093molbevmst119 MBE by guest on A
ugust 13 2016httpm
beoxfordjournalsorgD
ownloaded from
natural selection which produces an excess of common allelesassociated with high genetic diversity Thus the observed pat-tern of CYBA diversity in Europeans is not consistent with aselective sweep on a standing variation but it is consistent witha scenario of balancing selection acting on standing variation
If we consider for CYBA that heterozygote advantage maybe the mechanisms of balancing selection we can speculatethat the biological basis for this mechanism may be the fol-lowing considering that p22-phox is not exclusive of thephagocyte NADPH oxidase but it is also part of Nox com-plexes expressed in other tissues the dependence of ROSproduction on CYBA variants has to be finely regulated IfCYBA variants induce high levels of ROS these variants mayfavor a phagocyte-dependent efficient response to pathogensbut may damage other endothelial tissues Alternativelytissue oxidative damage does not occur if ROS productionis low but this response may be associated with a weaker
phagocyte respiratory burst against pathogens In this con-text heterozygote individuals with a CYBA-dependent inter-mediate level of ROS production may have been favored bynatural selection
Our results contribute to the discussion regarding the rel-evance of balancing selection in shaping the diversity ofinnate immunity genes (Ferrer-Admetlla et al 2008 Barreiroet al 2009) Ferrer-Admetlla et al (2008) have associated therecurrent signatures of balancing selection on inflammatorygenes with the need for fine regulation CYBA is the onlyphagocytic NADPH oxidase gene that also encodesnonphagocyte Nox components thus CYBA has a role inROS cell signaling a potentially dangerous process due toits capability to produce oxidative damage to tissues Geneswith these characteristics likely need even tighter regulationThe interplay between pathogen-driven selective pressureon innate immunity genes and their concomitantnonimmunological functions is complex In addition toCYBA other interesting examples of this interplay can befound among the 10 human toll-like receptors (TLRs) thatshow a variety of signatures of natural selection TLR8 whichshows the strongest signature of purifying selection is alsoinvolved in neuronal development (Barreiro et al 2009) and itis difficult to discriminate the role of each function in deter-mining the observed signature of natural selection
With few exceptions the pathogens responsible for naturalselection on immune genes are difficult to specify In the caseof NADPH oxidase we can infer based on the spectrum ofinfections in CGD patients that catalase-positive bacteria andfungi such as Staphylococci Salmonella Candida Aspergillusand M tuberculosis may be the selective pathogensInteractions between the host and pathogens also includemechanisms of the latter to impair the respiratory burst ofthe former For example Leishmania donovani blocks the as-sembly of NADPH oxidase at the phagosome membrane(Lodge et al 2006) These mechanisms may constitute selec-tive pressures imposed by pathogens
The associations reported in GWAS between rs4821544 inNCF4 and Crohnrsquos disease (an idiopathic inflammatory boweldisease that predominantly involves the ileum and colonRioux et al 2007) and between rs10911363 in NCF2 and
Table 3 Pairwise FST Genetic Distances between Populations
CYBBa CYBA
Africa Europe Asia Hispanic Africa Europe Asia Hispanic
Africa mdash 0316 0264 0092 mdash 0074 0083 0065
Europe 0257 mdash 0000 0107 mdash 0002 0056
Asia 0211 0000 mdash 0073 mdash 0054
Hispanic 0070 0082 0056 mdash mdash
NCF2 NCF4
Africa mdash 0048 0058 0037 mdash 0128 0136 0158
Europe mdash 0069 0000 mdash 0026 0026
Asia mdash 0059 mdash 0005
Hispanic mdash mdash
aFor CYBB FST estimators are above the diagonal Below the diagonal are the FST values corrected as if the effective population sizes of X chromosome genes were equal toautosomal ones
Table 4 Results of Neutrality Tests for the NADPH Oxidase Genesand Their Significancea
African European Asian Hispanic
Tajimarsquos D
CYBB 0473 0274 0813 0084
CYBA 0580 1242 0684 0707
NCF2 0412 0939 0883 0987
NCF4 0750 0243 0556 0102
Fu and Lirsquos D
CYBB 1050 1110 0395 0977
CYBA 1485 1734 1188 0138
NCF2 0407 1105 1904 1398
NCF4 0308 0443 1893 0266
Fu and Lirsquos F
CYBB 0980 0811 0114 0814
CYBA 1382 1893 1205 0405
NCF2 0512 1252 1893 1567
NCF4 0557 0231 1225 0248
NOTEmdashUnderlined values represent significant results under the demographic modelinferred by Laval et al (2010) for human populations See details in supplementarytable S4 Supplementary Material onlineaThe McDonaldndashKreitman test is nonsignificant in any of the cases
Significant under the WrightndashFisher model of constant population size
2164
Tarazona-Santos et al doi101093molbevmst119 MBE by guest on A
ugust 13 2016httpm
beoxfordjournalsorgD
ownloaded from
systemic lupus erythematosus (Cunninghame Graham et al2011) confirm the involvement of NADPH genes in the path-ogenesis of inflammatory-related common diseases Ourclaim that natural selection acted on CYBA (and maybe inNCF2) is relevant for biomedical studies because combiningevidence of natural selection with association analyses inimmune genes increases the statistical power to detect dis-ease-associated variants (Ayodo et al 2007) As a new gener-ation of association studies focusing on rare variation isemerging the combination of genes deemed interestingfrom GWAS and populations with an excess of rare variantsin these genes such as NCF2 in Asians are particularly inter-esting as a source of rare variants with clinical relevanceFinally by determining through molecular evolution mappingthat the extracellular portion of gp91 (encoded by CYBB inthe X-chromosome) has been subject to recurrent episodes ofpositive selection at the scale of mammals evolution we positthe hypothesis that this portion of the NADPH oxidase isrelevant for currently unknown biological processes thatonce revealed by structural and functional investigationswill contribute to understanding the role of NADPH oxidasein infectious autoimmune and cardiovascular diseases
Materials and Methods
Molecular Evolution Analysis of NADPH Genes
We used the maximum likelihood framework developed byYang (2007a) to estimate for the NADPH oxidase genes
under a variety of evolutionary models as implemented in thePAML software (Yang 2007b) This approach allows infer-ences about the evolution of a coding region along aninter-specific phylogeny mapping the codons that haveevolved under strongweak purifying selection neutrality oradaptive positive selection Further details about these anal-yses are available as supplementary material SupplementaryMaterial online
Human Population Genetics of NADPH Genes
For human population genetics analyses we conducted bidi-rectional Sanger sequencing of CYBB CYBA NCF2 and NCF4for a total of 35242 bp for each of 102 healthy individuals aspart of the SNP500 Cancer project (Packer et al 2006 seesupplementary fig S1 Supplementary Material online for de-tails) Human population resequencing data for CYBB werepublished in Tarazona-Santos et al (2008) The resequencingexperiments were performed as in Packer et al (2006) These102 unrelated individuals include the following 24 of Africanancestry (15 African Americans from the United States and 9Pygmies) 23 admixed Latin Americans (from Mexico PuertoRico and South America) 31 Europeans (from the CEPHUTAH pedigree and the NIEHS Environmental GenomeProject) and 24 AsiansndashOceanians (from Melanesia PakistanChina Cambodia Japan and Taiwan)
After controlling for multiple tests we confirmed that allSNPs fit the HardyndashWeinberg equilibrium by the Guo and
CYBA - A16
CYBA - A3
Y72H - rs4673
V174A - rs1049254 CYBA - E18
(a)
NCF2 - E10 H389Q - rs17849502
NCF2 - B3
NCF2 - D11
NCF2 - C1K181R - rs2274064(b)
FIG 4 Phylogenetic networks of (a) CYBA in Europeans and (b) NCF2 in Europeans (black) and Asians (yellow) The lengths of the branches areproportional to the number of mutations Only nonsynonymous mutations are shown The haplotype names correspond to the table of inferredhaplotypes in the supplementary material Supplementary Material online We only show the names of haplotypes with a frequency gt5 Ancestralhaplotypes (inferred as being the human haplotype or median vector most similar to the chimpanzee sequence) are indicated by an arrow Medianvectors are in red
2165
Evolutionary Dynamics of Human NADPH Oxidase Genes doi101093molbevmst119 MBE by guest on A
ugust 13 2016httpm
beoxfordjournalsorgD
ownloaded from
Thompson (1992) test which was implemented in the soft-ware Arlequin 30 (Excoffier et al 2007) Insertion-deletions(INDELs) were excluded from population genetics analysesTo assess intrapopulation diversity we used two statistics the per-site mean number of pairwise differences betweensequences (Tajima 1983) and w based on the number ofsegregating sites (S) (Watterson 1975) We measured pairwisebetween-populations diversity by using the FST statistics cal-culated using the software DnaSP (Rozas 2009)
Haplotypes and the recombination parameter were in-ferred using the PHASE software (Stephens and Scheet 2005)and diversity indexes calculations (tables 2 and 3) as well asneutrality statistics (table 4) were estimated using DnaSP soft-ware We applied two kinds of neutrality tests 1) tests basedon the allelic spectrum which is the distribution of polymor-phisms across different classes of frequencies namelyTajimarsquos DT (Tajima 1989) and FundashLirsquos DFL and FFL (Fu andLi 1993) and 2) tests based on comparisons between thenumber of polymorphisms in human populations and fixeddifferences with the chimpanzee (ie outgroup) namely theMcDonald and Kreitman (1991) test and the adaptedKolmogorovndashSmirnoff test by McDonald (1998) For thefirst set of tests we used as null hypotheses both the classicWrightndashFisher model of neutrality with a constant popula-tion size as well as the more realistic evolutionary scenario forhuman populations inferred by Laval et al (2010) In the caseof the scenario of Laval et al (2010) we ignored interconti-nental gene flow within the Old World because these raregene flow events likely does not affect the level of significanceof the neutrality tests given its very low inferred values(13 105) Null distributions used to test the significanceof the neutrality tests under these evolutionary scenarios weregenerated using coalescent simulations and a significancelevel of 005 (Hudson 2002) The KolmogorovndashSmirnoff testof neutrality adapted by McDonald (1998) was performedusing Slider software available at httpudeledu~mcdonaldaboutdnasliderhtml (last accessed July 16 2013) Weperformed coalescent simulations using ms software(Hudson 2002) Further methodological details are availableas supplementary material Supplementary Material onlineWe constructed the CYBA and NCF2 networks using allSNP variants and applying the Median joining algorithmand the maximum parsimony option calculations as imple-mented in the software Network 46 (Bandelt et al 1999)
Supplementary MaterialSupplementary tables S1ndashS5 and figure S1 are available atMolecular Biology and Evolution online (httpwwwmbeoxfordjournalsorg)
Acknowledgments
SJC conceived the study ET-S MM FL RC AC CF LPand BS generated the dataanalyzed the sequences LB ACWCSM and MY provided tools for data generationand analyses ET-S MM FL WCSM and RR performedpopulation genetics analyses ET-S and MM prepared tablesand figures ET-S and SJC wrote the manuscript All the
authors contributed with discussions about the results Theauthors are grateful to Silvia Fuselli for discussions of ourresults to Fernando Levi Soares for technical assistance withthe figures and the Sequencing Group of the CoreGenotyping Facility (National Cancer Institute) for their tech-nical assistance This work was supported by the IntramuralResearch Program of the National Cancer Institute (NCI) andNational Institutes of Health (NIH) CF was supported by theUniversity of Bologna ET-S MM WCSM and FL weresupported by the Fogarty International Center and NationalCancer Institute (1R01TW007894) Conselho Nacional deDesenvolvimento Cientıfico e Tecnologico (Brazil Grant3075562012-3) Fundacao de Amparo a Pesquisa de MinasGerais (Brazil CBB PPM 0026512) and the Brazilian Ministryof Education (CAPES Agency grant PNPD 0256709-1)
ReferencesAyodo G Price AL Keinan A Ajwang A Otieno MF Orago AS
Patterson N Reich D 2007 Combining evidence of natural selectionwith association analysis increases power to detect malaria-resis-tance variants Am J Hum Genet 81(2)234ndash242
Bamshad M Wooding SP 2003 Signatures of natural selection in thehuman genome Nat Rev Genet 4(2)99ndash111
Bandelt H-J Forster P Rohl A 1999 Median-joining networks for infer-ring intraspecific phylogenies Mol Biol Evol 16(1)37ndash48
Barreiro LB Ben-Ali M Quach H et al (17 co-authors) 2009Evolutionary dynamics of human Toll-like receptors and their dif-ferent contributions to host defense PLoS Genet 5(7)e1000562
Barreiro LB Quintana-Murci L 2010 From evolutionary genetics tohuman immunology how selection shapes host defence genesNat Rev Genet 11(1)17ndash30
Barrett RD Schluter D 2008 Adaptation from standing genetic varia-tion Trends Ecol Evol 23(1)38ndash44
Bedard K Attar H Bonnefont J Jaquet V Borel C Plastre O Stasia MJAntonarakis SE Krause KH 2009 Three common polymorphisms inthe CYBA gene form a haplotype associated with decreased ROSgeneration Hum Mutat 30(7)1123ndash1133
Blekhman R Man O Herrmann L Boyko AR Indap A Kosiol CBustamante CD Teshima KM Przeworski M 2008 Natural selectionon genes that underlie human disease susceptibility Curr Biol18(12)883ndash889
Brandes RP Kreuzer J 2005 Vascular NADPH oxidases molecular mech-anisms of activation Cardiovasc Res 65(1)16ndash27
Buckley RH 2004 Pulmonary complications of primary immunodefi-ciencies Paediatr Respir Rev 5(Suppl A)S225ndashS233
Bustamante CD Fledel-Alon A Williamson S et al (13 co-authors)2005 Natural selection on protein-coding genes in the humangenome Nature 437(7062)1153ndash1157
Bustamante J Arias AA Vogt G et al (29 co-authors) 2011 GermlineCYBB mutations that selectively affect macrophages in kindredswith X-linked predisposition to tuberculous mycobacterial diseaseNat Immunol 12(3)213ndash221
Campbell MC Tishkoff SA 2008 African genetic diversity implicationsfor human demographic history modern human origins and com-plex disease mapping Annu Rev Genomics Hum Genet 9403ndash433
Chanock SJ Roesler J Zhan S Hopkins P Lee P Barrett DT ChristensenBL Curnutte JT Gorlach A 2000 Genomic structure of the humanp47-phox (NCF1) gene Blood Cells Mol Dis 26(1)37ndash46
Cunninghame Graham DS Morris DL Bhangale TR Criswell LASyvanen AC Ronnblom L Behrens TW Graham RR Vyse TJ2011 Association of NCF2 IKZF1 IRF8 IFIH1 and TYK2 with sys-temic lupus erythematosus PLoS Genet 7(10)e1002341
Excoffier L Laval G Schneider S 2007 Arlequin ver 30 an integratedsoftware package for population genetics data analysis EvolBioinform Online 147ndash50
2166
Tarazona-Santos et al doi101093molbevmst119 MBE by guest on A
ugust 13 2016httpm
beoxfordjournalsorgD
ownloaded from
Excoffier L Ray N 2008 Surfing during population expansions promotesgenetic revolutions and structuration Trends Ecol Evol23(7)347ndash351
Ferrer-Admetlla A Bosch E Sikora M Marques-Bonet T Ramırez-SorianoA Muntasell A Navarro A Lazarus R Calafell F Bertranpetit J Casals F2008 Balancing selection is the main force shaping the evolution ofinnate immunity genes J Immunol 181(2)1315ndash1322
Fu YX Li WH 1993 Statistical tests of neutrality of mutations Genetics133(3)693ndash709
The 1000 Genomes Project Consortium 2010 A map of humangenome variation from population-scale sequencing Nature467(7319)1061ndash1073
The 1000 Genomes Project Consortium Abecasis GR Auton A BrooksLD DePristo MA Durbin RM Handsaker RE Kang HM Marth GTMcVean GA 2012 An integrated map of genetic variation from1092 human genomes Nature 491(7422)56ndash65
Guo SW Thompson EA 1992 Performing the exact test of Hardy-Weinberg proportion for multiple alleles Biometrics 48(2)361ndash372
Heyworth PG Cross AR Curnutte JT 2003 Chronic granulomatousdisease Curr Opin Immunol 15(5)578ndash584
Hudson RR 2002 Generating samples under a Wright-Fisher neutralmodel of genetic variation Bioinformatics 18(2)337ndash338
Jacob CO Eisenstein M Dinauer MC et al (21 co-authors) 2012 Lupus-associated causal mutation in neutrophil cytosolic factor 2 (NCF2)brings unique insights to the structure and function of NADPHoxidase Proc Natl Acad Sci U S A 109(2)E59ndashE67
Kimura M 1974 The neutral theory of molecular evolution CambridgeCambridge University Press
Kosiol C Vinar T da Fonseca RR Hubisz MJ Bustamante CD Nielsen RSiepel A 2008 Patterns of positive selection in six mammalian ge-nomes PLoS Genet 4(8)e1000144
Lacy F Kailasam MT OrsquoConnor DT Schmid-Schonbein GW Parmer RJ2000 Plasma hydrogen peroxide production in human essentialhypertension role of heredity gender and ethnicity Hypertension36(5)878ndash884
Laval G Patin E Barreiro LB Quintana-Murci L 2010 Formulating a his-torical and demographic model of recent human evolution based onresequencing data from noncoding regions PLoS One 5(4)e10284
Lewontin R 1972 The apportionment of human diversity Evol Biol 6391ndash398
Lindblad-Toh K Garber M Zuk O et al (88 co-authors) 2011 A high-resolution map of human evolutionary constraint using 29 mam-mals Nature 478(7370)476ndash482
Lodge R Diallo TO Descoteaux A 2006 Leishmania donovani lipopho-sphoglycan blocks NADPH oxidase assembly at the phagosomemembrane Cell Microbiol 8(12)1922ndash1931
Magalhaes WC Rodrigues MR Silva D Soares-Souza G Iannini MLCerqueira GC Faria-Campos AC Tarazona-Santos E 2012DIVERGENOME a bioinformatics platform to assist population ge-netics and genetic epidemiology studies Genet Epidemiol36(4)360ndash367
Marth JD Grewal PK 2008 Mammalian glycosylation in immunity NatRev Immunol 8(11)874ndash887
McDonald JH 1998 Improved tests for heterogeneity across a region ofDNA sequence in the ratio of polymorphism to divergence Mol BiolEvol 15377ndash384
McDonald JH Kreitman M 1991 Adaptive protein evolution at the Adhlocus in Drosophila Nature 351(6328)652ndash654
Nielsen R Bustamante C Clark AG et al (12 co-authors) 2005 A scanfor positively selected genes in the genomes of humans and chim-panzees PLoS Biol 3(6)e170
Olsson LM Lindqvist AK Kallberg H Padyukov L Burkhardt HAlfredsson L Klareskog L Holmdahl R 2007 A case-control studyof rheumatoid arthritis identifies an associated single nucleotidepolymorphism in the NCF4 gene supporting a role for the
NADPH-oxidase complex in autoimmunity Arthritis Res Ther9(5)R98
Packer BR Yeager M Burdett L et al (13 co-authors) 2006SNP500Cancer a public resource for sequence validation assay de-velopment and frequency analysis for genetic variation in candidategenes Nucleic Acids Res 34(Database issue)D617ndashD621
Peter BM Huerta-Sanchez E Nielsen R 2012 Distinguishing betweenselective sweeps from standing variation and from a de novo mu-tation PLoS Genet 8(10)e1003011
Przeworski M Coop G Wall JD 2005 The signature of positive selectionon standing genetic variation Evolution 59(11)2312ndash2323
Ptak SE Przeworski M 2002 Evidence for population growth in humansis confounded by fine-scale population structure Trends Genet18(11)559ndash563
Ramensky V Bork P Sunyaev S 2002 Human non-synonymous SNPsserver and survey Nucleic Acids Res 30(17)3894ndash3900
Rioux JD Xavier RJ Taylor KD et al (24 co-authors) 2007 Genome-wideassociation study identifies new susceptibility loci for Crohn diseaseand implicates autophagy in disease pathogenesis Nat Genet39(5)596ndash604
Roberts RL Hollis-Moffatt JE Gearry RB Kennedy MA Barclay MLMerriman TR 2008 Confirmation of association of IRGM andNCF4 with ileal Crohnrsquos disease in a population-based cohortGenes Immun 9(6)561ndash565
Rozas J 2009 DNA sequence polymorphism analysis using DnaSPMethods Mol Biol 537337ndash350
San Jose G Fortuno A Beloqui O Dıez J Zalba G 2008 NADPH oxidaseCYBA polymorphisms oxidative stress and cardiovascular diseasesClin Sci (Lond) 114(3)173ndash182
Santiago HC Gonzalez Lombana CZ Macedo JP et al (12 co-authors)2012 NADPH phagocyte oxidase knockout mice controlTrypanosoma cruzi proliferation but develop circulatory collapseand succumb to infection PLoS Negl Trop Dis 6(2)e1492
Stephens M Scheet P 2005 Accounting for decay of linkage disequilib-rium in haplotype inference and missing-data imputation Am JHum Genet 76(3)449ndash462
Sumimoto H Miyano K Takeya R 2005 Molecular composition andregulation of the Nox family NAD(P)H oxidases Biochem BiophysRes Commun 338(1)677ndash686
Tajima F 1983 Evolutionary relationship of DNA sequences in finitepopulations Genetics 105(2)437ndash460
Tajima F 1989 Statistical method for testing the neutral mutationhypothesis by DNA polymorphism Genetics 123(3)585ndash595
Tarazona-Santos E Bernig T Burdett L Magalhaes WC Fabbri C Liao JRedondo RA Welch R Yeager M Chanock SJ 2008 CYBB anNADPH-oxidase gene restricted diversity in humans and evidencefor differential long-term purifying selection on transmembrane andcytosolic domains Hum Mutat 29(5)623ndash632
Taylor RM Burritt JB Baniulis D Foubert TR Lord CI Dinauer MCParkos CA Jesaitis AJ 2004 Site-specific inhibitors of NADPH oxi-dase activity and structural probes of flavocytochrome b character-ization of six monoclonal antibodies to the p22phox subunitJ Immunol 173(12)7349ndash7357
Voight BF Kudaravalli S Wen X Pritchard JK 2006 A map of recentpositive selection in the human genome PLoS Biol 4(3)e72
Watterson GA 1975 On the number of segregating sites in geneticalmodels without recombination Theor Popul Biol 7(2) 256ndash276
Williamson SH Hernandez R Fledel-Alon A Zhu L Nielsen RBustamante CD 2005 Simultaneous inference of selection and pop-ulation growth from patterns of variation in the human genomeProc Natl Acad Sci U S A 102(22)7882ndash7887
Yang Z 2007a Adaptive molecular evolution In Balding DJ Bishop MCannings C editors Handbook of statistical genetics Vol 1 3rd edSusex (United Kingdom) John Wiley amp Sons p 377ndash406
Yang Z 2007b PAML 4 phylogenetic analysis by maximum likelihoodMol Biol Evol 241586ndash1591
2167
Evolutionary Dynamics of Human NADPH Oxidase Genes doi101093molbevmst119 MBE by guest on A
ugust 13 2016httpm
beoxfordjournalsorgD
ownloaded from
mammalian evolution as inferred in this study For the auto-somal NADPH oxidase components (table 1 and haplotypetables online Supplementary Material online) we observe inthis study two rare and conservative nonsynonymous substi-tutions (ie involving amino acids with similar chemical prop-erties T85N and A304E) in NCF4 But for NCF2 and CYBA weobserved patterns of nonsynonymous substitutions thatseldom occur in human genes NCF2 has nine nonsynony-mous substitutions three of them are common (with a fre-quency higher than 5 in at least one of the studiedpopulation samples) and six are rare On the other handtwo nonsynonymous substitutions in CYBA are commonand ubiquitous in human populations namely Y72H(rs4673) and V174A (rs1049254 in a position where variationamong mammalian species is also observed) Moreover thefollowing two of the five common amino acid changes ob-served in the autosomal NADPH genes are predicted to bepossibly damaging (ie radical) by the Polyphen resource(Ramensky et al 2002) the two changes are R395W inHispanic NCF2 (rs13306575) and Y72H (rs4673) in CYBA Ingeneral Polyphen accurately predicts the effect of nonsynon-ymous substitutions based on biochemical and evolutionarydata (Williamson et al 2005) Notably the ImmunodeficiencyMutations Database (httpbioinfutafiNCF2basecontent=pubIDbases last accessed July 16 2013) reports one395W395W autosomal recessive CGD patient but we andthe HapMap project (wwwhapmaporg last accessed July 16
2013) observed the W allele at frequencies between 5 and10 in Asians and admixed Latin Americans including onesupposedly healthy 395W395W Japanese HapMap individ-ual On the basis of these results we verified whether NativeAmericans who descend from an ancestral Pleistocene Asianpopulations that peopled the Americas by the Behring Straitsmore than 14000 years ago may have relatively higher fre-quencies of this variant We genotyped the variant 395Wusing a Taqman assay in 558 Native Americans (see supple-mentary table S3 Supplementary Material online for detailedresults) and observed an allele frequency of 12 being thisvariant always present in heterozygous individuals
For the four studied genes the diversity and levels of re-combination are higher in Africans than in non-Africans(table 2 and haplotype tables available as supplementaryfiles Supplementary Material online) This result is a conse-quence of the African origin of modern humans and the ldquooutof Africardquo migration that occurred 40000ndash80000 years agoafter a bottleneck leading to the peopling of other continents(Campbell and Tishkoff 2008 Laval et al 2010) Therefore thefirst divergence between continental human populations wasbetween Africans and ancestral non-Africans Consistentlywith this scenario we observed the highest between-popula-tion differentiation for CYBB CYBA and NCF4 between thesetwo groups Interestingly NCF2 does not match this pattern(table 3)
FIG 2 Inferred types of natural selection for codons of the NADPH genes at the evolutionary time scale of mammals Codons are represented along thehorizontal axis For each gene three classes of sites (black dark gray and light gray) are considered and each class evolved under different inferred values (presented for each gene in the figure at the top of each graphic) These classes correspond to the model M3 of Yang (2007a) with three classes ofsites Given our data this model is more likely than alternative models of evolution that assume simpler scenarios such as a unique for the entire gene(see supplementary material Supplementary Material online for details regarding methods and results using alternative models) The three classescorrespond to different types and levels of natural selection from strong purifying selection (in the lightest gray) to positive selection (gt 1) For eachcodon the probability of belonging to each of the three classes of corresponds to the height of the corresponding color in the vertical bar Forexample codon 173 of CYBB (indicated by a white arrow) has a 0000 probability of belonging to the= 00095 class (light gray a class corresponding tostrong purifying selection) a 0169 probability of belonging to the = 03085 class (dark gray) and a 0831 probability of belonging to the = 18987class (black a class that suggests positive selection) In this case reasonable evidence of positive selection on this codon exists
2160
Tarazona-Santos et al doi101093molbevmst119 MBE by guest on A
ugust 13 2016httpm
beoxfordjournalsorgD
ownloaded from
FIG
3
Nat
ural
sele
ctio
nm
app
ing
acro
ssC
YBB
(en
codi
ng
gp91
)al
ong
mam
mal
ian
evol
utio
na
sid
enti
fied
usin
gth
ePA
ML
met
hod
byY
ang
(200
7a)
The
top
olog
ies
ofgp
91an
dp2
2ar
ere
pro
duce
dfr
omT
aylo
ret
al(
2004
Cop
yrig
ht20
04T
heA
mer
ican
Ass
ocia
tion
ofIm
mun
olog
ists
In
cU
sed
wit
hp
erm
issi
on)
Dar
kgr
ayam
ino
acid
sha
veev
olve
dun
der
pos
itiv
ese
lect
ion
wit
hgt
80
pro
babi
lity
Mos
tof
thes
eam
ino
acid
sar
ein
the
extr
acel
lula
rp
orti
onof
the
pro
tein
The
upp
erp
art
ofth
efig
ure
show
sth
ep
rote
inal
ign
men
tfo
rn
ine
mam
mal
sof
the
gp91
regi
onin
dica
ted
byth
ebl
ack
ellip
seI
nth
isre
gion
ahi
ghle
velo
fam
ino
acid
varia
tion
isfo
und
betw
een
spec
ies
and
seve
ralc
odon
ssh
owgt
1In
this
alig
nm
ent
gray
vert
ical
bars
corr
esp
ond
tova
riab
leam
ino
acid
site
sT
hep
rote
inal
ign
men
tof
mam
mal
ssh
ows
the
follo
win
gsp
ecie
sH
om(H
omo
sapi
ens
sapi
ens)
Pan
(Pan
trog
lody
tes)
Mac
(Mac
aca
mul
atta
)M
us(M
usm
uscu
lus)
Rat
(Rat
tus
norv
egic
us)
Cri
(Cri
cetu
lus
gris
eus)
Het
(Het
eroc
epha
lus
glab
er)
Cav
(Cav
iapo
rcel
lus)
an
dO
ry(O
ryct
olag
uscu
nicu
lus)
EC
ext
race
lula
ren
viro
nm
ent
TM
tra
nsm
embr
ane
laye
ran
dIC
in
trac
ellu
lar
envi
ron
men
t
2161
Evolutionary Dynamics of Human NADPH Oxidase Genes doi101093molbevmst119 MBE by guest on A
ugust 13 2016httpm
beoxfordjournalsorgD
ownloaded from
From the four studied genes NCF4 which encodes theregulatory protein p40-phox shows a pattern of diversitythat is typical for a gene that has evolved under neutralityIn addition to the features described in the previous para-graph the allelic spectra of NCF4 in the studied populationsare consistent with a neutral model of evolution (tables 2ndash4)
Although CYBB presented the most interesting evolution-ary history at the interspecific level with repeated episodes ofpositive natural selection recent human evolutionary historyhas resulted in interesting patterns of variation for CYBA andNCF2 CYBA encodes p22-phox which is a transmembraneprotein shared by different NADPH oxidases In addition toharboring two common nonsynonymous polymorphisms(V174A is also variable among different species of mammals)CYBA is the most variable and most affected by recombina-tion among the NADPH oxidase genes (tables 1 and 2) Inparticular CYBA diversity is very high in Europe comparedwith 329 genes resequenced in a European sample (httppgagswashingtonedusummary_statshtml last accessed July16 2013) CYBA ranks 11th (ie the 97th percentile)Moreover there are contrasting proportions of the totalnumber of polymorphismsnumber of singletons betweenAfricans (a low proportion) and Europeans (a high propor-tion) the latter showing an excess of common polymor-phisms with respect to the neutral expectation (see the DFL
test in table 4 and supplementary table S4 SupplementaryMaterial online) This excess of common variants inEuropeans is also significant when we conservatively testedit against a scenario of human evolution that incorporates theldquoOut of Africardquo bottleneck (Laval et al 2010) and the observedlevel of recombination in Europeans (CYBA = 807 for the se-quenced region) Because demographic forces and recombi-nation levels alone do not explain the high CYBA diversity andits excess of common polymorphisms we suggest that bal-ancing natural selection (that acts by maintaining differentalleles at high frequency in a population) has contributed to
shape the diversity of CYBA at least in the European popu-lation Indeed figure 4a shows that the haplotype network ofCYBA for the European population is consistent with theaction of balancing natural selection (Bamshad andWooding 2003) showing two well-differentiated commonclades that explain the observed high diversity and theexcess of common CYBA variants A comparative genomicanalysis confirms this inference the ratio of polymorphismsto differences fixed between human and chimpanzee is nothomogeneous along the gene in the different human popu-lations (Mc Donald 1998 supplementary table S5Supplementary Material online) as would be expectedunder neutral evolution Our inference of balancing naturalselection is consistent with the fact that 25ndash30 of the var-iation in levels of ROS production can be attributed to geneticfactors (Lacy et al 2000) and that ROS levels are associatedwith CYBA variants (Bedard et al 2009)
p67 encoded by NCF2 is a necessary cytosolic NADPHcomponent for phagocyte ROS production Asians show ahighly differentiated NCF2 haplotype structure (see frequen-cies of haplotypes NCF2-D11 and NCF2-E10 in the haplotypetables online and in the network shown in fig 4b) and thehighest FST values are observed in pairwise comparisons be-tween Asians and non-Asian populations (in particular withEuropeans table 3) and not between Africans and non-African populations as is usually observed in the humangenome We confirmed these results by analyzing data forNCF2 from the HapMap Project (supplementary materialSupplementary Material online) Moreover a trend towardan excess of rare polymorphisms exists in Asians that is notobserved elsewhere (tables 1 2 and 4 DFL =1904FFL =1893) Although we cannot exclude that this patternof diversity is compatible with the null hypotheses of neutral-ity and with the tested demographic history of human pop-ulations inferred by Laval et al (2010 tables 2ndash4) we canspeculate and envisage four additional evolutionary scenarios
Table 1 Allele Frequencies of Nonsynonymous Polymorphisms in NADPH Oxidase Genes
Genes rs Minor Allele(Amino Acid)
PolyphenPrediction
African European Asian Hispanic
CYBA
Y72H rs4673 T (Y) Possibly damaging 046 032 017 022
V174A rs1049254 T (V) Benign 017 048 048 018
NCF2
K181R rs2274064 G (R) Benign 035 037 041 048
T279M rs13306581 T (T) Probably damaging 000 000 005 000
V297A rs35937854 C (A) Benign 004 000 000 000
T361S Chr1181799289 NCBI36hg18 T (S) mdash 000 000 002 000
H389Q rs17849502 A (Q) Benign 000 005 000 007
R395W rs13306575 T (W) Possibly damaging 000 000 000 007
N419I rs35012521 T (I) Probably damaging 000 002 004 000
P454S rs55761650 T (S) mdash 000 000 000 002
L487S Chr1181795862 NCBI36hg18 C (S) mdash 002 000 000 000
NCF4
T85N rs112306225 A (N) mdash 000 000 002 000
A304E rs5995361 A (E) Benign 004 000 000 000
2162
Tarazona-Santos et al doi101093molbevmst119 MBE by guest on A
ugust 13 2016httpm
beoxfordjournalsorgD
ownloaded from
that may have contributed to shape the pattern of NCF2diversity in Asians 1) the observed trend is suggestive of aselective sweep on NCF2 standing variation a neutral orweakly deleterious existing variant becomes beneficial andrapidly increases in frequency (together with its associatedhaplotypes ie incomplete sweep) reducing the nucleotidediversity in the surrounding region and rendering other stand-ing substitutions rare During this process new rare substitu-tions appear in the expanding positively selected haplotype2) The pattern of diversity of NCF2 in Asia may result from anincomplete selective sweep acting on ARPC5 which is locatedapproximately 35 kb downstream of NCF2 In a genome-widescan for recent positive selection Voight et al (2006) identi-fied a strong signature of an incomplete sweep for ARPC5 inAsia (P = 0009) characterized by a higher than expected long-range linkage disequilibrium summarized by very high iHS
statistics (see Haplotter results for the HapMap II data athttphaplotteruchicagoedu last accessed July 16 2013)SNPs in NCF2 also presents high iHS statistics (P = 002) al-though values are lower than for ARPC5 3) The differentiatedpattern of diversity of NCF2 in Asia may also have been gen-erated without the action of natural selection during the firstcolonization of Asia by modern humans In a process of geo-graphic population expansion specifically in the front wave ofthe expansion some rare alleleshaplotypes (ie surfing al-leles) may become common by chance mimicking the pat-tern of diversity generated by a selective sweep (Excoffier andRay 2008) 4) The excess of rare variants may be an artifact ofpooling individuals from different populations (Ptak andPrzeworski 2002) Consistent with evolutionary scenarios 1ndash4 that produce similar patterns of diversity the haplotypenetwork of NCF2 for Eurasians (fig 4b) shows the following1) a large differentiation between Asians and Europeans thatis compatible with the high observed FST values and 2) a star-like shape associated with the haplotype NCF2-E10 that iscommon in Asia and rare elsewhere which is compatiblewith the excess of rare alleles in the Asian populations
DiscussionBy analyzing 29 mammalian genomes and four human pop-ulations we show in this study that natural selection hasacted in different ways over time to shape the pattern ofdiversity of the phagocyte NADPH oxidase genes At the tem-poral scale of the evolution of mammals we have inferredrecurrent episodes of positive selection acting on the extra-cellular portion of gp-91 that have been important to shapethe pattern of interspecific diversity of this gene Our in-terspecific analyses did not show a similar pattern of naturalselection in any of the other phagocyte NADPH oxidasegenes Even if current knowledge on the biology of NADPHdoes not allow us to interpret our results in terms of functionwe propose that the extracellular region of gp-91 is function-ally relevant Our results also imply that this region is highlydifferentiated among mammals at the protein level and thisvariability should be considered when mammals models areused to study the structure and function of phagocyteNADPH components
In the time scale of human evolution our analyses of theNADPH oxidase genes suggest that CYBA has been a target ofbalancing natural selection Because we do not have evidenceof population-specific variants that faced selective pressurethe inferred natural selection may have acted on a standingvariation in ancestral populations This implies that the selec-tive pressure began after the appearance of the variant andpossibly acted in a specific geographic region (Barret andSchluter 2008) The signatures of natural selection acting ona new mutation and on standing variation differ In the case ofselective sweeps episodes of natural selection on standingvariation are associated to a larger variance in the allelic spec-trum with respect to natural selection on a new mutationAlso selection on standing variation may produce an excess ofalleles at intermediate frequencies that is not associated withhigh nucleotide diversity (Przeworski et al 2005 Peter et al2012) This pattern contrasts with the effect of balancing
Table 2 Intrapopulation Diversity Indexes in the Studied Populationsfor the NADPH Oxidase Genes Obtained from Resequencing Dataa
African European Asian Hispanic
Number of chromosomes
CYBBb 42 52 34 36
CYBA 48 62 48 46
NCF2 48 62 48 46
NCF4 48 62 48 46
Segregating sitessingletons
CYBB 218 70 103 135
CYBA 6122 333 335 347
NCF2 4613 3311 2816 3714
NCF4 4512 267 191 308
Haplotype structureNumber of inferred haplotypesc
CYBB 14 5 7 12
CYBA 39 39 32 26
NCF2 38 32 18 31
NCF4 36 30 17 22
Haplotype diversity plusmn SD
CYBB 088 plusmn 003 034 plusmn 008 053 plusmn 010 070 plusmn 008
CYBA 098 plusmn 001 098 plusmn 001 097 plusmn 000 096 plusmn 002
NCF2 099 plusmn 001 096 plusmn 001 087 plusmn 004 097 plusmn 001
NCF4 098 plusmn 001 093 plusmn 002 093 plusmn 002 093 plusmn 002
Recombination parameter (q 103 per site)c
CYBB 008 lt001 lt001 lt002
CYBA 291 148 096 088
NCF2 052 038 010 058
NCF4 137 103 039 022
h estimatorsd
p 103 per site
CYBB 036 012 015 028
CYBA 190 163 151 157
NCF2 081 047 043 057
NCF4 101 076 064 084
hW 103 per site
CYBB 042 013 021 027
CYBA 231 118 125 13
NCF2 102 069 062 083
NCF4 101 070 054 087
aMost analyses were performed using software DnaSP (Rozas 2009)bData for CYBB (Xp211) are from Tarazona-Santos et al (2008)cHaplotypes and inferred using the method by Stephens and Scheet (2005) andthe software PHASEd Tajima (1983) W Watterson (1975)
2163
Evolutionary Dynamics of Human NADPH Oxidase Genes doi101093molbevmst119 MBE by guest on A
ugust 13 2016httpm
beoxfordjournalsorgD
ownloaded from
natural selection which produces an excess of common allelesassociated with high genetic diversity Thus the observed pat-tern of CYBA diversity in Europeans is not consistent with aselective sweep on a standing variation but it is consistent witha scenario of balancing selection acting on standing variation
If we consider for CYBA that heterozygote advantage maybe the mechanisms of balancing selection we can speculatethat the biological basis for this mechanism may be the fol-lowing considering that p22-phox is not exclusive of thephagocyte NADPH oxidase but it is also part of Nox com-plexes expressed in other tissues the dependence of ROSproduction on CYBA variants has to be finely regulated IfCYBA variants induce high levels of ROS these variants mayfavor a phagocyte-dependent efficient response to pathogensbut may damage other endothelial tissues Alternativelytissue oxidative damage does not occur if ROS productionis low but this response may be associated with a weaker
phagocyte respiratory burst against pathogens In this con-text heterozygote individuals with a CYBA-dependent inter-mediate level of ROS production may have been favored bynatural selection
Our results contribute to the discussion regarding the rel-evance of balancing selection in shaping the diversity ofinnate immunity genes (Ferrer-Admetlla et al 2008 Barreiroet al 2009) Ferrer-Admetlla et al (2008) have associated therecurrent signatures of balancing selection on inflammatorygenes with the need for fine regulation CYBA is the onlyphagocytic NADPH oxidase gene that also encodesnonphagocyte Nox components thus CYBA has a role inROS cell signaling a potentially dangerous process due toits capability to produce oxidative damage to tissues Geneswith these characteristics likely need even tighter regulationThe interplay between pathogen-driven selective pressureon innate immunity genes and their concomitantnonimmunological functions is complex In addition toCYBA other interesting examples of this interplay can befound among the 10 human toll-like receptors (TLRs) thatshow a variety of signatures of natural selection TLR8 whichshows the strongest signature of purifying selection is alsoinvolved in neuronal development (Barreiro et al 2009) and itis difficult to discriminate the role of each function in deter-mining the observed signature of natural selection
With few exceptions the pathogens responsible for naturalselection on immune genes are difficult to specify In the caseof NADPH oxidase we can infer based on the spectrum ofinfections in CGD patients that catalase-positive bacteria andfungi such as Staphylococci Salmonella Candida Aspergillusand M tuberculosis may be the selective pathogensInteractions between the host and pathogens also includemechanisms of the latter to impair the respiratory burst ofthe former For example Leishmania donovani blocks the as-sembly of NADPH oxidase at the phagosome membrane(Lodge et al 2006) These mechanisms may constitute selec-tive pressures imposed by pathogens
The associations reported in GWAS between rs4821544 inNCF4 and Crohnrsquos disease (an idiopathic inflammatory boweldisease that predominantly involves the ileum and colonRioux et al 2007) and between rs10911363 in NCF2 and
Table 3 Pairwise FST Genetic Distances between Populations
CYBBa CYBA
Africa Europe Asia Hispanic Africa Europe Asia Hispanic
Africa mdash 0316 0264 0092 mdash 0074 0083 0065
Europe 0257 mdash 0000 0107 mdash 0002 0056
Asia 0211 0000 mdash 0073 mdash 0054
Hispanic 0070 0082 0056 mdash mdash
NCF2 NCF4
Africa mdash 0048 0058 0037 mdash 0128 0136 0158
Europe mdash 0069 0000 mdash 0026 0026
Asia mdash 0059 mdash 0005
Hispanic mdash mdash
aFor CYBB FST estimators are above the diagonal Below the diagonal are the FST values corrected as if the effective population sizes of X chromosome genes were equal toautosomal ones
Table 4 Results of Neutrality Tests for the NADPH Oxidase Genesand Their Significancea
African European Asian Hispanic
Tajimarsquos D
CYBB 0473 0274 0813 0084
CYBA 0580 1242 0684 0707
NCF2 0412 0939 0883 0987
NCF4 0750 0243 0556 0102
Fu and Lirsquos D
CYBB 1050 1110 0395 0977
CYBA 1485 1734 1188 0138
NCF2 0407 1105 1904 1398
NCF4 0308 0443 1893 0266
Fu and Lirsquos F
CYBB 0980 0811 0114 0814
CYBA 1382 1893 1205 0405
NCF2 0512 1252 1893 1567
NCF4 0557 0231 1225 0248
NOTEmdashUnderlined values represent significant results under the demographic modelinferred by Laval et al (2010) for human populations See details in supplementarytable S4 Supplementary Material onlineaThe McDonaldndashKreitman test is nonsignificant in any of the cases
Significant under the WrightndashFisher model of constant population size
2164
Tarazona-Santos et al doi101093molbevmst119 MBE by guest on A
ugust 13 2016httpm
beoxfordjournalsorgD
ownloaded from
systemic lupus erythematosus (Cunninghame Graham et al2011) confirm the involvement of NADPH genes in the path-ogenesis of inflammatory-related common diseases Ourclaim that natural selection acted on CYBA (and maybe inNCF2) is relevant for biomedical studies because combiningevidence of natural selection with association analyses inimmune genes increases the statistical power to detect dis-ease-associated variants (Ayodo et al 2007) As a new gener-ation of association studies focusing on rare variation isemerging the combination of genes deemed interestingfrom GWAS and populations with an excess of rare variantsin these genes such as NCF2 in Asians are particularly inter-esting as a source of rare variants with clinical relevanceFinally by determining through molecular evolution mappingthat the extracellular portion of gp91 (encoded by CYBB inthe X-chromosome) has been subject to recurrent episodes ofpositive selection at the scale of mammals evolution we positthe hypothesis that this portion of the NADPH oxidase isrelevant for currently unknown biological processes thatonce revealed by structural and functional investigationswill contribute to understanding the role of NADPH oxidasein infectious autoimmune and cardiovascular diseases
Materials and Methods
Molecular Evolution Analysis of NADPH Genes
We used the maximum likelihood framework developed byYang (2007a) to estimate for the NADPH oxidase genes
under a variety of evolutionary models as implemented in thePAML software (Yang 2007b) This approach allows infer-ences about the evolution of a coding region along aninter-specific phylogeny mapping the codons that haveevolved under strongweak purifying selection neutrality oradaptive positive selection Further details about these anal-yses are available as supplementary material SupplementaryMaterial online
Human Population Genetics of NADPH Genes
For human population genetics analyses we conducted bidi-rectional Sanger sequencing of CYBB CYBA NCF2 and NCF4for a total of 35242 bp for each of 102 healthy individuals aspart of the SNP500 Cancer project (Packer et al 2006 seesupplementary fig S1 Supplementary Material online for de-tails) Human population resequencing data for CYBB werepublished in Tarazona-Santos et al (2008) The resequencingexperiments were performed as in Packer et al (2006) These102 unrelated individuals include the following 24 of Africanancestry (15 African Americans from the United States and 9Pygmies) 23 admixed Latin Americans (from Mexico PuertoRico and South America) 31 Europeans (from the CEPHUTAH pedigree and the NIEHS Environmental GenomeProject) and 24 AsiansndashOceanians (from Melanesia PakistanChina Cambodia Japan and Taiwan)
After controlling for multiple tests we confirmed that allSNPs fit the HardyndashWeinberg equilibrium by the Guo and
CYBA - A16
CYBA - A3
Y72H - rs4673
V174A - rs1049254 CYBA - E18
(a)
NCF2 - E10 H389Q - rs17849502
NCF2 - B3
NCF2 - D11
NCF2 - C1K181R - rs2274064(b)
FIG 4 Phylogenetic networks of (a) CYBA in Europeans and (b) NCF2 in Europeans (black) and Asians (yellow) The lengths of the branches areproportional to the number of mutations Only nonsynonymous mutations are shown The haplotype names correspond to the table of inferredhaplotypes in the supplementary material Supplementary Material online We only show the names of haplotypes with a frequency gt5 Ancestralhaplotypes (inferred as being the human haplotype or median vector most similar to the chimpanzee sequence) are indicated by an arrow Medianvectors are in red
2165
Evolutionary Dynamics of Human NADPH Oxidase Genes doi101093molbevmst119 MBE by guest on A
ugust 13 2016httpm
beoxfordjournalsorgD
ownloaded from
Thompson (1992) test which was implemented in the soft-ware Arlequin 30 (Excoffier et al 2007) Insertion-deletions(INDELs) were excluded from population genetics analysesTo assess intrapopulation diversity we used two statistics the per-site mean number of pairwise differences betweensequences (Tajima 1983) and w based on the number ofsegregating sites (S) (Watterson 1975) We measured pairwisebetween-populations diversity by using the FST statistics cal-culated using the software DnaSP (Rozas 2009)
Haplotypes and the recombination parameter were in-ferred using the PHASE software (Stephens and Scheet 2005)and diversity indexes calculations (tables 2 and 3) as well asneutrality statistics (table 4) were estimated using DnaSP soft-ware We applied two kinds of neutrality tests 1) tests basedon the allelic spectrum which is the distribution of polymor-phisms across different classes of frequencies namelyTajimarsquos DT (Tajima 1989) and FundashLirsquos DFL and FFL (Fu andLi 1993) and 2) tests based on comparisons between thenumber of polymorphisms in human populations and fixeddifferences with the chimpanzee (ie outgroup) namely theMcDonald and Kreitman (1991) test and the adaptedKolmogorovndashSmirnoff test by McDonald (1998) For thefirst set of tests we used as null hypotheses both the classicWrightndashFisher model of neutrality with a constant popula-tion size as well as the more realistic evolutionary scenario forhuman populations inferred by Laval et al (2010) In the caseof the scenario of Laval et al (2010) we ignored interconti-nental gene flow within the Old World because these raregene flow events likely does not affect the level of significanceof the neutrality tests given its very low inferred values(13 105) Null distributions used to test the significanceof the neutrality tests under these evolutionary scenarios weregenerated using coalescent simulations and a significancelevel of 005 (Hudson 2002) The KolmogorovndashSmirnoff testof neutrality adapted by McDonald (1998) was performedusing Slider software available at httpudeledu~mcdonaldaboutdnasliderhtml (last accessed July 16 2013) Weperformed coalescent simulations using ms software(Hudson 2002) Further methodological details are availableas supplementary material Supplementary Material onlineWe constructed the CYBA and NCF2 networks using allSNP variants and applying the Median joining algorithmand the maximum parsimony option calculations as imple-mented in the software Network 46 (Bandelt et al 1999)
Supplementary MaterialSupplementary tables S1ndashS5 and figure S1 are available atMolecular Biology and Evolution online (httpwwwmbeoxfordjournalsorg)
Acknowledgments
SJC conceived the study ET-S MM FL RC AC CF LPand BS generated the dataanalyzed the sequences LB ACWCSM and MY provided tools for data generationand analyses ET-S MM FL WCSM and RR performedpopulation genetics analyses ET-S and MM prepared tablesand figures ET-S and SJC wrote the manuscript All the
authors contributed with discussions about the results Theauthors are grateful to Silvia Fuselli for discussions of ourresults to Fernando Levi Soares for technical assistance withthe figures and the Sequencing Group of the CoreGenotyping Facility (National Cancer Institute) for their tech-nical assistance This work was supported by the IntramuralResearch Program of the National Cancer Institute (NCI) andNational Institutes of Health (NIH) CF was supported by theUniversity of Bologna ET-S MM WCSM and FL weresupported by the Fogarty International Center and NationalCancer Institute (1R01TW007894) Conselho Nacional deDesenvolvimento Cientıfico e Tecnologico (Brazil Grant3075562012-3) Fundacao de Amparo a Pesquisa de MinasGerais (Brazil CBB PPM 0026512) and the Brazilian Ministryof Education (CAPES Agency grant PNPD 0256709-1)
ReferencesAyodo G Price AL Keinan A Ajwang A Otieno MF Orago AS
Patterson N Reich D 2007 Combining evidence of natural selectionwith association analysis increases power to detect malaria-resis-tance variants Am J Hum Genet 81(2)234ndash242
Bamshad M Wooding SP 2003 Signatures of natural selection in thehuman genome Nat Rev Genet 4(2)99ndash111
Bandelt H-J Forster P Rohl A 1999 Median-joining networks for infer-ring intraspecific phylogenies Mol Biol Evol 16(1)37ndash48
Barreiro LB Ben-Ali M Quach H et al (17 co-authors) 2009Evolutionary dynamics of human Toll-like receptors and their dif-ferent contributions to host defense PLoS Genet 5(7)e1000562
Barreiro LB Quintana-Murci L 2010 From evolutionary genetics tohuman immunology how selection shapes host defence genesNat Rev Genet 11(1)17ndash30
Barrett RD Schluter D 2008 Adaptation from standing genetic varia-tion Trends Ecol Evol 23(1)38ndash44
Bedard K Attar H Bonnefont J Jaquet V Borel C Plastre O Stasia MJAntonarakis SE Krause KH 2009 Three common polymorphisms inthe CYBA gene form a haplotype associated with decreased ROSgeneration Hum Mutat 30(7)1123ndash1133
Blekhman R Man O Herrmann L Boyko AR Indap A Kosiol CBustamante CD Teshima KM Przeworski M 2008 Natural selectionon genes that underlie human disease susceptibility Curr Biol18(12)883ndash889
Brandes RP Kreuzer J 2005 Vascular NADPH oxidases molecular mech-anisms of activation Cardiovasc Res 65(1)16ndash27
Buckley RH 2004 Pulmonary complications of primary immunodefi-ciencies Paediatr Respir Rev 5(Suppl A)S225ndashS233
Bustamante CD Fledel-Alon A Williamson S et al (13 co-authors)2005 Natural selection on protein-coding genes in the humangenome Nature 437(7062)1153ndash1157
Bustamante J Arias AA Vogt G et al (29 co-authors) 2011 GermlineCYBB mutations that selectively affect macrophages in kindredswith X-linked predisposition to tuberculous mycobacterial diseaseNat Immunol 12(3)213ndash221
Campbell MC Tishkoff SA 2008 African genetic diversity implicationsfor human demographic history modern human origins and com-plex disease mapping Annu Rev Genomics Hum Genet 9403ndash433
Chanock SJ Roesler J Zhan S Hopkins P Lee P Barrett DT ChristensenBL Curnutte JT Gorlach A 2000 Genomic structure of the humanp47-phox (NCF1) gene Blood Cells Mol Dis 26(1)37ndash46
Cunninghame Graham DS Morris DL Bhangale TR Criswell LASyvanen AC Ronnblom L Behrens TW Graham RR Vyse TJ2011 Association of NCF2 IKZF1 IRF8 IFIH1 and TYK2 with sys-temic lupus erythematosus PLoS Genet 7(10)e1002341
Excoffier L Laval G Schneider S 2007 Arlequin ver 30 an integratedsoftware package for population genetics data analysis EvolBioinform Online 147ndash50
2166
Tarazona-Santos et al doi101093molbevmst119 MBE by guest on A
ugust 13 2016httpm
beoxfordjournalsorgD
ownloaded from
Excoffier L Ray N 2008 Surfing during population expansions promotesgenetic revolutions and structuration Trends Ecol Evol23(7)347ndash351
Ferrer-Admetlla A Bosch E Sikora M Marques-Bonet T Ramırez-SorianoA Muntasell A Navarro A Lazarus R Calafell F Bertranpetit J Casals F2008 Balancing selection is the main force shaping the evolution ofinnate immunity genes J Immunol 181(2)1315ndash1322
Fu YX Li WH 1993 Statistical tests of neutrality of mutations Genetics133(3)693ndash709
The 1000 Genomes Project Consortium 2010 A map of humangenome variation from population-scale sequencing Nature467(7319)1061ndash1073
The 1000 Genomes Project Consortium Abecasis GR Auton A BrooksLD DePristo MA Durbin RM Handsaker RE Kang HM Marth GTMcVean GA 2012 An integrated map of genetic variation from1092 human genomes Nature 491(7422)56ndash65
Guo SW Thompson EA 1992 Performing the exact test of Hardy-Weinberg proportion for multiple alleles Biometrics 48(2)361ndash372
Heyworth PG Cross AR Curnutte JT 2003 Chronic granulomatousdisease Curr Opin Immunol 15(5)578ndash584
Hudson RR 2002 Generating samples under a Wright-Fisher neutralmodel of genetic variation Bioinformatics 18(2)337ndash338
Jacob CO Eisenstein M Dinauer MC et al (21 co-authors) 2012 Lupus-associated causal mutation in neutrophil cytosolic factor 2 (NCF2)brings unique insights to the structure and function of NADPHoxidase Proc Natl Acad Sci U S A 109(2)E59ndashE67
Kimura M 1974 The neutral theory of molecular evolution CambridgeCambridge University Press
Kosiol C Vinar T da Fonseca RR Hubisz MJ Bustamante CD Nielsen RSiepel A 2008 Patterns of positive selection in six mammalian ge-nomes PLoS Genet 4(8)e1000144
Lacy F Kailasam MT OrsquoConnor DT Schmid-Schonbein GW Parmer RJ2000 Plasma hydrogen peroxide production in human essentialhypertension role of heredity gender and ethnicity Hypertension36(5)878ndash884
Laval G Patin E Barreiro LB Quintana-Murci L 2010 Formulating a his-torical and demographic model of recent human evolution based onresequencing data from noncoding regions PLoS One 5(4)e10284
Lewontin R 1972 The apportionment of human diversity Evol Biol 6391ndash398
Lindblad-Toh K Garber M Zuk O et al (88 co-authors) 2011 A high-resolution map of human evolutionary constraint using 29 mam-mals Nature 478(7370)476ndash482
Lodge R Diallo TO Descoteaux A 2006 Leishmania donovani lipopho-sphoglycan blocks NADPH oxidase assembly at the phagosomemembrane Cell Microbiol 8(12)1922ndash1931
Magalhaes WC Rodrigues MR Silva D Soares-Souza G Iannini MLCerqueira GC Faria-Campos AC Tarazona-Santos E 2012DIVERGENOME a bioinformatics platform to assist population ge-netics and genetic epidemiology studies Genet Epidemiol36(4)360ndash367
Marth JD Grewal PK 2008 Mammalian glycosylation in immunity NatRev Immunol 8(11)874ndash887
McDonald JH 1998 Improved tests for heterogeneity across a region ofDNA sequence in the ratio of polymorphism to divergence Mol BiolEvol 15377ndash384
McDonald JH Kreitman M 1991 Adaptive protein evolution at the Adhlocus in Drosophila Nature 351(6328)652ndash654
Nielsen R Bustamante C Clark AG et al (12 co-authors) 2005 A scanfor positively selected genes in the genomes of humans and chim-panzees PLoS Biol 3(6)e170
Olsson LM Lindqvist AK Kallberg H Padyukov L Burkhardt HAlfredsson L Klareskog L Holmdahl R 2007 A case-control studyof rheumatoid arthritis identifies an associated single nucleotidepolymorphism in the NCF4 gene supporting a role for the
NADPH-oxidase complex in autoimmunity Arthritis Res Ther9(5)R98
Packer BR Yeager M Burdett L et al (13 co-authors) 2006SNP500Cancer a public resource for sequence validation assay de-velopment and frequency analysis for genetic variation in candidategenes Nucleic Acids Res 34(Database issue)D617ndashD621
Peter BM Huerta-Sanchez E Nielsen R 2012 Distinguishing betweenselective sweeps from standing variation and from a de novo mu-tation PLoS Genet 8(10)e1003011
Przeworski M Coop G Wall JD 2005 The signature of positive selectionon standing genetic variation Evolution 59(11)2312ndash2323
Ptak SE Przeworski M 2002 Evidence for population growth in humansis confounded by fine-scale population structure Trends Genet18(11)559ndash563
Ramensky V Bork P Sunyaev S 2002 Human non-synonymous SNPsserver and survey Nucleic Acids Res 30(17)3894ndash3900
Rioux JD Xavier RJ Taylor KD et al (24 co-authors) 2007 Genome-wideassociation study identifies new susceptibility loci for Crohn diseaseand implicates autophagy in disease pathogenesis Nat Genet39(5)596ndash604
Roberts RL Hollis-Moffatt JE Gearry RB Kennedy MA Barclay MLMerriman TR 2008 Confirmation of association of IRGM andNCF4 with ileal Crohnrsquos disease in a population-based cohortGenes Immun 9(6)561ndash565
Rozas J 2009 DNA sequence polymorphism analysis using DnaSPMethods Mol Biol 537337ndash350
San Jose G Fortuno A Beloqui O Dıez J Zalba G 2008 NADPH oxidaseCYBA polymorphisms oxidative stress and cardiovascular diseasesClin Sci (Lond) 114(3)173ndash182
Santiago HC Gonzalez Lombana CZ Macedo JP et al (12 co-authors)2012 NADPH phagocyte oxidase knockout mice controlTrypanosoma cruzi proliferation but develop circulatory collapseand succumb to infection PLoS Negl Trop Dis 6(2)e1492
Stephens M Scheet P 2005 Accounting for decay of linkage disequilib-rium in haplotype inference and missing-data imputation Am JHum Genet 76(3)449ndash462
Sumimoto H Miyano K Takeya R 2005 Molecular composition andregulation of the Nox family NAD(P)H oxidases Biochem BiophysRes Commun 338(1)677ndash686
Tajima F 1983 Evolutionary relationship of DNA sequences in finitepopulations Genetics 105(2)437ndash460
Tajima F 1989 Statistical method for testing the neutral mutationhypothesis by DNA polymorphism Genetics 123(3)585ndash595
Tarazona-Santos E Bernig T Burdett L Magalhaes WC Fabbri C Liao JRedondo RA Welch R Yeager M Chanock SJ 2008 CYBB anNADPH-oxidase gene restricted diversity in humans and evidencefor differential long-term purifying selection on transmembrane andcytosolic domains Hum Mutat 29(5)623ndash632
Taylor RM Burritt JB Baniulis D Foubert TR Lord CI Dinauer MCParkos CA Jesaitis AJ 2004 Site-specific inhibitors of NADPH oxi-dase activity and structural probes of flavocytochrome b character-ization of six monoclonal antibodies to the p22phox subunitJ Immunol 173(12)7349ndash7357
Voight BF Kudaravalli S Wen X Pritchard JK 2006 A map of recentpositive selection in the human genome PLoS Biol 4(3)e72
Watterson GA 1975 On the number of segregating sites in geneticalmodels without recombination Theor Popul Biol 7(2) 256ndash276
Williamson SH Hernandez R Fledel-Alon A Zhu L Nielsen RBustamante CD 2005 Simultaneous inference of selection and pop-ulation growth from patterns of variation in the human genomeProc Natl Acad Sci U S A 102(22)7882ndash7887
Yang Z 2007a Adaptive molecular evolution In Balding DJ Bishop MCannings C editors Handbook of statistical genetics Vol 1 3rd edSusex (United Kingdom) John Wiley amp Sons p 377ndash406
Yang Z 2007b PAML 4 phylogenetic analysis by maximum likelihoodMol Biol Evol 241586ndash1591
2167
Evolutionary Dynamics of Human NADPH Oxidase Genes doi101093molbevmst119 MBE by guest on A
ugust 13 2016httpm
beoxfordjournalsorgD
ownloaded from
FIG
3
Nat
ural
sele
ctio
nm
app
ing
acro
ssC
YBB
(en
codi
ng
gp91
)al
ong
mam
mal
ian
evol
utio
na
sid
enti
fied
usin
gth
ePA
ML
met
hod
byY
ang
(200
7a)
The
top
olog
ies
ofgp
91an
dp2
2ar
ere
pro
duce
dfr
omT
aylo
ret
al(
2004
Cop
yrig
ht20
04T
heA
mer
ican
Ass
ocia
tion
ofIm
mun
olog
ists
In
cU
sed
wit
hp
erm
issi
on)
Dar
kgr
ayam
ino
acid
sha
veev
olve
dun
der
pos
itiv
ese
lect
ion
wit
hgt
80
pro
babi
lity
Mos
tof
thes
eam
ino
acid
sar
ein
the
extr
acel
lula
rp
orti
onof
the
pro
tein
The
upp
erp
art
ofth
efig
ure
show
sth
ep
rote
inal
ign
men
tfo
rn
ine
mam
mal
sof
the
gp91
regi
onin
dica
ted
byth
ebl
ack
ellip
seI
nth
isre
gion
ahi
ghle
velo
fam
ino
acid
varia
tion
isfo
und
betw
een
spec
ies
and
seve
ralc
odon
ssh
owgt
1In
this
alig
nm
ent
gray
vert
ical
bars
corr
esp
ond
tova
riab
leam
ino
acid
site
sT
hep
rote
inal
ign
men
tof
mam
mal
ssh
ows
the
follo
win
gsp
ecie
sH
om(H
omo
sapi
ens
sapi
ens)
Pan
(Pan
trog
lody
tes)
Mac
(Mac
aca
mul
atta
)M
us(M
usm
uscu
lus)
Rat
(Rat
tus
norv
egic
us)
Cri
(Cri
cetu
lus
gris
eus)
Het
(Het
eroc
epha
lus
glab
er)
Cav
(Cav
iapo
rcel
lus)
an
dO
ry(O
ryct
olag
uscu
nicu
lus)
EC
ext
race
lula
ren
viro
nm
ent
TM
tra
nsm
embr
ane
laye
ran
dIC
in
trac
ellu
lar
envi
ron
men
t
2161
Evolutionary Dynamics of Human NADPH Oxidase Genes doi101093molbevmst119 MBE by guest on A
ugust 13 2016httpm
beoxfordjournalsorgD
ownloaded from
From the four studied genes NCF4 which encodes theregulatory protein p40-phox shows a pattern of diversitythat is typical for a gene that has evolved under neutralityIn addition to the features described in the previous para-graph the allelic spectra of NCF4 in the studied populationsare consistent with a neutral model of evolution (tables 2ndash4)
Although CYBB presented the most interesting evolution-ary history at the interspecific level with repeated episodes ofpositive natural selection recent human evolutionary historyhas resulted in interesting patterns of variation for CYBA andNCF2 CYBA encodes p22-phox which is a transmembraneprotein shared by different NADPH oxidases In addition toharboring two common nonsynonymous polymorphisms(V174A is also variable among different species of mammals)CYBA is the most variable and most affected by recombina-tion among the NADPH oxidase genes (tables 1 and 2) Inparticular CYBA diversity is very high in Europe comparedwith 329 genes resequenced in a European sample (httppgagswashingtonedusummary_statshtml last accessed July16 2013) CYBA ranks 11th (ie the 97th percentile)Moreover there are contrasting proportions of the totalnumber of polymorphismsnumber of singletons betweenAfricans (a low proportion) and Europeans (a high propor-tion) the latter showing an excess of common polymor-phisms with respect to the neutral expectation (see the DFL
test in table 4 and supplementary table S4 SupplementaryMaterial online) This excess of common variants inEuropeans is also significant when we conservatively testedit against a scenario of human evolution that incorporates theldquoOut of Africardquo bottleneck (Laval et al 2010) and the observedlevel of recombination in Europeans (CYBA = 807 for the se-quenced region) Because demographic forces and recombi-nation levels alone do not explain the high CYBA diversity andits excess of common polymorphisms we suggest that bal-ancing natural selection (that acts by maintaining differentalleles at high frequency in a population) has contributed to
shape the diversity of CYBA at least in the European popu-lation Indeed figure 4a shows that the haplotype network ofCYBA for the European population is consistent with theaction of balancing natural selection (Bamshad andWooding 2003) showing two well-differentiated commonclades that explain the observed high diversity and theexcess of common CYBA variants A comparative genomicanalysis confirms this inference the ratio of polymorphismsto differences fixed between human and chimpanzee is nothomogeneous along the gene in the different human popu-lations (Mc Donald 1998 supplementary table S5Supplementary Material online) as would be expectedunder neutral evolution Our inference of balancing naturalselection is consistent with the fact that 25ndash30 of the var-iation in levels of ROS production can be attributed to geneticfactors (Lacy et al 2000) and that ROS levels are associatedwith CYBA variants (Bedard et al 2009)
p67 encoded by NCF2 is a necessary cytosolic NADPHcomponent for phagocyte ROS production Asians show ahighly differentiated NCF2 haplotype structure (see frequen-cies of haplotypes NCF2-D11 and NCF2-E10 in the haplotypetables online and in the network shown in fig 4b) and thehighest FST values are observed in pairwise comparisons be-tween Asians and non-Asian populations (in particular withEuropeans table 3) and not between Africans and non-African populations as is usually observed in the humangenome We confirmed these results by analyzing data forNCF2 from the HapMap Project (supplementary materialSupplementary Material online) Moreover a trend towardan excess of rare polymorphisms exists in Asians that is notobserved elsewhere (tables 1 2 and 4 DFL =1904FFL =1893) Although we cannot exclude that this patternof diversity is compatible with the null hypotheses of neutral-ity and with the tested demographic history of human pop-ulations inferred by Laval et al (2010 tables 2ndash4) we canspeculate and envisage four additional evolutionary scenarios
Table 1 Allele Frequencies of Nonsynonymous Polymorphisms in NADPH Oxidase Genes
Genes rs Minor Allele(Amino Acid)
PolyphenPrediction
African European Asian Hispanic
CYBA
Y72H rs4673 T (Y) Possibly damaging 046 032 017 022
V174A rs1049254 T (V) Benign 017 048 048 018
NCF2
K181R rs2274064 G (R) Benign 035 037 041 048
T279M rs13306581 T (T) Probably damaging 000 000 005 000
V297A rs35937854 C (A) Benign 004 000 000 000
T361S Chr1181799289 NCBI36hg18 T (S) mdash 000 000 002 000
H389Q rs17849502 A (Q) Benign 000 005 000 007
R395W rs13306575 T (W) Possibly damaging 000 000 000 007
N419I rs35012521 T (I) Probably damaging 000 002 004 000
P454S rs55761650 T (S) mdash 000 000 000 002
L487S Chr1181795862 NCBI36hg18 C (S) mdash 002 000 000 000
NCF4
T85N rs112306225 A (N) mdash 000 000 002 000
A304E rs5995361 A (E) Benign 004 000 000 000
2162
Tarazona-Santos et al doi101093molbevmst119 MBE by guest on A
ugust 13 2016httpm
beoxfordjournalsorgD
ownloaded from
that may have contributed to shape the pattern of NCF2diversity in Asians 1) the observed trend is suggestive of aselective sweep on NCF2 standing variation a neutral orweakly deleterious existing variant becomes beneficial andrapidly increases in frequency (together with its associatedhaplotypes ie incomplete sweep) reducing the nucleotidediversity in the surrounding region and rendering other stand-ing substitutions rare During this process new rare substitu-tions appear in the expanding positively selected haplotype2) The pattern of diversity of NCF2 in Asia may result from anincomplete selective sweep acting on ARPC5 which is locatedapproximately 35 kb downstream of NCF2 In a genome-widescan for recent positive selection Voight et al (2006) identi-fied a strong signature of an incomplete sweep for ARPC5 inAsia (P = 0009) characterized by a higher than expected long-range linkage disequilibrium summarized by very high iHS
statistics (see Haplotter results for the HapMap II data athttphaplotteruchicagoedu last accessed July 16 2013)SNPs in NCF2 also presents high iHS statistics (P = 002) al-though values are lower than for ARPC5 3) The differentiatedpattern of diversity of NCF2 in Asia may also have been gen-erated without the action of natural selection during the firstcolonization of Asia by modern humans In a process of geo-graphic population expansion specifically in the front wave ofthe expansion some rare alleleshaplotypes (ie surfing al-leles) may become common by chance mimicking the pat-tern of diversity generated by a selective sweep (Excoffier andRay 2008) 4) The excess of rare variants may be an artifact ofpooling individuals from different populations (Ptak andPrzeworski 2002) Consistent with evolutionary scenarios 1ndash4 that produce similar patterns of diversity the haplotypenetwork of NCF2 for Eurasians (fig 4b) shows the following1) a large differentiation between Asians and Europeans thatis compatible with the high observed FST values and 2) a star-like shape associated with the haplotype NCF2-E10 that iscommon in Asia and rare elsewhere which is compatiblewith the excess of rare alleles in the Asian populations
DiscussionBy analyzing 29 mammalian genomes and four human pop-ulations we show in this study that natural selection hasacted in different ways over time to shape the pattern ofdiversity of the phagocyte NADPH oxidase genes At the tem-poral scale of the evolution of mammals we have inferredrecurrent episodes of positive selection acting on the extra-cellular portion of gp-91 that have been important to shapethe pattern of interspecific diversity of this gene Our in-terspecific analyses did not show a similar pattern of naturalselection in any of the other phagocyte NADPH oxidasegenes Even if current knowledge on the biology of NADPHdoes not allow us to interpret our results in terms of functionwe propose that the extracellular region of gp-91 is function-ally relevant Our results also imply that this region is highlydifferentiated among mammals at the protein level and thisvariability should be considered when mammals models areused to study the structure and function of phagocyteNADPH components
In the time scale of human evolution our analyses of theNADPH oxidase genes suggest that CYBA has been a target ofbalancing natural selection Because we do not have evidenceof population-specific variants that faced selective pressurethe inferred natural selection may have acted on a standingvariation in ancestral populations This implies that the selec-tive pressure began after the appearance of the variant andpossibly acted in a specific geographic region (Barret andSchluter 2008) The signatures of natural selection acting ona new mutation and on standing variation differ In the case ofselective sweeps episodes of natural selection on standingvariation are associated to a larger variance in the allelic spec-trum with respect to natural selection on a new mutationAlso selection on standing variation may produce an excess ofalleles at intermediate frequencies that is not associated withhigh nucleotide diversity (Przeworski et al 2005 Peter et al2012) This pattern contrasts with the effect of balancing
Table 2 Intrapopulation Diversity Indexes in the Studied Populationsfor the NADPH Oxidase Genes Obtained from Resequencing Dataa
African European Asian Hispanic
Number of chromosomes
CYBBb 42 52 34 36
CYBA 48 62 48 46
NCF2 48 62 48 46
NCF4 48 62 48 46
Segregating sitessingletons
CYBB 218 70 103 135
CYBA 6122 333 335 347
NCF2 4613 3311 2816 3714
NCF4 4512 267 191 308
Haplotype structureNumber of inferred haplotypesc
CYBB 14 5 7 12
CYBA 39 39 32 26
NCF2 38 32 18 31
NCF4 36 30 17 22
Haplotype diversity plusmn SD
CYBB 088 plusmn 003 034 plusmn 008 053 plusmn 010 070 plusmn 008
CYBA 098 plusmn 001 098 plusmn 001 097 plusmn 000 096 plusmn 002
NCF2 099 plusmn 001 096 plusmn 001 087 plusmn 004 097 plusmn 001
NCF4 098 plusmn 001 093 plusmn 002 093 plusmn 002 093 plusmn 002
Recombination parameter (q 103 per site)c
CYBB 008 lt001 lt001 lt002
CYBA 291 148 096 088
NCF2 052 038 010 058
NCF4 137 103 039 022
h estimatorsd
p 103 per site
CYBB 036 012 015 028
CYBA 190 163 151 157
NCF2 081 047 043 057
NCF4 101 076 064 084
hW 103 per site
CYBB 042 013 021 027
CYBA 231 118 125 13
NCF2 102 069 062 083
NCF4 101 070 054 087
aMost analyses were performed using software DnaSP (Rozas 2009)bData for CYBB (Xp211) are from Tarazona-Santos et al (2008)cHaplotypes and inferred using the method by Stephens and Scheet (2005) andthe software PHASEd Tajima (1983) W Watterson (1975)
2163
Evolutionary Dynamics of Human NADPH Oxidase Genes doi101093molbevmst119 MBE by guest on A
ugust 13 2016httpm
beoxfordjournalsorgD
ownloaded from
natural selection which produces an excess of common allelesassociated with high genetic diversity Thus the observed pat-tern of CYBA diversity in Europeans is not consistent with aselective sweep on a standing variation but it is consistent witha scenario of balancing selection acting on standing variation
If we consider for CYBA that heterozygote advantage maybe the mechanisms of balancing selection we can speculatethat the biological basis for this mechanism may be the fol-lowing considering that p22-phox is not exclusive of thephagocyte NADPH oxidase but it is also part of Nox com-plexes expressed in other tissues the dependence of ROSproduction on CYBA variants has to be finely regulated IfCYBA variants induce high levels of ROS these variants mayfavor a phagocyte-dependent efficient response to pathogensbut may damage other endothelial tissues Alternativelytissue oxidative damage does not occur if ROS productionis low but this response may be associated with a weaker
phagocyte respiratory burst against pathogens In this con-text heterozygote individuals with a CYBA-dependent inter-mediate level of ROS production may have been favored bynatural selection
Our results contribute to the discussion regarding the rel-evance of balancing selection in shaping the diversity ofinnate immunity genes (Ferrer-Admetlla et al 2008 Barreiroet al 2009) Ferrer-Admetlla et al (2008) have associated therecurrent signatures of balancing selection on inflammatorygenes with the need for fine regulation CYBA is the onlyphagocytic NADPH oxidase gene that also encodesnonphagocyte Nox components thus CYBA has a role inROS cell signaling a potentially dangerous process due toits capability to produce oxidative damage to tissues Geneswith these characteristics likely need even tighter regulationThe interplay between pathogen-driven selective pressureon innate immunity genes and their concomitantnonimmunological functions is complex In addition toCYBA other interesting examples of this interplay can befound among the 10 human toll-like receptors (TLRs) thatshow a variety of signatures of natural selection TLR8 whichshows the strongest signature of purifying selection is alsoinvolved in neuronal development (Barreiro et al 2009) and itis difficult to discriminate the role of each function in deter-mining the observed signature of natural selection
With few exceptions the pathogens responsible for naturalselection on immune genes are difficult to specify In the caseof NADPH oxidase we can infer based on the spectrum ofinfections in CGD patients that catalase-positive bacteria andfungi such as Staphylococci Salmonella Candida Aspergillusand M tuberculosis may be the selective pathogensInteractions between the host and pathogens also includemechanisms of the latter to impair the respiratory burst ofthe former For example Leishmania donovani blocks the as-sembly of NADPH oxidase at the phagosome membrane(Lodge et al 2006) These mechanisms may constitute selec-tive pressures imposed by pathogens
The associations reported in GWAS between rs4821544 inNCF4 and Crohnrsquos disease (an idiopathic inflammatory boweldisease that predominantly involves the ileum and colonRioux et al 2007) and between rs10911363 in NCF2 and
Table 3 Pairwise FST Genetic Distances between Populations
CYBBa CYBA
Africa Europe Asia Hispanic Africa Europe Asia Hispanic
Africa mdash 0316 0264 0092 mdash 0074 0083 0065
Europe 0257 mdash 0000 0107 mdash 0002 0056
Asia 0211 0000 mdash 0073 mdash 0054
Hispanic 0070 0082 0056 mdash mdash
NCF2 NCF4
Africa mdash 0048 0058 0037 mdash 0128 0136 0158
Europe mdash 0069 0000 mdash 0026 0026
Asia mdash 0059 mdash 0005
Hispanic mdash mdash
aFor CYBB FST estimators are above the diagonal Below the diagonal are the FST values corrected as if the effective population sizes of X chromosome genes were equal toautosomal ones
Table 4 Results of Neutrality Tests for the NADPH Oxidase Genesand Their Significancea
African European Asian Hispanic
Tajimarsquos D
CYBB 0473 0274 0813 0084
CYBA 0580 1242 0684 0707
NCF2 0412 0939 0883 0987
NCF4 0750 0243 0556 0102
Fu and Lirsquos D
CYBB 1050 1110 0395 0977
CYBA 1485 1734 1188 0138
NCF2 0407 1105 1904 1398
NCF4 0308 0443 1893 0266
Fu and Lirsquos F
CYBB 0980 0811 0114 0814
CYBA 1382 1893 1205 0405
NCF2 0512 1252 1893 1567
NCF4 0557 0231 1225 0248
NOTEmdashUnderlined values represent significant results under the demographic modelinferred by Laval et al (2010) for human populations See details in supplementarytable S4 Supplementary Material onlineaThe McDonaldndashKreitman test is nonsignificant in any of the cases
Significant under the WrightndashFisher model of constant population size
2164
Tarazona-Santos et al doi101093molbevmst119 MBE by guest on A
ugust 13 2016httpm
beoxfordjournalsorgD
ownloaded from
systemic lupus erythematosus (Cunninghame Graham et al2011) confirm the involvement of NADPH genes in the path-ogenesis of inflammatory-related common diseases Ourclaim that natural selection acted on CYBA (and maybe inNCF2) is relevant for biomedical studies because combiningevidence of natural selection with association analyses inimmune genes increases the statistical power to detect dis-ease-associated variants (Ayodo et al 2007) As a new gener-ation of association studies focusing on rare variation isemerging the combination of genes deemed interestingfrom GWAS and populations with an excess of rare variantsin these genes such as NCF2 in Asians are particularly inter-esting as a source of rare variants with clinical relevanceFinally by determining through molecular evolution mappingthat the extracellular portion of gp91 (encoded by CYBB inthe X-chromosome) has been subject to recurrent episodes ofpositive selection at the scale of mammals evolution we positthe hypothesis that this portion of the NADPH oxidase isrelevant for currently unknown biological processes thatonce revealed by structural and functional investigationswill contribute to understanding the role of NADPH oxidasein infectious autoimmune and cardiovascular diseases
Materials and Methods
Molecular Evolution Analysis of NADPH Genes
We used the maximum likelihood framework developed byYang (2007a) to estimate for the NADPH oxidase genes
under a variety of evolutionary models as implemented in thePAML software (Yang 2007b) This approach allows infer-ences about the evolution of a coding region along aninter-specific phylogeny mapping the codons that haveevolved under strongweak purifying selection neutrality oradaptive positive selection Further details about these anal-yses are available as supplementary material SupplementaryMaterial online
Human Population Genetics of NADPH Genes
For human population genetics analyses we conducted bidi-rectional Sanger sequencing of CYBB CYBA NCF2 and NCF4for a total of 35242 bp for each of 102 healthy individuals aspart of the SNP500 Cancer project (Packer et al 2006 seesupplementary fig S1 Supplementary Material online for de-tails) Human population resequencing data for CYBB werepublished in Tarazona-Santos et al (2008) The resequencingexperiments were performed as in Packer et al (2006) These102 unrelated individuals include the following 24 of Africanancestry (15 African Americans from the United States and 9Pygmies) 23 admixed Latin Americans (from Mexico PuertoRico and South America) 31 Europeans (from the CEPHUTAH pedigree and the NIEHS Environmental GenomeProject) and 24 AsiansndashOceanians (from Melanesia PakistanChina Cambodia Japan and Taiwan)
After controlling for multiple tests we confirmed that allSNPs fit the HardyndashWeinberg equilibrium by the Guo and
CYBA - A16
CYBA - A3
Y72H - rs4673
V174A - rs1049254 CYBA - E18
(a)
NCF2 - E10 H389Q - rs17849502
NCF2 - B3
NCF2 - D11
NCF2 - C1K181R - rs2274064(b)
FIG 4 Phylogenetic networks of (a) CYBA in Europeans and (b) NCF2 in Europeans (black) and Asians (yellow) The lengths of the branches areproportional to the number of mutations Only nonsynonymous mutations are shown The haplotype names correspond to the table of inferredhaplotypes in the supplementary material Supplementary Material online We only show the names of haplotypes with a frequency gt5 Ancestralhaplotypes (inferred as being the human haplotype or median vector most similar to the chimpanzee sequence) are indicated by an arrow Medianvectors are in red
2165
Evolutionary Dynamics of Human NADPH Oxidase Genes doi101093molbevmst119 MBE by guest on A
ugust 13 2016httpm
beoxfordjournalsorgD
ownloaded from
Thompson (1992) test which was implemented in the soft-ware Arlequin 30 (Excoffier et al 2007) Insertion-deletions(INDELs) were excluded from population genetics analysesTo assess intrapopulation diversity we used two statistics the per-site mean number of pairwise differences betweensequences (Tajima 1983) and w based on the number ofsegregating sites (S) (Watterson 1975) We measured pairwisebetween-populations diversity by using the FST statistics cal-culated using the software DnaSP (Rozas 2009)
Haplotypes and the recombination parameter were in-ferred using the PHASE software (Stephens and Scheet 2005)and diversity indexes calculations (tables 2 and 3) as well asneutrality statistics (table 4) were estimated using DnaSP soft-ware We applied two kinds of neutrality tests 1) tests basedon the allelic spectrum which is the distribution of polymor-phisms across different classes of frequencies namelyTajimarsquos DT (Tajima 1989) and FundashLirsquos DFL and FFL (Fu andLi 1993) and 2) tests based on comparisons between thenumber of polymorphisms in human populations and fixeddifferences with the chimpanzee (ie outgroup) namely theMcDonald and Kreitman (1991) test and the adaptedKolmogorovndashSmirnoff test by McDonald (1998) For thefirst set of tests we used as null hypotheses both the classicWrightndashFisher model of neutrality with a constant popula-tion size as well as the more realistic evolutionary scenario forhuman populations inferred by Laval et al (2010) In the caseof the scenario of Laval et al (2010) we ignored interconti-nental gene flow within the Old World because these raregene flow events likely does not affect the level of significanceof the neutrality tests given its very low inferred values(13 105) Null distributions used to test the significanceof the neutrality tests under these evolutionary scenarios weregenerated using coalescent simulations and a significancelevel of 005 (Hudson 2002) The KolmogorovndashSmirnoff testof neutrality adapted by McDonald (1998) was performedusing Slider software available at httpudeledu~mcdonaldaboutdnasliderhtml (last accessed July 16 2013) Weperformed coalescent simulations using ms software(Hudson 2002) Further methodological details are availableas supplementary material Supplementary Material onlineWe constructed the CYBA and NCF2 networks using allSNP variants and applying the Median joining algorithmand the maximum parsimony option calculations as imple-mented in the software Network 46 (Bandelt et al 1999)
Supplementary MaterialSupplementary tables S1ndashS5 and figure S1 are available atMolecular Biology and Evolution online (httpwwwmbeoxfordjournalsorg)
Acknowledgments
SJC conceived the study ET-S MM FL RC AC CF LPand BS generated the dataanalyzed the sequences LB ACWCSM and MY provided tools for data generationand analyses ET-S MM FL WCSM and RR performedpopulation genetics analyses ET-S and MM prepared tablesand figures ET-S and SJC wrote the manuscript All the
authors contributed with discussions about the results Theauthors are grateful to Silvia Fuselli for discussions of ourresults to Fernando Levi Soares for technical assistance withthe figures and the Sequencing Group of the CoreGenotyping Facility (National Cancer Institute) for their tech-nical assistance This work was supported by the IntramuralResearch Program of the National Cancer Institute (NCI) andNational Institutes of Health (NIH) CF was supported by theUniversity of Bologna ET-S MM WCSM and FL weresupported by the Fogarty International Center and NationalCancer Institute (1R01TW007894) Conselho Nacional deDesenvolvimento Cientıfico e Tecnologico (Brazil Grant3075562012-3) Fundacao de Amparo a Pesquisa de MinasGerais (Brazil CBB PPM 0026512) and the Brazilian Ministryof Education (CAPES Agency grant PNPD 0256709-1)
ReferencesAyodo G Price AL Keinan A Ajwang A Otieno MF Orago AS
Patterson N Reich D 2007 Combining evidence of natural selectionwith association analysis increases power to detect malaria-resis-tance variants Am J Hum Genet 81(2)234ndash242
Bamshad M Wooding SP 2003 Signatures of natural selection in thehuman genome Nat Rev Genet 4(2)99ndash111
Bandelt H-J Forster P Rohl A 1999 Median-joining networks for infer-ring intraspecific phylogenies Mol Biol Evol 16(1)37ndash48
Barreiro LB Ben-Ali M Quach H et al (17 co-authors) 2009Evolutionary dynamics of human Toll-like receptors and their dif-ferent contributions to host defense PLoS Genet 5(7)e1000562
Barreiro LB Quintana-Murci L 2010 From evolutionary genetics tohuman immunology how selection shapes host defence genesNat Rev Genet 11(1)17ndash30
Barrett RD Schluter D 2008 Adaptation from standing genetic varia-tion Trends Ecol Evol 23(1)38ndash44
Bedard K Attar H Bonnefont J Jaquet V Borel C Plastre O Stasia MJAntonarakis SE Krause KH 2009 Three common polymorphisms inthe CYBA gene form a haplotype associated with decreased ROSgeneration Hum Mutat 30(7)1123ndash1133
Blekhman R Man O Herrmann L Boyko AR Indap A Kosiol CBustamante CD Teshima KM Przeworski M 2008 Natural selectionon genes that underlie human disease susceptibility Curr Biol18(12)883ndash889
Brandes RP Kreuzer J 2005 Vascular NADPH oxidases molecular mech-anisms of activation Cardiovasc Res 65(1)16ndash27
Buckley RH 2004 Pulmonary complications of primary immunodefi-ciencies Paediatr Respir Rev 5(Suppl A)S225ndashS233
Bustamante CD Fledel-Alon A Williamson S et al (13 co-authors)2005 Natural selection on protein-coding genes in the humangenome Nature 437(7062)1153ndash1157
Bustamante J Arias AA Vogt G et al (29 co-authors) 2011 GermlineCYBB mutations that selectively affect macrophages in kindredswith X-linked predisposition to tuberculous mycobacterial diseaseNat Immunol 12(3)213ndash221
Campbell MC Tishkoff SA 2008 African genetic diversity implicationsfor human demographic history modern human origins and com-plex disease mapping Annu Rev Genomics Hum Genet 9403ndash433
Chanock SJ Roesler J Zhan S Hopkins P Lee P Barrett DT ChristensenBL Curnutte JT Gorlach A 2000 Genomic structure of the humanp47-phox (NCF1) gene Blood Cells Mol Dis 26(1)37ndash46
Cunninghame Graham DS Morris DL Bhangale TR Criswell LASyvanen AC Ronnblom L Behrens TW Graham RR Vyse TJ2011 Association of NCF2 IKZF1 IRF8 IFIH1 and TYK2 with sys-temic lupus erythematosus PLoS Genet 7(10)e1002341
Excoffier L Laval G Schneider S 2007 Arlequin ver 30 an integratedsoftware package for population genetics data analysis EvolBioinform Online 147ndash50
2166
Tarazona-Santos et al doi101093molbevmst119 MBE by guest on A
ugust 13 2016httpm
beoxfordjournalsorgD
ownloaded from
Excoffier L Ray N 2008 Surfing during population expansions promotesgenetic revolutions and structuration Trends Ecol Evol23(7)347ndash351
Ferrer-Admetlla A Bosch E Sikora M Marques-Bonet T Ramırez-SorianoA Muntasell A Navarro A Lazarus R Calafell F Bertranpetit J Casals F2008 Balancing selection is the main force shaping the evolution ofinnate immunity genes J Immunol 181(2)1315ndash1322
Fu YX Li WH 1993 Statistical tests of neutrality of mutations Genetics133(3)693ndash709
The 1000 Genomes Project Consortium 2010 A map of humangenome variation from population-scale sequencing Nature467(7319)1061ndash1073
The 1000 Genomes Project Consortium Abecasis GR Auton A BrooksLD DePristo MA Durbin RM Handsaker RE Kang HM Marth GTMcVean GA 2012 An integrated map of genetic variation from1092 human genomes Nature 491(7422)56ndash65
Guo SW Thompson EA 1992 Performing the exact test of Hardy-Weinberg proportion for multiple alleles Biometrics 48(2)361ndash372
Heyworth PG Cross AR Curnutte JT 2003 Chronic granulomatousdisease Curr Opin Immunol 15(5)578ndash584
Hudson RR 2002 Generating samples under a Wright-Fisher neutralmodel of genetic variation Bioinformatics 18(2)337ndash338
Jacob CO Eisenstein M Dinauer MC et al (21 co-authors) 2012 Lupus-associated causal mutation in neutrophil cytosolic factor 2 (NCF2)brings unique insights to the structure and function of NADPHoxidase Proc Natl Acad Sci U S A 109(2)E59ndashE67
Kimura M 1974 The neutral theory of molecular evolution CambridgeCambridge University Press
Kosiol C Vinar T da Fonseca RR Hubisz MJ Bustamante CD Nielsen RSiepel A 2008 Patterns of positive selection in six mammalian ge-nomes PLoS Genet 4(8)e1000144
Lacy F Kailasam MT OrsquoConnor DT Schmid-Schonbein GW Parmer RJ2000 Plasma hydrogen peroxide production in human essentialhypertension role of heredity gender and ethnicity Hypertension36(5)878ndash884
Laval G Patin E Barreiro LB Quintana-Murci L 2010 Formulating a his-torical and demographic model of recent human evolution based onresequencing data from noncoding regions PLoS One 5(4)e10284
Lewontin R 1972 The apportionment of human diversity Evol Biol 6391ndash398
Lindblad-Toh K Garber M Zuk O et al (88 co-authors) 2011 A high-resolution map of human evolutionary constraint using 29 mam-mals Nature 478(7370)476ndash482
Lodge R Diallo TO Descoteaux A 2006 Leishmania donovani lipopho-sphoglycan blocks NADPH oxidase assembly at the phagosomemembrane Cell Microbiol 8(12)1922ndash1931
Magalhaes WC Rodrigues MR Silva D Soares-Souza G Iannini MLCerqueira GC Faria-Campos AC Tarazona-Santos E 2012DIVERGENOME a bioinformatics platform to assist population ge-netics and genetic epidemiology studies Genet Epidemiol36(4)360ndash367
Marth JD Grewal PK 2008 Mammalian glycosylation in immunity NatRev Immunol 8(11)874ndash887
McDonald JH 1998 Improved tests for heterogeneity across a region ofDNA sequence in the ratio of polymorphism to divergence Mol BiolEvol 15377ndash384
McDonald JH Kreitman M 1991 Adaptive protein evolution at the Adhlocus in Drosophila Nature 351(6328)652ndash654
Nielsen R Bustamante C Clark AG et al (12 co-authors) 2005 A scanfor positively selected genes in the genomes of humans and chim-panzees PLoS Biol 3(6)e170
Olsson LM Lindqvist AK Kallberg H Padyukov L Burkhardt HAlfredsson L Klareskog L Holmdahl R 2007 A case-control studyof rheumatoid arthritis identifies an associated single nucleotidepolymorphism in the NCF4 gene supporting a role for the
NADPH-oxidase complex in autoimmunity Arthritis Res Ther9(5)R98
Packer BR Yeager M Burdett L et al (13 co-authors) 2006SNP500Cancer a public resource for sequence validation assay de-velopment and frequency analysis for genetic variation in candidategenes Nucleic Acids Res 34(Database issue)D617ndashD621
Peter BM Huerta-Sanchez E Nielsen R 2012 Distinguishing betweenselective sweeps from standing variation and from a de novo mu-tation PLoS Genet 8(10)e1003011
Przeworski M Coop G Wall JD 2005 The signature of positive selectionon standing genetic variation Evolution 59(11)2312ndash2323
Ptak SE Przeworski M 2002 Evidence for population growth in humansis confounded by fine-scale population structure Trends Genet18(11)559ndash563
Ramensky V Bork P Sunyaev S 2002 Human non-synonymous SNPsserver and survey Nucleic Acids Res 30(17)3894ndash3900
Rioux JD Xavier RJ Taylor KD et al (24 co-authors) 2007 Genome-wideassociation study identifies new susceptibility loci for Crohn diseaseand implicates autophagy in disease pathogenesis Nat Genet39(5)596ndash604
Roberts RL Hollis-Moffatt JE Gearry RB Kennedy MA Barclay MLMerriman TR 2008 Confirmation of association of IRGM andNCF4 with ileal Crohnrsquos disease in a population-based cohortGenes Immun 9(6)561ndash565
Rozas J 2009 DNA sequence polymorphism analysis using DnaSPMethods Mol Biol 537337ndash350
San Jose G Fortuno A Beloqui O Dıez J Zalba G 2008 NADPH oxidaseCYBA polymorphisms oxidative stress and cardiovascular diseasesClin Sci (Lond) 114(3)173ndash182
Santiago HC Gonzalez Lombana CZ Macedo JP et al (12 co-authors)2012 NADPH phagocyte oxidase knockout mice controlTrypanosoma cruzi proliferation but develop circulatory collapseand succumb to infection PLoS Negl Trop Dis 6(2)e1492
Stephens M Scheet P 2005 Accounting for decay of linkage disequilib-rium in haplotype inference and missing-data imputation Am JHum Genet 76(3)449ndash462
Sumimoto H Miyano K Takeya R 2005 Molecular composition andregulation of the Nox family NAD(P)H oxidases Biochem BiophysRes Commun 338(1)677ndash686
Tajima F 1983 Evolutionary relationship of DNA sequences in finitepopulations Genetics 105(2)437ndash460
Tajima F 1989 Statistical method for testing the neutral mutationhypothesis by DNA polymorphism Genetics 123(3)585ndash595
Tarazona-Santos E Bernig T Burdett L Magalhaes WC Fabbri C Liao JRedondo RA Welch R Yeager M Chanock SJ 2008 CYBB anNADPH-oxidase gene restricted diversity in humans and evidencefor differential long-term purifying selection on transmembrane andcytosolic domains Hum Mutat 29(5)623ndash632
Taylor RM Burritt JB Baniulis D Foubert TR Lord CI Dinauer MCParkos CA Jesaitis AJ 2004 Site-specific inhibitors of NADPH oxi-dase activity and structural probes of flavocytochrome b character-ization of six monoclonal antibodies to the p22phox subunitJ Immunol 173(12)7349ndash7357
Voight BF Kudaravalli S Wen X Pritchard JK 2006 A map of recentpositive selection in the human genome PLoS Biol 4(3)e72
Watterson GA 1975 On the number of segregating sites in geneticalmodels without recombination Theor Popul Biol 7(2) 256ndash276
Williamson SH Hernandez R Fledel-Alon A Zhu L Nielsen RBustamante CD 2005 Simultaneous inference of selection and pop-ulation growth from patterns of variation in the human genomeProc Natl Acad Sci U S A 102(22)7882ndash7887
Yang Z 2007a Adaptive molecular evolution In Balding DJ Bishop MCannings C editors Handbook of statistical genetics Vol 1 3rd edSusex (United Kingdom) John Wiley amp Sons p 377ndash406
Yang Z 2007b PAML 4 phylogenetic analysis by maximum likelihoodMol Biol Evol 241586ndash1591
2167
Evolutionary Dynamics of Human NADPH Oxidase Genes doi101093molbevmst119 MBE by guest on A
ugust 13 2016httpm
beoxfordjournalsorgD
ownloaded from
From the four studied genes NCF4 which encodes theregulatory protein p40-phox shows a pattern of diversitythat is typical for a gene that has evolved under neutralityIn addition to the features described in the previous para-graph the allelic spectra of NCF4 in the studied populationsare consistent with a neutral model of evolution (tables 2ndash4)
Although CYBB presented the most interesting evolution-ary history at the interspecific level with repeated episodes ofpositive natural selection recent human evolutionary historyhas resulted in interesting patterns of variation for CYBA andNCF2 CYBA encodes p22-phox which is a transmembraneprotein shared by different NADPH oxidases In addition toharboring two common nonsynonymous polymorphisms(V174A is also variable among different species of mammals)CYBA is the most variable and most affected by recombina-tion among the NADPH oxidase genes (tables 1 and 2) Inparticular CYBA diversity is very high in Europe comparedwith 329 genes resequenced in a European sample (httppgagswashingtonedusummary_statshtml last accessed July16 2013) CYBA ranks 11th (ie the 97th percentile)Moreover there are contrasting proportions of the totalnumber of polymorphismsnumber of singletons betweenAfricans (a low proportion) and Europeans (a high propor-tion) the latter showing an excess of common polymor-phisms with respect to the neutral expectation (see the DFL
test in table 4 and supplementary table S4 SupplementaryMaterial online) This excess of common variants inEuropeans is also significant when we conservatively testedit against a scenario of human evolution that incorporates theldquoOut of Africardquo bottleneck (Laval et al 2010) and the observedlevel of recombination in Europeans (CYBA = 807 for the se-quenced region) Because demographic forces and recombi-nation levels alone do not explain the high CYBA diversity andits excess of common polymorphisms we suggest that bal-ancing natural selection (that acts by maintaining differentalleles at high frequency in a population) has contributed to
shape the diversity of CYBA at least in the European popu-lation Indeed figure 4a shows that the haplotype network ofCYBA for the European population is consistent with theaction of balancing natural selection (Bamshad andWooding 2003) showing two well-differentiated commonclades that explain the observed high diversity and theexcess of common CYBA variants A comparative genomicanalysis confirms this inference the ratio of polymorphismsto differences fixed between human and chimpanzee is nothomogeneous along the gene in the different human popu-lations (Mc Donald 1998 supplementary table S5Supplementary Material online) as would be expectedunder neutral evolution Our inference of balancing naturalselection is consistent with the fact that 25ndash30 of the var-iation in levels of ROS production can be attributed to geneticfactors (Lacy et al 2000) and that ROS levels are associatedwith CYBA variants (Bedard et al 2009)
p67 encoded by NCF2 is a necessary cytosolic NADPHcomponent for phagocyte ROS production Asians show ahighly differentiated NCF2 haplotype structure (see frequen-cies of haplotypes NCF2-D11 and NCF2-E10 in the haplotypetables online and in the network shown in fig 4b) and thehighest FST values are observed in pairwise comparisons be-tween Asians and non-Asian populations (in particular withEuropeans table 3) and not between Africans and non-African populations as is usually observed in the humangenome We confirmed these results by analyzing data forNCF2 from the HapMap Project (supplementary materialSupplementary Material online) Moreover a trend towardan excess of rare polymorphisms exists in Asians that is notobserved elsewhere (tables 1 2 and 4 DFL =1904FFL =1893) Although we cannot exclude that this patternof diversity is compatible with the null hypotheses of neutral-ity and with the tested demographic history of human pop-ulations inferred by Laval et al (2010 tables 2ndash4) we canspeculate and envisage four additional evolutionary scenarios
Table 1 Allele Frequencies of Nonsynonymous Polymorphisms in NADPH Oxidase Genes
Genes rs Minor Allele(Amino Acid)
PolyphenPrediction
African European Asian Hispanic
CYBA
Y72H rs4673 T (Y) Possibly damaging 046 032 017 022
V174A rs1049254 T (V) Benign 017 048 048 018
NCF2
K181R rs2274064 G (R) Benign 035 037 041 048
T279M rs13306581 T (T) Probably damaging 000 000 005 000
V297A rs35937854 C (A) Benign 004 000 000 000
T361S Chr1181799289 NCBI36hg18 T (S) mdash 000 000 002 000
H389Q rs17849502 A (Q) Benign 000 005 000 007
R395W rs13306575 T (W) Possibly damaging 000 000 000 007
N419I rs35012521 T (I) Probably damaging 000 002 004 000
P454S rs55761650 T (S) mdash 000 000 000 002
L487S Chr1181795862 NCBI36hg18 C (S) mdash 002 000 000 000
NCF4
T85N rs112306225 A (N) mdash 000 000 002 000
A304E rs5995361 A (E) Benign 004 000 000 000
2162
Tarazona-Santos et al doi101093molbevmst119 MBE by guest on A
ugust 13 2016httpm
beoxfordjournalsorgD
ownloaded from
that may have contributed to shape the pattern of NCF2diversity in Asians 1) the observed trend is suggestive of aselective sweep on NCF2 standing variation a neutral orweakly deleterious existing variant becomes beneficial andrapidly increases in frequency (together with its associatedhaplotypes ie incomplete sweep) reducing the nucleotidediversity in the surrounding region and rendering other stand-ing substitutions rare During this process new rare substitu-tions appear in the expanding positively selected haplotype2) The pattern of diversity of NCF2 in Asia may result from anincomplete selective sweep acting on ARPC5 which is locatedapproximately 35 kb downstream of NCF2 In a genome-widescan for recent positive selection Voight et al (2006) identi-fied a strong signature of an incomplete sweep for ARPC5 inAsia (P = 0009) characterized by a higher than expected long-range linkage disequilibrium summarized by very high iHS
statistics (see Haplotter results for the HapMap II data athttphaplotteruchicagoedu last accessed July 16 2013)SNPs in NCF2 also presents high iHS statistics (P = 002) al-though values are lower than for ARPC5 3) The differentiatedpattern of diversity of NCF2 in Asia may also have been gen-erated without the action of natural selection during the firstcolonization of Asia by modern humans In a process of geo-graphic population expansion specifically in the front wave ofthe expansion some rare alleleshaplotypes (ie surfing al-leles) may become common by chance mimicking the pat-tern of diversity generated by a selective sweep (Excoffier andRay 2008) 4) The excess of rare variants may be an artifact ofpooling individuals from different populations (Ptak andPrzeworski 2002) Consistent with evolutionary scenarios 1ndash4 that produce similar patterns of diversity the haplotypenetwork of NCF2 for Eurasians (fig 4b) shows the following1) a large differentiation between Asians and Europeans thatis compatible with the high observed FST values and 2) a star-like shape associated with the haplotype NCF2-E10 that iscommon in Asia and rare elsewhere which is compatiblewith the excess of rare alleles in the Asian populations
DiscussionBy analyzing 29 mammalian genomes and four human pop-ulations we show in this study that natural selection hasacted in different ways over time to shape the pattern ofdiversity of the phagocyte NADPH oxidase genes At the tem-poral scale of the evolution of mammals we have inferredrecurrent episodes of positive selection acting on the extra-cellular portion of gp-91 that have been important to shapethe pattern of interspecific diversity of this gene Our in-terspecific analyses did not show a similar pattern of naturalselection in any of the other phagocyte NADPH oxidasegenes Even if current knowledge on the biology of NADPHdoes not allow us to interpret our results in terms of functionwe propose that the extracellular region of gp-91 is function-ally relevant Our results also imply that this region is highlydifferentiated among mammals at the protein level and thisvariability should be considered when mammals models areused to study the structure and function of phagocyteNADPH components
In the time scale of human evolution our analyses of theNADPH oxidase genes suggest that CYBA has been a target ofbalancing natural selection Because we do not have evidenceof population-specific variants that faced selective pressurethe inferred natural selection may have acted on a standingvariation in ancestral populations This implies that the selec-tive pressure began after the appearance of the variant andpossibly acted in a specific geographic region (Barret andSchluter 2008) The signatures of natural selection acting ona new mutation and on standing variation differ In the case ofselective sweeps episodes of natural selection on standingvariation are associated to a larger variance in the allelic spec-trum with respect to natural selection on a new mutationAlso selection on standing variation may produce an excess ofalleles at intermediate frequencies that is not associated withhigh nucleotide diversity (Przeworski et al 2005 Peter et al2012) This pattern contrasts with the effect of balancing
Table 2 Intrapopulation Diversity Indexes in the Studied Populationsfor the NADPH Oxidase Genes Obtained from Resequencing Dataa
African European Asian Hispanic
Number of chromosomes
CYBBb 42 52 34 36
CYBA 48 62 48 46
NCF2 48 62 48 46
NCF4 48 62 48 46
Segregating sitessingletons
CYBB 218 70 103 135
CYBA 6122 333 335 347
NCF2 4613 3311 2816 3714
NCF4 4512 267 191 308
Haplotype structureNumber of inferred haplotypesc
CYBB 14 5 7 12
CYBA 39 39 32 26
NCF2 38 32 18 31
NCF4 36 30 17 22
Haplotype diversity plusmn SD
CYBB 088 plusmn 003 034 plusmn 008 053 plusmn 010 070 plusmn 008
CYBA 098 plusmn 001 098 plusmn 001 097 plusmn 000 096 plusmn 002
NCF2 099 plusmn 001 096 plusmn 001 087 plusmn 004 097 plusmn 001
NCF4 098 plusmn 001 093 plusmn 002 093 plusmn 002 093 plusmn 002
Recombination parameter (q 103 per site)c
CYBB 008 lt001 lt001 lt002
CYBA 291 148 096 088
NCF2 052 038 010 058
NCF4 137 103 039 022
h estimatorsd
p 103 per site
CYBB 036 012 015 028
CYBA 190 163 151 157
NCF2 081 047 043 057
NCF4 101 076 064 084
hW 103 per site
CYBB 042 013 021 027
CYBA 231 118 125 13
NCF2 102 069 062 083
NCF4 101 070 054 087
aMost analyses were performed using software DnaSP (Rozas 2009)bData for CYBB (Xp211) are from Tarazona-Santos et al (2008)cHaplotypes and inferred using the method by Stephens and Scheet (2005) andthe software PHASEd Tajima (1983) W Watterson (1975)
2163
Evolutionary Dynamics of Human NADPH Oxidase Genes doi101093molbevmst119 MBE by guest on A
ugust 13 2016httpm
beoxfordjournalsorgD
ownloaded from
natural selection which produces an excess of common allelesassociated with high genetic diversity Thus the observed pat-tern of CYBA diversity in Europeans is not consistent with aselective sweep on a standing variation but it is consistent witha scenario of balancing selection acting on standing variation
If we consider for CYBA that heterozygote advantage maybe the mechanisms of balancing selection we can speculatethat the biological basis for this mechanism may be the fol-lowing considering that p22-phox is not exclusive of thephagocyte NADPH oxidase but it is also part of Nox com-plexes expressed in other tissues the dependence of ROSproduction on CYBA variants has to be finely regulated IfCYBA variants induce high levels of ROS these variants mayfavor a phagocyte-dependent efficient response to pathogensbut may damage other endothelial tissues Alternativelytissue oxidative damage does not occur if ROS productionis low but this response may be associated with a weaker
phagocyte respiratory burst against pathogens In this con-text heterozygote individuals with a CYBA-dependent inter-mediate level of ROS production may have been favored bynatural selection
Our results contribute to the discussion regarding the rel-evance of balancing selection in shaping the diversity ofinnate immunity genes (Ferrer-Admetlla et al 2008 Barreiroet al 2009) Ferrer-Admetlla et al (2008) have associated therecurrent signatures of balancing selection on inflammatorygenes with the need for fine regulation CYBA is the onlyphagocytic NADPH oxidase gene that also encodesnonphagocyte Nox components thus CYBA has a role inROS cell signaling a potentially dangerous process due toits capability to produce oxidative damage to tissues Geneswith these characteristics likely need even tighter regulationThe interplay between pathogen-driven selective pressureon innate immunity genes and their concomitantnonimmunological functions is complex In addition toCYBA other interesting examples of this interplay can befound among the 10 human toll-like receptors (TLRs) thatshow a variety of signatures of natural selection TLR8 whichshows the strongest signature of purifying selection is alsoinvolved in neuronal development (Barreiro et al 2009) and itis difficult to discriminate the role of each function in deter-mining the observed signature of natural selection
With few exceptions the pathogens responsible for naturalselection on immune genes are difficult to specify In the caseof NADPH oxidase we can infer based on the spectrum ofinfections in CGD patients that catalase-positive bacteria andfungi such as Staphylococci Salmonella Candida Aspergillusand M tuberculosis may be the selective pathogensInteractions between the host and pathogens also includemechanisms of the latter to impair the respiratory burst ofthe former For example Leishmania donovani blocks the as-sembly of NADPH oxidase at the phagosome membrane(Lodge et al 2006) These mechanisms may constitute selec-tive pressures imposed by pathogens
The associations reported in GWAS between rs4821544 inNCF4 and Crohnrsquos disease (an idiopathic inflammatory boweldisease that predominantly involves the ileum and colonRioux et al 2007) and between rs10911363 in NCF2 and
Table 3 Pairwise FST Genetic Distances between Populations
CYBBa CYBA
Africa Europe Asia Hispanic Africa Europe Asia Hispanic
Africa mdash 0316 0264 0092 mdash 0074 0083 0065
Europe 0257 mdash 0000 0107 mdash 0002 0056
Asia 0211 0000 mdash 0073 mdash 0054
Hispanic 0070 0082 0056 mdash mdash
NCF2 NCF4
Africa mdash 0048 0058 0037 mdash 0128 0136 0158
Europe mdash 0069 0000 mdash 0026 0026
Asia mdash 0059 mdash 0005
Hispanic mdash mdash
aFor CYBB FST estimators are above the diagonal Below the diagonal are the FST values corrected as if the effective population sizes of X chromosome genes were equal toautosomal ones
Table 4 Results of Neutrality Tests for the NADPH Oxidase Genesand Their Significancea
African European Asian Hispanic
Tajimarsquos D
CYBB 0473 0274 0813 0084
CYBA 0580 1242 0684 0707
NCF2 0412 0939 0883 0987
NCF4 0750 0243 0556 0102
Fu and Lirsquos D
CYBB 1050 1110 0395 0977
CYBA 1485 1734 1188 0138
NCF2 0407 1105 1904 1398
NCF4 0308 0443 1893 0266
Fu and Lirsquos F
CYBB 0980 0811 0114 0814
CYBA 1382 1893 1205 0405
NCF2 0512 1252 1893 1567
NCF4 0557 0231 1225 0248
NOTEmdashUnderlined values represent significant results under the demographic modelinferred by Laval et al (2010) for human populations See details in supplementarytable S4 Supplementary Material onlineaThe McDonaldndashKreitman test is nonsignificant in any of the cases
Significant under the WrightndashFisher model of constant population size
2164
Tarazona-Santos et al doi101093molbevmst119 MBE by guest on A
ugust 13 2016httpm
beoxfordjournalsorgD
ownloaded from
systemic lupus erythematosus (Cunninghame Graham et al2011) confirm the involvement of NADPH genes in the path-ogenesis of inflammatory-related common diseases Ourclaim that natural selection acted on CYBA (and maybe inNCF2) is relevant for biomedical studies because combiningevidence of natural selection with association analyses inimmune genes increases the statistical power to detect dis-ease-associated variants (Ayodo et al 2007) As a new gener-ation of association studies focusing on rare variation isemerging the combination of genes deemed interestingfrom GWAS and populations with an excess of rare variantsin these genes such as NCF2 in Asians are particularly inter-esting as a source of rare variants with clinical relevanceFinally by determining through molecular evolution mappingthat the extracellular portion of gp91 (encoded by CYBB inthe X-chromosome) has been subject to recurrent episodes ofpositive selection at the scale of mammals evolution we positthe hypothesis that this portion of the NADPH oxidase isrelevant for currently unknown biological processes thatonce revealed by structural and functional investigationswill contribute to understanding the role of NADPH oxidasein infectious autoimmune and cardiovascular diseases
Materials and Methods
Molecular Evolution Analysis of NADPH Genes
We used the maximum likelihood framework developed byYang (2007a) to estimate for the NADPH oxidase genes
under a variety of evolutionary models as implemented in thePAML software (Yang 2007b) This approach allows infer-ences about the evolution of a coding region along aninter-specific phylogeny mapping the codons that haveevolved under strongweak purifying selection neutrality oradaptive positive selection Further details about these anal-yses are available as supplementary material SupplementaryMaterial online
Human Population Genetics of NADPH Genes
For human population genetics analyses we conducted bidi-rectional Sanger sequencing of CYBB CYBA NCF2 and NCF4for a total of 35242 bp for each of 102 healthy individuals aspart of the SNP500 Cancer project (Packer et al 2006 seesupplementary fig S1 Supplementary Material online for de-tails) Human population resequencing data for CYBB werepublished in Tarazona-Santos et al (2008) The resequencingexperiments were performed as in Packer et al (2006) These102 unrelated individuals include the following 24 of Africanancestry (15 African Americans from the United States and 9Pygmies) 23 admixed Latin Americans (from Mexico PuertoRico and South America) 31 Europeans (from the CEPHUTAH pedigree and the NIEHS Environmental GenomeProject) and 24 AsiansndashOceanians (from Melanesia PakistanChina Cambodia Japan and Taiwan)
After controlling for multiple tests we confirmed that allSNPs fit the HardyndashWeinberg equilibrium by the Guo and
CYBA - A16
CYBA - A3
Y72H - rs4673
V174A - rs1049254 CYBA - E18
(a)
NCF2 - E10 H389Q - rs17849502
NCF2 - B3
NCF2 - D11
NCF2 - C1K181R - rs2274064(b)
FIG 4 Phylogenetic networks of (a) CYBA in Europeans and (b) NCF2 in Europeans (black) and Asians (yellow) The lengths of the branches areproportional to the number of mutations Only nonsynonymous mutations are shown The haplotype names correspond to the table of inferredhaplotypes in the supplementary material Supplementary Material online We only show the names of haplotypes with a frequency gt5 Ancestralhaplotypes (inferred as being the human haplotype or median vector most similar to the chimpanzee sequence) are indicated by an arrow Medianvectors are in red
2165
Evolutionary Dynamics of Human NADPH Oxidase Genes doi101093molbevmst119 MBE by guest on A
ugust 13 2016httpm
beoxfordjournalsorgD
ownloaded from
Thompson (1992) test which was implemented in the soft-ware Arlequin 30 (Excoffier et al 2007) Insertion-deletions(INDELs) were excluded from population genetics analysesTo assess intrapopulation diversity we used two statistics the per-site mean number of pairwise differences betweensequences (Tajima 1983) and w based on the number ofsegregating sites (S) (Watterson 1975) We measured pairwisebetween-populations diversity by using the FST statistics cal-culated using the software DnaSP (Rozas 2009)
Haplotypes and the recombination parameter were in-ferred using the PHASE software (Stephens and Scheet 2005)and diversity indexes calculations (tables 2 and 3) as well asneutrality statistics (table 4) were estimated using DnaSP soft-ware We applied two kinds of neutrality tests 1) tests basedon the allelic spectrum which is the distribution of polymor-phisms across different classes of frequencies namelyTajimarsquos DT (Tajima 1989) and FundashLirsquos DFL and FFL (Fu andLi 1993) and 2) tests based on comparisons between thenumber of polymorphisms in human populations and fixeddifferences with the chimpanzee (ie outgroup) namely theMcDonald and Kreitman (1991) test and the adaptedKolmogorovndashSmirnoff test by McDonald (1998) For thefirst set of tests we used as null hypotheses both the classicWrightndashFisher model of neutrality with a constant popula-tion size as well as the more realistic evolutionary scenario forhuman populations inferred by Laval et al (2010) In the caseof the scenario of Laval et al (2010) we ignored interconti-nental gene flow within the Old World because these raregene flow events likely does not affect the level of significanceof the neutrality tests given its very low inferred values(13 105) Null distributions used to test the significanceof the neutrality tests under these evolutionary scenarios weregenerated using coalescent simulations and a significancelevel of 005 (Hudson 2002) The KolmogorovndashSmirnoff testof neutrality adapted by McDonald (1998) was performedusing Slider software available at httpudeledu~mcdonaldaboutdnasliderhtml (last accessed July 16 2013) Weperformed coalescent simulations using ms software(Hudson 2002) Further methodological details are availableas supplementary material Supplementary Material onlineWe constructed the CYBA and NCF2 networks using allSNP variants and applying the Median joining algorithmand the maximum parsimony option calculations as imple-mented in the software Network 46 (Bandelt et al 1999)
Supplementary MaterialSupplementary tables S1ndashS5 and figure S1 are available atMolecular Biology and Evolution online (httpwwwmbeoxfordjournalsorg)
Acknowledgments
SJC conceived the study ET-S MM FL RC AC CF LPand BS generated the dataanalyzed the sequences LB ACWCSM and MY provided tools for data generationand analyses ET-S MM FL WCSM and RR performedpopulation genetics analyses ET-S and MM prepared tablesand figures ET-S and SJC wrote the manuscript All the
authors contributed with discussions about the results Theauthors are grateful to Silvia Fuselli for discussions of ourresults to Fernando Levi Soares for technical assistance withthe figures and the Sequencing Group of the CoreGenotyping Facility (National Cancer Institute) for their tech-nical assistance This work was supported by the IntramuralResearch Program of the National Cancer Institute (NCI) andNational Institutes of Health (NIH) CF was supported by theUniversity of Bologna ET-S MM WCSM and FL weresupported by the Fogarty International Center and NationalCancer Institute (1R01TW007894) Conselho Nacional deDesenvolvimento Cientıfico e Tecnologico (Brazil Grant3075562012-3) Fundacao de Amparo a Pesquisa de MinasGerais (Brazil CBB PPM 0026512) and the Brazilian Ministryof Education (CAPES Agency grant PNPD 0256709-1)
ReferencesAyodo G Price AL Keinan A Ajwang A Otieno MF Orago AS
Patterson N Reich D 2007 Combining evidence of natural selectionwith association analysis increases power to detect malaria-resis-tance variants Am J Hum Genet 81(2)234ndash242
Bamshad M Wooding SP 2003 Signatures of natural selection in thehuman genome Nat Rev Genet 4(2)99ndash111
Bandelt H-J Forster P Rohl A 1999 Median-joining networks for infer-ring intraspecific phylogenies Mol Biol Evol 16(1)37ndash48
Barreiro LB Ben-Ali M Quach H et al (17 co-authors) 2009Evolutionary dynamics of human Toll-like receptors and their dif-ferent contributions to host defense PLoS Genet 5(7)e1000562
Barreiro LB Quintana-Murci L 2010 From evolutionary genetics tohuman immunology how selection shapes host defence genesNat Rev Genet 11(1)17ndash30
Barrett RD Schluter D 2008 Adaptation from standing genetic varia-tion Trends Ecol Evol 23(1)38ndash44
Bedard K Attar H Bonnefont J Jaquet V Borel C Plastre O Stasia MJAntonarakis SE Krause KH 2009 Three common polymorphisms inthe CYBA gene form a haplotype associated with decreased ROSgeneration Hum Mutat 30(7)1123ndash1133
Blekhman R Man O Herrmann L Boyko AR Indap A Kosiol CBustamante CD Teshima KM Przeworski M 2008 Natural selectionon genes that underlie human disease susceptibility Curr Biol18(12)883ndash889
Brandes RP Kreuzer J 2005 Vascular NADPH oxidases molecular mech-anisms of activation Cardiovasc Res 65(1)16ndash27
Buckley RH 2004 Pulmonary complications of primary immunodefi-ciencies Paediatr Respir Rev 5(Suppl A)S225ndashS233
Bustamante CD Fledel-Alon A Williamson S et al (13 co-authors)2005 Natural selection on protein-coding genes in the humangenome Nature 437(7062)1153ndash1157
Bustamante J Arias AA Vogt G et al (29 co-authors) 2011 GermlineCYBB mutations that selectively affect macrophages in kindredswith X-linked predisposition to tuberculous mycobacterial diseaseNat Immunol 12(3)213ndash221
Campbell MC Tishkoff SA 2008 African genetic diversity implicationsfor human demographic history modern human origins and com-plex disease mapping Annu Rev Genomics Hum Genet 9403ndash433
Chanock SJ Roesler J Zhan S Hopkins P Lee P Barrett DT ChristensenBL Curnutte JT Gorlach A 2000 Genomic structure of the humanp47-phox (NCF1) gene Blood Cells Mol Dis 26(1)37ndash46
Cunninghame Graham DS Morris DL Bhangale TR Criswell LASyvanen AC Ronnblom L Behrens TW Graham RR Vyse TJ2011 Association of NCF2 IKZF1 IRF8 IFIH1 and TYK2 with sys-temic lupus erythematosus PLoS Genet 7(10)e1002341
Excoffier L Laval G Schneider S 2007 Arlequin ver 30 an integratedsoftware package for population genetics data analysis EvolBioinform Online 147ndash50
2166
Tarazona-Santos et al doi101093molbevmst119 MBE by guest on A
ugust 13 2016httpm
beoxfordjournalsorgD
ownloaded from
Excoffier L Ray N 2008 Surfing during population expansions promotesgenetic revolutions and structuration Trends Ecol Evol23(7)347ndash351
Ferrer-Admetlla A Bosch E Sikora M Marques-Bonet T Ramırez-SorianoA Muntasell A Navarro A Lazarus R Calafell F Bertranpetit J Casals F2008 Balancing selection is the main force shaping the evolution ofinnate immunity genes J Immunol 181(2)1315ndash1322
Fu YX Li WH 1993 Statistical tests of neutrality of mutations Genetics133(3)693ndash709
The 1000 Genomes Project Consortium 2010 A map of humangenome variation from population-scale sequencing Nature467(7319)1061ndash1073
The 1000 Genomes Project Consortium Abecasis GR Auton A BrooksLD DePristo MA Durbin RM Handsaker RE Kang HM Marth GTMcVean GA 2012 An integrated map of genetic variation from1092 human genomes Nature 491(7422)56ndash65
Guo SW Thompson EA 1992 Performing the exact test of Hardy-Weinberg proportion for multiple alleles Biometrics 48(2)361ndash372
Heyworth PG Cross AR Curnutte JT 2003 Chronic granulomatousdisease Curr Opin Immunol 15(5)578ndash584
Hudson RR 2002 Generating samples under a Wright-Fisher neutralmodel of genetic variation Bioinformatics 18(2)337ndash338
Jacob CO Eisenstein M Dinauer MC et al (21 co-authors) 2012 Lupus-associated causal mutation in neutrophil cytosolic factor 2 (NCF2)brings unique insights to the structure and function of NADPHoxidase Proc Natl Acad Sci U S A 109(2)E59ndashE67
Kimura M 1974 The neutral theory of molecular evolution CambridgeCambridge University Press
Kosiol C Vinar T da Fonseca RR Hubisz MJ Bustamante CD Nielsen RSiepel A 2008 Patterns of positive selection in six mammalian ge-nomes PLoS Genet 4(8)e1000144
Lacy F Kailasam MT OrsquoConnor DT Schmid-Schonbein GW Parmer RJ2000 Plasma hydrogen peroxide production in human essentialhypertension role of heredity gender and ethnicity Hypertension36(5)878ndash884
Laval G Patin E Barreiro LB Quintana-Murci L 2010 Formulating a his-torical and demographic model of recent human evolution based onresequencing data from noncoding regions PLoS One 5(4)e10284
Lewontin R 1972 The apportionment of human diversity Evol Biol 6391ndash398
Lindblad-Toh K Garber M Zuk O et al (88 co-authors) 2011 A high-resolution map of human evolutionary constraint using 29 mam-mals Nature 478(7370)476ndash482
Lodge R Diallo TO Descoteaux A 2006 Leishmania donovani lipopho-sphoglycan blocks NADPH oxidase assembly at the phagosomemembrane Cell Microbiol 8(12)1922ndash1931
Magalhaes WC Rodrigues MR Silva D Soares-Souza G Iannini MLCerqueira GC Faria-Campos AC Tarazona-Santos E 2012DIVERGENOME a bioinformatics platform to assist population ge-netics and genetic epidemiology studies Genet Epidemiol36(4)360ndash367
Marth JD Grewal PK 2008 Mammalian glycosylation in immunity NatRev Immunol 8(11)874ndash887
McDonald JH 1998 Improved tests for heterogeneity across a region ofDNA sequence in the ratio of polymorphism to divergence Mol BiolEvol 15377ndash384
McDonald JH Kreitman M 1991 Adaptive protein evolution at the Adhlocus in Drosophila Nature 351(6328)652ndash654
Nielsen R Bustamante C Clark AG et al (12 co-authors) 2005 A scanfor positively selected genes in the genomes of humans and chim-panzees PLoS Biol 3(6)e170
Olsson LM Lindqvist AK Kallberg H Padyukov L Burkhardt HAlfredsson L Klareskog L Holmdahl R 2007 A case-control studyof rheumatoid arthritis identifies an associated single nucleotidepolymorphism in the NCF4 gene supporting a role for the
NADPH-oxidase complex in autoimmunity Arthritis Res Ther9(5)R98
Packer BR Yeager M Burdett L et al (13 co-authors) 2006SNP500Cancer a public resource for sequence validation assay de-velopment and frequency analysis for genetic variation in candidategenes Nucleic Acids Res 34(Database issue)D617ndashD621
Peter BM Huerta-Sanchez E Nielsen R 2012 Distinguishing betweenselective sweeps from standing variation and from a de novo mu-tation PLoS Genet 8(10)e1003011
Przeworski M Coop G Wall JD 2005 The signature of positive selectionon standing genetic variation Evolution 59(11)2312ndash2323
Ptak SE Przeworski M 2002 Evidence for population growth in humansis confounded by fine-scale population structure Trends Genet18(11)559ndash563
Ramensky V Bork P Sunyaev S 2002 Human non-synonymous SNPsserver and survey Nucleic Acids Res 30(17)3894ndash3900
Rioux JD Xavier RJ Taylor KD et al (24 co-authors) 2007 Genome-wideassociation study identifies new susceptibility loci for Crohn diseaseand implicates autophagy in disease pathogenesis Nat Genet39(5)596ndash604
Roberts RL Hollis-Moffatt JE Gearry RB Kennedy MA Barclay MLMerriman TR 2008 Confirmation of association of IRGM andNCF4 with ileal Crohnrsquos disease in a population-based cohortGenes Immun 9(6)561ndash565
Rozas J 2009 DNA sequence polymorphism analysis using DnaSPMethods Mol Biol 537337ndash350
San Jose G Fortuno A Beloqui O Dıez J Zalba G 2008 NADPH oxidaseCYBA polymorphisms oxidative stress and cardiovascular diseasesClin Sci (Lond) 114(3)173ndash182
Santiago HC Gonzalez Lombana CZ Macedo JP et al (12 co-authors)2012 NADPH phagocyte oxidase knockout mice controlTrypanosoma cruzi proliferation but develop circulatory collapseand succumb to infection PLoS Negl Trop Dis 6(2)e1492
Stephens M Scheet P 2005 Accounting for decay of linkage disequilib-rium in haplotype inference and missing-data imputation Am JHum Genet 76(3)449ndash462
Sumimoto H Miyano K Takeya R 2005 Molecular composition andregulation of the Nox family NAD(P)H oxidases Biochem BiophysRes Commun 338(1)677ndash686
Tajima F 1983 Evolutionary relationship of DNA sequences in finitepopulations Genetics 105(2)437ndash460
Tajima F 1989 Statistical method for testing the neutral mutationhypothesis by DNA polymorphism Genetics 123(3)585ndash595
Tarazona-Santos E Bernig T Burdett L Magalhaes WC Fabbri C Liao JRedondo RA Welch R Yeager M Chanock SJ 2008 CYBB anNADPH-oxidase gene restricted diversity in humans and evidencefor differential long-term purifying selection on transmembrane andcytosolic domains Hum Mutat 29(5)623ndash632
Taylor RM Burritt JB Baniulis D Foubert TR Lord CI Dinauer MCParkos CA Jesaitis AJ 2004 Site-specific inhibitors of NADPH oxi-dase activity and structural probes of flavocytochrome b character-ization of six monoclonal antibodies to the p22phox subunitJ Immunol 173(12)7349ndash7357
Voight BF Kudaravalli S Wen X Pritchard JK 2006 A map of recentpositive selection in the human genome PLoS Biol 4(3)e72
Watterson GA 1975 On the number of segregating sites in geneticalmodels without recombination Theor Popul Biol 7(2) 256ndash276
Williamson SH Hernandez R Fledel-Alon A Zhu L Nielsen RBustamante CD 2005 Simultaneous inference of selection and pop-ulation growth from patterns of variation in the human genomeProc Natl Acad Sci U S A 102(22)7882ndash7887
Yang Z 2007a Adaptive molecular evolution In Balding DJ Bishop MCannings C editors Handbook of statistical genetics Vol 1 3rd edSusex (United Kingdom) John Wiley amp Sons p 377ndash406
Yang Z 2007b PAML 4 phylogenetic analysis by maximum likelihoodMol Biol Evol 241586ndash1591
2167
Evolutionary Dynamics of Human NADPH Oxidase Genes doi101093molbevmst119 MBE by guest on A
ugust 13 2016httpm
beoxfordjournalsorgD
ownloaded from
that may have contributed to shape the pattern of NCF2diversity in Asians 1) the observed trend is suggestive of aselective sweep on NCF2 standing variation a neutral orweakly deleterious existing variant becomes beneficial andrapidly increases in frequency (together with its associatedhaplotypes ie incomplete sweep) reducing the nucleotidediversity in the surrounding region and rendering other stand-ing substitutions rare During this process new rare substitu-tions appear in the expanding positively selected haplotype2) The pattern of diversity of NCF2 in Asia may result from anincomplete selective sweep acting on ARPC5 which is locatedapproximately 35 kb downstream of NCF2 In a genome-widescan for recent positive selection Voight et al (2006) identi-fied a strong signature of an incomplete sweep for ARPC5 inAsia (P = 0009) characterized by a higher than expected long-range linkage disequilibrium summarized by very high iHS
statistics (see Haplotter results for the HapMap II data athttphaplotteruchicagoedu last accessed July 16 2013)SNPs in NCF2 also presents high iHS statistics (P = 002) al-though values are lower than for ARPC5 3) The differentiatedpattern of diversity of NCF2 in Asia may also have been gen-erated without the action of natural selection during the firstcolonization of Asia by modern humans In a process of geo-graphic population expansion specifically in the front wave ofthe expansion some rare alleleshaplotypes (ie surfing al-leles) may become common by chance mimicking the pat-tern of diversity generated by a selective sweep (Excoffier andRay 2008) 4) The excess of rare variants may be an artifact ofpooling individuals from different populations (Ptak andPrzeworski 2002) Consistent with evolutionary scenarios 1ndash4 that produce similar patterns of diversity the haplotypenetwork of NCF2 for Eurasians (fig 4b) shows the following1) a large differentiation between Asians and Europeans thatis compatible with the high observed FST values and 2) a star-like shape associated with the haplotype NCF2-E10 that iscommon in Asia and rare elsewhere which is compatiblewith the excess of rare alleles in the Asian populations
DiscussionBy analyzing 29 mammalian genomes and four human pop-ulations we show in this study that natural selection hasacted in different ways over time to shape the pattern ofdiversity of the phagocyte NADPH oxidase genes At the tem-poral scale of the evolution of mammals we have inferredrecurrent episodes of positive selection acting on the extra-cellular portion of gp-91 that have been important to shapethe pattern of interspecific diversity of this gene Our in-terspecific analyses did not show a similar pattern of naturalselection in any of the other phagocyte NADPH oxidasegenes Even if current knowledge on the biology of NADPHdoes not allow us to interpret our results in terms of functionwe propose that the extracellular region of gp-91 is function-ally relevant Our results also imply that this region is highlydifferentiated among mammals at the protein level and thisvariability should be considered when mammals models areused to study the structure and function of phagocyteNADPH components
In the time scale of human evolution our analyses of theNADPH oxidase genes suggest that CYBA has been a target ofbalancing natural selection Because we do not have evidenceof population-specific variants that faced selective pressurethe inferred natural selection may have acted on a standingvariation in ancestral populations This implies that the selec-tive pressure began after the appearance of the variant andpossibly acted in a specific geographic region (Barret andSchluter 2008) The signatures of natural selection acting ona new mutation and on standing variation differ In the case ofselective sweeps episodes of natural selection on standingvariation are associated to a larger variance in the allelic spec-trum with respect to natural selection on a new mutationAlso selection on standing variation may produce an excess ofalleles at intermediate frequencies that is not associated withhigh nucleotide diversity (Przeworski et al 2005 Peter et al2012) This pattern contrasts with the effect of balancing
Table 2 Intrapopulation Diversity Indexes in the Studied Populationsfor the NADPH Oxidase Genes Obtained from Resequencing Dataa
African European Asian Hispanic
Number of chromosomes
CYBBb 42 52 34 36
CYBA 48 62 48 46
NCF2 48 62 48 46
NCF4 48 62 48 46
Segregating sitessingletons
CYBB 218 70 103 135
CYBA 6122 333 335 347
NCF2 4613 3311 2816 3714
NCF4 4512 267 191 308
Haplotype structureNumber of inferred haplotypesc
CYBB 14 5 7 12
CYBA 39 39 32 26
NCF2 38 32 18 31
NCF4 36 30 17 22
Haplotype diversity plusmn SD
CYBB 088 plusmn 003 034 plusmn 008 053 plusmn 010 070 plusmn 008
CYBA 098 plusmn 001 098 plusmn 001 097 plusmn 000 096 plusmn 002
NCF2 099 plusmn 001 096 plusmn 001 087 plusmn 004 097 plusmn 001
NCF4 098 plusmn 001 093 plusmn 002 093 plusmn 002 093 plusmn 002
Recombination parameter (q 103 per site)c
CYBB 008 lt001 lt001 lt002
CYBA 291 148 096 088
NCF2 052 038 010 058
NCF4 137 103 039 022
h estimatorsd
p 103 per site
CYBB 036 012 015 028
CYBA 190 163 151 157
NCF2 081 047 043 057
NCF4 101 076 064 084
hW 103 per site
CYBB 042 013 021 027
CYBA 231 118 125 13
NCF2 102 069 062 083
NCF4 101 070 054 087
aMost analyses were performed using software DnaSP (Rozas 2009)bData for CYBB (Xp211) are from Tarazona-Santos et al (2008)cHaplotypes and inferred using the method by Stephens and Scheet (2005) andthe software PHASEd Tajima (1983) W Watterson (1975)
2163
Evolutionary Dynamics of Human NADPH Oxidase Genes doi101093molbevmst119 MBE by guest on A
ugust 13 2016httpm
beoxfordjournalsorgD
ownloaded from
natural selection which produces an excess of common allelesassociated with high genetic diversity Thus the observed pat-tern of CYBA diversity in Europeans is not consistent with aselective sweep on a standing variation but it is consistent witha scenario of balancing selection acting on standing variation
If we consider for CYBA that heterozygote advantage maybe the mechanisms of balancing selection we can speculatethat the biological basis for this mechanism may be the fol-lowing considering that p22-phox is not exclusive of thephagocyte NADPH oxidase but it is also part of Nox com-plexes expressed in other tissues the dependence of ROSproduction on CYBA variants has to be finely regulated IfCYBA variants induce high levels of ROS these variants mayfavor a phagocyte-dependent efficient response to pathogensbut may damage other endothelial tissues Alternativelytissue oxidative damage does not occur if ROS productionis low but this response may be associated with a weaker
phagocyte respiratory burst against pathogens In this con-text heterozygote individuals with a CYBA-dependent inter-mediate level of ROS production may have been favored bynatural selection
Our results contribute to the discussion regarding the rel-evance of balancing selection in shaping the diversity ofinnate immunity genes (Ferrer-Admetlla et al 2008 Barreiroet al 2009) Ferrer-Admetlla et al (2008) have associated therecurrent signatures of balancing selection on inflammatorygenes with the need for fine regulation CYBA is the onlyphagocytic NADPH oxidase gene that also encodesnonphagocyte Nox components thus CYBA has a role inROS cell signaling a potentially dangerous process due toits capability to produce oxidative damage to tissues Geneswith these characteristics likely need even tighter regulationThe interplay between pathogen-driven selective pressureon innate immunity genes and their concomitantnonimmunological functions is complex In addition toCYBA other interesting examples of this interplay can befound among the 10 human toll-like receptors (TLRs) thatshow a variety of signatures of natural selection TLR8 whichshows the strongest signature of purifying selection is alsoinvolved in neuronal development (Barreiro et al 2009) and itis difficult to discriminate the role of each function in deter-mining the observed signature of natural selection
With few exceptions the pathogens responsible for naturalselection on immune genes are difficult to specify In the caseof NADPH oxidase we can infer based on the spectrum ofinfections in CGD patients that catalase-positive bacteria andfungi such as Staphylococci Salmonella Candida Aspergillusand M tuberculosis may be the selective pathogensInteractions between the host and pathogens also includemechanisms of the latter to impair the respiratory burst ofthe former For example Leishmania donovani blocks the as-sembly of NADPH oxidase at the phagosome membrane(Lodge et al 2006) These mechanisms may constitute selec-tive pressures imposed by pathogens
The associations reported in GWAS between rs4821544 inNCF4 and Crohnrsquos disease (an idiopathic inflammatory boweldisease that predominantly involves the ileum and colonRioux et al 2007) and between rs10911363 in NCF2 and
Table 3 Pairwise FST Genetic Distances between Populations
CYBBa CYBA
Africa Europe Asia Hispanic Africa Europe Asia Hispanic
Africa mdash 0316 0264 0092 mdash 0074 0083 0065
Europe 0257 mdash 0000 0107 mdash 0002 0056
Asia 0211 0000 mdash 0073 mdash 0054
Hispanic 0070 0082 0056 mdash mdash
NCF2 NCF4
Africa mdash 0048 0058 0037 mdash 0128 0136 0158
Europe mdash 0069 0000 mdash 0026 0026
Asia mdash 0059 mdash 0005
Hispanic mdash mdash
aFor CYBB FST estimators are above the diagonal Below the diagonal are the FST values corrected as if the effective population sizes of X chromosome genes were equal toautosomal ones
Table 4 Results of Neutrality Tests for the NADPH Oxidase Genesand Their Significancea
African European Asian Hispanic
Tajimarsquos D
CYBB 0473 0274 0813 0084
CYBA 0580 1242 0684 0707
NCF2 0412 0939 0883 0987
NCF4 0750 0243 0556 0102
Fu and Lirsquos D
CYBB 1050 1110 0395 0977
CYBA 1485 1734 1188 0138
NCF2 0407 1105 1904 1398
NCF4 0308 0443 1893 0266
Fu and Lirsquos F
CYBB 0980 0811 0114 0814
CYBA 1382 1893 1205 0405
NCF2 0512 1252 1893 1567
NCF4 0557 0231 1225 0248
NOTEmdashUnderlined values represent significant results under the demographic modelinferred by Laval et al (2010) for human populations See details in supplementarytable S4 Supplementary Material onlineaThe McDonaldndashKreitman test is nonsignificant in any of the cases
Significant under the WrightndashFisher model of constant population size
2164
Tarazona-Santos et al doi101093molbevmst119 MBE by guest on A
ugust 13 2016httpm
beoxfordjournalsorgD
ownloaded from
systemic lupus erythematosus (Cunninghame Graham et al2011) confirm the involvement of NADPH genes in the path-ogenesis of inflammatory-related common diseases Ourclaim that natural selection acted on CYBA (and maybe inNCF2) is relevant for biomedical studies because combiningevidence of natural selection with association analyses inimmune genes increases the statistical power to detect dis-ease-associated variants (Ayodo et al 2007) As a new gener-ation of association studies focusing on rare variation isemerging the combination of genes deemed interestingfrom GWAS and populations with an excess of rare variantsin these genes such as NCF2 in Asians are particularly inter-esting as a source of rare variants with clinical relevanceFinally by determining through molecular evolution mappingthat the extracellular portion of gp91 (encoded by CYBB inthe X-chromosome) has been subject to recurrent episodes ofpositive selection at the scale of mammals evolution we positthe hypothesis that this portion of the NADPH oxidase isrelevant for currently unknown biological processes thatonce revealed by structural and functional investigationswill contribute to understanding the role of NADPH oxidasein infectious autoimmune and cardiovascular diseases
Materials and Methods
Molecular Evolution Analysis of NADPH Genes
We used the maximum likelihood framework developed byYang (2007a) to estimate for the NADPH oxidase genes
under a variety of evolutionary models as implemented in thePAML software (Yang 2007b) This approach allows infer-ences about the evolution of a coding region along aninter-specific phylogeny mapping the codons that haveevolved under strongweak purifying selection neutrality oradaptive positive selection Further details about these anal-yses are available as supplementary material SupplementaryMaterial online
Human Population Genetics of NADPH Genes
For human population genetics analyses we conducted bidi-rectional Sanger sequencing of CYBB CYBA NCF2 and NCF4for a total of 35242 bp for each of 102 healthy individuals aspart of the SNP500 Cancer project (Packer et al 2006 seesupplementary fig S1 Supplementary Material online for de-tails) Human population resequencing data for CYBB werepublished in Tarazona-Santos et al (2008) The resequencingexperiments were performed as in Packer et al (2006) These102 unrelated individuals include the following 24 of Africanancestry (15 African Americans from the United States and 9Pygmies) 23 admixed Latin Americans (from Mexico PuertoRico and South America) 31 Europeans (from the CEPHUTAH pedigree and the NIEHS Environmental GenomeProject) and 24 AsiansndashOceanians (from Melanesia PakistanChina Cambodia Japan and Taiwan)
After controlling for multiple tests we confirmed that allSNPs fit the HardyndashWeinberg equilibrium by the Guo and
CYBA - A16
CYBA - A3
Y72H - rs4673
V174A - rs1049254 CYBA - E18
(a)
NCF2 - E10 H389Q - rs17849502
NCF2 - B3
NCF2 - D11
NCF2 - C1K181R - rs2274064(b)
FIG 4 Phylogenetic networks of (a) CYBA in Europeans and (b) NCF2 in Europeans (black) and Asians (yellow) The lengths of the branches areproportional to the number of mutations Only nonsynonymous mutations are shown The haplotype names correspond to the table of inferredhaplotypes in the supplementary material Supplementary Material online We only show the names of haplotypes with a frequency gt5 Ancestralhaplotypes (inferred as being the human haplotype or median vector most similar to the chimpanzee sequence) are indicated by an arrow Medianvectors are in red
2165
Evolutionary Dynamics of Human NADPH Oxidase Genes doi101093molbevmst119 MBE by guest on A
ugust 13 2016httpm
beoxfordjournalsorgD
ownloaded from
Thompson (1992) test which was implemented in the soft-ware Arlequin 30 (Excoffier et al 2007) Insertion-deletions(INDELs) were excluded from population genetics analysesTo assess intrapopulation diversity we used two statistics the per-site mean number of pairwise differences betweensequences (Tajima 1983) and w based on the number ofsegregating sites (S) (Watterson 1975) We measured pairwisebetween-populations diversity by using the FST statistics cal-culated using the software DnaSP (Rozas 2009)
Haplotypes and the recombination parameter were in-ferred using the PHASE software (Stephens and Scheet 2005)and diversity indexes calculations (tables 2 and 3) as well asneutrality statistics (table 4) were estimated using DnaSP soft-ware We applied two kinds of neutrality tests 1) tests basedon the allelic spectrum which is the distribution of polymor-phisms across different classes of frequencies namelyTajimarsquos DT (Tajima 1989) and FundashLirsquos DFL and FFL (Fu andLi 1993) and 2) tests based on comparisons between thenumber of polymorphisms in human populations and fixeddifferences with the chimpanzee (ie outgroup) namely theMcDonald and Kreitman (1991) test and the adaptedKolmogorovndashSmirnoff test by McDonald (1998) For thefirst set of tests we used as null hypotheses both the classicWrightndashFisher model of neutrality with a constant popula-tion size as well as the more realistic evolutionary scenario forhuman populations inferred by Laval et al (2010) In the caseof the scenario of Laval et al (2010) we ignored interconti-nental gene flow within the Old World because these raregene flow events likely does not affect the level of significanceof the neutrality tests given its very low inferred values(13 105) Null distributions used to test the significanceof the neutrality tests under these evolutionary scenarios weregenerated using coalescent simulations and a significancelevel of 005 (Hudson 2002) The KolmogorovndashSmirnoff testof neutrality adapted by McDonald (1998) was performedusing Slider software available at httpudeledu~mcdonaldaboutdnasliderhtml (last accessed July 16 2013) Weperformed coalescent simulations using ms software(Hudson 2002) Further methodological details are availableas supplementary material Supplementary Material onlineWe constructed the CYBA and NCF2 networks using allSNP variants and applying the Median joining algorithmand the maximum parsimony option calculations as imple-mented in the software Network 46 (Bandelt et al 1999)
Supplementary MaterialSupplementary tables S1ndashS5 and figure S1 are available atMolecular Biology and Evolution online (httpwwwmbeoxfordjournalsorg)
Acknowledgments
SJC conceived the study ET-S MM FL RC AC CF LPand BS generated the dataanalyzed the sequences LB ACWCSM and MY provided tools for data generationand analyses ET-S MM FL WCSM and RR performedpopulation genetics analyses ET-S and MM prepared tablesand figures ET-S and SJC wrote the manuscript All the
authors contributed with discussions about the results Theauthors are grateful to Silvia Fuselli for discussions of ourresults to Fernando Levi Soares for technical assistance withthe figures and the Sequencing Group of the CoreGenotyping Facility (National Cancer Institute) for their tech-nical assistance This work was supported by the IntramuralResearch Program of the National Cancer Institute (NCI) andNational Institutes of Health (NIH) CF was supported by theUniversity of Bologna ET-S MM WCSM and FL weresupported by the Fogarty International Center and NationalCancer Institute (1R01TW007894) Conselho Nacional deDesenvolvimento Cientıfico e Tecnologico (Brazil Grant3075562012-3) Fundacao de Amparo a Pesquisa de MinasGerais (Brazil CBB PPM 0026512) and the Brazilian Ministryof Education (CAPES Agency grant PNPD 0256709-1)
ReferencesAyodo G Price AL Keinan A Ajwang A Otieno MF Orago AS
Patterson N Reich D 2007 Combining evidence of natural selectionwith association analysis increases power to detect malaria-resis-tance variants Am J Hum Genet 81(2)234ndash242
Bamshad M Wooding SP 2003 Signatures of natural selection in thehuman genome Nat Rev Genet 4(2)99ndash111
Bandelt H-J Forster P Rohl A 1999 Median-joining networks for infer-ring intraspecific phylogenies Mol Biol Evol 16(1)37ndash48
Barreiro LB Ben-Ali M Quach H et al (17 co-authors) 2009Evolutionary dynamics of human Toll-like receptors and their dif-ferent contributions to host defense PLoS Genet 5(7)e1000562
Barreiro LB Quintana-Murci L 2010 From evolutionary genetics tohuman immunology how selection shapes host defence genesNat Rev Genet 11(1)17ndash30
Barrett RD Schluter D 2008 Adaptation from standing genetic varia-tion Trends Ecol Evol 23(1)38ndash44
Bedard K Attar H Bonnefont J Jaquet V Borel C Plastre O Stasia MJAntonarakis SE Krause KH 2009 Three common polymorphisms inthe CYBA gene form a haplotype associated with decreased ROSgeneration Hum Mutat 30(7)1123ndash1133
Blekhman R Man O Herrmann L Boyko AR Indap A Kosiol CBustamante CD Teshima KM Przeworski M 2008 Natural selectionon genes that underlie human disease susceptibility Curr Biol18(12)883ndash889
Brandes RP Kreuzer J 2005 Vascular NADPH oxidases molecular mech-anisms of activation Cardiovasc Res 65(1)16ndash27
Buckley RH 2004 Pulmonary complications of primary immunodefi-ciencies Paediatr Respir Rev 5(Suppl A)S225ndashS233
Bustamante CD Fledel-Alon A Williamson S et al (13 co-authors)2005 Natural selection on protein-coding genes in the humangenome Nature 437(7062)1153ndash1157
Bustamante J Arias AA Vogt G et al (29 co-authors) 2011 GermlineCYBB mutations that selectively affect macrophages in kindredswith X-linked predisposition to tuberculous mycobacterial diseaseNat Immunol 12(3)213ndash221
Campbell MC Tishkoff SA 2008 African genetic diversity implicationsfor human demographic history modern human origins and com-plex disease mapping Annu Rev Genomics Hum Genet 9403ndash433
Chanock SJ Roesler J Zhan S Hopkins P Lee P Barrett DT ChristensenBL Curnutte JT Gorlach A 2000 Genomic structure of the humanp47-phox (NCF1) gene Blood Cells Mol Dis 26(1)37ndash46
Cunninghame Graham DS Morris DL Bhangale TR Criswell LASyvanen AC Ronnblom L Behrens TW Graham RR Vyse TJ2011 Association of NCF2 IKZF1 IRF8 IFIH1 and TYK2 with sys-temic lupus erythematosus PLoS Genet 7(10)e1002341
Excoffier L Laval G Schneider S 2007 Arlequin ver 30 an integratedsoftware package for population genetics data analysis EvolBioinform Online 147ndash50
2166
Tarazona-Santos et al doi101093molbevmst119 MBE by guest on A
ugust 13 2016httpm
beoxfordjournalsorgD
ownloaded from
Excoffier L Ray N 2008 Surfing during population expansions promotesgenetic revolutions and structuration Trends Ecol Evol23(7)347ndash351
Ferrer-Admetlla A Bosch E Sikora M Marques-Bonet T Ramırez-SorianoA Muntasell A Navarro A Lazarus R Calafell F Bertranpetit J Casals F2008 Balancing selection is the main force shaping the evolution ofinnate immunity genes J Immunol 181(2)1315ndash1322
Fu YX Li WH 1993 Statistical tests of neutrality of mutations Genetics133(3)693ndash709
The 1000 Genomes Project Consortium 2010 A map of humangenome variation from population-scale sequencing Nature467(7319)1061ndash1073
The 1000 Genomes Project Consortium Abecasis GR Auton A BrooksLD DePristo MA Durbin RM Handsaker RE Kang HM Marth GTMcVean GA 2012 An integrated map of genetic variation from1092 human genomes Nature 491(7422)56ndash65
Guo SW Thompson EA 1992 Performing the exact test of Hardy-Weinberg proportion for multiple alleles Biometrics 48(2)361ndash372
Heyworth PG Cross AR Curnutte JT 2003 Chronic granulomatousdisease Curr Opin Immunol 15(5)578ndash584
Hudson RR 2002 Generating samples under a Wright-Fisher neutralmodel of genetic variation Bioinformatics 18(2)337ndash338
Jacob CO Eisenstein M Dinauer MC et al (21 co-authors) 2012 Lupus-associated causal mutation in neutrophil cytosolic factor 2 (NCF2)brings unique insights to the structure and function of NADPHoxidase Proc Natl Acad Sci U S A 109(2)E59ndashE67
Kimura M 1974 The neutral theory of molecular evolution CambridgeCambridge University Press
Kosiol C Vinar T da Fonseca RR Hubisz MJ Bustamante CD Nielsen RSiepel A 2008 Patterns of positive selection in six mammalian ge-nomes PLoS Genet 4(8)e1000144
Lacy F Kailasam MT OrsquoConnor DT Schmid-Schonbein GW Parmer RJ2000 Plasma hydrogen peroxide production in human essentialhypertension role of heredity gender and ethnicity Hypertension36(5)878ndash884
Laval G Patin E Barreiro LB Quintana-Murci L 2010 Formulating a his-torical and demographic model of recent human evolution based onresequencing data from noncoding regions PLoS One 5(4)e10284
Lewontin R 1972 The apportionment of human diversity Evol Biol 6391ndash398
Lindblad-Toh K Garber M Zuk O et al (88 co-authors) 2011 A high-resolution map of human evolutionary constraint using 29 mam-mals Nature 478(7370)476ndash482
Lodge R Diallo TO Descoteaux A 2006 Leishmania donovani lipopho-sphoglycan blocks NADPH oxidase assembly at the phagosomemembrane Cell Microbiol 8(12)1922ndash1931
Magalhaes WC Rodrigues MR Silva D Soares-Souza G Iannini MLCerqueira GC Faria-Campos AC Tarazona-Santos E 2012DIVERGENOME a bioinformatics platform to assist population ge-netics and genetic epidemiology studies Genet Epidemiol36(4)360ndash367
Marth JD Grewal PK 2008 Mammalian glycosylation in immunity NatRev Immunol 8(11)874ndash887
McDonald JH 1998 Improved tests for heterogeneity across a region ofDNA sequence in the ratio of polymorphism to divergence Mol BiolEvol 15377ndash384
McDonald JH Kreitman M 1991 Adaptive protein evolution at the Adhlocus in Drosophila Nature 351(6328)652ndash654
Nielsen R Bustamante C Clark AG et al (12 co-authors) 2005 A scanfor positively selected genes in the genomes of humans and chim-panzees PLoS Biol 3(6)e170
Olsson LM Lindqvist AK Kallberg H Padyukov L Burkhardt HAlfredsson L Klareskog L Holmdahl R 2007 A case-control studyof rheumatoid arthritis identifies an associated single nucleotidepolymorphism in the NCF4 gene supporting a role for the
NADPH-oxidase complex in autoimmunity Arthritis Res Ther9(5)R98
Packer BR Yeager M Burdett L et al (13 co-authors) 2006SNP500Cancer a public resource for sequence validation assay de-velopment and frequency analysis for genetic variation in candidategenes Nucleic Acids Res 34(Database issue)D617ndashD621
Peter BM Huerta-Sanchez E Nielsen R 2012 Distinguishing betweenselective sweeps from standing variation and from a de novo mu-tation PLoS Genet 8(10)e1003011
Przeworski M Coop G Wall JD 2005 The signature of positive selectionon standing genetic variation Evolution 59(11)2312ndash2323
Ptak SE Przeworski M 2002 Evidence for population growth in humansis confounded by fine-scale population structure Trends Genet18(11)559ndash563
Ramensky V Bork P Sunyaev S 2002 Human non-synonymous SNPsserver and survey Nucleic Acids Res 30(17)3894ndash3900
Rioux JD Xavier RJ Taylor KD et al (24 co-authors) 2007 Genome-wideassociation study identifies new susceptibility loci for Crohn diseaseand implicates autophagy in disease pathogenesis Nat Genet39(5)596ndash604
Roberts RL Hollis-Moffatt JE Gearry RB Kennedy MA Barclay MLMerriman TR 2008 Confirmation of association of IRGM andNCF4 with ileal Crohnrsquos disease in a population-based cohortGenes Immun 9(6)561ndash565
Rozas J 2009 DNA sequence polymorphism analysis using DnaSPMethods Mol Biol 537337ndash350
San Jose G Fortuno A Beloqui O Dıez J Zalba G 2008 NADPH oxidaseCYBA polymorphisms oxidative stress and cardiovascular diseasesClin Sci (Lond) 114(3)173ndash182
Santiago HC Gonzalez Lombana CZ Macedo JP et al (12 co-authors)2012 NADPH phagocyte oxidase knockout mice controlTrypanosoma cruzi proliferation but develop circulatory collapseand succumb to infection PLoS Negl Trop Dis 6(2)e1492
Stephens M Scheet P 2005 Accounting for decay of linkage disequilib-rium in haplotype inference and missing-data imputation Am JHum Genet 76(3)449ndash462
Sumimoto H Miyano K Takeya R 2005 Molecular composition andregulation of the Nox family NAD(P)H oxidases Biochem BiophysRes Commun 338(1)677ndash686
Tajima F 1983 Evolutionary relationship of DNA sequences in finitepopulations Genetics 105(2)437ndash460
Tajima F 1989 Statistical method for testing the neutral mutationhypothesis by DNA polymorphism Genetics 123(3)585ndash595
Tarazona-Santos E Bernig T Burdett L Magalhaes WC Fabbri C Liao JRedondo RA Welch R Yeager M Chanock SJ 2008 CYBB anNADPH-oxidase gene restricted diversity in humans and evidencefor differential long-term purifying selection on transmembrane andcytosolic domains Hum Mutat 29(5)623ndash632
Taylor RM Burritt JB Baniulis D Foubert TR Lord CI Dinauer MCParkos CA Jesaitis AJ 2004 Site-specific inhibitors of NADPH oxi-dase activity and structural probes of flavocytochrome b character-ization of six monoclonal antibodies to the p22phox subunitJ Immunol 173(12)7349ndash7357
Voight BF Kudaravalli S Wen X Pritchard JK 2006 A map of recentpositive selection in the human genome PLoS Biol 4(3)e72
Watterson GA 1975 On the number of segregating sites in geneticalmodels without recombination Theor Popul Biol 7(2) 256ndash276
Williamson SH Hernandez R Fledel-Alon A Zhu L Nielsen RBustamante CD 2005 Simultaneous inference of selection and pop-ulation growth from patterns of variation in the human genomeProc Natl Acad Sci U S A 102(22)7882ndash7887
Yang Z 2007a Adaptive molecular evolution In Balding DJ Bishop MCannings C editors Handbook of statistical genetics Vol 1 3rd edSusex (United Kingdom) John Wiley amp Sons p 377ndash406
Yang Z 2007b PAML 4 phylogenetic analysis by maximum likelihoodMol Biol Evol 241586ndash1591
2167
Evolutionary Dynamics of Human NADPH Oxidase Genes doi101093molbevmst119 MBE by guest on A
ugust 13 2016httpm
beoxfordjournalsorgD
ownloaded from
natural selection which produces an excess of common allelesassociated with high genetic diversity Thus the observed pat-tern of CYBA diversity in Europeans is not consistent with aselective sweep on a standing variation but it is consistent witha scenario of balancing selection acting on standing variation
If we consider for CYBA that heterozygote advantage maybe the mechanisms of balancing selection we can speculatethat the biological basis for this mechanism may be the fol-lowing considering that p22-phox is not exclusive of thephagocyte NADPH oxidase but it is also part of Nox com-plexes expressed in other tissues the dependence of ROSproduction on CYBA variants has to be finely regulated IfCYBA variants induce high levels of ROS these variants mayfavor a phagocyte-dependent efficient response to pathogensbut may damage other endothelial tissues Alternativelytissue oxidative damage does not occur if ROS productionis low but this response may be associated with a weaker
phagocyte respiratory burst against pathogens In this con-text heterozygote individuals with a CYBA-dependent inter-mediate level of ROS production may have been favored bynatural selection
Our results contribute to the discussion regarding the rel-evance of balancing selection in shaping the diversity ofinnate immunity genes (Ferrer-Admetlla et al 2008 Barreiroet al 2009) Ferrer-Admetlla et al (2008) have associated therecurrent signatures of balancing selection on inflammatorygenes with the need for fine regulation CYBA is the onlyphagocytic NADPH oxidase gene that also encodesnonphagocyte Nox components thus CYBA has a role inROS cell signaling a potentially dangerous process due toits capability to produce oxidative damage to tissues Geneswith these characteristics likely need even tighter regulationThe interplay between pathogen-driven selective pressureon innate immunity genes and their concomitantnonimmunological functions is complex In addition toCYBA other interesting examples of this interplay can befound among the 10 human toll-like receptors (TLRs) thatshow a variety of signatures of natural selection TLR8 whichshows the strongest signature of purifying selection is alsoinvolved in neuronal development (Barreiro et al 2009) and itis difficult to discriminate the role of each function in deter-mining the observed signature of natural selection
With few exceptions the pathogens responsible for naturalselection on immune genes are difficult to specify In the caseof NADPH oxidase we can infer based on the spectrum ofinfections in CGD patients that catalase-positive bacteria andfungi such as Staphylococci Salmonella Candida Aspergillusand M tuberculosis may be the selective pathogensInteractions between the host and pathogens also includemechanisms of the latter to impair the respiratory burst ofthe former For example Leishmania donovani blocks the as-sembly of NADPH oxidase at the phagosome membrane(Lodge et al 2006) These mechanisms may constitute selec-tive pressures imposed by pathogens
The associations reported in GWAS between rs4821544 inNCF4 and Crohnrsquos disease (an idiopathic inflammatory boweldisease that predominantly involves the ileum and colonRioux et al 2007) and between rs10911363 in NCF2 and
Table 3 Pairwise FST Genetic Distances between Populations
CYBBa CYBA
Africa Europe Asia Hispanic Africa Europe Asia Hispanic
Africa mdash 0316 0264 0092 mdash 0074 0083 0065
Europe 0257 mdash 0000 0107 mdash 0002 0056
Asia 0211 0000 mdash 0073 mdash 0054
Hispanic 0070 0082 0056 mdash mdash
NCF2 NCF4
Africa mdash 0048 0058 0037 mdash 0128 0136 0158
Europe mdash 0069 0000 mdash 0026 0026
Asia mdash 0059 mdash 0005
Hispanic mdash mdash
aFor CYBB FST estimators are above the diagonal Below the diagonal are the FST values corrected as if the effective population sizes of X chromosome genes were equal toautosomal ones
Table 4 Results of Neutrality Tests for the NADPH Oxidase Genesand Their Significancea
African European Asian Hispanic
Tajimarsquos D
CYBB 0473 0274 0813 0084
CYBA 0580 1242 0684 0707
NCF2 0412 0939 0883 0987
NCF4 0750 0243 0556 0102
Fu and Lirsquos D
CYBB 1050 1110 0395 0977
CYBA 1485 1734 1188 0138
NCF2 0407 1105 1904 1398
NCF4 0308 0443 1893 0266
Fu and Lirsquos F
CYBB 0980 0811 0114 0814
CYBA 1382 1893 1205 0405
NCF2 0512 1252 1893 1567
NCF4 0557 0231 1225 0248
NOTEmdashUnderlined values represent significant results under the demographic modelinferred by Laval et al (2010) for human populations See details in supplementarytable S4 Supplementary Material onlineaThe McDonaldndashKreitman test is nonsignificant in any of the cases
Significant under the WrightndashFisher model of constant population size
2164
Tarazona-Santos et al doi101093molbevmst119 MBE by guest on A
ugust 13 2016httpm
beoxfordjournalsorgD
ownloaded from
systemic lupus erythematosus (Cunninghame Graham et al2011) confirm the involvement of NADPH genes in the path-ogenesis of inflammatory-related common diseases Ourclaim that natural selection acted on CYBA (and maybe inNCF2) is relevant for biomedical studies because combiningevidence of natural selection with association analyses inimmune genes increases the statistical power to detect dis-ease-associated variants (Ayodo et al 2007) As a new gener-ation of association studies focusing on rare variation isemerging the combination of genes deemed interestingfrom GWAS and populations with an excess of rare variantsin these genes such as NCF2 in Asians are particularly inter-esting as a source of rare variants with clinical relevanceFinally by determining through molecular evolution mappingthat the extracellular portion of gp91 (encoded by CYBB inthe X-chromosome) has been subject to recurrent episodes ofpositive selection at the scale of mammals evolution we positthe hypothesis that this portion of the NADPH oxidase isrelevant for currently unknown biological processes thatonce revealed by structural and functional investigationswill contribute to understanding the role of NADPH oxidasein infectious autoimmune and cardiovascular diseases
Materials and Methods
Molecular Evolution Analysis of NADPH Genes
We used the maximum likelihood framework developed byYang (2007a) to estimate for the NADPH oxidase genes
under a variety of evolutionary models as implemented in thePAML software (Yang 2007b) This approach allows infer-ences about the evolution of a coding region along aninter-specific phylogeny mapping the codons that haveevolved under strongweak purifying selection neutrality oradaptive positive selection Further details about these anal-yses are available as supplementary material SupplementaryMaterial online
Human Population Genetics of NADPH Genes
For human population genetics analyses we conducted bidi-rectional Sanger sequencing of CYBB CYBA NCF2 and NCF4for a total of 35242 bp for each of 102 healthy individuals aspart of the SNP500 Cancer project (Packer et al 2006 seesupplementary fig S1 Supplementary Material online for de-tails) Human population resequencing data for CYBB werepublished in Tarazona-Santos et al (2008) The resequencingexperiments were performed as in Packer et al (2006) These102 unrelated individuals include the following 24 of Africanancestry (15 African Americans from the United States and 9Pygmies) 23 admixed Latin Americans (from Mexico PuertoRico and South America) 31 Europeans (from the CEPHUTAH pedigree and the NIEHS Environmental GenomeProject) and 24 AsiansndashOceanians (from Melanesia PakistanChina Cambodia Japan and Taiwan)
After controlling for multiple tests we confirmed that allSNPs fit the HardyndashWeinberg equilibrium by the Guo and
CYBA - A16
CYBA - A3
Y72H - rs4673
V174A - rs1049254 CYBA - E18
(a)
NCF2 - E10 H389Q - rs17849502
NCF2 - B3
NCF2 - D11
NCF2 - C1K181R - rs2274064(b)
FIG 4 Phylogenetic networks of (a) CYBA in Europeans and (b) NCF2 in Europeans (black) and Asians (yellow) The lengths of the branches areproportional to the number of mutations Only nonsynonymous mutations are shown The haplotype names correspond to the table of inferredhaplotypes in the supplementary material Supplementary Material online We only show the names of haplotypes with a frequency gt5 Ancestralhaplotypes (inferred as being the human haplotype or median vector most similar to the chimpanzee sequence) are indicated by an arrow Medianvectors are in red
2165
Evolutionary Dynamics of Human NADPH Oxidase Genes doi101093molbevmst119 MBE by guest on A
ugust 13 2016httpm
beoxfordjournalsorgD
ownloaded from
Thompson (1992) test which was implemented in the soft-ware Arlequin 30 (Excoffier et al 2007) Insertion-deletions(INDELs) were excluded from population genetics analysesTo assess intrapopulation diversity we used two statistics the per-site mean number of pairwise differences betweensequences (Tajima 1983) and w based on the number ofsegregating sites (S) (Watterson 1975) We measured pairwisebetween-populations diversity by using the FST statistics cal-culated using the software DnaSP (Rozas 2009)
Haplotypes and the recombination parameter were in-ferred using the PHASE software (Stephens and Scheet 2005)and diversity indexes calculations (tables 2 and 3) as well asneutrality statistics (table 4) were estimated using DnaSP soft-ware We applied two kinds of neutrality tests 1) tests basedon the allelic spectrum which is the distribution of polymor-phisms across different classes of frequencies namelyTajimarsquos DT (Tajima 1989) and FundashLirsquos DFL and FFL (Fu andLi 1993) and 2) tests based on comparisons between thenumber of polymorphisms in human populations and fixeddifferences with the chimpanzee (ie outgroup) namely theMcDonald and Kreitman (1991) test and the adaptedKolmogorovndashSmirnoff test by McDonald (1998) For thefirst set of tests we used as null hypotheses both the classicWrightndashFisher model of neutrality with a constant popula-tion size as well as the more realistic evolutionary scenario forhuman populations inferred by Laval et al (2010) In the caseof the scenario of Laval et al (2010) we ignored interconti-nental gene flow within the Old World because these raregene flow events likely does not affect the level of significanceof the neutrality tests given its very low inferred values(13 105) Null distributions used to test the significanceof the neutrality tests under these evolutionary scenarios weregenerated using coalescent simulations and a significancelevel of 005 (Hudson 2002) The KolmogorovndashSmirnoff testof neutrality adapted by McDonald (1998) was performedusing Slider software available at httpudeledu~mcdonaldaboutdnasliderhtml (last accessed July 16 2013) Weperformed coalescent simulations using ms software(Hudson 2002) Further methodological details are availableas supplementary material Supplementary Material onlineWe constructed the CYBA and NCF2 networks using allSNP variants and applying the Median joining algorithmand the maximum parsimony option calculations as imple-mented in the software Network 46 (Bandelt et al 1999)
Supplementary MaterialSupplementary tables S1ndashS5 and figure S1 are available atMolecular Biology and Evolution online (httpwwwmbeoxfordjournalsorg)
Acknowledgments
SJC conceived the study ET-S MM FL RC AC CF LPand BS generated the dataanalyzed the sequences LB ACWCSM and MY provided tools for data generationand analyses ET-S MM FL WCSM and RR performedpopulation genetics analyses ET-S and MM prepared tablesand figures ET-S and SJC wrote the manuscript All the
authors contributed with discussions about the results Theauthors are grateful to Silvia Fuselli for discussions of ourresults to Fernando Levi Soares for technical assistance withthe figures and the Sequencing Group of the CoreGenotyping Facility (National Cancer Institute) for their tech-nical assistance This work was supported by the IntramuralResearch Program of the National Cancer Institute (NCI) andNational Institutes of Health (NIH) CF was supported by theUniversity of Bologna ET-S MM WCSM and FL weresupported by the Fogarty International Center and NationalCancer Institute (1R01TW007894) Conselho Nacional deDesenvolvimento Cientıfico e Tecnologico (Brazil Grant3075562012-3) Fundacao de Amparo a Pesquisa de MinasGerais (Brazil CBB PPM 0026512) and the Brazilian Ministryof Education (CAPES Agency grant PNPD 0256709-1)
ReferencesAyodo G Price AL Keinan A Ajwang A Otieno MF Orago AS
Patterson N Reich D 2007 Combining evidence of natural selectionwith association analysis increases power to detect malaria-resis-tance variants Am J Hum Genet 81(2)234ndash242
Bamshad M Wooding SP 2003 Signatures of natural selection in thehuman genome Nat Rev Genet 4(2)99ndash111
Bandelt H-J Forster P Rohl A 1999 Median-joining networks for infer-ring intraspecific phylogenies Mol Biol Evol 16(1)37ndash48
Barreiro LB Ben-Ali M Quach H et al (17 co-authors) 2009Evolutionary dynamics of human Toll-like receptors and their dif-ferent contributions to host defense PLoS Genet 5(7)e1000562
Barreiro LB Quintana-Murci L 2010 From evolutionary genetics tohuman immunology how selection shapes host defence genesNat Rev Genet 11(1)17ndash30
Barrett RD Schluter D 2008 Adaptation from standing genetic varia-tion Trends Ecol Evol 23(1)38ndash44
Bedard K Attar H Bonnefont J Jaquet V Borel C Plastre O Stasia MJAntonarakis SE Krause KH 2009 Three common polymorphisms inthe CYBA gene form a haplotype associated with decreased ROSgeneration Hum Mutat 30(7)1123ndash1133
Blekhman R Man O Herrmann L Boyko AR Indap A Kosiol CBustamante CD Teshima KM Przeworski M 2008 Natural selectionon genes that underlie human disease susceptibility Curr Biol18(12)883ndash889
Brandes RP Kreuzer J 2005 Vascular NADPH oxidases molecular mech-anisms of activation Cardiovasc Res 65(1)16ndash27
Buckley RH 2004 Pulmonary complications of primary immunodefi-ciencies Paediatr Respir Rev 5(Suppl A)S225ndashS233
Bustamante CD Fledel-Alon A Williamson S et al (13 co-authors)2005 Natural selection on protein-coding genes in the humangenome Nature 437(7062)1153ndash1157
Bustamante J Arias AA Vogt G et al (29 co-authors) 2011 GermlineCYBB mutations that selectively affect macrophages in kindredswith X-linked predisposition to tuberculous mycobacterial diseaseNat Immunol 12(3)213ndash221
Campbell MC Tishkoff SA 2008 African genetic diversity implicationsfor human demographic history modern human origins and com-plex disease mapping Annu Rev Genomics Hum Genet 9403ndash433
Chanock SJ Roesler J Zhan S Hopkins P Lee P Barrett DT ChristensenBL Curnutte JT Gorlach A 2000 Genomic structure of the humanp47-phox (NCF1) gene Blood Cells Mol Dis 26(1)37ndash46
Cunninghame Graham DS Morris DL Bhangale TR Criswell LASyvanen AC Ronnblom L Behrens TW Graham RR Vyse TJ2011 Association of NCF2 IKZF1 IRF8 IFIH1 and TYK2 with sys-temic lupus erythematosus PLoS Genet 7(10)e1002341
Excoffier L Laval G Schneider S 2007 Arlequin ver 30 an integratedsoftware package for population genetics data analysis EvolBioinform Online 147ndash50
2166
Tarazona-Santos et al doi101093molbevmst119 MBE by guest on A
ugust 13 2016httpm
beoxfordjournalsorgD
ownloaded from
Excoffier L Ray N 2008 Surfing during population expansions promotesgenetic revolutions and structuration Trends Ecol Evol23(7)347ndash351
Ferrer-Admetlla A Bosch E Sikora M Marques-Bonet T Ramırez-SorianoA Muntasell A Navarro A Lazarus R Calafell F Bertranpetit J Casals F2008 Balancing selection is the main force shaping the evolution ofinnate immunity genes J Immunol 181(2)1315ndash1322
Fu YX Li WH 1993 Statistical tests of neutrality of mutations Genetics133(3)693ndash709
The 1000 Genomes Project Consortium 2010 A map of humangenome variation from population-scale sequencing Nature467(7319)1061ndash1073
The 1000 Genomes Project Consortium Abecasis GR Auton A BrooksLD DePristo MA Durbin RM Handsaker RE Kang HM Marth GTMcVean GA 2012 An integrated map of genetic variation from1092 human genomes Nature 491(7422)56ndash65
Guo SW Thompson EA 1992 Performing the exact test of Hardy-Weinberg proportion for multiple alleles Biometrics 48(2)361ndash372
Heyworth PG Cross AR Curnutte JT 2003 Chronic granulomatousdisease Curr Opin Immunol 15(5)578ndash584
Hudson RR 2002 Generating samples under a Wright-Fisher neutralmodel of genetic variation Bioinformatics 18(2)337ndash338
Jacob CO Eisenstein M Dinauer MC et al (21 co-authors) 2012 Lupus-associated causal mutation in neutrophil cytosolic factor 2 (NCF2)brings unique insights to the structure and function of NADPHoxidase Proc Natl Acad Sci U S A 109(2)E59ndashE67
Kimura M 1974 The neutral theory of molecular evolution CambridgeCambridge University Press
Kosiol C Vinar T da Fonseca RR Hubisz MJ Bustamante CD Nielsen RSiepel A 2008 Patterns of positive selection in six mammalian ge-nomes PLoS Genet 4(8)e1000144
Lacy F Kailasam MT OrsquoConnor DT Schmid-Schonbein GW Parmer RJ2000 Plasma hydrogen peroxide production in human essentialhypertension role of heredity gender and ethnicity Hypertension36(5)878ndash884
Laval G Patin E Barreiro LB Quintana-Murci L 2010 Formulating a his-torical and demographic model of recent human evolution based onresequencing data from noncoding regions PLoS One 5(4)e10284
Lewontin R 1972 The apportionment of human diversity Evol Biol 6391ndash398
Lindblad-Toh K Garber M Zuk O et al (88 co-authors) 2011 A high-resolution map of human evolutionary constraint using 29 mam-mals Nature 478(7370)476ndash482
Lodge R Diallo TO Descoteaux A 2006 Leishmania donovani lipopho-sphoglycan blocks NADPH oxidase assembly at the phagosomemembrane Cell Microbiol 8(12)1922ndash1931
Magalhaes WC Rodrigues MR Silva D Soares-Souza G Iannini MLCerqueira GC Faria-Campos AC Tarazona-Santos E 2012DIVERGENOME a bioinformatics platform to assist population ge-netics and genetic epidemiology studies Genet Epidemiol36(4)360ndash367
Marth JD Grewal PK 2008 Mammalian glycosylation in immunity NatRev Immunol 8(11)874ndash887
McDonald JH 1998 Improved tests for heterogeneity across a region ofDNA sequence in the ratio of polymorphism to divergence Mol BiolEvol 15377ndash384
McDonald JH Kreitman M 1991 Adaptive protein evolution at the Adhlocus in Drosophila Nature 351(6328)652ndash654
Nielsen R Bustamante C Clark AG et al (12 co-authors) 2005 A scanfor positively selected genes in the genomes of humans and chim-panzees PLoS Biol 3(6)e170
Olsson LM Lindqvist AK Kallberg H Padyukov L Burkhardt HAlfredsson L Klareskog L Holmdahl R 2007 A case-control studyof rheumatoid arthritis identifies an associated single nucleotidepolymorphism in the NCF4 gene supporting a role for the
NADPH-oxidase complex in autoimmunity Arthritis Res Ther9(5)R98
Packer BR Yeager M Burdett L et al (13 co-authors) 2006SNP500Cancer a public resource for sequence validation assay de-velopment and frequency analysis for genetic variation in candidategenes Nucleic Acids Res 34(Database issue)D617ndashD621
Peter BM Huerta-Sanchez E Nielsen R 2012 Distinguishing betweenselective sweeps from standing variation and from a de novo mu-tation PLoS Genet 8(10)e1003011
Przeworski M Coop G Wall JD 2005 The signature of positive selectionon standing genetic variation Evolution 59(11)2312ndash2323
Ptak SE Przeworski M 2002 Evidence for population growth in humansis confounded by fine-scale population structure Trends Genet18(11)559ndash563
Ramensky V Bork P Sunyaev S 2002 Human non-synonymous SNPsserver and survey Nucleic Acids Res 30(17)3894ndash3900
Rioux JD Xavier RJ Taylor KD et al (24 co-authors) 2007 Genome-wideassociation study identifies new susceptibility loci for Crohn diseaseand implicates autophagy in disease pathogenesis Nat Genet39(5)596ndash604
Roberts RL Hollis-Moffatt JE Gearry RB Kennedy MA Barclay MLMerriman TR 2008 Confirmation of association of IRGM andNCF4 with ileal Crohnrsquos disease in a population-based cohortGenes Immun 9(6)561ndash565
Rozas J 2009 DNA sequence polymorphism analysis using DnaSPMethods Mol Biol 537337ndash350
San Jose G Fortuno A Beloqui O Dıez J Zalba G 2008 NADPH oxidaseCYBA polymorphisms oxidative stress and cardiovascular diseasesClin Sci (Lond) 114(3)173ndash182
Santiago HC Gonzalez Lombana CZ Macedo JP et al (12 co-authors)2012 NADPH phagocyte oxidase knockout mice controlTrypanosoma cruzi proliferation but develop circulatory collapseand succumb to infection PLoS Negl Trop Dis 6(2)e1492
Stephens M Scheet P 2005 Accounting for decay of linkage disequilib-rium in haplotype inference and missing-data imputation Am JHum Genet 76(3)449ndash462
Sumimoto H Miyano K Takeya R 2005 Molecular composition andregulation of the Nox family NAD(P)H oxidases Biochem BiophysRes Commun 338(1)677ndash686
Tajima F 1983 Evolutionary relationship of DNA sequences in finitepopulations Genetics 105(2)437ndash460
Tajima F 1989 Statistical method for testing the neutral mutationhypothesis by DNA polymorphism Genetics 123(3)585ndash595
Tarazona-Santos E Bernig T Burdett L Magalhaes WC Fabbri C Liao JRedondo RA Welch R Yeager M Chanock SJ 2008 CYBB anNADPH-oxidase gene restricted diversity in humans and evidencefor differential long-term purifying selection on transmembrane andcytosolic domains Hum Mutat 29(5)623ndash632
Taylor RM Burritt JB Baniulis D Foubert TR Lord CI Dinauer MCParkos CA Jesaitis AJ 2004 Site-specific inhibitors of NADPH oxi-dase activity and structural probes of flavocytochrome b character-ization of six monoclonal antibodies to the p22phox subunitJ Immunol 173(12)7349ndash7357
Voight BF Kudaravalli S Wen X Pritchard JK 2006 A map of recentpositive selection in the human genome PLoS Biol 4(3)e72
Watterson GA 1975 On the number of segregating sites in geneticalmodels without recombination Theor Popul Biol 7(2) 256ndash276
Williamson SH Hernandez R Fledel-Alon A Zhu L Nielsen RBustamante CD 2005 Simultaneous inference of selection and pop-ulation growth from patterns of variation in the human genomeProc Natl Acad Sci U S A 102(22)7882ndash7887
Yang Z 2007a Adaptive molecular evolution In Balding DJ Bishop MCannings C editors Handbook of statistical genetics Vol 1 3rd edSusex (United Kingdom) John Wiley amp Sons p 377ndash406
Yang Z 2007b PAML 4 phylogenetic analysis by maximum likelihoodMol Biol Evol 241586ndash1591
2167
Evolutionary Dynamics of Human NADPH Oxidase Genes doi101093molbevmst119 MBE by guest on A
ugust 13 2016httpm
beoxfordjournalsorgD
ownloaded from
systemic lupus erythematosus (Cunninghame Graham et al2011) confirm the involvement of NADPH genes in the path-ogenesis of inflammatory-related common diseases Ourclaim that natural selection acted on CYBA (and maybe inNCF2) is relevant for biomedical studies because combiningevidence of natural selection with association analyses inimmune genes increases the statistical power to detect dis-ease-associated variants (Ayodo et al 2007) As a new gener-ation of association studies focusing on rare variation isemerging the combination of genes deemed interestingfrom GWAS and populations with an excess of rare variantsin these genes such as NCF2 in Asians are particularly inter-esting as a source of rare variants with clinical relevanceFinally by determining through molecular evolution mappingthat the extracellular portion of gp91 (encoded by CYBB inthe X-chromosome) has been subject to recurrent episodes ofpositive selection at the scale of mammals evolution we positthe hypothesis that this portion of the NADPH oxidase isrelevant for currently unknown biological processes thatonce revealed by structural and functional investigationswill contribute to understanding the role of NADPH oxidasein infectious autoimmune and cardiovascular diseases
Materials and Methods
Molecular Evolution Analysis of NADPH Genes
We used the maximum likelihood framework developed byYang (2007a) to estimate for the NADPH oxidase genes
under a variety of evolutionary models as implemented in thePAML software (Yang 2007b) This approach allows infer-ences about the evolution of a coding region along aninter-specific phylogeny mapping the codons that haveevolved under strongweak purifying selection neutrality oradaptive positive selection Further details about these anal-yses are available as supplementary material SupplementaryMaterial online
Human Population Genetics of NADPH Genes
For human population genetics analyses we conducted bidi-rectional Sanger sequencing of CYBB CYBA NCF2 and NCF4for a total of 35242 bp for each of 102 healthy individuals aspart of the SNP500 Cancer project (Packer et al 2006 seesupplementary fig S1 Supplementary Material online for de-tails) Human population resequencing data for CYBB werepublished in Tarazona-Santos et al (2008) The resequencingexperiments were performed as in Packer et al (2006) These102 unrelated individuals include the following 24 of Africanancestry (15 African Americans from the United States and 9Pygmies) 23 admixed Latin Americans (from Mexico PuertoRico and South America) 31 Europeans (from the CEPHUTAH pedigree and the NIEHS Environmental GenomeProject) and 24 AsiansndashOceanians (from Melanesia PakistanChina Cambodia Japan and Taiwan)
After controlling for multiple tests we confirmed that allSNPs fit the HardyndashWeinberg equilibrium by the Guo and
CYBA - A16
CYBA - A3
Y72H - rs4673
V174A - rs1049254 CYBA - E18
(a)
NCF2 - E10 H389Q - rs17849502
NCF2 - B3
NCF2 - D11
NCF2 - C1K181R - rs2274064(b)
FIG 4 Phylogenetic networks of (a) CYBA in Europeans and (b) NCF2 in Europeans (black) and Asians (yellow) The lengths of the branches areproportional to the number of mutations Only nonsynonymous mutations are shown The haplotype names correspond to the table of inferredhaplotypes in the supplementary material Supplementary Material online We only show the names of haplotypes with a frequency gt5 Ancestralhaplotypes (inferred as being the human haplotype or median vector most similar to the chimpanzee sequence) are indicated by an arrow Medianvectors are in red
2165
Evolutionary Dynamics of Human NADPH Oxidase Genes doi101093molbevmst119 MBE by guest on A
ugust 13 2016httpm
beoxfordjournalsorgD
ownloaded from
Thompson (1992) test which was implemented in the soft-ware Arlequin 30 (Excoffier et al 2007) Insertion-deletions(INDELs) were excluded from population genetics analysesTo assess intrapopulation diversity we used two statistics the per-site mean number of pairwise differences betweensequences (Tajima 1983) and w based on the number ofsegregating sites (S) (Watterson 1975) We measured pairwisebetween-populations diversity by using the FST statistics cal-culated using the software DnaSP (Rozas 2009)
Haplotypes and the recombination parameter were in-ferred using the PHASE software (Stephens and Scheet 2005)and diversity indexes calculations (tables 2 and 3) as well asneutrality statistics (table 4) were estimated using DnaSP soft-ware We applied two kinds of neutrality tests 1) tests basedon the allelic spectrum which is the distribution of polymor-phisms across different classes of frequencies namelyTajimarsquos DT (Tajima 1989) and FundashLirsquos DFL and FFL (Fu andLi 1993) and 2) tests based on comparisons between thenumber of polymorphisms in human populations and fixeddifferences with the chimpanzee (ie outgroup) namely theMcDonald and Kreitman (1991) test and the adaptedKolmogorovndashSmirnoff test by McDonald (1998) For thefirst set of tests we used as null hypotheses both the classicWrightndashFisher model of neutrality with a constant popula-tion size as well as the more realistic evolutionary scenario forhuman populations inferred by Laval et al (2010) In the caseof the scenario of Laval et al (2010) we ignored interconti-nental gene flow within the Old World because these raregene flow events likely does not affect the level of significanceof the neutrality tests given its very low inferred values(13 105) Null distributions used to test the significanceof the neutrality tests under these evolutionary scenarios weregenerated using coalescent simulations and a significancelevel of 005 (Hudson 2002) The KolmogorovndashSmirnoff testof neutrality adapted by McDonald (1998) was performedusing Slider software available at httpudeledu~mcdonaldaboutdnasliderhtml (last accessed July 16 2013) Weperformed coalescent simulations using ms software(Hudson 2002) Further methodological details are availableas supplementary material Supplementary Material onlineWe constructed the CYBA and NCF2 networks using allSNP variants and applying the Median joining algorithmand the maximum parsimony option calculations as imple-mented in the software Network 46 (Bandelt et al 1999)
Supplementary MaterialSupplementary tables S1ndashS5 and figure S1 are available atMolecular Biology and Evolution online (httpwwwmbeoxfordjournalsorg)
Acknowledgments
SJC conceived the study ET-S MM FL RC AC CF LPand BS generated the dataanalyzed the sequences LB ACWCSM and MY provided tools for data generationand analyses ET-S MM FL WCSM and RR performedpopulation genetics analyses ET-S and MM prepared tablesand figures ET-S and SJC wrote the manuscript All the
authors contributed with discussions about the results Theauthors are grateful to Silvia Fuselli for discussions of ourresults to Fernando Levi Soares for technical assistance withthe figures and the Sequencing Group of the CoreGenotyping Facility (National Cancer Institute) for their tech-nical assistance This work was supported by the IntramuralResearch Program of the National Cancer Institute (NCI) andNational Institutes of Health (NIH) CF was supported by theUniversity of Bologna ET-S MM WCSM and FL weresupported by the Fogarty International Center and NationalCancer Institute (1R01TW007894) Conselho Nacional deDesenvolvimento Cientıfico e Tecnologico (Brazil Grant3075562012-3) Fundacao de Amparo a Pesquisa de MinasGerais (Brazil CBB PPM 0026512) and the Brazilian Ministryof Education (CAPES Agency grant PNPD 0256709-1)
ReferencesAyodo G Price AL Keinan A Ajwang A Otieno MF Orago AS
Patterson N Reich D 2007 Combining evidence of natural selectionwith association analysis increases power to detect malaria-resis-tance variants Am J Hum Genet 81(2)234ndash242
Bamshad M Wooding SP 2003 Signatures of natural selection in thehuman genome Nat Rev Genet 4(2)99ndash111
Bandelt H-J Forster P Rohl A 1999 Median-joining networks for infer-ring intraspecific phylogenies Mol Biol Evol 16(1)37ndash48
Barreiro LB Ben-Ali M Quach H et al (17 co-authors) 2009Evolutionary dynamics of human Toll-like receptors and their dif-ferent contributions to host defense PLoS Genet 5(7)e1000562
Barreiro LB Quintana-Murci L 2010 From evolutionary genetics tohuman immunology how selection shapes host defence genesNat Rev Genet 11(1)17ndash30
Barrett RD Schluter D 2008 Adaptation from standing genetic varia-tion Trends Ecol Evol 23(1)38ndash44
Bedard K Attar H Bonnefont J Jaquet V Borel C Plastre O Stasia MJAntonarakis SE Krause KH 2009 Three common polymorphisms inthe CYBA gene form a haplotype associated with decreased ROSgeneration Hum Mutat 30(7)1123ndash1133
Blekhman R Man O Herrmann L Boyko AR Indap A Kosiol CBustamante CD Teshima KM Przeworski M 2008 Natural selectionon genes that underlie human disease susceptibility Curr Biol18(12)883ndash889
Brandes RP Kreuzer J 2005 Vascular NADPH oxidases molecular mech-anisms of activation Cardiovasc Res 65(1)16ndash27
Buckley RH 2004 Pulmonary complications of primary immunodefi-ciencies Paediatr Respir Rev 5(Suppl A)S225ndashS233
Bustamante CD Fledel-Alon A Williamson S et al (13 co-authors)2005 Natural selection on protein-coding genes in the humangenome Nature 437(7062)1153ndash1157
Bustamante J Arias AA Vogt G et al (29 co-authors) 2011 GermlineCYBB mutations that selectively affect macrophages in kindredswith X-linked predisposition to tuberculous mycobacterial diseaseNat Immunol 12(3)213ndash221
Campbell MC Tishkoff SA 2008 African genetic diversity implicationsfor human demographic history modern human origins and com-plex disease mapping Annu Rev Genomics Hum Genet 9403ndash433
Chanock SJ Roesler J Zhan S Hopkins P Lee P Barrett DT ChristensenBL Curnutte JT Gorlach A 2000 Genomic structure of the humanp47-phox (NCF1) gene Blood Cells Mol Dis 26(1)37ndash46
Cunninghame Graham DS Morris DL Bhangale TR Criswell LASyvanen AC Ronnblom L Behrens TW Graham RR Vyse TJ2011 Association of NCF2 IKZF1 IRF8 IFIH1 and TYK2 with sys-temic lupus erythematosus PLoS Genet 7(10)e1002341
Excoffier L Laval G Schneider S 2007 Arlequin ver 30 an integratedsoftware package for population genetics data analysis EvolBioinform Online 147ndash50
2166
Tarazona-Santos et al doi101093molbevmst119 MBE by guest on A
ugust 13 2016httpm
beoxfordjournalsorgD
ownloaded from
Excoffier L Ray N 2008 Surfing during population expansions promotesgenetic revolutions and structuration Trends Ecol Evol23(7)347ndash351
Ferrer-Admetlla A Bosch E Sikora M Marques-Bonet T Ramırez-SorianoA Muntasell A Navarro A Lazarus R Calafell F Bertranpetit J Casals F2008 Balancing selection is the main force shaping the evolution ofinnate immunity genes J Immunol 181(2)1315ndash1322
Fu YX Li WH 1993 Statistical tests of neutrality of mutations Genetics133(3)693ndash709
The 1000 Genomes Project Consortium 2010 A map of humangenome variation from population-scale sequencing Nature467(7319)1061ndash1073
The 1000 Genomes Project Consortium Abecasis GR Auton A BrooksLD DePristo MA Durbin RM Handsaker RE Kang HM Marth GTMcVean GA 2012 An integrated map of genetic variation from1092 human genomes Nature 491(7422)56ndash65
Guo SW Thompson EA 1992 Performing the exact test of Hardy-Weinberg proportion for multiple alleles Biometrics 48(2)361ndash372
Heyworth PG Cross AR Curnutte JT 2003 Chronic granulomatousdisease Curr Opin Immunol 15(5)578ndash584
Hudson RR 2002 Generating samples under a Wright-Fisher neutralmodel of genetic variation Bioinformatics 18(2)337ndash338
Jacob CO Eisenstein M Dinauer MC et al (21 co-authors) 2012 Lupus-associated causal mutation in neutrophil cytosolic factor 2 (NCF2)brings unique insights to the structure and function of NADPHoxidase Proc Natl Acad Sci U S A 109(2)E59ndashE67
Kimura M 1974 The neutral theory of molecular evolution CambridgeCambridge University Press
Kosiol C Vinar T da Fonseca RR Hubisz MJ Bustamante CD Nielsen RSiepel A 2008 Patterns of positive selection in six mammalian ge-nomes PLoS Genet 4(8)e1000144
Lacy F Kailasam MT OrsquoConnor DT Schmid-Schonbein GW Parmer RJ2000 Plasma hydrogen peroxide production in human essentialhypertension role of heredity gender and ethnicity Hypertension36(5)878ndash884
Laval G Patin E Barreiro LB Quintana-Murci L 2010 Formulating a his-torical and demographic model of recent human evolution based onresequencing data from noncoding regions PLoS One 5(4)e10284
Lewontin R 1972 The apportionment of human diversity Evol Biol 6391ndash398
Lindblad-Toh K Garber M Zuk O et al (88 co-authors) 2011 A high-resolution map of human evolutionary constraint using 29 mam-mals Nature 478(7370)476ndash482
Lodge R Diallo TO Descoteaux A 2006 Leishmania donovani lipopho-sphoglycan blocks NADPH oxidase assembly at the phagosomemembrane Cell Microbiol 8(12)1922ndash1931
Magalhaes WC Rodrigues MR Silva D Soares-Souza G Iannini MLCerqueira GC Faria-Campos AC Tarazona-Santos E 2012DIVERGENOME a bioinformatics platform to assist population ge-netics and genetic epidemiology studies Genet Epidemiol36(4)360ndash367
Marth JD Grewal PK 2008 Mammalian glycosylation in immunity NatRev Immunol 8(11)874ndash887
McDonald JH 1998 Improved tests for heterogeneity across a region ofDNA sequence in the ratio of polymorphism to divergence Mol BiolEvol 15377ndash384
McDonald JH Kreitman M 1991 Adaptive protein evolution at the Adhlocus in Drosophila Nature 351(6328)652ndash654
Nielsen R Bustamante C Clark AG et al (12 co-authors) 2005 A scanfor positively selected genes in the genomes of humans and chim-panzees PLoS Biol 3(6)e170
Olsson LM Lindqvist AK Kallberg H Padyukov L Burkhardt HAlfredsson L Klareskog L Holmdahl R 2007 A case-control studyof rheumatoid arthritis identifies an associated single nucleotidepolymorphism in the NCF4 gene supporting a role for the
NADPH-oxidase complex in autoimmunity Arthritis Res Ther9(5)R98
Packer BR Yeager M Burdett L et al (13 co-authors) 2006SNP500Cancer a public resource for sequence validation assay de-velopment and frequency analysis for genetic variation in candidategenes Nucleic Acids Res 34(Database issue)D617ndashD621
Peter BM Huerta-Sanchez E Nielsen R 2012 Distinguishing betweenselective sweeps from standing variation and from a de novo mu-tation PLoS Genet 8(10)e1003011
Przeworski M Coop G Wall JD 2005 The signature of positive selectionon standing genetic variation Evolution 59(11)2312ndash2323
Ptak SE Przeworski M 2002 Evidence for population growth in humansis confounded by fine-scale population structure Trends Genet18(11)559ndash563
Ramensky V Bork P Sunyaev S 2002 Human non-synonymous SNPsserver and survey Nucleic Acids Res 30(17)3894ndash3900
Rioux JD Xavier RJ Taylor KD et al (24 co-authors) 2007 Genome-wideassociation study identifies new susceptibility loci for Crohn diseaseand implicates autophagy in disease pathogenesis Nat Genet39(5)596ndash604
Roberts RL Hollis-Moffatt JE Gearry RB Kennedy MA Barclay MLMerriman TR 2008 Confirmation of association of IRGM andNCF4 with ileal Crohnrsquos disease in a population-based cohortGenes Immun 9(6)561ndash565
Rozas J 2009 DNA sequence polymorphism analysis using DnaSPMethods Mol Biol 537337ndash350
San Jose G Fortuno A Beloqui O Dıez J Zalba G 2008 NADPH oxidaseCYBA polymorphisms oxidative stress and cardiovascular diseasesClin Sci (Lond) 114(3)173ndash182
Santiago HC Gonzalez Lombana CZ Macedo JP et al (12 co-authors)2012 NADPH phagocyte oxidase knockout mice controlTrypanosoma cruzi proliferation but develop circulatory collapseand succumb to infection PLoS Negl Trop Dis 6(2)e1492
Stephens M Scheet P 2005 Accounting for decay of linkage disequilib-rium in haplotype inference and missing-data imputation Am JHum Genet 76(3)449ndash462
Sumimoto H Miyano K Takeya R 2005 Molecular composition andregulation of the Nox family NAD(P)H oxidases Biochem BiophysRes Commun 338(1)677ndash686
Tajima F 1983 Evolutionary relationship of DNA sequences in finitepopulations Genetics 105(2)437ndash460
Tajima F 1989 Statistical method for testing the neutral mutationhypothesis by DNA polymorphism Genetics 123(3)585ndash595
Tarazona-Santos E Bernig T Burdett L Magalhaes WC Fabbri C Liao JRedondo RA Welch R Yeager M Chanock SJ 2008 CYBB anNADPH-oxidase gene restricted diversity in humans and evidencefor differential long-term purifying selection on transmembrane andcytosolic domains Hum Mutat 29(5)623ndash632
Taylor RM Burritt JB Baniulis D Foubert TR Lord CI Dinauer MCParkos CA Jesaitis AJ 2004 Site-specific inhibitors of NADPH oxi-dase activity and structural probes of flavocytochrome b character-ization of six monoclonal antibodies to the p22phox subunitJ Immunol 173(12)7349ndash7357
Voight BF Kudaravalli S Wen X Pritchard JK 2006 A map of recentpositive selection in the human genome PLoS Biol 4(3)e72
Watterson GA 1975 On the number of segregating sites in geneticalmodels without recombination Theor Popul Biol 7(2) 256ndash276
Williamson SH Hernandez R Fledel-Alon A Zhu L Nielsen RBustamante CD 2005 Simultaneous inference of selection and pop-ulation growth from patterns of variation in the human genomeProc Natl Acad Sci U S A 102(22)7882ndash7887
Yang Z 2007a Adaptive molecular evolution In Balding DJ Bishop MCannings C editors Handbook of statistical genetics Vol 1 3rd edSusex (United Kingdom) John Wiley amp Sons p 377ndash406
Yang Z 2007b PAML 4 phylogenetic analysis by maximum likelihoodMol Biol Evol 241586ndash1591
2167
Evolutionary Dynamics of Human NADPH Oxidase Genes doi101093molbevmst119 MBE by guest on A
ugust 13 2016httpm
beoxfordjournalsorgD
ownloaded from
Thompson (1992) test which was implemented in the soft-ware Arlequin 30 (Excoffier et al 2007) Insertion-deletions(INDELs) were excluded from population genetics analysesTo assess intrapopulation diversity we used two statistics the per-site mean number of pairwise differences betweensequences (Tajima 1983) and w based on the number ofsegregating sites (S) (Watterson 1975) We measured pairwisebetween-populations diversity by using the FST statistics cal-culated using the software DnaSP (Rozas 2009)
Haplotypes and the recombination parameter were in-ferred using the PHASE software (Stephens and Scheet 2005)and diversity indexes calculations (tables 2 and 3) as well asneutrality statistics (table 4) were estimated using DnaSP soft-ware We applied two kinds of neutrality tests 1) tests basedon the allelic spectrum which is the distribution of polymor-phisms across different classes of frequencies namelyTajimarsquos DT (Tajima 1989) and FundashLirsquos DFL and FFL (Fu andLi 1993) and 2) tests based on comparisons between thenumber of polymorphisms in human populations and fixeddifferences with the chimpanzee (ie outgroup) namely theMcDonald and Kreitman (1991) test and the adaptedKolmogorovndashSmirnoff test by McDonald (1998) For thefirst set of tests we used as null hypotheses both the classicWrightndashFisher model of neutrality with a constant popula-tion size as well as the more realistic evolutionary scenario forhuman populations inferred by Laval et al (2010) In the caseof the scenario of Laval et al (2010) we ignored interconti-nental gene flow within the Old World because these raregene flow events likely does not affect the level of significanceof the neutrality tests given its very low inferred values(13 105) Null distributions used to test the significanceof the neutrality tests under these evolutionary scenarios weregenerated using coalescent simulations and a significancelevel of 005 (Hudson 2002) The KolmogorovndashSmirnoff testof neutrality adapted by McDonald (1998) was performedusing Slider software available at httpudeledu~mcdonaldaboutdnasliderhtml (last accessed July 16 2013) Weperformed coalescent simulations using ms software(Hudson 2002) Further methodological details are availableas supplementary material Supplementary Material onlineWe constructed the CYBA and NCF2 networks using allSNP variants and applying the Median joining algorithmand the maximum parsimony option calculations as imple-mented in the software Network 46 (Bandelt et al 1999)
Supplementary MaterialSupplementary tables S1ndashS5 and figure S1 are available atMolecular Biology and Evolution online (httpwwwmbeoxfordjournalsorg)
Acknowledgments
SJC conceived the study ET-S MM FL RC AC CF LPand BS generated the dataanalyzed the sequences LB ACWCSM and MY provided tools for data generationand analyses ET-S MM FL WCSM and RR performedpopulation genetics analyses ET-S and MM prepared tablesand figures ET-S and SJC wrote the manuscript All the
authors contributed with discussions about the results Theauthors are grateful to Silvia Fuselli for discussions of ourresults to Fernando Levi Soares for technical assistance withthe figures and the Sequencing Group of the CoreGenotyping Facility (National Cancer Institute) for their tech-nical assistance This work was supported by the IntramuralResearch Program of the National Cancer Institute (NCI) andNational Institutes of Health (NIH) CF was supported by theUniversity of Bologna ET-S MM WCSM and FL weresupported by the Fogarty International Center and NationalCancer Institute (1R01TW007894) Conselho Nacional deDesenvolvimento Cientıfico e Tecnologico (Brazil Grant3075562012-3) Fundacao de Amparo a Pesquisa de MinasGerais (Brazil CBB PPM 0026512) and the Brazilian Ministryof Education (CAPES Agency grant PNPD 0256709-1)
ReferencesAyodo G Price AL Keinan A Ajwang A Otieno MF Orago AS
Patterson N Reich D 2007 Combining evidence of natural selectionwith association analysis increases power to detect malaria-resis-tance variants Am J Hum Genet 81(2)234ndash242
Bamshad M Wooding SP 2003 Signatures of natural selection in thehuman genome Nat Rev Genet 4(2)99ndash111
Bandelt H-J Forster P Rohl A 1999 Median-joining networks for infer-ring intraspecific phylogenies Mol Biol Evol 16(1)37ndash48
Barreiro LB Ben-Ali M Quach H et al (17 co-authors) 2009Evolutionary dynamics of human Toll-like receptors and their dif-ferent contributions to host defense PLoS Genet 5(7)e1000562
Barreiro LB Quintana-Murci L 2010 From evolutionary genetics tohuman immunology how selection shapes host defence genesNat Rev Genet 11(1)17ndash30
Barrett RD Schluter D 2008 Adaptation from standing genetic varia-tion Trends Ecol Evol 23(1)38ndash44
Bedard K Attar H Bonnefont J Jaquet V Borel C Plastre O Stasia MJAntonarakis SE Krause KH 2009 Three common polymorphisms inthe CYBA gene form a haplotype associated with decreased ROSgeneration Hum Mutat 30(7)1123ndash1133
Blekhman R Man O Herrmann L Boyko AR Indap A Kosiol CBustamante CD Teshima KM Przeworski M 2008 Natural selectionon genes that underlie human disease susceptibility Curr Biol18(12)883ndash889
Brandes RP Kreuzer J 2005 Vascular NADPH oxidases molecular mech-anisms of activation Cardiovasc Res 65(1)16ndash27
Buckley RH 2004 Pulmonary complications of primary immunodefi-ciencies Paediatr Respir Rev 5(Suppl A)S225ndashS233
Bustamante CD Fledel-Alon A Williamson S et al (13 co-authors)2005 Natural selection on protein-coding genes in the humangenome Nature 437(7062)1153ndash1157
Bustamante J Arias AA Vogt G et al (29 co-authors) 2011 GermlineCYBB mutations that selectively affect macrophages in kindredswith X-linked predisposition to tuberculous mycobacterial diseaseNat Immunol 12(3)213ndash221
Campbell MC Tishkoff SA 2008 African genetic diversity implicationsfor human demographic history modern human origins and com-plex disease mapping Annu Rev Genomics Hum Genet 9403ndash433
Chanock SJ Roesler J Zhan S Hopkins P Lee P Barrett DT ChristensenBL Curnutte JT Gorlach A 2000 Genomic structure of the humanp47-phox (NCF1) gene Blood Cells Mol Dis 26(1)37ndash46
Cunninghame Graham DS Morris DL Bhangale TR Criswell LASyvanen AC Ronnblom L Behrens TW Graham RR Vyse TJ2011 Association of NCF2 IKZF1 IRF8 IFIH1 and TYK2 with sys-temic lupus erythematosus PLoS Genet 7(10)e1002341
Excoffier L Laval G Schneider S 2007 Arlequin ver 30 an integratedsoftware package for population genetics data analysis EvolBioinform Online 147ndash50
2166
Tarazona-Santos et al doi101093molbevmst119 MBE by guest on A
ugust 13 2016httpm
beoxfordjournalsorgD
ownloaded from
Excoffier L Ray N 2008 Surfing during population expansions promotesgenetic revolutions and structuration Trends Ecol Evol23(7)347ndash351
Ferrer-Admetlla A Bosch E Sikora M Marques-Bonet T Ramırez-SorianoA Muntasell A Navarro A Lazarus R Calafell F Bertranpetit J Casals F2008 Balancing selection is the main force shaping the evolution ofinnate immunity genes J Immunol 181(2)1315ndash1322
Fu YX Li WH 1993 Statistical tests of neutrality of mutations Genetics133(3)693ndash709
The 1000 Genomes Project Consortium 2010 A map of humangenome variation from population-scale sequencing Nature467(7319)1061ndash1073
The 1000 Genomes Project Consortium Abecasis GR Auton A BrooksLD DePristo MA Durbin RM Handsaker RE Kang HM Marth GTMcVean GA 2012 An integrated map of genetic variation from1092 human genomes Nature 491(7422)56ndash65
Guo SW Thompson EA 1992 Performing the exact test of Hardy-Weinberg proportion for multiple alleles Biometrics 48(2)361ndash372
Heyworth PG Cross AR Curnutte JT 2003 Chronic granulomatousdisease Curr Opin Immunol 15(5)578ndash584
Hudson RR 2002 Generating samples under a Wright-Fisher neutralmodel of genetic variation Bioinformatics 18(2)337ndash338
Jacob CO Eisenstein M Dinauer MC et al (21 co-authors) 2012 Lupus-associated causal mutation in neutrophil cytosolic factor 2 (NCF2)brings unique insights to the structure and function of NADPHoxidase Proc Natl Acad Sci U S A 109(2)E59ndashE67
Kimura M 1974 The neutral theory of molecular evolution CambridgeCambridge University Press
Kosiol C Vinar T da Fonseca RR Hubisz MJ Bustamante CD Nielsen RSiepel A 2008 Patterns of positive selection in six mammalian ge-nomes PLoS Genet 4(8)e1000144
Lacy F Kailasam MT OrsquoConnor DT Schmid-Schonbein GW Parmer RJ2000 Plasma hydrogen peroxide production in human essentialhypertension role of heredity gender and ethnicity Hypertension36(5)878ndash884
Laval G Patin E Barreiro LB Quintana-Murci L 2010 Formulating a his-torical and demographic model of recent human evolution based onresequencing data from noncoding regions PLoS One 5(4)e10284
Lewontin R 1972 The apportionment of human diversity Evol Biol 6391ndash398
Lindblad-Toh K Garber M Zuk O et al (88 co-authors) 2011 A high-resolution map of human evolutionary constraint using 29 mam-mals Nature 478(7370)476ndash482
Lodge R Diallo TO Descoteaux A 2006 Leishmania donovani lipopho-sphoglycan blocks NADPH oxidase assembly at the phagosomemembrane Cell Microbiol 8(12)1922ndash1931
Magalhaes WC Rodrigues MR Silva D Soares-Souza G Iannini MLCerqueira GC Faria-Campos AC Tarazona-Santos E 2012DIVERGENOME a bioinformatics platform to assist population ge-netics and genetic epidemiology studies Genet Epidemiol36(4)360ndash367
Marth JD Grewal PK 2008 Mammalian glycosylation in immunity NatRev Immunol 8(11)874ndash887
McDonald JH 1998 Improved tests for heterogeneity across a region ofDNA sequence in the ratio of polymorphism to divergence Mol BiolEvol 15377ndash384
McDonald JH Kreitman M 1991 Adaptive protein evolution at the Adhlocus in Drosophila Nature 351(6328)652ndash654
Nielsen R Bustamante C Clark AG et al (12 co-authors) 2005 A scanfor positively selected genes in the genomes of humans and chim-panzees PLoS Biol 3(6)e170
Olsson LM Lindqvist AK Kallberg H Padyukov L Burkhardt HAlfredsson L Klareskog L Holmdahl R 2007 A case-control studyof rheumatoid arthritis identifies an associated single nucleotidepolymorphism in the NCF4 gene supporting a role for the
NADPH-oxidase complex in autoimmunity Arthritis Res Ther9(5)R98
Packer BR Yeager M Burdett L et al (13 co-authors) 2006SNP500Cancer a public resource for sequence validation assay de-velopment and frequency analysis for genetic variation in candidategenes Nucleic Acids Res 34(Database issue)D617ndashD621
Peter BM Huerta-Sanchez E Nielsen R 2012 Distinguishing betweenselective sweeps from standing variation and from a de novo mu-tation PLoS Genet 8(10)e1003011
Przeworski M Coop G Wall JD 2005 The signature of positive selectionon standing genetic variation Evolution 59(11)2312ndash2323
Ptak SE Przeworski M 2002 Evidence for population growth in humansis confounded by fine-scale population structure Trends Genet18(11)559ndash563
Ramensky V Bork P Sunyaev S 2002 Human non-synonymous SNPsserver and survey Nucleic Acids Res 30(17)3894ndash3900
Rioux JD Xavier RJ Taylor KD et al (24 co-authors) 2007 Genome-wideassociation study identifies new susceptibility loci for Crohn diseaseand implicates autophagy in disease pathogenesis Nat Genet39(5)596ndash604
Roberts RL Hollis-Moffatt JE Gearry RB Kennedy MA Barclay MLMerriman TR 2008 Confirmation of association of IRGM andNCF4 with ileal Crohnrsquos disease in a population-based cohortGenes Immun 9(6)561ndash565
Rozas J 2009 DNA sequence polymorphism analysis using DnaSPMethods Mol Biol 537337ndash350
San Jose G Fortuno A Beloqui O Dıez J Zalba G 2008 NADPH oxidaseCYBA polymorphisms oxidative stress and cardiovascular diseasesClin Sci (Lond) 114(3)173ndash182
Santiago HC Gonzalez Lombana CZ Macedo JP et al (12 co-authors)2012 NADPH phagocyte oxidase knockout mice controlTrypanosoma cruzi proliferation but develop circulatory collapseand succumb to infection PLoS Negl Trop Dis 6(2)e1492
Stephens M Scheet P 2005 Accounting for decay of linkage disequilib-rium in haplotype inference and missing-data imputation Am JHum Genet 76(3)449ndash462
Sumimoto H Miyano K Takeya R 2005 Molecular composition andregulation of the Nox family NAD(P)H oxidases Biochem BiophysRes Commun 338(1)677ndash686
Tajima F 1983 Evolutionary relationship of DNA sequences in finitepopulations Genetics 105(2)437ndash460
Tajima F 1989 Statistical method for testing the neutral mutationhypothesis by DNA polymorphism Genetics 123(3)585ndash595
Tarazona-Santos E Bernig T Burdett L Magalhaes WC Fabbri C Liao JRedondo RA Welch R Yeager M Chanock SJ 2008 CYBB anNADPH-oxidase gene restricted diversity in humans and evidencefor differential long-term purifying selection on transmembrane andcytosolic domains Hum Mutat 29(5)623ndash632
Taylor RM Burritt JB Baniulis D Foubert TR Lord CI Dinauer MCParkos CA Jesaitis AJ 2004 Site-specific inhibitors of NADPH oxi-dase activity and structural probes of flavocytochrome b character-ization of six monoclonal antibodies to the p22phox subunitJ Immunol 173(12)7349ndash7357
Voight BF Kudaravalli S Wen X Pritchard JK 2006 A map of recentpositive selection in the human genome PLoS Biol 4(3)e72
Watterson GA 1975 On the number of segregating sites in geneticalmodels without recombination Theor Popul Biol 7(2) 256ndash276
Williamson SH Hernandez R Fledel-Alon A Zhu L Nielsen RBustamante CD 2005 Simultaneous inference of selection and pop-ulation growth from patterns of variation in the human genomeProc Natl Acad Sci U S A 102(22)7882ndash7887
Yang Z 2007a Adaptive molecular evolution In Balding DJ Bishop MCannings C editors Handbook of statistical genetics Vol 1 3rd edSusex (United Kingdom) John Wiley amp Sons p 377ndash406
Yang Z 2007b PAML 4 phylogenetic analysis by maximum likelihoodMol Biol Evol 241586ndash1591
2167
Evolutionary Dynamics of Human NADPH Oxidase Genes doi101093molbevmst119 MBE by guest on A
ugust 13 2016httpm
beoxfordjournalsorgD
ownloaded from
Excoffier L Ray N 2008 Surfing during population expansions promotesgenetic revolutions and structuration Trends Ecol Evol23(7)347ndash351
Ferrer-Admetlla A Bosch E Sikora M Marques-Bonet T Ramırez-SorianoA Muntasell A Navarro A Lazarus R Calafell F Bertranpetit J Casals F2008 Balancing selection is the main force shaping the evolution ofinnate immunity genes J Immunol 181(2)1315ndash1322
Fu YX Li WH 1993 Statistical tests of neutrality of mutations Genetics133(3)693ndash709
The 1000 Genomes Project Consortium 2010 A map of humangenome variation from population-scale sequencing Nature467(7319)1061ndash1073
The 1000 Genomes Project Consortium Abecasis GR Auton A BrooksLD DePristo MA Durbin RM Handsaker RE Kang HM Marth GTMcVean GA 2012 An integrated map of genetic variation from1092 human genomes Nature 491(7422)56ndash65
Guo SW Thompson EA 1992 Performing the exact test of Hardy-Weinberg proportion for multiple alleles Biometrics 48(2)361ndash372
Heyworth PG Cross AR Curnutte JT 2003 Chronic granulomatousdisease Curr Opin Immunol 15(5)578ndash584
Hudson RR 2002 Generating samples under a Wright-Fisher neutralmodel of genetic variation Bioinformatics 18(2)337ndash338
Jacob CO Eisenstein M Dinauer MC et al (21 co-authors) 2012 Lupus-associated causal mutation in neutrophil cytosolic factor 2 (NCF2)brings unique insights to the structure and function of NADPHoxidase Proc Natl Acad Sci U S A 109(2)E59ndashE67
Kimura M 1974 The neutral theory of molecular evolution CambridgeCambridge University Press
Kosiol C Vinar T da Fonseca RR Hubisz MJ Bustamante CD Nielsen RSiepel A 2008 Patterns of positive selection in six mammalian ge-nomes PLoS Genet 4(8)e1000144
Lacy F Kailasam MT OrsquoConnor DT Schmid-Schonbein GW Parmer RJ2000 Plasma hydrogen peroxide production in human essentialhypertension role of heredity gender and ethnicity Hypertension36(5)878ndash884
Laval G Patin E Barreiro LB Quintana-Murci L 2010 Formulating a his-torical and demographic model of recent human evolution based onresequencing data from noncoding regions PLoS One 5(4)e10284
Lewontin R 1972 The apportionment of human diversity Evol Biol 6391ndash398
Lindblad-Toh K Garber M Zuk O et al (88 co-authors) 2011 A high-resolution map of human evolutionary constraint using 29 mam-mals Nature 478(7370)476ndash482
Lodge R Diallo TO Descoteaux A 2006 Leishmania donovani lipopho-sphoglycan blocks NADPH oxidase assembly at the phagosomemembrane Cell Microbiol 8(12)1922ndash1931
Magalhaes WC Rodrigues MR Silva D Soares-Souza G Iannini MLCerqueira GC Faria-Campos AC Tarazona-Santos E 2012DIVERGENOME a bioinformatics platform to assist population ge-netics and genetic epidemiology studies Genet Epidemiol36(4)360ndash367
Marth JD Grewal PK 2008 Mammalian glycosylation in immunity NatRev Immunol 8(11)874ndash887
McDonald JH 1998 Improved tests for heterogeneity across a region ofDNA sequence in the ratio of polymorphism to divergence Mol BiolEvol 15377ndash384
McDonald JH Kreitman M 1991 Adaptive protein evolution at the Adhlocus in Drosophila Nature 351(6328)652ndash654
Nielsen R Bustamante C Clark AG et al (12 co-authors) 2005 A scanfor positively selected genes in the genomes of humans and chim-panzees PLoS Biol 3(6)e170
Olsson LM Lindqvist AK Kallberg H Padyukov L Burkhardt HAlfredsson L Klareskog L Holmdahl R 2007 A case-control studyof rheumatoid arthritis identifies an associated single nucleotidepolymorphism in the NCF4 gene supporting a role for the
NADPH-oxidase complex in autoimmunity Arthritis Res Ther9(5)R98
Packer BR Yeager M Burdett L et al (13 co-authors) 2006SNP500Cancer a public resource for sequence validation assay de-velopment and frequency analysis for genetic variation in candidategenes Nucleic Acids Res 34(Database issue)D617ndashD621
Peter BM Huerta-Sanchez E Nielsen R 2012 Distinguishing betweenselective sweeps from standing variation and from a de novo mu-tation PLoS Genet 8(10)e1003011
Przeworski M Coop G Wall JD 2005 The signature of positive selectionon standing genetic variation Evolution 59(11)2312ndash2323
Ptak SE Przeworski M 2002 Evidence for population growth in humansis confounded by fine-scale population structure Trends Genet18(11)559ndash563
Ramensky V Bork P Sunyaev S 2002 Human non-synonymous SNPsserver and survey Nucleic Acids Res 30(17)3894ndash3900
Rioux JD Xavier RJ Taylor KD et al (24 co-authors) 2007 Genome-wideassociation study identifies new susceptibility loci for Crohn diseaseand implicates autophagy in disease pathogenesis Nat Genet39(5)596ndash604
Roberts RL Hollis-Moffatt JE Gearry RB Kennedy MA Barclay MLMerriman TR 2008 Confirmation of association of IRGM andNCF4 with ileal Crohnrsquos disease in a population-based cohortGenes Immun 9(6)561ndash565
Rozas J 2009 DNA sequence polymorphism analysis using DnaSPMethods Mol Biol 537337ndash350
San Jose G Fortuno A Beloqui O Dıez J Zalba G 2008 NADPH oxidaseCYBA polymorphisms oxidative stress and cardiovascular diseasesClin Sci (Lond) 114(3)173ndash182
Santiago HC Gonzalez Lombana CZ Macedo JP et al (12 co-authors)2012 NADPH phagocyte oxidase knockout mice controlTrypanosoma cruzi proliferation but develop circulatory collapseand succumb to infection PLoS Negl Trop Dis 6(2)e1492
Stephens M Scheet P 2005 Accounting for decay of linkage disequilib-rium in haplotype inference and missing-data imputation Am JHum Genet 76(3)449ndash462
Sumimoto H Miyano K Takeya R 2005 Molecular composition andregulation of the Nox family NAD(P)H oxidases Biochem BiophysRes Commun 338(1)677ndash686
Tajima F 1983 Evolutionary relationship of DNA sequences in finitepopulations Genetics 105(2)437ndash460
Tajima F 1989 Statistical method for testing the neutral mutationhypothesis by DNA polymorphism Genetics 123(3)585ndash595
Tarazona-Santos E Bernig T Burdett L Magalhaes WC Fabbri C Liao JRedondo RA Welch R Yeager M Chanock SJ 2008 CYBB anNADPH-oxidase gene restricted diversity in humans and evidencefor differential long-term purifying selection on transmembrane andcytosolic domains Hum Mutat 29(5)623ndash632
Taylor RM Burritt JB Baniulis D Foubert TR Lord CI Dinauer MCParkos CA Jesaitis AJ 2004 Site-specific inhibitors of NADPH oxi-dase activity and structural probes of flavocytochrome b character-ization of six monoclonal antibodies to the p22phox subunitJ Immunol 173(12)7349ndash7357
Voight BF Kudaravalli S Wen X Pritchard JK 2006 A map of recentpositive selection in the human genome PLoS Biol 4(3)e72
Watterson GA 1975 On the number of segregating sites in geneticalmodels without recombination Theor Popul Biol 7(2) 256ndash276
Williamson SH Hernandez R Fledel-Alon A Zhu L Nielsen RBustamante CD 2005 Simultaneous inference of selection and pop-ulation growth from patterns of variation in the human genomeProc Natl Acad Sci U S A 102(22)7882ndash7887
Yang Z 2007a Adaptive molecular evolution In Balding DJ Bishop MCannings C editors Handbook of statistical genetics Vol 1 3rd edSusex (United Kingdom) John Wiley amp Sons p 377ndash406
Yang Z 2007b PAML 4 phylogenetic analysis by maximum likelihoodMol Biol Evol 241586ndash1591
2167
Evolutionary Dynamics of Human NADPH Oxidase Genes doi101093molbevmst119 MBE by guest on A
ugust 13 2016httpm
beoxfordjournalsorgD
ownloaded from