characterization of genetic loci that affect susceptibility to inflammatory bowel diseases in...

47
Accepted Manuscript Characterization of Genetic Loci That Affect Susceptibility to Inflammatory Bowel Diseases in African Americans Chengrui Huang, Talin Haritunians, David T. Okou, David J. Cutler, Michael E. Zwick, Kent D. Taylor, Lisa W. Datta, Joseph C. Maranville, Zhenqiu Liu, Shannon Ellis, Pankaj Chopra, Jonathan S. Alexander, Robert N. Baldassano, Raymond K. Cross, Themistocles Dassopoulos, Tanvi A. Dhere, Richard H. Duerr, John S. Hanson, Jason K. Hou, Sunny Z. Hussain, Kim L. Isaacs, Kelly E. Kachelries, Howard Kader, Michael D. Kappelman, Jeffrey Katz, Richard Kellermayer, Barbara S. Kirschner, John F. Kuemmerle, Archana Kumar, John H. Kwon, Mark Lazarev, Peter Mannon, Dedrick E. Moulton, Bankole O. Osuntokun, Ashish Patel, John D. Rioux, Jerome I. Rotter, Shehzad Saeed, Ellen J. Scherl, Mark S. Silverberg, Ann Silverman, Stephan R. Targan, John Valentine, Ming-Hsi Wang, Claire L. Simpson, S. Louis Bridges, Robert P. Kimberly, Stephen S. Rich, Judy H. Cho, Anna Di Rienzo, Linda W.H. Kao, Dermot P.B. McGovern, Steven R. Brant, Subra Kugathasan PII: S0016-5085(15)01103-8 DOI: 10.1053/j.gastro.2015.07.065 Reference: YGAST 59961 To appear in: Gastroenterology Accepted Date: 28 July 2015 Please cite this article as: Huang C, Haritunians T, Okou DT, Cutler DJ, Zwick ME, Taylor KD, Datta LW, Maranville JC, Liu Z, Ellis S, Chopra P, Alexander JS, Baldassano RN, Cross RK, Dassopoulos T, Dhere TA, Duerr RH, Hanson JS, Hou JK, Hussain SZ, Isaacs KL, Kachelries KE, Kader H, Kappelman MD, Katz J, Kellermayer R, Kirschner BS, Kuemmerle JF, Kumar A, Kwon JH, Lazarev M, Mannon P, Moulton DE, Osuntokun BO, Patel A, Rioux JD, Rotter JI, Saeed S, Scherl EJ, Silverberg MS, Silverman A, Targan SR, Valentine J, Wang M-H, Simpson CL, Bridges SL, Kimberly RP, Rich SS, Cho JH, Di Rienzo A, Kao LWH, McGovern DPB, Brant SR, Kugathasan S, Characterization of Genetic Loci That Affect Susceptibility to Inflammatory Bowel Diseases in African Americans, Gastroenterology (2015), doi: 10.1053/j.gastro.2015.07.065. This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please

Upload: disl

Post on 22-Nov-2023

0 views

Category:

Documents


0 download

TRANSCRIPT

Accepted Manuscript

Characterization of Genetic Loci That Affect Susceptibility to Inflammatory BowelDiseases in African Americans

Chengrui Huang, Talin Haritunians, David T. Okou, David J. Cutler, Michael E.Zwick, Kent D. Taylor, Lisa W. Datta, Joseph C. Maranville, Zhenqiu Liu, ShannonEllis, Pankaj Chopra, Jonathan S. Alexander, Robert N. Baldassano, RaymondK. Cross, Themistocles Dassopoulos, Tanvi A. Dhere, Richard H. Duerr, JohnS. Hanson, Jason K. Hou, Sunny Z. Hussain, Kim L. Isaacs, Kelly E. Kachelries,Howard Kader, Michael D. Kappelman, Jeffrey Katz, Richard Kellermayer, BarbaraS. Kirschner, John F. Kuemmerle, Archana Kumar, John H. Kwon, Mark Lazarev,Peter Mannon, Dedrick E. Moulton, Bankole O. Osuntokun, Ashish Patel, JohnD. Rioux, Jerome I. Rotter, Shehzad Saeed, Ellen J. Scherl, Mark S. Silverberg,Ann Silverman, Stephan R. Targan, John Valentine, Ming-Hsi Wang, Claire L.Simpson, S. Louis Bridges, Robert P. Kimberly, Stephen S. Rich, Judy H. Cho,Anna Di Rienzo, Linda W.H. Kao, Dermot P.B. McGovern, Steven R. Brant, SubraKugathasan

PII: S0016-5085(15)01103-8DOI: 10.1053/j.gastro.2015.07.065Reference: YGAST 59961

To appear in: GastroenterologyAccepted Date: 28 July 2015

Please cite this article as: Huang C, Haritunians T, Okou DT, Cutler DJ, Zwick ME, Taylor KD, DattaLW, Maranville JC, Liu Z, Ellis S, Chopra P, Alexander JS, Baldassano RN, Cross RK, Dassopoulos T,Dhere TA, Duerr RH, Hanson JS, Hou JK, Hussain SZ, Isaacs KL, Kachelries KE, Kader H, KappelmanMD, Katz J, Kellermayer R, Kirschner BS, Kuemmerle JF, Kumar A, Kwon JH, Lazarev M, Mannon P,Moulton DE, Osuntokun BO, Patel A, Rioux JD, Rotter JI, Saeed S, Scherl EJ, Silverberg MS, SilvermanA, Targan SR, Valentine J, Wang M-H, Simpson CL, Bridges SL, Kimberly RP, Rich SS, Cho JH, DiRienzo A, Kao LWH, McGovern DPB, Brant SR, Kugathasan S, Characterization of Genetic Loci ThatAffect Susceptibility to Inflammatory Bowel Diseases in African Americans, Gastroenterology (2015), doi:10.1053/j.gastro.2015.07.065.

This is a PDF file of an unedited manuscript that has been accepted for publication. As a service toour customers we are providing this early version of the manuscript. The manuscript will undergocopyediting, typesetting, and review of the resulting proof before it is published in its final form. Please

note that during the production process errors may be discovered which could affect the content, and alllegal disclaimers that apply to the journal pertain.

All studies published in Gastroenterology are embargoed until 3PM ET of the day they are published ascorrected proofs on-line. Studies cannot be publicized as accepted manuscripts or uncorrected proofs.

MANUSCRIP

T

ACCEPTED

ACCEPTED MANUSCRIPT

Characterization of Genetic Loci That Affect Susceptibility to Inflammatory

Bowel Diseases in African Americans

Short Title: Major Genetic Loci for IBD in African Americans

Authors: Chengrui Huang1*, Talin Haritunians2*, David T. Okou3*, David J. Cutler5, Michael E. Zwick5,

Kent D. Taylor4, Lisa W. Datta6, Joseph C. Maranville7, Zhenqiu Liu2, Shannon Ellis6, Pankaj Chopra5,

Jonathan S. Alexander8, Robert N. Baldassano9, Raymond K. Cross10, Themistocles Dassopoulos11, Tanvi

A. Dhere12, Richard H. Duerr13, John S. Hanson14, Jason K. Hou15, Sunny Z. Hussain16, Kim L. Isaacs17,

Kelly E Kachelries9, Howard Kader18, Michael D. Kappelman19, Jeffrey Katz20, Richard Kellermayer21

Barbara S. Kirschner22, John F. Kuemmerle23, Archana Kumar3, John H. Kwon24, Mark Lazarev6, Peter

Mannon25, Dedrick E. Moulton26, Bankole O. Osuntokun27, Ashish Patel28, John D. Rioux29, Jerome I.

Rotter4, Shehzad Saeed30, Ellen J. Scherl31, Mark S. Silverberg32, Ann Silverman33, Stephan R. Targan2,

John Valentine34, Ming-Hsi Wang6, Claire L. Simpson35, S. Louis Bridges36, Robert P. Kimberly36,

Stephen S. Rich37, Judy H. Cho38, Anna Di Rienzo7, Linda W.H. Kao1, Dermot P.B. McGovern2,*, Steven

R. Brant1,6,*†, Subra Kugathasan3,*

1. Department of Epidemiology, Johns Hopkins University Bloomberg School of Public Health, Baltimore,

MD 21231, USA;

2. F. Widjaja Foundation Inflammatory Bowel and Immunobiology Research Institute, Cedars-Sinai Medical

Center, Los Angeles, CA 90049, USA;

3. Department of Pediatrics, Emory University School of Medicine, Atlanta, GA 30322, USA;

4. Institute for Translational Genomics and Population Sciences and Division of Genomic Outcomes,

Departments of Pediatrics and Medicine, Los Angeles Biomedical Research Institute at Harbor-UCLA

Medical Center, Torrance, CA,90502, USA;

MANUSCRIP

T

ACCEPTED

ACCEPTED MANUSCRIPT

5. Department of Human Genetics, Emory University School of Medicine, Atlanta, GA 30322, USA;

6. Meyerhoff Inflammatory Bowel Disease Center, Department of Medicine, Johns Hopkins University

School of Medicine, Baltimore, MD 21231, USA;

7. Committee on Clinical Pharmacology and Pharmacogenomics, and the Department of Human Genetics,

The University of Chicago, Chicago, IL 60637, USA

8. Department of Molecular and Cellular Physiology, Louisiana State University Health Sciences Center,

Shreveport, LA 71130, USA;

9. Division of Gastroenterology and Nutrition, Children's Hospital of Philadelphia, Philadelphia, PA 19104,

USA;

10. Division of Gastroenterology, University of Maryland, Baltimore, MD 21201, USA;

11. Department of Medicine, Washington University School of Medicine, St Louis, MO 63110, USA;

12. Department of Medicine, Emory University School of Medicine, Atlanta, GA 30322, USA;

13. Division of Gastroenterology, Hepatology and Nutrition, Department of Medicine, University of Pittsburgh

School of Medicine, and Department of Human Genetics, Graduate School of Public Health, University of

Pittsburgh, Pittsburgh, PA 15261, USA;

14. Charlotte Gastroenterology and Hepatology, PLLC, Charlotte, NC 28207, USA;

15. Department of Medicine, Baylor College of Medicine; VA HSR&D Center for Innovations in Quality,

Effectiveness and Safety , Michael E. DeBakey VA Medical Center, Houston, TX 77030, USA;

16. Department of Pediatrics, Willis-Knighton Physician Network, Shreveport, LA 71118, USA;

17. Division of Gastroenterology and Hepatology, University of North Carolina at Chapel Hill, Chapel Hill,

NC 27514, USA;

18. Department of Pediatrics, University of Maryland School of Medicine, Baltimore, MD 21201, USA;

MANUSCRIP

T

ACCEPTED

ACCEPTED MANUSCRIPT

19. Department of Pediatrics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27514, USA;

20. Division of Gastroenterology, Case Western Reserve University, Cleveland, OH 44106, USA;

21. Section of Pediatric Gastroenterology, Baylor College of Medicine, Houston, TX, 77030

22. Department of Pediatrics, University of Chicago Comer Children's Hospital, Chicago, IL 60637, USA;

23. Departments of Medicine and Physiology and Biophysics, VCU Program in Enteric Neuromuscular

Sciences, Medical College of Virginia Campus of Virginia Commonwealth University, Richmond VA

23298, USA;

24. Section of Gastroenterology, Department of Medicine, University of Chicago, Chicago, IL 60637, USA;

25. Department of Medicine, University of Alabama at Birmingham, Birmingham, AL 35294, USA;

26. Division of Gastroenterology, Vanderbilt Children’s Hospital, Nashville TN 37212, USA;

27. Department of Pediatrics, Cook Children’s Medical Center, Fort Worth, TX 76104, USA;

28. Department of Pediatrics, University of Texas Southwestern Medical Center, Dallas, TX 75390, USA;

29. Universite de Montreal and the Montreal Heart Institute, Research Center, Montreal, Quebec H1T 1C8,

Canada;

30. Division of Gastroenterology, Hepatology and Nutrition, Cincinnati Children's Hospital Medical Center,

Cincinnati, OH 45229, USA;

31. Department of Medicine, Weill Cornell Medical College, New York, NY 10065, USA;

32. Departments of Medicine, Surgery, Public Health Sciences, Immunology, and Molecular and Medical

Genetics, University of Toronto, Samuel Lunenfeld Research Institute and Mount Sinai Hospital, Toronto

General Hospital Research Institute, Toronto, Ontario M5S 2J7, Canada;

33. Department of Gastroenterology, Henry Ford Health System Detroit, MI 48208, USA;

MANUSCRIP

T

ACCEPTED

ACCEPTED MANUSCRIPT

34. Departments and Physiology and Biophysics of Division of Gastroenterology, Hepatology & Nutrition,

University of Florida, Gainesville, FL 32611, USA;

35. Statistical Genetics Section, Inherited Disease Research Branch, National Human Genome Research

Institute, National Institutes of Health, Baltimore, MD 21224, USA;

36. Division of Clinical Immunology & Rheumatology, University of Alabama at Birmingham, Birmingham,

AL 35294, USA;

37. Center for Public Health Genomics, University of Virginia School of Medicine, Charlottesville, VA 22908,

USA;

38. Department of Medicine and Genetics, Yale University, New Haven, CT 06520, USA;

*These authors contributed equally to this work.

† Corresponding author

Grant Support: NIH Grants DK062431 (S.R.B.), DK087694 (S.K.), DK062413 (D.P.B.M and K.T),

DK046763-19, AI067068 and U54DE023789-01 (D.P.B.M.), DK062429 and DK062422 (J.H.C.),

DK062420 (R.H.D.), DK062432 (J.D.R.), and DK062423 (M.S.S.). Additional support from Harvey M.

and Lynn P. Meyerhoff Inflammatory Bowel Disease Center, the Morton Hyatt Family, the Buford and

Linda Lewis family (S.R.B.); Endowed professorship from Marcus foundation (SK); The Joshua L and

Lisa Z Greer Endowed Chair, HS021747 from the Agency for Healthcare Research and Quality, grant

305479 from the European Union, and The Leona M. and Harry B. Helmsley Charitable Trust

(D.P.B.M.); from the Veterans Administration HSR&D Center for Innovations in Quality, Effectiveness

and Safety (#CIN 13-413) and the Michael E. DeBakey VA Medical Center (J.K.H.). PARC control

samples supported by NIH/NHLBI grant HL06957 (Ronald M. Krauss, PI). RA control samples recruited

MANUSCRIP

T

ACCEPTED

ACCEPTED MANUSCRIPT

and supported by the Consortium for the Longitudinal Evaluation of African-Americans with Early

Rheumatoid Arthritis (CLEAR), NIH grants N01-AR-02247 and AR-6-2278 (S.L.B.) and the University

of Alabama GCRC (M01-RR-00032); SLE control samples recruited and supported by the PROFILE

Study group coordinated at the University of Alabama Birmingham and supported by NIH grants P01-

AR49084 (R.P.K.) and M01-RR-00032; T1D control samples recruited and supported by the Type 1

Diabetes Genetics Consortium U01 DK062418 (S.S.R.), and sponsored by NIDDK, NIAID, NHGRI,

NICHD and the Juvenile Diabetes Research Foundation (JDRF).

Abbreviations: admixture linkage disequilibrium (ALD); African Americans (AAs); ancestry

informative markers (AIMs); Caucasian Americans (CAs); Caucasian Immunochip study (CIS); Crohn’s

disease (CD); European ancestry (CEU); expression quantitative trait locus (eQTL); false discovery rate

(FDR); genetics research center (GRC); genome wide association (GWA); immune-mediated diseases

(IMD); inflammatory bowel disease (IBD); inflammatory bowel disease type undetermined (IBDU);

odds ratio (OR); peripheral blood mononuclear cells (PBMCs); quality control (QC); rheumatoid arthritis

(RA); the most statistically significant SNPs at each locus in the Caucasian Immunochip study and

present on the Immunochip genotyping array (Maximal-CIS SNPs); systemic lupus erythematosus (SLE);

type 1 diabetes (T1D); ulcerative colitis (UC); West African ancestry (YRI).

Correspondence: Steven R. Brant, M.D., Johns Hopkins University School of Medicine, Meyerhoff Inflammatory

Bowel Disease Center, 1501 E. Jefferson St., B136, Baltimore, MD 21231. Email: [email protected]; Phone: 410-

955-9679; Fax: 410-502-9913

Disclosures: No authors have conflicts of interest to declare.

MANUSCRIP

T

ACCEPTED

ACCEPTED MANUSCRIPT

Author Contributions: Study concept and design: SRB SK DPBM LWHK JHC CH DJC. Analysis and

interpretation of data: CH TH DTO DJC MEZ KDT JCM ZL CLS LWHK DPBM SRB SK. Data

acquisition and material support: LWD JSA RNB JHC RKC TD TAD RHD JSH JKH SZH KLI KEK HK

MK MDK JK RK BSK JFK AK JHK ML PM DEM BOO AP JDR JIR SS EJS MSS AS SRT JV MHW

SLB RPK SSR ADR DPBM SRB SK. Drafting manuscript: CH SRB. Critical revision of the manuscript

for important intellectual content: CH TH DTO HK CLS MHW SSR ADR JHC LWHK DPBM SRB SK.

MANUSCRIP

T

ACCEPTED

ACCEPTED MANUSCRIPT

1

Abstract

Background & Aims: Inflammatory bowel disease (IBD) has familial aggregation in African

Americans (AAs), but little is known about the molecular genetic susceptibility. Mapping studies

using the Immunochip genotyping array expand the number of susceptibility loci for IBD in

Caucasians to 163, but the contribution of the 163 loci and European admixture to IBD risk in

AAs is unclear. We performed a genetic mapping study using the Immunochip to determine

whether IBD susceptibility loci in Caucasians also affect risk in AAs and identify new associated

loci.

Methods: We recruited AAs with IBD and without IBD (controls) from 34 IBD centers in the

US; additional controls were collected from 4 other immunochip studies. Association and

admixture loci were mapped for 1088 patients with Crohn’s disease (CD), 361 with ulcerative

colitis (UC), 62 with IBD type-unknown (IBDU), and 1797 controls; 130,241 autosomal single-

nucleotide polymorphisms (SNPs) were analyzed.

Results: The strongest associations were observed between UC and HLA rs9271366 (P=7.5e–6),

CD and 5p13.1 rs254855 (P=3.0e–6), and IBD and KAT2A rs730086 (P=2.3e–6). Additional

suggestive associations (P<4.2e-5) were observed between CD and IBD and African-specific

SNPs in STAT5A and STAT3; between IBD and SNPs in IL23R, IL12B, and C2 open reading

frame 43; and between UC and SNPs near HDAC11 and near LINC00994. The latter 3 loci have

not been previously associated with IBD, but require replication. Established Caucasian

associations were replicated in AAs (P<3.1e-4) at NOD2, IL23R, 5p15.3, and IKZF3. Significant

admixture (P<3.9e–4) was observed for 17q12-17q21.31 (IZKF3 through STAT3), 10q11.23-

10q21.2, 15q22.2–15q23, and 16p12.2–16p12.1. Network analyses showed significant

MANUSCRIP

T

ACCEPTED

ACCEPTED MANUSCRIPT

2

enrichment (false discovery rate <1e–5) in genes that encode members of the JAK–STAT,

cytokine, and chemokine signaling pathways, as well those involved in pathogenesis of measles.

Conclusions: In a genetic analysis of 1511 AAs with IBD, we found that many variants

associated with IBD in Caucasians also showed association evidence with these diseases in AAs;

we found evidence for loci and variants not previously associated with IBD. The complex

genetic factors that determine risk for or protection from IBD in different populations require

further study.

KEYWORDS: race, ethnicity, genetic variant, intestinal inflammation

MANUSCRIP

T

ACCEPTED

ACCEPTED MANUSCRIPT

1

Introduction

Inflammatory bowel disease (IBD) is a complex genetic disorder of immune dysregulation causing

chronic idiopathic inflammation of the gastrointestinal tract, estimated to affect about 1.4 million

Americans. It comprises two major, genetically related phenotypes: Crohn’s disease (CD) and ulcerative

colitis (UC).

IBD shares many clinical and immunological characteristics with other complex genetic immune-

mediated diseases (IMDs), especially with seronegative auto-inflammatory diseases1, such as ankylosing

spondylitis, psoriasis and primary immune-deficiencies2, indicating overlapping etiological factors. To

facilitate genetic studies on IMDs a custom Illumina array of ~200,000 SNPs, the Immunochip, was

designed based on genome wide association (GWA) analyses on Caucasian populations of 12 IMDs

including IBD3. The main purposes were to fine map established associations and to replicate suggestive,

but not yet proven, associations3. The Immunochip also contains ancestry informative markers (AIMs)

allowing for genome-wide admixture estimates and adjustment for population stratification.

IBD GWA studies, including those performed using the Immunochip, have expanded the number of IBD

susceptibility loci to 163 (including 30 CD- and 23 UC-specific loci)2, and have enhanced our

understanding of IBD immunopathogenesis by identifying key cellular pathways, both known – such as

barrier function, the role of T cell subsets and cytokine–cytokine receptor signaling – and unknown –

such as autophagy, regulation of interleukin 23 (IL23) signaling, and host defense4. However, compared

to hundreds of IBD genetic studies in Caucasian populations including massive GWA mega-analyses and

replication studies like the Caucasian Immunochip Study (CIS)2, only a handful of IBD-associated gene

variations have been evaluated in African Americans (AAs), in relatively small sample sizes of a few

hundred cases and controls, and only for CD, not UC5-7.

AAs are a recently admixed population derived from an average of approximately 80% West African and

20% European ancestries8. IBD prevalence is lower in AAs than Caucasian Americans (CAs) possibly as

MANUSCRIP

T

ACCEPTED

ACCEPTED MANUSCRIPT

2

a result of both genetic and environmental differences9-11. IBD sibling risk in AA IBD patients is

relatively high (2.5%), suggesting underlying genetic risk factors are in-part responsible for IBD in AAs,

albeit lower than that observed for Caucasians (4.6%)12.

An admixed population is one where two or more previously separated populations have interbred. Loci

that have different allele frequencies in the founder populations become correlated because of gene flow,

a phenomenon known as admixture linkage disequilibrium (ALD). AAs are a typical example of an

admixed population, where there has been recent introduction of European genetic lineages into a West-

African-derived population. A powerful technique called mapping by ALD (MALD) can be used on

admixed populations such as AAs and the underlying assumption is that the difference in disease

frequency is due in part to differences in allele frequencies of causal variants between populations.

We therefore undertook an evaluation of the Immunochip in ~4000 AA IBD cases and controls, primarily

to determine the importance of the 163 established CIS IBD loci in the understudied AA population and

to identify novel IBD loci, including loci identified by MALD.

Patients and Methods

Study Population and Phenotyping

The study population included unrelated self-identified non-Hispanic AA volunteers recruited from three

coordinating centers: (1) Johns Hopkins: Multicenter African American IBD Study (MAAIS) coordinated

by Johns Hopkins IBD Genetics Research Center (GRC) of the NIDDK IBD Genetics Consortium

(IBDGC) with recruitment from 13 collaborating IBD centers and 4 other IBDGC GRCs5, and additional

AA control samples from the controls from rheumatoid arthritis (RA), systemic lupus erythematosus

(SLE) and type 1 diabetes (T1D) Immunochip studies; (2) Emory: GENESIS AA cohort, an ancillary

study of the NIDDK IBDGC, coordinated by Emory University with recruitment of IBD cases and

MANUSCRIP

T

ACCEPTED

ACCEPTED MANUSCRIPT

3

matched controls from 12 of their collaborating IBD centers; and (3) Cedars: Cedars-Sinai Medical

Center recruited IBD cases and controls with additional controls from the Pharmacogenetics and Risk of

Cardiovascular disease Study Group (PARC) from San Francisco General Hospital and University of

California, Los Angeles13.

All subjects gave informed consent to participate in genetics research studies in protocols approved by

each sites institutional review board. Cases were confirmed as CD, UC, or IBD type undetermined

(IBDU) in accordance with the NIDDK IBDGC phenotyping manual14. Details regarding controls are

described in the Supplement. See Acknowledgements for listing of all recruitment centers.

Genotyping and Quality Control (QC)

DNA samples were derived from whole blood. All DNA samples were genotyped using the Immunochip3

and genotype determinations (allele calls) were made using GenomeStudio version 2011.1 and

Genotyping Module Version 1.9.4. All MAAIS samples and the RA, SLE, and T1D controls were

genotyped at Feinstein Institute for Medical Research. All Emory, Cedars-Sinai and PARC samples were

genotyped at Cedars-Sinai Medical Center Genetics Institute.

Several SNP-wise and sample-wise quality filters were applied (Supplementary Figure 1). Samples were

excluded if they had <99% data completeness, differed by >3 standard deviations from the mean

heterozygosity for the study, had discrepant gender, or were unexpectedly the first-degree relative of any

other sample in the study. Matching of cases and controls was done by determining principal components

(PC) using the software EIGENSTRAT15, plotting PC1 against PC2, followed by visual inspection and

elimination of outlier samples. Three successive rounds were necessary until a satisfactory matching of

cases/controls (Supplementary Figure 2) was obtained.

MANUSCRIP

T

ACCEPTED

ACCEPTED MANUSCRIPT

4

Local and Global Ancestry Estimation

Because AAs are well-modeled as linear combinations of West African (YRI) and European (CEU)

ancestries16, to estimate the locus-specific local ancestry, we chose the WINPOP model in LAMP

package17 for its fast computation with low error rate18. Average West African (global YRI) ancestry was

estimated by using ADMIXTURE19.

Study-wise Association and Admixture, and Replication of 163 SNPs Established in Caucasian

Immunochip Study (CIS)

For all analyses, we compared three phenotypes, all IBD, CD, and UC, against the same set of AA

controls. We performed association and admixture mapping both under generalized linear models. Sex,

recruitment coordinating center (Hopkins, Emory or Cedars), GRC (Feinstein or Cedars), global YRI

ancestry, and the first 10 PCs were included in multiple regression with IBD affection status, and those

significantly associated with IBD were used as covariates in association and admixture mapping.

We estimated the testing burdens empirically by assessing the autocorrelation (of genotypes for

association or local ancestry proportions for admixture) of all the SNPs on each autosomal chromosome

for each individual, and then summing over the 22 chromosomes and averaging across individuals20.

The testing burdens of independent SNPs or admixture regions, specific to our dataset, were 23,639.4 and

128.4 for association and admixture mapping respectively. Association peaks with p<2.1e-6 (5% false

positive rate corrected for number of independent SNPs, i.e. 0.05/23,639.4) were marked as significant,

while those with 2.1e-6≤p<4.2e-5 were marked as suggestive (i.e. one false positive per study-wide test

burden). Admixture peaks with p<3.9e-4 (0.05/128.4) were marked as significant.

For chromosomes with multiple significant/suggestive association SNPs, conditional regression was

performed to determine number of independent signals. Specifically for m SNPs, all possible m*(m-1)/2

MANUSCRIP

T

ACCEPTED

ACCEPTED MANUSCRIPT

5

pairwise combinations were tested. For each pair if the target SNP remained significant in the presence of

the conditional SNP then they are independent of each other; otherwise, they have arisen from the same

signal.

To evaluate the 163 loci with reported genome-wide significance in the CIS, we tested the most

statistically significant SNPs at each locus and present on the Immunochip (Maximal-CIS SNPs). Criteria

for replication was p<3.1e-4 (0.05/163).

Gene Ontology was utilized to functionally annotate the genes within each significant admixture region,

using the PANTHER Classification System (http://pantherdb.org). Genes with immunological or

gastrointestinal-relevant functions were considered to be suggestive candidate genes within the region.

eQTL Evaluation

For significant/suggestive association SNPs we also performed local (cis-) and distant (trans-) expression

quantitative trait locus (eQTL) mapping with R package MatrixEQTL21 on a different set of 85 unrelated

peripheral blood mononuclear cells (PBMCs) drawn from AA study subjects profiled with Illumina

HumanHT-12 v4 Expression BeadChips and genotyped on Illumina HumanOmni1 or HumanOmni2.5

BeadChips22. We defined gene-SNP pairs within 1Mb of another as local, otherwise distant, and those

with false discovery rate (FDR) <0.05 were considered significant.

Network Analysis

We first selected genes with association p<0.005, and then the identified genes annotated with multiple

biologically functional databases including Reactome23, human protein reference databases

(www.hprd.org), and NCI/Nature Pathway Interaction Database. The networks were then constructed

MANUSCRIP

T

ACCEPTED

ACCEPTED MANUSCRIPT

6

from the known interactions in those databases. In addition, we identified the top KEGG pathways

associated with each network using the enrichment analysis tool in STRING (http://string-db.org/). See

Supplement for additional details.

Results

After QC measures were performed, 3,308 samples (1,511 IBD cases and 1,797 controls) and 130,241

unique autosomal SNPs remained for association and admixture analyses (Supplementary Figures 1 and

2). Average West African ancestry was higher in controls than cases (81.7±9.7% vs. 80.0±10.3%, p<1e-

5), and also differed by coordinating centers (p<1e-10), with YRI proportion lowest for AAs from the

American West Coast, similar to other genetic studies24 (Supplementary Table 1).

Five variables were significantly associated with IBD (p<0.05) and therefore were included as covariates

for all subsequent analyses: sex (p=5e-7), recruitment coordinating center (p=2e-5), GRC (p=4e-4), PC2

(p=0.015), and global ancestry (p=0.03). Quantile-quantile plots for association mappings and genomic

inflation factors are shown in Supplementary Figure 3.

Replication of Maximal-CIS SNPs

152 of the 163 Maximal-CIS SNPs passed QC. The remaining 11 SNPs did not have any tagging (r2<0.6)

SNPs that could be used as alternative markers. Five SNPs met criteria for replication (p<3.1e-4):

rs5743289 that tags the three common NOD2 mutations, rs11209026 that encodes the IL23R R381Q

protective variant, rs1801274 that encodes the FCGR2A H167R risk variant, rs11742570 in the 5p13.1

gene desert near PTGER4 and rs12946510 on 17q21 just 3’ of IKZF3 (Table 1).

MANUSCRIP

T

ACCEPTED

ACCEPTED MANUSCRIPT

7

For most of the 152 SNPs, the AA odds ratios are in the same direction as those from the CIS

(correlation=0.78, Figure 1). 81 SNPs had AA ORs within the 95% CIS confidence intervals (CI),

including one replicated SNP (rs11742570). Nominal levels of significance (p<0.05 and same risk allele)

were observed for 41 of the 152 (27%) variants (Supplementary Table 2).

Study-wide Association

Association results for IBD, CD and UC are shown in Figure 2 (left panels). Suggestive association for

IBD was observed at the following 7 loci for 15 SNPs (Table 2): 1p at IL23R; 2p near the p-telomere and

4 Mb upstream of an associated locus in the CIS near C2orf43; 5p13.1 gene desert (5 SNPs in tight LD,

pairwise r2 ranging from 0.52 to 1.00); 5q 70kb telomeric of IL12B (2 SNPs in tight LD, r2=0.87); and 17

in the STAT5A/STAT3 region (6 SNPs in 3 LD blocks). Conditional regression revealed 3 independent

signals from the latter region: three STAT5A SNPs (rs7220367, rs7217884 and rs13380828, all

monomorphic in CEU, r2 ranging from 0.46 to 0.99) in one block; two SNPs 200 kb apart (rs730086 in

KAT2A and rs1053004 in STAT3, r2=0.44); and rs7224339. IBD, being the phenotype with the largest

sample size had greatest power to detect association evidence. However, 72% of cases had CD, and not

surprisingly SNPs at 5p15.3 (PTGER4), IL12B, KAT2A and STAT5A also showed suggestive association

for CD, with CD consistently showing greater association evidence at 5p15.3 than all IBD (Table 2).

For UC, the major peak located in the HLA region, with greatest association for rs9271366 (p=7e-6).

Conditional analysis revealed that all suggestive HLA SNPs could be accounted for by rs9271366

(Supplementary Figure 4). Two SNPs showed suggestive associations on chromosome 3, at 13Mb and at

64Mb (near HDAC11 and LINC00994), neither within 5Mb of any established IBD loci.

Raw genotyping intensities were visually examined for all replicated/suggestive SNPs and only

rs35990859 (at IL12B) had sub-optimal cluster separation (data not shown).

MANUSCRIP

T

ACCEPTED

ACCEPTED MANUSCRIPT

8

Admixture Association

We observed four areas of significant admixture association (Figure 2, right panels and Table 3) with all

having significant evidence for both IBD and CD except for 16p12 (IBD-only). For all of these regions,

CEU (European) ancestry increases risk and YRI (West African) ancestry was protective. Just below

significance was an area with increased CD risk from CEU, maximal at rs2111112 (p=4.9e-4) and 3 Mb

from NOD2 (with increased CEU ALD extending through NOD2). Conditioning on NOD2 genotype

rs5743289 weakened this ALD evidence (p=0.017). A list of annotated genes with immune-related

functions that map to each region is included in Supplementary Table 3.

eQTL Evaluation

We conducted PBMC eQTL mapping for 26 (5 significant replication and 21 suggestive study-wide

association) SNPs with eQTL data. After controlling for multiple comparisons, 0 distant and 6 local gene-

SNP pairs showed significant association (Supplementary Table 4): rs1053004 on chromosome 17 was

associated with KRT19 (p=1.7e-4) and with TTC25 (p=8.0e-4) expression; 3 SNPs (rs1876141 [p=5.5e-

4], rs6866402 [p=7.5e-4], and rs1505994 [p=7.5e-4] in tight LD (r2>0.99) at 5p13.1 with Complement

Component 6 (C6) expression; and rs1801274 with CD84 expression (p=1.2e-3). PTGER4 expression

was evaluated but was not associated with any SNP evaluated (p>0.05).

Network Analysis

We found multiple networks demonstrating significant influence for IBD, CD and UC in AAs even after

FDR and Bonferroni corrections (Table 4). Cytokine-cytokine receptor interaction was the dominant

MANUSCRIP

T

ACCEPTED

ACCEPTED MANUSCRIPT

9

pathway for CD and IBD overall, but for UC was second in effect to Measles (which also played a

significant but less prominent role in IBD and CD). Jak-STAT and Chemokine signaling pathways were

included in top 3 networks for IBD and CD (and Jak-STAT alone for UC). In addition, T cell receptor

signaling pathway was one of the top pathways in UC and IBD, but not CD.

Discussion

In this study of IBD genetics in the AA population, we assembled the largest set of cases and controls

(more than 4 times larger than any prior AA IBD study and the first genetic study of AA UC), evaluated

for replication the majority of established IBD associations and interrogated the majority of known IBD

loci for novel associations, and performed pathway and eQTL analyses to further inform about the nature

of AA IBD genetics. Although our genotyping platform (not being GWA) limited association mapping

primarily to replicating known associations and interrogating established, immunologically related loci,

we also used the more powerful method of MALD to identify novel loci throughout the autosomal

genome and to complement the association mapping.

We replicated, using stringent criteria (p<3.1e-4), five CIS loci. As our study has more CD than UC

cases, most of our replications were for CD. Not surprisingly we replicated loci with greater impact in

other populations. Three of the five replications, NOD2, IL23R and 5p13.1, account for the greatest

degree of CD variance estimated in CIS. The two other replicated loci also have relatively high OR for

Caucasian IBD: FCGR2A and IKZF3 (17th and 47th ranked ORs of the 163 CIS loci).

Three replicated CIS SNPs are more frequent in, and hence IBD risk is more likely to arise from, CEU

than YRI genome: the 3 common NOD2 CD-associated mutations and rs5743289 that tags them, the

protective variant R381Q at rs11209026 that influences IL23R phosphorylation, and rs12946510 at IKZF3

MANUSCRIP

T

ACCEPTED

ACCEPTED MANUSCRIPT

10

(Table 1). In contrast, the two other replicated SNPs, rs11742570 at 5p13.1 and the rs1801274 at

FCGR2A with functional variant H167R, have similar presence in CEU and YRI genomes.

Given our sample size, we had 80% power to replicate Maximal-CIS SNPs with OR>1.3 and MAF>0.2.

Supplementary Figure 5 illustrates our power at different ORs/MAFs at three levels of significance.

Although our study was underpowered a-priori for genome-wide association (and obtaining enough AA

individuals for genetic studies, as statically powerful as those in Caucasians, may be impractical for years

to come), we have identified SNPs in the AA population at a level of association for which to target for

independent replication.

Two replicated loci also had additional SNPs with suggestive association: 5p13.1 and IL23R. At 5p13.1

rs10043340 showed greatest association with CD (p=3.25e-6), only marginally below study-wide

significance (p<2.1e-6). The suggestive SNPs are in LD, equally frequent in YRI and CEU, and, in our

PBMC eQTL analysis, correlated with complement component 6 (C6) instead of PTGER4, the associated

gene in the original discovery in lymphoblastoid cell lines25. C6 is a reasonable candidate gene, given that

C2 and other complement-related genes are associated with numerous IMDs. For rs7515029 at IL23R the

protective allele is almost twice as frequent in YRI (9%) than CEU (5%) indicating another African-

derived protective variant, in addition to the well-established R381Q replication.

Our strongest UC signal was in the HLA region, similar to that observed in all examined populations. The

most significant SNP, rs9271366, known to tag the HLA class II allele DRB1*1502 in other populations,

is the same SNP found most statistically significant in both Japanese26 and Korean27 UC GWASs (p=1e-

70 and p=1e-18 respectively). Hence, our rs9271366 UC association (p=7.47e-6) is justified as replication

of an established UC SNP. It also accounted for nearby associations according to conditional analysis

(Supplementary Figure 4). Interestingly, it was found to be the most associated HLA SNP in AA SLE, a

disease 4-fold more frequent in AAs than CAs28.

MANUSCRIP

T

ACCEPTED

ACCEPTED MANUSCRIPT

11

The strongest IBD association was observed for rs730086, with OR=1.4 (p=2.26e-6, just below study-

wide significance), at an intron of KAT2A, a histone acetyltransferase gene recently linked to repression

of IFN-beta and innate antiviral immunity through inhibition of TKB129. In LD with rs730086 is

rs1053004 (r2=0.44), located in the 3’utr of STAT3, which is eQTL for two genes (FDR=0.038): KRT19,

a gene that produces a major type 1 keratin expressed in ileal and colonic epithelium with a broad

distribution in simple and stratified epithelia; and TTC25, a gene important for celiogenesis expressed in

bone marrow but not in intestine. The risk variants are much more frequent in CEU than YRI.

We observed two other independent signals for IBD and CD on chromosome 17 centered about

STAT5A/STAT3 adjacent genes. These two signals are protective and African-specific (monomorphic in

CEU). We found no association (p>0.1) but marginal admixture (p=4.4e-4) evidences for rs12942547, the

CIS SNP located in STAT3 intron with nearly identical frequencies in CEU and YRI. Complementing

these findings is significant ALD that overlaps this chromosome 17 region (and extends through

IKZF3 and STAT3) with the CEU genome producing IBD risk and YRI being protective.

Conditioning on the IKZF3 replicated SNP (rs11742570) and the 3 independent STAT5A/STAT3

associations eliminated this ALD (p=0.8). An important implication of our association and

admixture results is that the common CEU STAT5A/STAT3 haplotype likely contributes to IBD

risk in the Caucasians – a finding made possible by studying an admixed population. Sequencing

studies in AAs will be important to identify potential functional variants in LD with the YRI-

specific protective alleles.

We found three other regions with significant ALD. Although ALD regions should be unbiased to known

loci (as Immunochip contained AIMs throughout the genome) all four regions overlap CIS loci. The most

highly associated region is on chromosome 10, includes the IPMK/CISD1/UBE2D1/TFAM locus and

extends to within 800 kb of ZNF365 (using a genome-wide admixture p=0.10 cut-off). A region with

MANUSCRIP

T

ACCEPTED

ACCEPTED MANUSCRIPT

12

ALD associated with CD and IBD on chromosome 15 contains SMAD3, a gene associated with need for

multiple surgeries in CD30. A region on chromosome 16 contains PRKCB, IL4R and IL21R, and extends

centromeric just to SBK1. Each of the ALD regions contains additional potential candidate genes, as

noted in Supplementary Table 3.

The top three KEGG pathways in AA IBD (cytokine, Jak-STAT and chemokines) were among the top

four KEGG pathways found in CIS. However, not observed in AA IBD was the Leishmania infection

pathway – second most significant pathway in the CIS; whereas AA UC, IBD, and CD all showed strong

evidence for measles pathway and UC also showed significant evidence for African trypanosomiasis –

both pathways not observed in CIS.

The high frequency of genes associated with infectious disease supports the hypothesis that IBD genetic

susceptibility may be a by-product of evolutionary adaption to human pathogens. Hence the differential

infectious disease pathways associated with AA and Caucasian IBD may be related to adaptation to

geographically distinct infectious diseases. In total, these findings suggest that, while major pathways are

similar across populations, some are distinct and may be targets for personalized IBD therapies.

This first, in-depth characterization of AA IBD genetics provides a basis to compare IBD genetics with

Caucasian2 and East Asian26,27,31-36 populations where multiple GWAS, replication and Immunochip

studies have been performed (Figure 3). Five of the AA loci with suggestive/replication evidence are

associated in all three populations: IL23R, FCGR2A, IL12B, HLA and STAT5A/3, with the functional

SNPs for IL23R and FCGR2A the same. The higher OR loci tend to be observed in more than one

population. The STAT5A/STAT3 locus stands out in AAs: the ORs for the three independent associations

are 0.76, 1.40 and 0.55, suggesting that this region likely has a stronger influence in AA than Caucasian

IBD (OR=1.1 for rs12942547 in CIS). A Japanese CD GWAS found relatively strong influence for

STAT333, maximal at rs9891119, but we found no evidence in AAs for this SNP equally frequent in YRI,

CEU and East Asians.

MANUSCRIP

T

ACCEPTED

ACCEPTED MANUSCRIPT

13

In summary, our study reveals that in AAs HLA SNPs demonstrate the dominant signal for UC;

STAT5A/STAT3 shows disproportionate association with IBD and CD including novel African protective

haplotypes and common European risk haplotypes that together with an IKZF3 CD replication 3 Mb

centromeric likely gives risk to significant chromosome 17 regional ALD. We identified three other

chromosomal regions contributing significant IBD risk from European admixture. We replicated

Maximal-CIS SNPs for NOD2, IL23R, 5p13.1, and FCGR2A, identified additional risk variants at 5p13.1

and IL23R. We observed a strong correlation in ORs for Maximal-CIS SNPs between our study and the

CIS suggesting that additional Caucasian established loci likely play a role in AAs. We also demonstrated

other known (IL12B) and novel (C2orf43, HDAC11 and LINC00994) suggestive areas of association

interest. The new suggestive associations will need to be validated by independent replication. Finally a

network analysis showed that, as in other studies, cytokines, Jak-STAT signaling and chemokine

pathways play major roles, but also suggests that measles and African trypanosomiasis pathways may be

important for further investigation. This study has yielded vital information not only on the

etiopathogenesis of IBD specifically for AAs, but also about risk variants and ancestral chromosomal

regions that may also contribute to IBD pathogenesis in Caucasians, as we continue our exploration of

AA IBD genetics.

MANUSCRIP

T

ACCEPTED

ACCEPTED MANUSCRIPT

15

Figure Legends

Figure 1. Comparison of log2(Odds ratios) estimated in this study with those in Caucasian Immunochip

Study (CIS) for 110 IBD-general (correlation=0.78), 26 CD-specific (correlation=0.77) and 23 UC-

specific (correlation=0.61) Maximal-CIS SNPs.

Figure 2. Association mapping (left) and admixture (right) mapping p-values (in -log10) for IBD (top

panel), CD (middle panel) and UC (bottom panel).

Figure 3. Comparison of log10(Odds Ratios) at the risk alleles by chromosomal position (in Mb, Genome

Build 37) between Caucasian (black), East Asian (green) and African American (red) loci for IBD

(hatched), CD (solid) and UC (stippled). Bars with black outlines represent loci with reported genome-

wide significant association evidence in the CIS (black, limited to OR>1.1) or in East Asian studies

(green). Bars without outlines are present for AA or East Asian replications (at p<3.1e-4) of CIS-Maximal

SNPs, or for SNPs evaluated in these populations with suggestive evidence for association at established

CIS loci (p<4.2e-5) or borderline of genome-wide significance for novel loci (p<1e-6). Asterisks denote

the same associated SNPs in AA and additional populations. Red arrows denote overlaps between

significant admixture regions and association loci.

MANUSCRIP

T

ACCEPTED

ACCEPTED MANUSCRIPT

16

References

1. Parkes M, Cortes A, van Heel DA, Brown MA. Genetic insights into common pathways and complex relationships among immune-mediated diseases. Nat Rev Genet 2013;14:661-673.

2. Jostins L, Ripke S, Weersma RK, et al. Host-microbe interactions have shaped the genetic architecture of inflammatory bowel disease. Nature 2012;491:119-124.

3. Cortes A, Brown MA. Promise and pitfalls of the Immunochip. Arthritis Res Ther 2011;13:101.

4. Knights D, Lassen KG, Xavier RJ. Advances in inflammatory bowel disease pathogenesis: linking host genetics and the microbiome. Gut 2013;62:1505-1510.

5. Wang MH, Okazaki T, Kugathasan S, et al. Contribution of higher risk genes and European admixture to Crohn's disease in African Americans. Inflamm Bowel Dis 2012;18:2277-2287.

6. Adeyanju O, Okou DT, Huang C, et al. Common NOD2 risk variants in African Americans with Crohn's disease are due exclusively to recent Caucasian admixture. Inflamm Bowel Dis 2012;18:2357-2359.

7. Kanaan Z, Ahmad S, Roberts H, et al. Crohn's disease in Caucasians and African Americans, as defined by clinical predictors and single nucleotide polymorphisms. J Natl Med Assoc 2012;104:420-427.

8. Patterson N, Hattangadi N, Lane B, et al. Methods for high-density admixture mapping of disease genes. Am J Hum Genet 2004;74:979-1000.

9. Wang YR, Loftus EV,Jr, Cangemi JR, Picco MF. Racial/Ethnic and regional differences in the prevalence of inflammatory bowel disease in the United States. Digestion 2013;88:20-25.

10. Betteridge JD, Armbruster SP, Maydonovitch C, Veerappan GR. Inflammatory bowel disease prevalence by age, gender, race, and geographic location in the U.S. military health care population. Inflamm Bowel Dis 2013;19:1421-1427.

11. Malaty HM, Hou JK, Thirumurthi S. Epidemiology of inflammatory bowel disease among an indigent multi-ethnic population in the United States. Clin Exp Gastroenterol 2010;3:165-170.

12. Nguyen GC, Torres EA, Regueiro M, et al. Inflammatory bowel disease characteristics among African Americans, Hispanics, and non-Hispanic Whites: characterization of a large North American cohort. Am J Gastroenterol 2006;101:1012-1023.

13. Simon JA, Lin F, Hulley SB, et al. Phenotypic predictors of response to simvastatin therapy among African-Americans and Caucasians: the Cholesterol and Pharmacogenetics (CAP) Study. Am J Cardiol 2006;97:843-850.

14. Dassopoulos T, Nguyen GC, Bitton A, et al. Assessment of reliability and validity of IBD phenotyping within the National Institutes of Diabetes and Digestive and Kidney Diseases (NIDDK) IBD Genetics Consortium (IBDGC). Inflamm Bowel Dis 2007;13:975-983.

MANUSCRIP

T

ACCEPTED

ACCEPTED MANUSCRIPT

17

15. Price AL, Patterson NJ, Plenge RM, Weinblatt ME, Shadick NA, Reich D. Principal components analysis corrects for stratification in genome-wide association studies. Nat Genet 2006;38:904-909.

16. Price AL, Tandon A, Patterson N, et al. Sensitive detection of chromosomal segments of distinct ancestry in admixed populations. PLoS Genet 2009;5:e1000519.

17. Pasaniuc B, Sankararaman S, Kimmel G, Halperin E. Inference of locus-specific ancestry in closely related populations. Bioinformatics 2009;25:i213-21.

18. Seldin MF, Pasaniuc B, Price AL. New approaches to disease mapping in admixed populations. Nat Rev Genet 2011;12:523-528.

19. Alexander DH, Novembre J, Lange K. Fast model-based estimation of ancestry in unrelated individuals. Genome Res 2009;19:1655-1664.

20. Shriner D, Adeyemo A, Rotimi CN. Joint ancestry and association testing in admixed individuals. PLoS Comput Biol 2011;7:e1002325.

21. Shabalin AA. Matrix eQTL: ultra fast eQTL analysis via large matrix operations. Bioinformatics 2012;28:1353-1358.

22. Maranville JC, Baxter SS, Witonsky DB, Chase MA, Di Rienzo A. Genetic mapping with multiple levels of phenotypic information reveals determinants of lymphocyte glucocorticoid sensitivity. Am J Hum Genet 2013;93:735-743.

23. Matthews L, Gopinath G, Gillespie M, et al. Reactome knowledgebase of human biological pathways and processes. Nucleic Acids Res 2009;37:D619-22.

24. Chen GK, Millikan RC, John EM, et al. The potential for enhancing the power of genetic association studies in African Americans through the reuse of existing genotype data. PLoS Genet 2010;6:e1001096.

25. Libioulle C, Louis E, Hansoul S, et al. Novel Crohn disease locus identified by genome-wide association maps to a gene desert on 5p13.1 and modulates expression of PTGER4. PLoS Genet 2007;3:e58.

26. Okada Y, Yamazaki K, Umeno J, et al. HLA-Cw*1202-B*5201-DRB1*1502 haplotype increases risk for ulcerative colitis but reduces risk for Crohn's disease. Gastroenterology 2011;141:864-871.e1-5.

27. Yang SK, Hong M, Zhao W, et al. Genome-wide association study of ulcerative colitis in Koreans suggests extensive overlapping of genetic susceptibility with Caucasians. Inflamm Bowel Dis 2013;19:954-966.

28. Ruiz-Narvaez EA, Fraser PA, Palmer JR, et al. MHC region and risk of systemic lupus erythematosus in African American women. Hum Genet 2011;130:807-815.

29. Jin Q, Zhuang L, Lai B, et al. Gcn5 and PCAF negatively regulate interferon-beta production through HAT-independent inhibition of TBK1. EMBO Rep 2014;15:1192-1201.

MANUSCRIP

T

ACCEPTED

ACCEPTED MANUSCRIPT

18

30. Fowler SA, Ananthakrishnan AN, Gardet A, et al. SMAD3 gene variant is a risk factor for recurrent surgery in patients with Crohn's disease. J Crohns Colitis 2014;8:845-851.

31. Asano K, Matsushita T, Umeno J, et al. A genome-wide association study identifies three new susceptibility loci for ulcerative colitis in the Japanese population. Nat Genet 2009;41:1325-1329.

32. Hirano A, Yamazaki K, Umeno J, et al. Association study of 71 European Crohn's disease susceptibility loci in a Japanese population. Inflamm Bowel Dis 2013;19:526-533.

33. Yamazaki K, Umeno J, Takahashi A, et al. A genome-wide association study identifies 2 susceptibility Loci for Crohn's disease in a Japanese population. Gastroenterology 2013;144:781-788.

34. Hong SN, Park C, Park SJ, et al. Deep resequencing of 131 Crohn's disease associated genes in pooled DNA confirmed three reported variants and identified eight novel variants. Gut 2015;0:1–9.

35. Yang SK, Hong M, Choi H, et al. Immunochip analysis identification of 6 additional susceptibility Loci for Crohn's disease in koreans. Inflamm Bowel Dis 2015;21:1-7.

36. Liu JZ, van Sommeren S, Huang H, et al. Association analyses identify 38 susceptibility loci for inflammatory bowel disease and highlight shared genetic risk across populations. Nat Genet 2015.

Author names in bold designate shared co-first authors

MANUSCRIP

T

ACCEPTED

ACCEPTED MANUSCRIPT

Acknowledgements

We are grateful to all of the patients and the controls that volunteered to join this study. This study is

dedicated to Dr. Linda Kao who helped plan and direct the study and passed away just prior to the study’s

completion.

IBDGC sample recruitment was from the Meyerhoff Inflammatory Bowel Disease Center at The Johns

Hopkins Hospital and Johns Hopkins University School of Medicine, Departments of Medicine and

Surgery (Patricia Ushry, Sharon Dudley-Brown, Theodore M. Bayless, Christina Ha, Jonathon Efron;

Susan Gearhart, and Michael Marohn); and the Division of Pediatric Gastroenterology and Nutrition at

The Johns Hopkins Children’s Center (Maria Oliva-Hemker and Carmen Cuffari); and funded satellite

centers at Henry Ford Health System (Martin Zonca, Qiana Samuels and Aref Araya); University of

North Carolina Department of Medicine (Dolly Walkup); Baylor College of Medicine; Virginia

Commonwealth University (Kasiah Banks; Alisa Maibauer; Amy Newcombe); Washington Hospital

Center (Michael S. Gold and Averell Sherker); University of Florida; Howard University (Duane T.

Smoot); University of Alabama (Toni Seay; Tajuanna Lucious); Perelman School of Medicine at the

University of Pennsylvania (James D. Lewis); University of Maryland Department of Pediatrics;

University of Chicago (Lici Shen); Cornell University; and Columbia University (Arun Swaminath).

IBDGC Genetic Research Centers at Yale University, University of Pittsburgh, University of Toronto,

and University of Montreal (John D. Rioux holds a Canada Research Chair). The database was developed

with assistance from Phil Schumm (University of Chicago). Additional DNA samples on African

American IBD cases were provided by Material Transfer Agreements from Washington University

Inflammatory Bowel Disease Program with assistance from Rodney Newberry and Ellen Li.

Emory samples were recruited from: Emory University School of Medicine and funded satellite centers at

The Children's Hospital of Philadelphia, Cincinnati Children's Hospital Medical Center, Case Western

Reserve University, University of Maryland, Vanderbilt Children’s Hospital, University of Texas

MANUSCRIP

T

ACCEPTED

ACCEPTED MANUSCRIPT

Southwestern Medical Center, University of North Carolina Chapel Hill Department of Pediatrics,

University of Chicago Comer Children's Hospital (Thomas Mangatu; Kathleen Van’t Hof, The Barnett

and Alscher Families), Louisiana State University Health Sciences Center, Cook Childrens Medical

Center, and Willis-Knighton Physician Network.

Cedars samples were recruited from: Cedars-Sinai Medical Center F. Widjaja Foundation Inflammatory

Bowel and Immunobiology Research Institute (Stephan Targan, Andrew Ippoliti, Eric Vasiliauskas,

David Shih, Gil Melmed, Marla Dubinsky), Charlotte Gastroenterology and Hepatology PLLC; and the

Los Angeles Biomedical Research Institute at Harbor-UCLA Medical Center Institute for Translational

Genomics and Population Sciences.

We thank Suna Onengut-Gumuscu for assistance with extraction of control data from RA, SLE and T1D

Immunochip studies. We thank Feng Zhou for assistance with Figure 3 and compiling IBD association

information in East Asian populations.

MANUSCRIP

T

ACCEPTED

ACCEPTED MANUSCRIPT

Table 1 Significant association for replication (p<3.1e-4) for SNPs established in Caucasian Immunochip Study (CIS)

SNP Chr Mbc Nearest

Gene A1d A2d YRIe CEUe Casee Ctrle CISe Sigf pf ORf L95CIf U95CIf

CIS

Sigg

CIS

ORg

CIS

L95CIg

CIS

U95CIg

rs11209026 a,h 1p31.3 67.71 IL23R A G 0.009 0.041 0.007 0.014 0.067 CD 7.53E-05 0.219 0.103 0.464 IBD 0.497 0.465 0.531

rs1801274 a 1q23 161.48 FCGR2A A G 0.496 0.491 0.506 0.453 0.525 IBD 2.60E-04 1.211 1.093 1.343 UC 1.192 1.15 1.234

rs11742570 a 5p13.1 40.41 PTGER4 A G 0.385 0.354 0.342 0.390 0.39 IBD 1.29E-04 0.812 0.729 0.903 CD 0.773 0.747 0.799

rs5743289 b,h 16q21 50.76 NOD2 A G 0 0.226 0.052 0.032 0.178 CD 6.12E-05 1.763 1.336 2.326 CD 1.557 1.497 1.618

rs12946510 a 17q21 37.91 IKZF3 A G 0.119 0.500 0.233 0.201 0.467 CD 9.31E-05 1.319 1.148 1.516 IBD 1.157 1.124 1.19

a. IBD-general SNP in CIS;

b. Crohn’s disease-specific SNP in CIS;

c. SNP position in mega base-pair (Mb, Genome Build 37);

d. Minor (A1) and major (A2) allele in current dataset respectively;

e. Frequency corresponding to minor allele (A1) for Hapmap African samples (YRI), Hapmap European samples (CEU), current African American IBD case

(Case) and control (Ctrl) samples, and Immunochip case/control samples used in CIS respectively;

f. Most significant phenotype (Sig), corresponding p-value (p), minor allele odds ratio (OR) with upper (U95CI) and lower (L95CI) 95% confidence interval.

g. The most significant phenotype (Sig) and the corresponding odds ratio (OR), lower (L95CI) and upper (U95CI) bounds for 95% confidence interval for odds

ratio in CIS;

h. SNP without expression data available in eQTL evaluation.

MANUSCRIP

T

ACCEPTED

ACCEPTED MANUSCRIPT

Table 2 Suggestive Study-wide Association (p<4.2e-5)

SNP Chr Mba A1b A2b YRIb CEUb Caseb Ctrlb

Nearest

Gene Location Sigc pc ORc L95CIc U95CIc

rs7515029g 1 67.60 G A 0.093 0.049 0.075 0.101 IL23R intergenic IBD 3.17E-05 0.670 0.555 0.809

rs3072g 2 20.88 G A 0.027 0.354 0.100 0.130 C2orf43 intergenic IBD 2.15E-05 0.697 0.590 0.823

rs10043340g 5 40.49 C A 0.466 0.275 0.361 0.434 PTGER4 intergenic CDd 3.25E-06 0.754 0.669 0.849

rs4286721 5 40.50 G A 0.385 0.252 0.319 0.386 PTGER4 intergenic CDd 2.97E-06 0.748 0.662 0.845

rs6866402 5 40.52 A G 0.442 0.257 0.368 0.428 PTGER4 intergenic CDd 4.05E-05 0.782 0.695 0.879

rs1876141 5 40.53 G A 0.442 0.267 0.367 0.428 PTGER4 intergenic CDd 4.00E-05 0.782 0.695 0.879

rs1505994 5 40.53 A G 0.442 0.267 0.368 0.429 PTGER4 intergenic CDd 3.65E-05 0.781 0.694 0.878

rs35990859g 5 158.80 G A - 0.092e 0.014 0.027 IL12B intergenic IBDd 6.62E-06 0.393 0.261 0.590

rs36048684g 5 158.82 A T - 0.050e 0.013 0.027 IL12B intergenic IBDd 9.04E-06 0.398 0.265 0.598

rs730086 17 40.27 G A 0.097 0.664 0.238 0.177 KAT2A intron IBDd 2.26E-06 1.373 1.204 1.567

rs7220367 17 40.44 G C 0.392 0 0.285 0.349 STAT5A intergenic CD 2.23E-05 0.758 0.667 0.862

rs7217884g 17 40.45 G A 0.270 0 0.211 0.267 STAT5A intron IBDd 2.21E-05 0.765 0.675 0.866

rs13380828g 17 40.45 A T 0.297 0 0.210 0.266 STAT5A intron IBD 2.57E-05 0.766 0.677 0.867

rs1053004 17 40.47 A G 0.119f 0.611 0.169 0.118 STAT3 utr-3 IBD 1.08E-05 1.409 1.209 1.641

rs7224339g 17 40.49 A G 0 0 0.033 0.058 STAT3 intron IBD 8.76E-06 0.552 0.425 0.717

rs2655211 3 13.49 G A 0.179 0.431 0.190 0.265 HDAC11 intergenic UC 6.97E-06 0.619 0.503 0.763

MANUSCRIP

T

ACCEPTED

ACCEPTED MANUSCRIPT

rs254855 3 64.07 A G 0.261 0.301 0.287 0.214 LINC00994 coding UC 3.13E-05 1.491 1.235 1.799

rs2395178g 6 32.41 C G 0.533 0.602 0.415 0.500 HLA-DRA intergenic UC 1.49E-05 0.692 0.586 0.818

rs9270986g 6 32.57 A C 0.173 0.204 0.218 0.156 HLA-DRB1 intergenic UC 1.85E-05 1.559 1.272 1.91

rs9271366g 6 32.59 G A 0.168 0.192 0.202 0.141 HLA-DQA1 intergenic UC 7.47E-06 1.627 1.315 2.013

rs2097431g 6 32.59 A G 0.332 0.394 0.445 0.364 HLA-DQA1 intergenic UC 3.34E-05 1.436 1.210 1.703

a. SNP position in mega base-pair (Mb, Genome Build 37);

b. Frequency corresponding to minor allele (A1) for Hapmap African (YRI), Hapmap European (CEU), and current African American case (Case) and control

(Ctrl) samples respectively;

c. The most significant phenotype (Sig) and corresponding p-values (p), odds ratio (OR) with lower (L95CI) and upper (U95CI) bounds for 95% confidence

intervals;

d. SNP with suggestive association (p<4.2e-5) for both IBD and CD;

e. Minor allele frequency (MAF) for samples from 1000 Genomes pilot 1 CEU low coverage panel;

f. MAF for samples from the human variation panel of African Americans;

g. SNP without expression data available in eQTL evaluation.

MANUSCRIP

T

ACCEPTED

ACCEPTED MANUSCRIPT

Table 3 Significant (p<3.9e-4) admixture association regions

Maximal SNP Chromosomala Positiona Pheno-

typeb pb ORb L95CIb U95CIb Candidate Genesd

rs1684909 10q11.23-10q21.2 49.94-63.25 IBDc 1.47E-05 0.542 0.472 0.715 ZNF365, REEP3, MBL2

rs603439 15q22.2-15q23 62.76-70.75 CDc 1.80E-04 0.549 0.484 0.751 TRIP4, SMAD6, SMAD3, MAP2K1,

IGDCC4, IGDCC3

rs1423086 16p12.2-16p12.1 21.74-28.39 IBD 2.67E-04 0.586 0.465 0.781 IL4R, IL21R, SLC5A11, PRKCB,

ITFG1

rs2227322 17q12-17q21.31 37.39-40.93 IBDc 1.10E-04 0.580 0.519 0.765 ORMDL3, CSF3, CCR7, Keratin

cluster

a. Window containing SNPs with suggestive admixture (p<7.8e-4) by chromosomal banding and physical positions in Mb (Genome Build 37).

b. The most significant phenotype (Phenotype) and the corresponding: p-value (p); odds ratio with each additional African ancestry (YRI) allele (OR); lower

(L95CI) and upper (U95CI) bounds for 95% confidence interval for YRI odds ratio.

c. SNP with significant admixture (p<3.9e-4) for both IBD and CD.

d. Candidate genes highlighted based on literature review of genes located in physical window.

MANUSCRIP

T

ACCEPTED

ACCEPTED MANUSCRIPT

Table 4 Significant KEGG pathways for IBD, CD and UC

Top KEGG Pathways Pheno-

typea

Genesb p-valuec FDRc Bonferronic

Cytokine-cytokine receptor interaction IBD 12 1.55E-12 3.66E-10 3.66E-10

Jak-STAT signaling pathway IBD 8 4.07E-09 4.82E-07 9.64E-07

Chemokine signaling pathway IBD 8 1.59E-08 1.26E-06 3.77E-06

Measles IBD 7 3.32E-08 1.97E-06 7.88E-06

Pathways in cancer IBD 9 8.65E-08 4.10E-06 2.05E-05

T cell receptor signaling pathway IBD 5 6.33E-06 2.50E-04 1.50E-03

Focal adhesion IBD 6 9.10E-06 3.08E-04 2.16E-03

Gap junction IBD 4 5.97E-05 1.77E-03 1.42E-02

Fc gamma R-mediated phagocytosis IBD 4 7.47E-05 1.97E-03 1.77E-02

Vascular smooth muscle contraction IBD 4 1.74E-04 4.13E-03 4.13E-02

Cytokine-cytokine receptor interaction CD 18 3.11E-21 7.37E-19 7.37E-19

Chemokine signaling pathway CD 10 4.13E-11 4.89E-09 9.79E-09

Jak-STAT signaling pathway CD 9 2.29E-10 1.81E-08 5.42E-08

Measles CD 6 1.15E-06 6.84E-05 2.74E-04

Pathways in cancer CD 8 1.76E-06 8.32E-05 4.16E-04

MANUSCRIP

T

ACCEPTED

ACCEPTED MANUSCRIPT

Measles UC 7 1.97E-09 4.67E-07 4.67E-07

Cytokine-cytokine receptor interaction UC 8 1.03E-08 1.22E-06 2.43E-06

Jak-STAT signaling pathway UC 6 2.05E-07 1.62E-05 4.87E-05

T cell receptor signaling pathway UC 5 8.88E-07 5.26E-05 2.11E-04

African trypanosomiasis UC 3 2.33E-05 1.11E-03 5.53E-03

NOD-like receptor signaling pathway UC 3 1.15E-04 4.56E-03 2.74E-02

Adipocytokine signaling pathway UC 3 1.88E-04 6.38E-03 4.47E-02

a. Associated phenotype;

b. Number of genes in the pathway;

c. Nominal p-value, false discovery rate (FDR), and Bonferroni-corrected p-value respectively.

MANUSCRIP

T

ACCEPTED

ACCEPTED MANUSCRIPT

MANUSCRIP

T

ACCEPTED

ACCEPTED MANUSCRIPT

MANUSCRIP

T

ACCEPTED

ACCEPTED MANUSCRIPT

0  

0.1  

0.2  

0.3  

0.4  

0.5  

0.6  

0.7  log10(Odd

s  Ra-

o)  

IL23R   FCGR2A  

     **     ***    

*      *    0  

0.1  

0.2  

0.3  

0.4  

0.5  

0.6  

0.7  

PTGER4   IL12B   HLA  

0  

0.1  

0.2  

0.3  

0.4  

0.5  

0.6  

0.7  

MANUSCRIP

T

ACCEPTED

ACCEPTED MANUSCRIPT

0  0.1  0.2  0.3  0.4  0.5  0.6  0.7  

Asian  CD  Replica7on    or  borderline  significant      

Asian  UC  Replica7on  or  borderline  significant  

AA  UC  Replica7on  or    borderline  significant    

Caucasian  IBD  Genome-­‐wide  Significant    

**    

Overlapping  Admixture  Locus  

0  

0.1  

0.2  

0.3  

0.4  

0.5  

0.6  

0.7  

**    

NOD2   IKZF3  

STAT5A/3  

Key  

Asian  CD  Genome-­‐wide  significant      

Caucasian  CD  Genome-­‐  Wide    significant      

AA  CD  Replica7on  or    borderline  significant      

AA  IBD  Replica7on  or    borderline  significant    

Asian  IBD  Genome-­‐  wide  significant  

Caucasian  UC  Genome-­‐  wide    significant      

MANUSCRIP

T

ACCEPTED

ACCEPTED MANUSCRIPT

IC_SNP CHR POS GWA_SNP R2_GWA_IC Type A1 A2 Frq_CEU Frq_YRI MAF Frq_Aff Frq_Unaff IBD_pval IBD_OR IBD_L95 IBD_U95 CD_pval CD_OR CD_L95 CD_U95 UC_pval UC_OR UC_L95 UC_U95rs12103 1 12,47,494 rs12103 1 IBD-all G A NA NA 0.198 0.210 0.189 4.05E-01 1.059 0.926 1.210 3.15E-01 1.079 0.930 1.252 9.13E-01 0.988 0.796 1.227

rs6667605 1 25,02,780 rs10797432 0.971 UC-only G A 0.531 0.354 0.428 0.419 0.432 3.66E-01 0.953 0.860 1.057 2.69E-01 0.937 0.835 1.052 7.96E-01 1.022 0.868 1.202rs3766606 1 80,22,197 rs35675666 1 IBD-all A C 0.168 0.354 0.343 0.342 0.345 6.25E-01 1.028 0.921 1.146 9.38E-01 0.995 0.880 1.125 2.15E-01 1.115 0.939 1.323rs6426833 1 2,01,71,860 rs6426833 1 UC-only A G 0.509 0.434 0.461 0.465 0.456 2.32E-01 1.065 0.960 1.181 1.33E-01 1.093 0.973 1.227 8.44E-01 0.984 0.834 1.160

rs12568930 1 2,27,02,231 rs12568930 1 IBD-all G A 0.186 0.416 0.309 0.299 0.315 2.21E-01 0.932 0.832 1.043 9.12E-01 0.993 0.875 1.126 6.42E-03 0.773 0.643 0.930rs11209026 1 6,77,05,958 rs11209026 1 IBD-all A G 0.041 0.009 0.011 0.007 0.014 4.08E-04 0.376 0.219 0.647 7.53E-05 0.219 0.103 0.464 7.72E-01 0.907 0.467 1.760

rs2651244 1 7,09,95,562 rs2651244 1 IBD-all A G 0.367 0.084 0.134 0.134 0.134 5.60E-01 0.955 0.818 1.115 6.45E-01 0.960 0.807 1.142 9.27E-01 0.989 0.780 1.254rs17391694 1 7,86,23,626 rs17391694 1 CD-only A G 0.119 0 0.023 0.022 0.024 6.79E-02 0.722 0.508 1.024 7.51E-02 0.695 0.466 1.037 5.15E-01 0.838 0.493 1.425

rs6679677 1 ########## rs6679677 1 CD-only A C 0.115 0 0.015 0.016 0.015 8.79E-01 0.968 0.633 1.480 9.94E-01 0.998 0.618 1.613 6.22E-01 1.170 0.627 2.185rs2641348 1 ########## rs3897478 0.97 CD-only G A 0.097 0.385 0.317 0.309 0.324 4.84E-01 0.961 0.860 1.074 2.58E-01 0.931 0.822 1.054 3.58E-01 1.085 0.912 1.290rs4845604 1 ########## rs4845604 1 IBD-all A G 0.133 0.274 0.259 0.243 0.272 7.16E-03 0.849 0.754 0.957 2.71E-02 0.861 0.754 0.983 8.37E-02 0.845 0.699 1.023

rs670523 1 ########## rs670523 1 IBD-all G A 0.637 0.004 0.145 0.157 0.134 5.10E-01 1.054 0.902 1.231 8.14E-01 1.021 0.858 1.215 3.45E-01 1.122 0.883 1.425rs4656958 1 ########## rs4656958 1 IBD-all A G 0.354 0.403 0.430 0.417 0.440 1.39E-01 0.924 0.831 1.026 5.39E-01 0.964 0.857 1.084 1.45E-01 0.882 0.746 1.044rs1801274 1 ########## rs1801274 1 IBD-all A G 0.491 0.496 0.477 0.506 0.453 2.60E-04 1.211 1.093 1.343 7.61E-03 1.169 1.042 1.312 3.16E-03 1.281 1.087 1.510rs7517810 1 ########## rs9286879 1 CD-only A G 0.204 0.226 0.281 0.295 0.270 2.49E-02 1.142 1.017 1.282 2.35E-02 1.161 1.020 1.320 1.56E-01 1.143 0.950 1.374rs2488389 1 ########## rs2488389 1 IBD-all A G 0.207 0.261 0.265 0.280 0.252 2.92E-02 1.137 1.013 1.277 6.11E-02 1.132 0.994 1.289 1.87E-01 1.130 0.943 1.354rs2816958 1 ########## rs2816958 1 UC-only A G 0.086 0.375 0.268 0.256 0.277 2.18E-01 0.929 0.826 1.045 3.36E-01 0.938 0.822 1.069 1.40E-01 0.867 0.717 1.048rs7554511 1 ########## rs7554511 1 IBD-all A C 0.327 0.183 0.162 0.149 0.172 8.62E-04 0.784 0.680 0.905 5.01E-03 0.795 0.678 0.933 2.64E-02 0.768 0.608 0.970rs3024505 1 ########## rs3024505 1 IBD-all A G 0.181 0.018 0.051 0.057 0.046 3.53E-02 1.289 1.018 1.633 6.11E-02 1.287 0.988 1.677 2.07E-01 1.265 0.878 1.821

rs13407913 2 2,50,97,644 rs6545800 0.949 IBD-all A G 0.584 0.102 0.179 0.169 0.187 3.10E-02 0.858 0.747 0.986 1.79E-02 0.828 0.708 0.968 6.93E-01 0.957 0.770 1.189rs1260326 2 2,77,30,940 rs1728918 0.498 CD-only A G 0.42 0.103 0.149 0.151 0.148 2.80E-01 0.922 0.795 1.069 2.24E-01 0.903 0.766 1.064 6.35E-01 0.944 0.746 1.196

rs925255 2 2,86,14,794 rs925255 1 IBD-all A G 0.473 0.257 0.271 0.265 0.276 2.42E-01 0.932 0.829 1.048 5.13E-01 0.957 0.839 1.091 1.52E-01 0.870 0.720 1.052rs10495903 2 4,38,06,918 rs10495903 1 IBD-all A G 0.146 0.181 0.154 0.159 0.151 4.83E-01 1.053 0.911 1.218 6.44E-01 1.039 0.883 1.222 8.06E-01 1.029 0.817 1.296

rs7608910 2 6,12,04,856 rs7608910 1 IBD-all G A 0.379 0.415 0.401 0.401 0.401 9.11E-01 0.994 0.894 1.105 6.91E-01 0.976 0.867 1.099 5.49E-01 1.053 0.889 1.248rs10865331 2 6,25,51,472 rs10865331 1 CD-only G A 0.712 0.442 0.493 0.478 0.505 1.44E-02 0.878 0.792 0.975 3.37E-02 0.882 0.785 0.990 1.62E-01 0.890 0.756 1.048

rs6740462 2 6,56,67,272 rs6740462 1 IBD-all C A 0.265 0.062 0.113 0.113 0.114 3.77E-01 0.927 0.783 1.097 5.58E-01 0.945 0.784 1.141 1.26E-01 0.805 0.609 1.063rs6708413 2 ########## rs917997 1 IBD-all G A 0.204 0.044 0.115 0.122 0.109 4.11E-01 1.071 0.910 1.261 3.90E-02 1.207 1.010 1.443 1.91E-02 0.710 0.533 0.946rs2111485 2 ########## rs2111485 1 IBD-all G A 0.615 0.111 0.240 0.243 0.240 7.00E-01 0.976 0.864 1.103 3.98E-01 0.943 0.821 1.081 8.33E-01 1.021 0.843 1.236rs1517352 2 ########## rs1517352 1 IBD-all C A 0.624 0.027 0.137 0.136 0.137 4.26E-01 0.939 0.803 1.097 6.92E-01 0.966 0.812 1.149 3.92E-01 0.896 0.698 1.151rs1440088 2 ########## rs1016883 0.895 UC-only C A 0.164 0.23 0.213 0.205 0.220 3.10E-01 0.936 0.823 1.064 6.33E-01 0.966 0.837 1.114 3.35E-01 0.904 0.737 1.110

rs17229285 2 ########## rs17229285 1 UC-only A G 0.562 0.044 0.151 0.158 0.146 4.98E-01 1.053 0.907 1.221 5.57E-01 1.051 0.890 1.241 4.78E-01 1.088 0.862 1.372rs2382817 2 ########## rs2382817 1 IBD-all A C 0.345 0.301 0.369 0.377 0.363 3.30E-01 1.056 0.946 1.178 5.45E-01 1.038 0.919 1.173 3.39E-01 1.088 0.916 1.292rs6716753 2 ########## rs6716753 1 CD-only G A 0.212 0.159 0.206 0.214 0.201 3.33E-01 1.066 0.937 1.212 5.60E-02 1.148 0.996 1.323 8.12E-01 0.975 0.791 1.202

rs12994997 2 ########## rs12994997 1 CD-only A G 0.575 0.279 0.330 0.344 0.318 2.78E-01 1.063 0.952 1.186 3.34E-01 1.063 0.940 1.202 4.15E-01 1.074 0.905 1.274rs4256159 3 1,87,67,404 rs4256159 1 IBD-all A G 0.164 0.058 0.063 0.069 0.058 4.50E-02 1.237 1.005 1.523 4.72E-02 1.265 1.003 1.596 2.65E-01 1.197 0.872 1.643rs3197999 3 4,97,21,532 rs3197999 1 IBD-all A G 0.261 0.204 0.250 0.249 0.250 2.83E-01 0.937 0.831 1.056 3.81E-01 0.942 0.823 1.077 3.87E-01 0.919 0.759 1.112rs9847710 3 5,30,62,661 rs9847710 1 UC-only A G 0.518 0.451 0.489 0.487 0.493 7.78E-01 0.985 0.888 1.093 5.91E-01 1.032 0.919 1.159 3.89E-01 0.930 0.789 1.097rs2457996 4 7,48,56,535 rs2472649 0.547 IBD-all G A 0.142 0.478 0.375 0.373 0.375 5.00E-01 1.037 0.933 1.154 4.99E-01 1.042 0.925 1.174 6.74E-01 0.964 0.815 1.142

rs13126505 4 ########## rs13126505 1 CD-only A G 0.076 0 0.011 0.014 0.009 2.28E-01 1.348 0.830 2.189 2.11E-01 1.393 0.829 2.342 8.18E-01 0.903 0.377 2.162rs3774937 4 ########## rs3774959 0.905 UC-only G A 0.341 0.004 0.082 0.084 0.081 5.05E-01 0.937 0.773 1.135 4.59E-01 0.922 0.744 1.143 6.08E-01 0.924 0.682 1.252

rs11739663 5 5,94,083 rs11739663 1 UC-only G A 0.248 0.504 0.466 0.461 0.472 8.66E-01 0.991 0.892 1.100 4.00E-01 0.951 0.846 1.069 7.40E-01 1.029 0.871 1.215rs2930047 5 1,06,95,526 rs2930047 1 IBD-all A G 0.588 0.23 0.294 0.293 0.297 1.83E-01 0.926 0.827 1.037 1.75E-01 0.916 0.808 1.040 5.47E-01 0.947 0.793 1.131

rs11742570 5 4,04,10,584 rs11742570 1 IBD-all A G 0.354 0.385 0.369 0.342 0.390 1.29E-04 0.812 0.729 0.903 3.14E-04 0.802 0.711 0.904 4.95E-02 0.845 0.714 1.000rs10065637 5 5,54,38,851 rs10065637 1 CD-only A G 0.186 0.009 0.049 0.048 0.050 1.35E-01 0.830 0.650 1.060 2.14E-01 0.841 0.640 1.105 6.39E-01 0.913 0.625 1.335rs10061469 5 7,25,18,148 rs7702331 0.821 CD-only G A 0.35 0.571 0.461 0.443 0.477 4.44E-02 0.898 0.809 0.997 7.61E-02 0.900 0.801 1.011 3.82E-02 0.839 0.710 0.990

rs1363907 5 9,62,52,803 rs1363907 1 IBD-all A G 0.442 0.341 0.370 0.368 0.371 4.76E-01 0.962 0.864 1.071 6.13E-01 0.969 0.860 1.093 8.63E-01 0.985 0.831 1.168rs10051722 5 ########## rs4836519 0.631 IBD-all C A 0.339 0.362 0.370 0.380 0.363 3.42E-01 1.053 0.946 1.172 4.58E-01 1.047 0.928 1.180 1.37E-01 1.135 0.961 1.342

rs2188962 5 ########## rs2188962 1 IBD-all A G 0.402 0.009 0.104 0.113 0.096 4.12E-01 1.074 0.905 1.276 2.15E-01 1.127 0.933 1.361 6.01E-01 0.928 0.701 1.228rs254560 5 ########## rs254560 1 UC-only A G 0.363 0.044 0.142 0.147 0.138 9.90E-01 0.999 0.859 1.162 9.90E-01 0.999 0.844 1.182 9.52E-01 1.007 0.793 1.280

rs6863411 5 ########## rs6863411 1 IBD-all A T NA NA 0.236 0.241 0.232 6.65E-01 1.027 0.910 1.159 7.33E-01 1.024 0.895 1.172 6.24E-01 1.049 0.867 1.268rs11741861 5 ########## rs11741861 1 IBD-all G A 0.044 0.084 0.072 0.071 0.072 4.33E-01 0.922 0.751 1.130 9.54E-01 0.993 0.794 1.243 2.85E-01 0.832 0.594 1.165

rs6871626 5 ########## rs6871626 1 IBD-all A C 0.367 0.177 0.207 0.223 0.194 7.63E-02 1.125 0.988 1.282 5.95E-02 1.149 0.994 1.328 3.73E-01 1.097 0.894 1.346rs4976646 5 ########## rs12654812 0.947 IBD-all A G 0.637 0.389 0.447 0.445 0.447 4.17E-01 0.957 0.861 1.064 5.34E-01 0.963 0.856 1.084 6.04E-01 0.957 0.810 1.131

rs17119 6 1,47,19,496 rs17119 1 IBD-all G A 0.23 0.46 0.412 0.402 0.421 1.68E-01 0.929 0.837 1.031 4.03E-02 0.885 0.787 0.995 5.60E-01 1.050 0.892 1.235

MANUSCRIP

T

ACCEPTED

ACCEPTED MANUSCRIPT

rs9358372 6 2,08,12,588 rs9358372 1 IBD-all G A 0.35 0.323 0.342 0.348 0.337 6.04E-01 1.030 0.922 1.150 5.20E-01 1.041 0.920 1.179 5.43E-01 0.948 0.797 1.127rs12663356 6 2,14,30,728 rs12663356 1 CD-only A G 0.403 0.46 0.464 0.462 0.464 7.58E-01 1.017 0.916 1.129 9.31E-01 1.005 0.895 1.129 7.12E-01 0.969 0.820 1.145

rs9264942 6 3,12,74,380 rs9264942 1 CD-only G A 0.341 0.226 0.308 0.319 0.300 7.09E-02 1.110 0.991 1.242 4.99E-02 1.134 1.000 1.286 2.13E-01 1.119 0.937 1.337rs477515 6 3,25,69,691 rs6927022 0.523 UC-only A G 0.425 0.212 0.211 0.209 0.213 5.12E-01 0.958 0.843 1.089 3.61E-01 1.068 0.927 1.230 6.46E-03 0.740 0.596 0.919

rs1847472 6 9,09,73,159 rs1847472 1 IBD-all A C 0.308 0.106 0.122 0.117 0.125 1.63E-01 0.892 0.760 1.047 2.59E-01 0.902 0.755 1.079 4.73E-02 0.761 0.581 0.997rs7746082 6 ########## rs6568421 0.989 IBD-all C G NA NA 0.053 0.057 0.050 4.09E-01 1.102 0.875 1.388 6.49E-01 1.062 0.820 1.374 5.52E-01 1.116 0.777 1.603rs3851228 6 ########## rs3851228 1 IBD-all A T NA NA 0.171 0.170 0.173 7.47E-01 0.978 0.852 1.122 6.49E-01 0.965 0.826 1.126 5.70E-01 1.064 0.859 1.317rs2503322 6 ########## rs9491697 0.655 CD-only A G 0.42 0.407 0.398 0.394 0.402 4.92E-01 0.963 0.866 1.072 4.88E-01 0.959 0.850 1.080 2.08E-01 0.896 0.755 1.063

rs13204742 6 ########## rs13204742 1 CD-only A C 0.146 0.009 0.041 0.040 0.041 4.79E-01 0.910 0.701 1.181 4.66E-01 1.110 0.839 1.467 2.45E-02 0.558 0.336 0.928rs6920220 6 ########## rs6920220 1 IBD-all A G 0.165 0.119 0.114 0.123 0.105 3.09E-02 1.196 1.017 1.407 1.56E-01 1.141 0.951 1.369 6.12E-03 1.406 1.102 1.793

rs12199775 6 ########## rs12199775 1 IBD-all G A 0.075 0 0.012 0.012 0.012 7.71E-01 0.932 0.581 1.496 5.03E-01 0.830 0.482 1.431 6.73E-01 0.845 0.386 1.850rs212388 6 ########## rs212388 1 CD-only A G 0.562 0.323 0.340 0.333 0.347 1.70E-01 0.926 0.830 1.033 5.06E-01 0.959 0.849 1.084 4.48E-02 0.834 0.698 0.996

rs1819333 6 ########## rs1819333 1 IBD-all A C 0.491 0.221 0.278 0.277 0.278 2.19E-01 0.930 0.827 1.044 5.52E-01 0.962 0.845 1.094 2.49E-01 0.896 0.742 1.080rs1182188 7 28,69,985 rs798502 0.94 UC-only G A 0.288 0.088 0.171 0.168 0.174 4.59E-01 0.949 0.826 1.090 5.62E-01 0.955 0.818 1.116 5.25E-01 0.932 0.749 1.159rs4722672 7 2,72,31,762 rs4722672 1 UC-only G A 0.177 0.487 0.432 0.438 0.429 4.59E-01 1.040 0.938 1.153 9.56E-01 1.003 0.893 1.127 2.92E-01 1.092 0.927 1.285

rs864745 7 2,81,80,556 rs864745 1 CD-only G A 0.513 0.228 0.252 0.246 0.256 1.52E-01 0.916 0.812 1.033 1.21E-01 0.898 0.784 1.029 8.78E-01 0.985 0.816 1.190rs1456896 7 5,03,04,461 rs1456896 1 IBD-all G A 0.354 0.385 0.399 0.399 0.398 9.08E-01 0.994 0.894 1.104 9.31E-01 1.005 0.894 1.131 6.37E-01 0.961 0.814 1.134rs9297145 7 9,87,59,117 rs9297145 1 IBD-all A C 0.695 0.35 0.491 0.494 0.488 8.52E-01 0.990 0.893 1.098 5.26E-01 1.038 0.924 1.166 1.28E-01 0.880 0.747 1.037rs1734907 7 ########## rs1734907 1 IBD-all A G 0.162 0.192 0.169 0.167 0.171 5.40E-01 0.958 0.834 1.100 6.24E-01 0.962 0.825 1.122 9.23E-01 0.989 0.794 1.232rs4380874 7 ########## rs4380874 1 UC-only A G 0.438 0.15 0.206 0.210 0.202 2.12E-01 1.085 0.954 1.234 4.11E-01 1.062 0.920 1.226 1.46E-01 1.160 0.950 1.417

rs38911 7 ########## rs38904 1 IBD-all G A 0.522 0.317 0.395 0.402 0.390 8.59E-01 1.010 0.909 1.121 7.98E-01 1.015 0.903 1.142 9.00E-01 0.990 0.840 1.166rs4728142 7 ########## rs4728142 1 UC-only A G 0.398 0.239 0.282 0.280 0.283 2.90E-01 0.940 0.837 1.055 2.47E-01 0.927 0.815 1.054 7.33E-01 1.032 0.861 1.237

rs921720 8 ########## rs921720 1 IBD-all G A 0.619 0.075 0.190 0.211 0.174 2.60E-02 1.163 1.018 1.327 1.34E-02 1.204 1.039 1.395 3.26E-01 1.112 0.900 1.374rs6651252 8 ########## rs6651252 1 CD-only G A 0.124 0.381 0.321 0.320 0.322 9.85E-01 1.001 0.894 1.121 9.79E-01 1.002 0.884 1.135 3.98E-01 1.080 0.903 1.293

rs13277237 8 ########## rs1991866 0.902 IBD-all A G 0.549 0.398 0.445 0.444 0.446 7.29E-01 0.981 0.883 1.091 8.95E-01 1.008 0.896 1.134 3.84E-01 0.929 0.786 1.097rs4743820 9 9,39,28,416 rs4743820 1 IBD-all A G 0.699 0.283 0.402 0.407 0.399 7.28E-01 0.981 0.882 1.092 7.82E-01 0.983 0.873 1.108 6.77E-01 0.965 0.815 1.142rs4246905 9 ########## rs4246905 1 IBD-all A G 0.324 0.022 0.068 0.063 0.072 1.08E-01 0.843 0.685 1.038 5.91E-02 0.797 0.630 1.009 3.35E-01 0.848 0.606 1.186

rs10781499 9 ########## rs10781499 1 IBD-all A G 0.487 0.257 0.298 0.311 0.287 1.95E-01 1.079 0.962 1.210 4.87E-01 1.047 0.920 1.190 1.07E-01 1.160 0.968 1.389rs12722515 10 60,81,230 rs12722515 1 IBD-all A C 0.15 0.125 0.104 0.108 0.100 7.28E-01 1.030 0.872 1.217 3.67E-01 1.089 0.905 1.309 8.52E-01 0.975 0.747 1.273rs11010067 10 3,52,95,431 rs11010067 1 IBD-all G C NA NA 0.420 0.428 0.412 1.58E-01 1.081 0.970 1.204 1.76E-01 1.087 0.963 1.226 6.59E-01 1.040 0.875 1.236

rs2790216 10 5,99,97,926 rs2790216 1 IBD-all G A 0.779 0.305 0.413 0.432 0.397 4.62E-02 1.114 1.002 1.240 2.42E-02 1.146 1.018 1.291 8.30E-01 1.019 0.860 1.207rs10761659 10 6,44,45,564 rs10761659 1 IBD-all G A 0.553 0.035 0.150 0.169 0.135 6.96E-03 1.226 1.057 1.421 1.27E-02 1.231 1.045 1.450 7.28E-02 1.229 0.981 1.539

rs2227551 10 7,56,69,190 rs2227564 0.812 IBD-all C A 0.274 0.482 0.402 0.392 0.409 2.61E-01 0.941 0.847 1.046 2.51E-01 0.933 0.830 1.050 4.64E-01 0.939 0.795 1.110rs1250546 10 8,10,32,532 rs1250546 1 IBD-all G A 0.442 0.155 0.235 0.238 0.233 8.48E-01 1.012 0.895 1.144 8.97E-01 1.009 0.879 1.158 6.50E-01 1.046 0.862 1.269rs7097656 10 8,22,50,831 rs6586030 0.794 IBD-all A G 0.204 0.022 0.054 0.055 0.054 4.61E-01 0.917 0.728 1.155 3.71E-01 0.888 0.685 1.152 4.40E-01 0.863 0.593 1.255

rs12778642 10 9,44,64,307 rs7911264 0.757 IBD-all A C 0.438 0.407 0.433 0.423 0.441 1.11E-01 0.919 0.828 1.020 5.21E-02 0.891 0.792 1.001 9.77E-01 0.998 0.847 1.175rs4409764 10 ########## rs4409764 1 IBD-all C A 0.545 0.478 0.431 0.416 0.444 6.70E-02 0.906 0.815 1.007 1.53E-01 0.917 0.814 1.033 2.43E-01 0.905 0.766 1.070

rs11229555 11 5,84,08,687 rs10896794 0.87 IBD-all A C 0.288 0.195 0.141 0.139 0.143 3.35E-01 0.929 0.799 1.079 2.27E-01 0.901 0.760 1.067 9.06E-01 0.986 0.780 1.246rs11230563 11 6,07,76,209 rs11230563 1 IBD-all G A 0.637 0.323 0.457 0.466 0.449 2.18E-01 1.069 0.962 1.187 2.92E-01 1.065 0.947 1.198 1.83E-01 1.120 0.948 1.323

rs174537 11 6,15,52,680 rs4246215 0.89 IBD-all A C 0.345 0.013 0.095 0.102 0.089 3.68E-01 1.087 0.906 1.304 5.71E-01 1.060 0.866 1.298 3.91E-01 1.128 0.856 1.487rs559928 11 6,41,50,370 rs559928 1 IBD-all A G 0.204 0.429 0.365 0.370 0.360 5.61E-01 1.033 0.926 1.152 1.48E-01 1.094 0.969 1.235 1.76E-01 0.887 0.745 1.055rs568617 11 6,56,53,242 rs2231884 0.742 IBD-all A G 0.221 0.531 0.454 0.450 0.456 9.58E-01 1.003 0.903 1.113 7.74E-01 1.017 0.905 1.143 8.14E-01 1.020 0.864 1.205

rs2155219 11 7,62,99,194 rs2155219 1 IBD-all C A 0.513 0.416 0.415 0.402 0.426 4.36E-02 0.897 0.807 0.997 6.84E-02 0.895 0.795 1.008 6.42E-02 0.852 0.720 1.009rs2226628 11 8,70,91,845 rs6592362 0.501 IBD-all A G 0.274 0.363 0.309 0.308 0.312 7.13E-01 0.979 0.875 1.096 1.99E-01 0.920 0.811 1.045 1.33E-01 1.145 0.960 1.365

rs483905 11 9,60,23,427 rs483905 1 UC-only A G 0.319 0.181 0.216 0.208 0.224 6.49E-02 0.887 0.781 1.007 1.05E-01 0.889 0.771 1.025 2.02E-01 0.877 0.716 1.073rs561722 11 ########## rs561722 1 UC-only A G 0.301 0.372 0.318 0.310 0.323 1.86E-01 0.927 0.829 1.037 2.74E-01 0.932 0.822 1.057 1.59E-01 0.879 0.735 1.052rs566416 11 ########## rs630923 0.427 IBD-all C A 0.223 0 0.056 0.050 0.061 1.06E-02 0.740 0.587 0.932 3.56E-02 0.759 0.587 0.982 6.19E-02 0.692 0.470 1.019

rs11054935 12 1,26,48,843 rs11612508 0.988 IBD-all G A 0.279 0.009 0.075 0.082 0.069 1.44E-01 1.160 0.950 1.416 3.49E-01 1.113 0.889 1.393 2.37E-01 1.200 0.887 1.625rs11564258 12 4,07,92,300 rs11564258 1 IBD-all A G 0.027 0 0.010 0.014 0.008 3.77E-02 1.734 1.032 2.916 5.20E-02 1.760 0.995 3.113 1.04E-01 1.873 0.879 3.990rs11168249 12 4,82,08,368 rs11168249 1 IBD-all A G 0.496 0.363 0.361 0.355 0.366 2.29E-01 0.935 0.839 1.043 1.25E-01 0.908 0.804 1.027 8.29E-01 1.019 0.859 1.209

rs7134599 12 6,85,00,075 rs7134599 1 IBD-all A G 0.394 0.319 0.301 0.309 0.295 2.32E-01 1.071 0.957 1.197 6.95E-01 1.025 0.904 1.163 2.08E-02 1.226 1.031 1.458rs17085007 13 2,75,31,267 rs17085007 1 IBD-all G A 0.15 0.066 0.108 0.109 0.107 7.43E-01 0.972 0.819 1.153 6.24E-01 0.953 0.786 1.155 9.26E-01 0.987 0.756 1.289

rs941823 13 4,10,13,977 rs941823 1 IBD-all A G 0.221 0.217 0.213 0.216 0.209 5.23E-01 1.043 0.917 1.185 4.22E-01 1.060 0.920 1.222 8.71E-01 1.017 0.830 1.247rs3764147 13 4,44,57,925 rs3764147 1 CD-only G A 0.243 0.35 0.321 0.333 0.310 9.24E-02 1.100 0.984 1.230 5.64E-02 1.129 0.997 1.278 3.92E-01 1.079 0.906 1.286rs3742130 13 9,99,07,341 rs9557195 0.987 IBD-all A G 0.221 0.022 0.067 0.063 0.070 1.16E-01 0.845 0.686 1.042 1.72E-01 0.849 0.670 1.074 2.94E-01 0.834 0.594 1.171

MANUSCRIP

T

ACCEPTED

ACCEPTED MANUSCRIPT

rs194749 14 6,92,73,905 rs194749 1 IBD-all G A 0.235 0.192 0.207 0.215 0.199 6.82E-02 1.127 0.991 1.281 2.76E-01 1.083 0.938 1.251 5.39E-02 1.217 0.997 1.487rs1569328 14 7,57,41,751 rs4899554 0.538 IBD-all A G 0.159 0 0.031 0.032 0.029 7.70E-01 1.045 0.777 1.406 9.10E-01 1.019 0.729 1.426 7.38E-01 1.082 0.682 1.717rs8005161 14 8,84,72,595 rs8005161 1 IBD-all A G 0.14 0.438 0.365 0.374 0.357 1.55E-01 1.081 0.971 1.203 1.04E-01 1.105 0.980 1.246 5.35E-01 1.055 0.890 1.251

rs16967103 15 3,88,99,190 rs16967103 1 CD-only G A 0.241 0.279 0.295 0.285 0.302 1.56E-01 0.920 0.821 1.032 2.62E-01 0.929 0.818 1.056 1.38E-01 0.870 0.724 1.046rs28374715 15 4,15,63,950 rs28374715 1 UC-only G A 0.195 0.327 0.269 0.263 0.274 3.35E-01 0.945 0.841 1.061 2.48E-01 0.926 0.813 1.055 9.10E-01 0.989 0.824 1.189rs17293632 15 6,74,42,596 rs17293632 1 IBD-all A G 0.23 0.031 0.072 0.077 0.067 3.14E-01 1.109 0.907 1.358 2.30E-01 1.147 0.917 1.434 6.09E-01 0.916 0.654 1.283

rs529866 16 1,13,73,320 rs529866 1 IBD-all A G 0.195 0.181 0.211 0.203 0.216 3.51E-01 0.942 0.830 1.068 3.06E-01 0.928 0.805 1.070 6.69E-01 1.044 0.859 1.268rs7404095 16 2,38,64,590 rs7404095 1 IBD-all A G 0.363 0.407 0.405 0.398 0.409 2.09E-01 0.935 0.842 1.038 4.77E-01 0.959 0.853 1.077 1.68E-01 0.890 0.754 1.050

rs26528 16 2,85,17,709 rs26528 1 IBD-all G A 0.451 0.31 0.386 0.389 0.382 4.54E-01 1.041 0.937 1.158 3.77E-01 1.055 0.937 1.189 8.38E-01 1.018 0.860 1.205rs11150589 16 3,04,82,494 rs11150589 1 UC-only G A 0.544 0.155 0.200 0.215 0.186 2.89E-02 1.158 1.015 1.321 2.64E-02 1.180 1.020 1.366 2.32E-01 1.135 0.922 1.398

rs5743289 16 5,07,56,774 rs5743289 1 CD-only A G 0.226 0 0.041 0.052 0.032 6.52E-03 1.440 1.107 1.872 6.12E-05 1.763 1.336 2.326 2.64E-01 0.749 0.451 1.244rs1728785 16 6,85,91,230 rs1728785 1 UC-only A C 0.164 0.128 0.128 0.127 0.130 2.54E-01 0.914 0.782 1.067 3.59E-01 0.922 0.775 1.097 8.96E-01 0.984 0.773 1.252rs2361755 16 8,60,09,686 rs10521318 0.754 IBD-all C G NA NA 0.119 0.116 0.120 6.67E-01 0.965 0.822 1.134 3.58E-01 0.917 0.763 1.102 2.35E-01 1.158 0.909 1.474rs2945412 17 2,58,43,643 rs2945412 1 CD-only G A 0.416 0.208 0.231 0.223 0.237 1.19E-01 0.905 0.799 1.026 2.02E-01 0.913 0.795 1.050 5.41E-01 0.941 0.773 1.144rs3091315 17 3,25,93,665 rs3091316 1 IBD-all G A 0.31 0.451 0.358 0.341 0.373 5.18E-03 0.858 0.770 0.955 5.06E-03 0.841 0.745 0.949 6.24E-01 0.958 0.809 1.136

rs12946510 17 3,79,12,377 rs12946510 1 IBD-all A G 0.5 0.119 0.216 0.233 0.201 4.00E-03 1.203 1.061 1.364 9.31E-05 1.319 1.148 1.516 8.13E-01 0.976 0.795 1.197rs12942547 17 4,05,27,544 rs12942547 1 IBD-all G A 0.455 0.456 0.412 0.406 0.419 1.43E-01 0.924 0.831 1.027 4.91E-01 0.959 0.853 1.079 1.36E-01 0.879 0.743 1.041

rs1292053 17 5,79,63,537 rs1292053 1 IBD-all A G 0.549 0.504 0.493 0.491 0.494 5.96E-01 0.972 0.876 1.079 6.04E-01 0.970 0.863 1.089 7.63E-01 1.026 0.870 1.209rs17780256 17 7,06,42,923 rs7210086 0.968 UC-only C A 0.205 0.093 0.137 0.134 0.140 6.82E-01 0.969 0.835 1.125 7.74E-01 1.024 0.869 1.207 2.21E-01 0.858 0.672 1.096

rs1893217 18 1,28,09,340 rs1893217 1 IBD-all G A 0.115 0.049 0.062 0.065 0.060 7.81E-01 0.970 0.782 1.204 9.58E-01 1.006 0.793 1.277 9.07E-01 0.980 0.699 1.375rs7240004 18 4,63,95,022 rs7240004 1 IBD-all A G 0.626 0.478 0.474 0.478 0.471 4.97E-01 1.037 0.934 1.151 1.59E-01 1.087 0.968 1.222 3.55E-01 0.925 0.783 1.092

rs727088 18 6,75,30,439 rs727088 1 IBD-all A G 0.5 0.254 0.299 0.303 0.297 4.39E-01 0.956 0.855 1.071 5.40E-01 0.961 0.847 1.091 2.12E-01 0.891 0.743 1.068rs2024092 19 11,24,031 rs2024092 1 CD-only A G 0.212 0.388 0.303 0.299 0.307 4.94E-01 0.961 0.858 1.077 5.95E-01 0.966 0.850 1.098 9.38E-01 1.007 0.843 1.204

rs11879191 19 1,05,12,911 rs11879191 1 IBD-all A G 0.199 0.128 0.123 0.125 0.121 4.64E-01 1.061 0.905 1.245 8.01E-01 1.023 0.856 1.223 4.14E-01 1.109 0.865 1.424rs17694108 19 3,37,31,551 rs17694108 1 IBD-all A G 0.305 0.013 0.065 0.069 0.061 7.01E-01 1.042 0.845 1.285 7.17E-01 1.044 0.827 1.318 9.15E-01 0.982 0.703 1.371

rs4802307 19 4,68,49,806 rs4802307 1 CD-only A C 0.283 0.049 0.094 0.090 0.098 2.06E-02 0.809 0.676 0.968 1.04E-02 0.765 0.623 0.939 7.18E-01 0.951 0.726 1.247rs11083840 19 4,71,19,910 rs1126510 0.782 UC-only A C 0.593 0.42 0.452 0.454 0.448 6.05E-01 1.028 0.926 1.142 4.55E-01 1.046 0.930 1.175 7.38E-01 0.972 0.824 1.147

rs516246 19 4,92,06,172 rs516246 1 CD-only G A 0.469 0.473 0.493 0.480 0.504 4.37E-02 0.899 0.810 0.997 3.35E-02 0.882 0.785 0.990 5.94E-01 0.956 0.812 1.127rs4243971 20 3,08,49,517 rs6142618 0.932 IBD-all A C 0.469 0.204 0.242 0.242 0.242 4.47E-01 0.954 0.845 1.077 4.62E-01 0.950 0.829 1.089 3.49E-01 0.912 0.751 1.106rs6087990 20 3,13,49,908 rs4911259 0.876 IBD-all A G 0.619 0.19 0.294 0.292 0.296 1.83E-01 0.926 0.826 1.037 3.06E-01 0.936 0.823 1.063 4.21E-02 0.828 0.691 0.993rs6088765 20 3,37,99,280 rs6088765 1 UC-only A C 0.584 0.133 0.251 0.261 0.242 1.93E-01 1.083 0.961 1.220 2.33E-01 1.085 0.949 1.242 3.63E-01 1.091 0.905 1.315rs6017342 20 4,30,65,028 rs6017342 1 UC-only A C 0.438 0.195 0.269 0.272 0.267 7.68E-01 0.982 0.872 1.107 7.14E-01 1.025 0.898 1.171 1.18E-01 0.857 0.707 1.040rs6074022 20 4,47,40,196 rs1569723 1 IBD-all G A 0.239 0.004 0.065 0.070 0.061 6.46E-01 1.051 0.851 1.297 5.95E-01 1.066 0.843 1.347 7.91E-01 1.046 0.751 1.455

rs913678 20 4,89,55,424 rs913678 1 IBD-all A G 0.606 0.022 0.168 0.183 0.155 7.98E-02 1.137 0.985 1.312 4.48E-02 1.176 1.004 1.378 4.44E-01 1.089 0.875 1.357rs259964 20 5,78,24,309 rs259964 1 IBD-all G A 0.536 0.305 0.332 0.321 0.342 7.40E-02 0.906 0.813 1.010 6.32E-02 0.891 0.789 1.006 1.58E-01 0.883 0.743 1.049

rs6062504 20 6,23,48,907 rs6062504 1 IBD-all A G 0.345 0.243 0.286 0.267 0.301 5.91E-03 0.850 0.758 0.954 1.19E-02 0.847 0.744 0.964 1.25E-01 0.865 0.719 1.041rs2823286 21 1,68,17,938 rs2823286 1 IBD-all A G 0.319 0.261 0.275 0.263 0.285 1.22E-01 0.914 0.815 1.024 1.53E-01 0.911 0.801 1.035 5.83E-01 0.951 0.794 1.139rs2284553 21 3,47,76,695 rs2284553 1 CD-only A G 0.442 0.053 0.114 0.112 0.116 6.17E-02 0.854 0.723 1.008 1.65E-01 0.877 0.729 1.056 1.09E-01 0.802 0.613 1.051rs2836878 21 4,04,65,534 rs2836878 1 IBD-all A G 0.288 0.075 0.129 0.114 0.143 1.08E-03 0.770 0.658 0.901 4.49E-02 0.838 0.705 0.996 4.03E-04 0.609 0.463 0.802rs7282490 21 4,56,15,741 rs7282490 1 IBD-all G A 0.429 0.199 0.251 0.264 0.240 1.88E-02 1.154 1.024 1.301 3.31E-03 1.222 1.069 1.396 9.44E-01 0.993 0.818 1.205rs2266959 22 2,19,22,904 rs2266959 1 IBD-all A C 0.168 0.031 0.062 0.067 0.057 1.29E-01 1.183 0.952 1.470 2.26E-01 1.162 0.911 1.483 2.32E-01 1.226 0.877 1.714rs5763767 22 3,04,93,882 rs2412970 1 IBD-all G A 0.615 0.208 0.310 0.296 0.321 7.07E-03 0.854 0.761 0.958 8.97E-03 0.842 0.740 0.958 5.62E-02 0.834 0.693 1.005rs2413583 22 3,96,59,773 rs2413583 1 IBD-all A G 0.146 0.367 0.274 0.250 0.295 7.90E-04 0.819 0.728 0.920 2.09E-03 0.812 0.711 0.927 2.56E-01 0.900 0.750 1.080rs3749171 2 ########## rs6837335 0.968 CD-onlyrs7438704 4 4,83,63,245 rs17695092 1 CD-onlyrs7657746 4 ########## rs10486483 1 CD-only

rs17695092 5 ########## rs7015630 1 CD-onlyrs10486483 7 2,68,92,440 rs3749171 1 IBD-all

rs7015630 8 9,08,75,918 rs7657746 1 IBD-allrs10758669 9 49,81,602 rs10758669 1 IBD-all

rs1042058 10 3,07,28,101 rs1042058 1 IBD-allrs907611 11 18,74,072 rs907611 1 IBD-all

rs7495132 15 9,11,72,901 rs7495132 1 IBD-allrs1654644 19 5,53,73,362 rs11672983 0.866 IBD-all

MANUSCRIP

T

ACCEPTED

ACCEPTED MANUSCRIPTChr Region HUGO_ID

Uniprot_ID

NCBI_ID Gene_Name Function

10q11.23-10q21.2 19400 Q8N328 NM_170753,NM_001277059 PIGGYBAC TRANSPOSABLE ELEMENT-DERIVED PROTEIN 2-RELATED (PTHR28576:SF3)

10q11.23-10q21.2 9414 Q13976 NM_006258,NM_001098512 SUBFAMILY NOT NAMED (PTHR24353:SF68) non-receptor serine/threonine protein kinase;non-receptor serine/threonine protein kinase

10q11.23-10q21.2 19351 Q9H694 NM_001080512 PROTEIN BICAUDAL C HOMOLOG 1 (PTHR10627:SF38) apolipoprotein;RNA binding protein

10q11.23-10q21.2 26470 Q8IW00 NM_144984,NM_001031746 V-SET AND TRANSMEMBRANE DOMAIN-CONTAINING PROTEIN 4 (PTHR12207:SF8)

10q11.23-10q21.2 23466 Q5VW22 NM_001077665 ARF-GAP WITH GTPASE, ANK REPEAT AND PH DOMAIN-CONTAINING PROTEIN 10-RELATED (PTHR23180:SF213)

nucleic acid binding;G-protein modulator

10q11.23-10q21.2 20739 Q8NFU5 NM_152230 INOSITOL POLYPHOSPHATE MULTIKINASE (PTHR12400:SF51) kinase;kinase

10q11.23-10q21.2 18738 O94844 NM_001242359,NM_014836 RHO-RELATED BTB DOMAIN-CONTAINING PROTEIN 1 (PTHR24072:SF120) small GTPase

10q11.23-10q21.2 28550 Q6ZUK4 NM_178505 TRANSMEMBRANE PROTEIN 26 (PTHR22168:SF3)

10q11.23-10q21.2 29799 Q86VZ5 NM_147156 PHOSPHATIDYLCHOLINE:CERAMIDE CHOLINEPHOSPHOTRANSFERASE 1 (PTHR21290:SF28)

10q11.23-10q21.2 29378 Q96FC7 NM_032439,NM_001143774 PHYTANOYL-COA HYDROXYLASE-INTERACTING PROTEIN-LIKE (PTHR15698:SF8)

10q11.23-10q21.2 27421 Q8N6V4 NM_182554,NM_001042427 UPF0728 PROTEIN C10ORF53 (PTHR28448:SF1)

10q11.23-10q21.2 8605 Q86W56 NM_003631 POLY(ADP-RIBOSE) GLYCOHYDROLASE (PTHR12837:SF0) glycosidase

10q11.23-10q21.2 7671 Q13772 NM_001145261,NM_001145262,NM_001145263,NM_001145260,NM_005437

NUCLEAR RECEPTOR COACTIVATOR 4 (PTHR17085:SF3)

10q11.23-10q21.2 18860 Q9NR71 NM_019893,NM_001143974 NEUTRAL CERAMIDASE-RELATED (PTHR12670:SF1)

10q11.23-10q21.2 14674 Q96QU1 NM_033056 PROTOCADHERIN-15 (PTHR24028:SF11) G-protein coupled receptor;cadherin

10q11.23-10q21.2 30880 Q9NZ45 NM_018464 CDGSH IRON-SULFUR DOMAIN-CONTAINING PROTEIN 1 (PTHR13680:SF2)

10q11.23-10q21.2 23520 Q7RTY1 NM_194298 MONOCARBOXYLATE TRANSPORTER 9 (PTHR11360:SF158) transporter

10q11.23-10q21.2 494 Q12955 NM_001204403,NM_001204404,NM_001149,NM_020987

ANKYRIN-3 (PTHR24123:SF22) cytoskeletal protein

10q11.23-10q21.2 23199 Q8N456 NM_001006939 LEUCINE-RICH REPEAT-CONTAINING PROTEIN 18 (PTHR23155:SF593)

10q11.23-10q21.2 25590 Q9ULD0 NM_001143997,NM_018245,NM_001143996 2-OXOGLUTARATE DEHYDROGENASE-LIKE, MITOCHONDRIAL (PTHR23152:SF5)

10q11.23-10q21.2 7372 P08118 NM_002443,NM_138634 BETA-MICROSEMINOPROTEIN (PTHR10500:SF0) peptide hormone

10q11.23-10q21.2 23416 Q641Q2 NM_001005751 WASH COMPLEX SUBUNIT FAM21-RELATED (PTHR21669:SF4)

10q11.23-10q21.2 24086 Q9NQ94 NM_001198820,NM_001198818,NM_001198819,NM_014576,NM_138932,NM_138933

APOBEC1 COMPLEMENTATION FACTOR (PTHR24012:SF313)

10q11.23-10q21.2 6922 P11226 NM_000242 MANNOSE-BINDING PROTEIN C; MBL2 ; (PTHR24020:SF0) defense/immunity protein

10q11.23-10q21.2 21536 A6NNA5 NM_001276451 DORSAL ROOT GANGLIA HOMEOBOX PROTEIN (PTHR24329:SF373) homeobox transcription factor;DNA binding protein

10q11.23-10q21.2 1912 P28329 NM_001142933,NM_020984,NM_001142934,NM_020549,NM_001142929,NM_020985,NM_020986

CHOLINE O-ACETYLTRANSFERASE (PTHR22589:SF14) acetyltransferase;acyltransferase

10q11.23-10q21.2 10936 Q16572 NM_003055 VESICULAR ACETYLCHOLINE TRANSPORTER (PTHR23506:SF13)

10q11.23-10q21.2 23456 P0C7U1 NM_001079516 NEUTRAL CERAMIDASE-RELATED (PTHR12670:SF1)

10q11.23-10q21.2 37162 P0CJ72 NM_001190478 HUMANIN-LIKE PROTEIN 5 (PTHR33895:SF4)

10q11.23-10q21.2 19371 Q8NE31 NM_198215,NM_001166698,NM_001001971,NM_001143773

PROTEIN FAM13C (PTHR15904:SF19)

10q11.23-10q21.2 18782 Q16204 NM_005436 COILED-COIL DOMAIN-CONTAINING PROTEIN 6 (PTHR15276:SF0)

10q11.23-10q21.2 19736 A6NMN3 NM_001164484 PROTEIN FAM170B-RELATED (PTHR33517:SF2)

10q11.23-10q21.2 27274 Q5T292 NM_001288740,NM_001010863,NM_001288743

PROTEIN 1810011H11RIK (PTHR37857:SF1)

10q11.23-10q21.2 2891 O94907 NM_012242 DICKKOPF-RELATED PROTEIN 1 (PTHR12113:SF11)

10q11.23-10q21.2 13195 O95229 NM_007057,NM_032997,NM_001005413 ZW10 INTERACTOR (PTHR31504:SF1)

10q11.23-10q21.2 11741 Q00059 NM_003201,NM_001270782 TRANSCRIPTION FACTOR A, MITOCHONDRIAL (PTHR13711:SF49) HMG box transcription factor;signaling molecule;chromatin/chromatin-binding protein

10q11.23-10q21.2 1722 P06493 NM_033379,NM_001786 CYCLIN-DEPENDENT KINASE 1 (PTHR24056:SF80) non-receptor serine/threonine protein kinase;non-receptor tyrosine protein kinase;non-receptor serine/threonine protein kinase;non-receptor tyrosine protein kinase

10q11.23-10q21.2 29323 Q6ZS81 NM_020945 WD REPEAT- AND FYVE DOMAIN-CONTAINING PROTEIN 4 (PTHR13743:SF85)

enzyme modulator

10q11.23-10q21.2 26973 Q711Q0 NM_001135196 PROTEIN 3425401B19RIK (PTHR33775:SF2)

10q11.23-10q21.2 17312 O14925 NM_006327 MITOCHONDRIAL IMPORT INNER MEMBRANE TRANSLOCASE SUBUNIT TIM23 (PTHR15371:SF0)

amino acid transporter

10q11.23-10q21.2 17086 Q9H0L4 NM_015235 CLEAVAGE STIMULATION FACTOR SUBUNIT 2 TAU VARIANT (PTHR23139:SF55)

mRNA splicing factor

10q11.23-10q21.2 12474 P51668 NM_003338 UBIQUITIN-CONJUGATING ENZYME E2 D1 (PTHR24068:SF36) ligase

15q22.2-15q23 1371 O43570 NM_206925,NM_001218 CARBONIC ANHYDRASE 12 (PTHR18952:SF19) dehydratase15q22.2-15q23 29666 Q96DP5 NM_139242 METHIONYL-TRNA FORMYLTRANSFERASE, MITOCHONDRIAL

(PTHR11138:SF0)methyltransferase

15q22.2-15q23 6840 Q02750 NM_002755 DUAL SPECIFICITY MITOGEN-ACTIVATED PROTEIN KINASE KINASE 1 (PTHR24361:SF370)

15q22.2-15q23 10353 P36578 NM_000968 60S RIBOSOMAL PROTEIN L4 (PTHR19431:SF0)15q22.2-15q23 29645 Q9NXK6 NM_001104554,NM_017705 MEMBRANE PROGESTIN RECEPTOR GAMMA (PTHR20855:SF38) G-protein coupled receptor15q22.2-15q23 10372 P05386 NM_213725,NM_001003 60S ACIDIC RIBOSOMAL PROTEIN P1 (PTHR21141:SF42) ribosomal protein15q22.2-15q23 24080 Q8WW43 NM_031301,NM_001145646 GAMMA-SECRETASE SUBUNIT APH-1B (PTHR12889:SF1) enzyme modulator15q22.2-15q23 26235 Q9H5X1 NM_032231,NM_001014812 MIP18 FAMILY PROTEIN FAM96A (PTHR12377:SF2)15q22.2-15q23 2454 Q9HCP0 NM_022048 CASEIN KINASE I ISOFORM GAMMA-1-RELATED (PTHR11909:SF151) non-receptor serine/threonine protein kinase;non-receptor

serine/threonine protein kinase15q22.2-15q23 28961 Q15004 NM_014736,NM_001029989 PCNA-ASSOCIATED FACTOR (PTHR15679:SF8)15q22.2-15q23 8096 O95190 NM_002537 ORNITHINE DECARBOXYLASE ANTIZYME 2 (PTHR10279:SF6) enzyme modulator15q22.2-15q23 26220 Q9H611 NM_025049,NM_001286496,NM_001286497,

NM_001286499ATP-DEPENDENT DNA HELICASE PIF1 (PTHR23274:SF11)

15q22.2-15q23 20373 Q9NZD8 NM_001127889,NM_016630,NM_001127890 MASPARDIN (PTHR15913:SF0)

15q22.2-15q23 25721 Q86VS3 NM_001284347,NM_001284349,NM_001284348,NM_022784,NM_001031715

IQ DOMAIN-CONTAINING PROTEIN H (PTHR14465:SF0)

15q22.2-15q23 3649 Q9UK73 NM_015322 PROTEIN FEM-1 HOMOLOG B (PTHR24173:SF18)

MANUSCRIP

T

ACCEPTED

ACCEPTED MANUSCRIPT15q22.2-15q23 6136 Q9UKX5 NM_001004439 INTEGRIN ALPHA-11 (PTHR23220:SF21)15q22.2-15q23 2256 Q9UQ03 NM_006091,NM_001190457,NM_001190456 CORONIN-2B (PTHR10856:SF17) non-motor actin binding protein

15q22.2-15q23 30026 Q8TD55 NM_025201,NM_001195059 PLECKSTRIN HOMOLOGY DOMAIN-CONTAINING FAMILY O MEMBER 2 (PTHR15871:SF2)

15q22.2-15q23 30750 Q9BVW5 NM_017858 TIMELESS-INTERACTING PROTEIN (PTHR13220:SF11) transcription factor15q22.2-15q23 2077 Q9NWW5 NM_017882 CEROID-LIPOFUSCINOSIS NEURONAL PROTEIN 6 (PTHR16244:SF2)

15q22.2-15q23 11839 Q04726 NM_001282982,NM_001105192,NM_005078,NM_020908,NM_001282979,NM_001282980,NM_001282981

TRANSDUCIN-LIKE ENHANCER PROTEIN 3 (PTHR10814:SF24) transcription cofactor

15q22.2-15q23 16468 P83111 NM_171846,NM_032857,NM_001288585 SERINE BETA-LACTAMASE-LIKE PROTEIN LACTB, MITOCHONDRIAL (PTHR22935:SF37)

serine protease;serine protease

15q22.2-15q23 29956 Q86UW2 NM_178859 ORGANIC SOLUTE TRANSPORTER SUBUNIT BETA (PTHR36129:SF1)

15q22.2-15q23 24175 Q9P035 NM_016395 VERY-LONG-CHAIN (3R)-3-HYDROXYACYL-[ACYL-CARRIER PROTEIN] DEHYDRATASE 3 (PTHR11035:SF20)

15q22.2-15q23 15484 O75971 NM_006049 SNRNA-ACTIVATING PROTEIN COMPLEX SUBUNIT 5 (PTHR15333:SF2)

15q22.2-15q23 6769 P84022 NM_001145104,NM_005902,NM_001145102,NM_001145103

MOTHERS AGAINST DECAPENTAPLEGIC HOMOLOG 3 (PTHR13703:SF25) transcription factor

15q22.2-15q23 14874 Q96PH1 NM_001184780,NM_001184779,NM_024505 NADPH OXIDASE 5 (PTHR11972:SF58) oxidase

15q22.2-15q23 15447 Q9Y4G6 NM_015059 TALIN-2 (PTHR19981:SF15) actin family cytoskeletal protein;cell adhesion molecule15q22.2-15q23 29003 O15014 NM_015042 ZINC FINGER PROTEIN 609 (PTHR21564:SF3)15q22.2-15q23 13770 Q8TDY8 NM_020962 IMMUNOGLOBULIN SUPERFAMILY DCC SUBCLASS MEMBER 4; IGDCC4;

(PTHR10489:SF40)immunoglobulin receptor superfamily;protein phosphatase;protein phosphatase;immunoglobulin receptor superfamily;immunoglobulin superfamily cell adhesion molecule

15q22.2-15q23 25468 Q9H900 NM_001287822,NM_001287823,NM_017975,NM_001287821

PROTEIN ZWILCH HOMOLOG (PTHR15995:SF1)

15q22.2-15q23 25662 Q6PD74 NM_001271885,NM_024666,NM_001271886 ALPHA- AND GAMMA-ADAPTIN-BINDING PROTEIN P34 (PTHR14659:SF1)

15q22.2-15q23 21326 P84550 NM_001258024 SKI FAMILY TRANSCRIPTIONAL COREPRESSOR 1 (PTHR10005:SF8) transcription factor

15q22.2-15q23 17855 O94923 NM_015554 D-GLUCURONYL C5-EPIMERASE (PTHR13174:SF3)15q22.2-15q23 12010 P09493 NM_001018004,NM_001018005,NM_000366,

NM_001018020,NM_001018008,NM_001018006,NM_001018007

TROPOMYOSIN ALPHA-1 CHAIN (PTHR19269:SF41) actin binding motor protein

15q22.2-15q23 12626 Q9Y6I4 NM_001256702,NM_006537 UBIQUITIN CARBOXYL-TERMINAL HYDROLASE 3 (PTHR24006:SF356)

15q22.2-15q23 4867 Q15751 NM_003922 E3 UBIQUITIN-PROTEIN LIGASE HERC1-RELATED (PTHR22870:SF188) chromatin/chromatin-binding protein;guanyl-nucleotide exchange factor

15q22.2-15q23 11172 Q13596 NM_003099,NM_001242933,NM_148955 SORTING NEXIN-1 (PTHR10555:SF129) membrane trafficking regulatory protein15q22.2-15q23 12310 Q15650 NM_016213 ACTIVATING SIGNAL COINTEGRATOR 1 (PTHR12963:SF0) transcription cofactor15q22.2-15q23 28002 Q495B1 NM_182703 ANKYRIN REPEAT AND DEATH DOMAIN-CONTAINING PROTEIN 1A

(PTHR24125:SF0)15q22.2-15q23 2088 O76031 NM_006660 ATP-DEPENDENT CLP PROTEASE ATP-BINDING SUBUNIT CLPX-LIKE,

MITOCHONDRIAL (PTHR11262:SF4)chaperone

15q22.2-15q23 1980 O75339 NM_003613 CARTILAGE INTERMEDIATE LAYER PROTEIN 1 (PTHR15031:SF3)15q22.2-15q23 26040 Q8N5Y8 NM_017851 MONO [ADP-RIBOSE] POLYMERASE PARP16 (PTHR21328:SF24)15q22.2-15q23 29635 A6BM72 NM_032445 MULTIPLE EPIDERMAL GROWTH FACTOR-LIKE DOMAINS PROTEIN 11

(PTHR24035:SF10)extracellular matrix protein

15q22.2-15q23 2752 O75925 NM_016166 E3 SUMO-PROTEIN LIGASE PIAS1 (PTHR10782:SF11) ligase15q22.2-15q23 16315 Q96L94 NM_024798 SORTING NEXIN-22 (PTHR15813:SF8)15q22.2-15q23 9255 P23284 NM_000942 PEPTIDYL-PROLYL CIS-TRANS ISOMERASE B (PTHR11071:SF267) isomerase15q22.2-15q23 37227 C9JR72 NM_001101362 KELCH REPEAT AND BTB DOMAIN-CONTAINING PROTEIN 13

(PTHR24412:SF57)15q22.2-15q23 40028 F5GYI3 NM_001163692 UBIQUITIN-ASSOCIATED PROTEIN 1-LIKE (PTHR15960:SF3)15q22.2-15q23 25372 Q96SY0 NM_001207058,NM_001207059 VON WILLEBRAND FACTOR A DOMAIN-CONTAINING PROTEIN 9

(PTHR13532:SF3)15q22.2-15q23 10975 O60721 NM_004727 SODIUM/POTASSIUM/CALCIUM EXCHANGER 1 (PTHR10846:SF36) transporter15q22.2-15q23 9760 P62491 NM_004663,NM_001206836 RAS-RELATED PROTEIN RAB-11A (PTHR24073:SF313)15q22.2-15q23 28698 Q8TF46 NM_133375,NM_001143688 DIS3-LIKE EXONUCLEASE 1 (PTHR23355:SF30) endoribonuclease;exoribonuclease;nuclease;hydrolase15q22.2-15q23 6772 O43541 NM_005585 MOTHERS AGAINST DECAPENTAPLEGIC HOMOLOG 6 (PTHR13703:SF28) transcription factor15q22.2-15q23 18445 Q96GE6 NM_001031733,NM_033429,NM_001286694, CALMODULIN-LIKE PROTEIN 4 (PTHR23050:SF169) calmodulin15q22.2-15q23 13233 P39687 NM_006305 ACIDIC LEUCINE-RICH NUCLEAR PHOSPHOPROTEIN 32 FAMILY phosphatase inhibitor15q22.2-15q23 15570 Q6UW49 NM_145658 SPERM EQUATORIAL SEGMENT PROTEIN 1 (PTHR31667:SF2)15q22.2-15q23 18476 Q71UM5 NM_015920 40S RIBOSOMAL PROTEIN S27-LIKE (PTHR11594:SF2) ribosomal protein15q22.2-15q23 30273 Q92930 NM_016530 RAS-RELATED PROTEIN RAB-8B (PTHR24073:SF22)15q22.2-15q23 30289 Q9NYN1 NM_016563 RAS-LIKE PROTEIN FAMILY MEMBER 12 (PTHR24070:SF252) small GTPase15q22.2-15q23 9700 Q8IVU1 NM_004884 IMMUNOGLOBULIN SUPERFAMILY DCC SUBCLASS MEMBER 3; IGDCC3 ; immunoglobulin receptor superfamily;protein phosphatase;protein 15q22.2-15q23 16490 Q6V1X1 NM_130434,NM_017743,NM_197960,NM_197 DIPEPTIDYL PEPTIDASE 8 (PTHR11731:SF98) serine protease;serine protease15q22.2-15q23 24321 Q7Z401 NM_001144823,NM_005848 C-MYC PROMOTER-BINDING PROTEIN (PTHR12296:SF16)15q22.2-15q23 15583 Q6UWM7 NM_207338 LACTASE-LIKE PROTEIN (PTHR10353:SF24) glucosidase;glycosidase15q22.2-15q23 34453 A6NNL5 NM_001143936 SUBFAMILY NOT NAMED (PTHR34651:SF1)15q22.2-15q23 6845 Q13163 NM_001206804,NM_145160,NM_002757 DUAL SPECIFICITY MITOGEN-ACTIVATED PROTEIN KINASE KINASE 5 15q22.2-15q23 6392 Q02241 NM_138555,NM_004856 KINESIN-LIKE PROTEIN KIF23 (PTHR24115:SF467) microtubule binding motor protein16p12.2-16p11.2 28740 Q8IXQ8 NM_173806 PDZ DOMAIN-CONTAINING PROTEIN 9 (PTHR22698:SF1) non-receptor serine/threonine protein kinase;non-receptor 16p12.2-16p11.2 24615 O00418 NM_013302 EUKARYOTIC ELONGATION FACTOR 2 KINASE (PTHR14187:SF56) ligase16p12.2-16p11.2 29419 Q5JPH6 NM_001083614 GLUTAMATE--TRNA LIGASE, MITOCHONDRIAL-RELATED

(PTHR11451:SF41)microtubule binding motor protein

16p12.2-16p11.2 24594 Q9BTE1 NM_032486,NM_001199743,NM_001199011 DYNACTIN SUBUNIT 5 (PTHR13061:SF5) KRAB box transcription factor

16p12.2-16p11.2 25677 Q63HK3 NM_001012981 ZINC FINGER PROTEIN WITH KRAB AND SCAN DOMAINS 2 (PTHR10032:SF206)

16p12.2-16p11.2 5200 Q9Y661 NM_006040 HEPARAN SULFATE GLUCOSAMINE 3-O-SULFOTRANSFERASE 4 (PTHR10605:SF11)

cytoskeletal protein

16p12.2-16p11.2 28283 Q6UXU4 NM_144675,NM_001109763 GERM CELL-SPECIFIC GENE 1-LIKE PROTEIN (PTHR10671:SF35) protein phosphatase;protein phosphatase;calmodulin16p12.2-16p11.2 24927 O43745 NM_022097 CALCINEURIN B HOMOLOGOUS PROTEIN 2 (PTHR23056:SF49) transporter;transfer/carrier protein16p12.2-16p11.2 642 O94778 NM_001169 AQUAPORIN-8 (PTHR19139:SF178)16p12.2-16p11.2 30755 Q7Z2V1 NM_001145545 PROTEIN TNT (PTHR40139:SF1)16p12.2-16p11.2 25840 Q8N371 NM_024773,NM_001145348 LYSINE-SPECIFIC DEMETHYLASE 8 (PTHR12461:SF38)16p12.2-16p11.2 29897 Q8WV22 NM_145080 NON-STRUCTURAL MAINTENANCE OF CHROMOSOMES ELEMENT 1

HOMOLOG (PTHR20973:SF0)16p12.2-16p11.2 4664 Q12789 NM_001520,NM_001286242 GENERAL TRANSCRIPTION FACTOR 3C POLYPEPTIDE 1 (PTHR15180:SF1) metalloprotease;reductase;esterase;metalloprotease

16p12.2-16p11.2 12586 P22695 NM_003366 CYTOCHROME B-C1 COMPLEX SUBUNIT 2, MITOCHONDRIAL (PTHR11851:SF142)

ion channel

16p12.2-16p11.2 10602 P51170 NM_001039 AMILORIDE-SENSITIVE SODIUM CHANNEL SUBUNIT GAMMA (PTHR11690:SF19)

16p12.2-16p11.2 30565 O14562 NM_019116 UBIQUITIN DOMAIN-CONTAINING PROTEIN UBFD1 (PTHR16470:SF0) voltage-gated calcium channel;voltage-gated ion channel

16p12.2-16p11.2 1407 O60359 NM_006539 VOLTAGE-DEPENDENT CALCIUM CHANNEL GAMMA-3 SUBUNIT (PTHR12107:SF5)

nuclease

16p12.2-16p11.2 9889 Q7Z6E9 NM_006910,NM_032626,NM_018703 E3 UBIQUITIN-PROTEIN LIGASE RBBP6 (PTHR15439:SF3) type I cytokine receptor16p12.2-16p11.2 6015 P24394 NM_001257406,NM_001257407,NM_000418 INTERLEUKIN-4 RECEPTOR SUBUNIT ALPHA; IRL4 ; (PTHR23037:SF32)

16p12.2-16p11.2 29068 O60303 NM_015202 PROTEIN K04F10.2 (PTHR21534:SF0) DNA-directed RNA polymerase

MANUSCRIP

T

ACCEPTED

ACCEPTED MANUSCRIPT16p12.2-16p11.2 30347 Q9NVU0 NM_018119,NM_001258034,NM_001258033,NM_001258036,NM_001258035

DNA-DIRECTED RNA POLYMERASE III SUBUNIT RPC5 (PTHR12069:SF0) methyltransferase

16p12.2-16p11.2 17557 Q9UIC8 NM_001032391,NM_016309 LEUCINE CARBOXYL METHYLTRANSFERASE 1 (PTHR13600:SF21) type I cytokine receptor16p12.2-16p11.2 6006 Q9HBE5 NM_021798,NM_181079,NM_181078 INTERLEUKIN-21 RECEPTOR; IL21R ; (PTHR23037:SF7) DNA binding protein16p12.2-16p11.2 1799 Q01850 NM_001802 CEREBELLAR DEGENERATION-RELATED PROTEIN 2 (PTHR19232:SF1) ion channel

16p12.2-16p11.2 10600 P51168 NM_000336 AMILORIDE-SENSITIVE SODIUM CHANNEL SUBUNIT BETA (PTHR11690:SF18)

transfer/carrier protein

16p12.2-16p11.2 7694 O14561 NM_005003 ACYL CARRIER PROTEIN, MITOCHONDRIAL (PTHR20863:SF5)16p12.2-16p11.2 11969 Q8NDV7 NM_014494 TRINUCLEOTIDE REPEAT-CONTAINING GENE 6A PROTEIN

(PTHR13020:SF28)translation initiation factor

16p12.2-16p11.2 26347 B5ME19 NM_001099661 EUKARYOTIC TRANSLATION INITIATION FACTOR 3 SUBUNIT C-RELATED (PTHR13937:SF0)

translation initiation factor

16p12.2-16p11.2 3279 Q99613 NM_001037808,NM_003752,NM_001199142,NM_001286478,NM_001267574

EUKARYOTIC TRANSLATION INITIATION FACTOR 3 SUBUNIT C-RELATED (PTHR13937:SF0)

extracellular matrix glycoprotein

16p12.2-16p11.2 16378 Q7RTW8 NM_001161683,NM_170664 OTOANCORIN (PTHR23412:SF18)16p12.2-16p11.2 27088 A6NCI4 NM_173615 VON WILLEBRAND FACTOR A DOMAIN-CONTAINING PROTEIN 3A

(PTHR10338:SF95)16p12.2-16p11.2 20060 Q70CQ4 NM_020718 UBIQUITIN CARBOXYL-TERMINAL HYDROLASE 31 (PTHR24006:SF462) transporter;membrane traffic protein;kinase activator;cytoskeletal

protein16p12.2-16p11.2 16064 Q9UJY4 NM_015044 ADP-RIBOSYLATION FACTOR-BINDING PROTEIN GGA2 (PTHR13856:SF74)

16p12.2-16p11.2 26144 Q86YC2 NM_024675 PARTNER AND LOCALIZER OF BRCA2 (PTHR14662:SF2) non-receptor serine/threonine protein kinase;transfer/carrier protein;non-receptor serine/threonine protein kinase;annexin;calmodulin

16p12.2-16p11.2 9395 P05771 NM_002738,NM_212535 PROTEIN KINASE C BETA TYPE (PTHR24356:SF188) carbohydrate transporter;cation transporter16p12.2-16p11.2 23091 Q8WWX8 NM_001258412,NM_001258411,NM_0012584 SODIUM/MYO-INOSITOL COTRANSPORTER 2 (PTHR11819:SF127)16p12.2-16p11.2 18239 Q68EM7 NM_018054,NM_001006634 RHO GTPASE-ACTIVATING PROTEIN 17 (PTHR14130:SF3) protein kinase;protein kinase16p12.2-16p11.2 16942 Q76MJ5 NM_033266 SERINE/THREONINE-PROTEIN KINASE/ENDORIBONUCLEASE IRE2 protein kinase;protein kinase16p12.2-16p11.2 17699 Q52WX2 NM_001024401 SERINE/THREONINE-PROTEIN KINASE SBK1 (PTHR24359:SF0)16p12.2-16p11.2 5195 Q9Y278 NM_006043 HEPARAN SULFATE GLUCOSAMINE 3-O-SULFOTRANSFERASE 2 16p12.2-16p11.2 18622 P83436 NM_153603 CONSERVED OLIGOMERIC GOLGI COMPLEX SUBUNIT 7 (PTHR21443:SF0)16p12.2-16p11.2 9077 P53350 NM_005030 SERINE/THREONINE-PROTEIN KINASE PLK1 (PTHR24345:SF0)16p12.2-16p11.2 19733 Q96QU8 NM_001270940,NM_015171 EXPORTIN-6 (PTHR21452:SF4)16p12.2-16p11.2 37454 E9PJ23 NM_001282524 NUCLEAR PORE COMPLEX-INTERACTING PROTEIN FAMILY MEMBER 17q12-17q21.31 13311 Q96QA5 NM_178171 GASDERMIN-A (PTHR16399:SF18)17q12-17q21.31 2438 P09919 NM_172220,NM_001178147,NM_000759,NM_ GRANULOCYTE COLONY-STIMULATING FACTOR (PTHR10511:SF2) cytokine17q12-17q21.31 17428 Q9UHV5 NM_016339 RAP GUANINE NUCLEOTIDE EXCHANGE FACTOR-LIKE 1 guanyl-nucleotide exchange factor17q12-17q21.31 28695 Q8N1A0 NM_152349 KERATIN-LIKE PROTEIN KRT222 (PTHR23239:SF217) structural protein;intermediate filament17q12-17q21.31 18527 Q2M2I5 NM_019016 KERATIN, TYPE I CYTOSKELETAL 24 (PTHR23239:SF207) structural protein;intermediate filament17q12-17q21.31 16778 Q9BYR8 NM_031958 KERATIN-ASSOCIATED PROTEIN 3-1 (PTHR23260:SF3)17q12-17q21.31 6456 O76015 NM_006771 KERATIN, TYPE I CUTICULAR HA8 (PTHR23239:SF166) structural protein;intermediate filament17q12-17q21.31 6415 P13646 NM_002274,NM_153490 KERATIN, TYPE I CYTOSKELETAL 13 (PTHR23239:SF121) structural protein;intermediate filament17q12-17q21.31 6436 P08727 NM_002276 KERATIN, TYPE I CYTOSKELETAL 19 (PTHR23239:SF14) structural protein;intermediate filament17q12-17q21.31 4164 P01350 NM_000805 GASTRIN (PTHR19309:SF0) peptide hormone17q12-17q21.31 28300 Q969T7 NM_052935 7-METHYLGUANOSINE PHOSPHATE-SPECIFIC 5'-NUCLEOTIDASE

(PTHR13045:SF3)esterase

17q12-17q21.31 9287 Q9UD71 NM_001242464,NM_032192,NM_181505 PROTEIN PHOSPHATASE 1 REGULATORY SUBUNIT 1B (PTHR15417:SF2) signaling molecule;phosphatase inhibitor

17q12-17q21.31 17579 Q14849 NM_001165938,NM_001165937,NM_006804 STAR-RELATED LIPID TRANSFER PROTEIN 3 (PTHR12136:SF51) transfer/carrier protein;membrane traffic protein

17q12-17q21.31 4567 Q14451 NM_001030002,NM_001242442,NM_001242443,NM_005310

GROWTH FACTOR RECEPTOR-BOUND PROTEIN 7 (PTHR11243:SF25) transmembrane receptor regulatory/adaptor protein

17q12-17q21.31 11796 P10827 NM_003250,NM_001190918,NM_001190919,NM_199334

THYROID HORMONE RECEPTOR ALPHA (PTHR24082:SF42) nuclear hormone receptor;receptor;nucleic acid binding

17q12-17q21.31 11109 Q969G3 NM_003079 SWI/SNF-RELATED MATRIX-ASSOCIATED ACTIN-DEPENDENT REGULATOR OF CHROMATIN SUBFAMILY E MEMBER 1 (PTHR13711:SF206)

HMG box transcription factor;signaling molecule;chromatin/chromatin-binding protein

17q12-17q21.31 26707 Q6A162 NM_182497 KERATIN, TYPE I CYTOSKELETAL 40 (PTHR23239:SF90) structural protein;intermediate filament17q12-17q21.31 16772 Q07627 NM_030967 KERATIN-ASSOCIATED PROTEIN 1-1-RELATED (PTHR23262:SF58)17q12-17q21.31 18907 Q9BYQ7 NM_033060 KERATIN-ASSOCIATED PROTEIN 4-1 (PTHR23262:SF71)17q12-17q21.31 16927 Q9BYQ3 NM_031962 KERATIN-ASSOCIATED PROTEIN 9-3 (PTHR23262:SF46)17q12-17q21.31 6455 O76014 NM_003770 KERATIN, TYPE I CUTICULAR HA7 (PTHR23239:SF197) structural protein;intermediate filament17q12-17q21.31 6423 P08779 NM_005557 KERATIN, TYPE I CYTOSKELETAL 16 (PTHR23239:SF105) structural protein;intermediate filament17q12-17q21.31 9785 P51148 NM_201434,NM_004583,NM_001252039 RAS-RELATED PROTEIN RAB-5C (PTHR24073:SF366)17q12-17q21.31 11366 P42229 NM_003152,NM_001288719,NM_001288718,

NM_001288720SIGNAL TRANSDUCER AND ACTIVATOR OF TRANSCRIPTION 5A (PTHR11801:SF47)

transcription factor;nucleic acid binding

17q12-17q21.31 7632 P54802 NM_000263 ALPHA-N-ACETYLGLUCOSAMINIDASE (PTHR12872:SF1)17q12-17q21.31 3526 Q92800 NM_001991 HISTONE-LYSINE N-METHYLTRANSFERASE EZH1 (PTHR22884:SF333) methyltransferase;DNA binding protein

17q12-17q21.31 24224 Q9NYV4 NM_015083,NM_016507 CYCLIN-DEPENDENT KINASE 12 (PTHR24056:SF126) non-receptor serine/threonine protein kinase;non-receptor tyrosine protein kinase;non-receptor serine/threonine protein kinase;non-receptor tyrosine protein kinase

17q12-17q21.31 28230 Q9BRT3 NM_032339 MIGRATION AND INVASION ENHANCER 1 (PTHR15124:SF15)17q12-17q21.31 1744 Q99741 NM_001254 CELL DIVISION CONTROL PROTEIN 6 HOMOLOG (PTHR10763:SF26) replication origin binding protein

17q12-17q21.31 1608 P32248 NM_001838 C-C CHEMOKINE RECEPTOR TYPE 7; CCR7 ; (PTHR10489:SF635) G-protein coupled receptor;immunoglobulin receptor superfamily;protein phosphatase;protein phosphatase;immunoglobulin receptor superfamily;immunoglobulin superfamily cell adhesion molecule

17q12-17q21.31 18890 Q9BYR6 NM_033185 KERATIN-ASSOCIATED PROTEIN 3-3 (PTHR23260:SF1)17q12-17q21.31 16777 Q9BYS1 NM_031957 KERATIN-ASSOCIATED PROTEIN 1-3-RELATED (PTHR23262:SF34)17q12-17q21.31 18905 Q9BYT5 NM_033032 KERATIN-ASSOCIATED PROTEIN 2-2-RELATED (PTHR23262:SF6)17q12-17q21.31 16776 Q9BQ66 NM_031854 KERATIN-ASSOCIATED PROTEIN 4-12-RELATED (PTHR23262:SF54)

17q12-17q21.31 18900 Q9BYR5 NM_033062 KERATIN-ASSOCIATED PROTEIN 4-12-RELATED (PTHR23262:SF54)

17q12-17q21.31 16926 Q9BYQ4 NM_031961 KERATIN-ASSOCIATED PROTEIN 9-2-RELATED (PTHR23262:SF56)17q12-17q21.31 18915 A8MTY7 NM_001277332 KERATIN-ASSOCIATED PROTEIN 9-2-RELATED (PTHR23262:SF56)17q12-17q21.31 6450 O76009 NM_004138 KERATIN, TYPE I CUTICULAR HA3-I (PTHR23239:SF98) structural protein;intermediate filament17q12-17q21.31 6452 O76011 NM_021013 KERATIN, TYPE I CUTICULAR HA4 (PTHR23239:SF165) structural protein;intermediate filament17q12-17q21.31 6447 P35527 NM_000226 KERATIN, TYPE I CYTOSKELETAL 9 (PTHR23239:SF96) structural protein;intermediate filament17q12-17q21.31 6427 Q04695 NM_000422 KERATIN, TYPE I CYTOSKELETAL 17 (PTHR23239:SF180) structural protein;intermediate filament17q12-17q21.31 19008 Q9NVR0 NM_018143 KELCH-LIKE PROTEIN 11 (PTHR24413:SF115) transcription cofactor;serine protease;serine protease;non-motor actin

binding protein17q12-17q21.31 27258 Q86VR2 NM_178126 PROTEIN FAM134C (PTHR28659:SF1)17q12-17q21.31 8011 P78357 NM_003632 CONTACTIN-ASSOCIATED PROTEIN 1 (PTHR10127:SF4) transporter;apolipoprotein;membrane-bound signaling

molecule;receptor;metalloprotease;serine protease;oxidase;metalloprotease;serine protease;extracellular matrix protein;enzyme modulator;cell adhesion molecule

17q12-17q21.31 13178 Q9UKT9 NM_183228,NM_183229,NM_012481,NM_183231,NM_183232,NM_183230,NM_001257410,NM_001257411,NM_001284514,NM_001257412,NM_001257413,NM_001257414,NM_001284515,NM_001257408,NM_001257409

ZINC FINGER PROTEIN AIOLOS (PTHR24404:SF23) KRAB box transcription factor

17q12-17q21.31 9864 P10276 NM_001145302,NM_001024809,NM_000964,NM_001145301

RETINOIC ACID RECEPTOR ALPHA (PTHR24082:SF115) nuclear hormone receptor;receptor;nucleic acid binding

17q12-17q21.31 30840 Q7Z3Y9 NM_181539 KERATIN, TYPE I CYTOSKELETAL 26 (PTHR23239:SF162) structural protein;intermediate filament

MANUSCRIP

T

ACCEPTED

ACCEPTED MANUSCRIPT17q12-17q21.31 30842 Q7Z3Y7 NM_181535 KERATIN, TYPE I CYTOSKELETAL 28 (PTHR23239:SF215) structural protein;intermediate filament17q12-17q21.31 6413 P13645 NM_000421 KERATIN, TYPE I CYTOSKELETAL 10 (PTHR23239:SF137) structural protein;intermediate filament17q12-17q21.31 20412 P35900 NM_019010 KERATIN, TYPE I CYTOSKELETAL 20 (PTHR23239:SF167) structural protein;intermediate filament17q12-17q21.31 6438 Q9C075 NM_015515 KERATIN, TYPE I CYTOSKELETAL 23 (PTHR23239:SF44) structural protein;intermediate filament17q12-17q21.31 18911 Q9BYQ6 NM_033059 KERATIN-ASSOCIATED PROTEIN 4-11-RELATED (PTHR23262:SF1)17q12-17q21.31 18908 Q9BYR4 NM_033187 KERATIN-ASSOCIATED PROTEIN 4-3 (PTHR23262:SF14)17q12-17q21.31 18902 Q9BYQ2 NM_033191 KERATIN-ASSOCIATED PROTEIN 9-4 (PTHR23262:SF65)17q12-17q21.31 18914 A8MVA2 NM_001277331 KERATIN-ASSOCIATED PROTEIN 9-2-RELATED (PTHR23262:SF56)17q12-17q21.31 6448 Q15323 NM_002277 KERATIN, TYPE I CUTICULAR HA1 (PTHR23239:SF97) structural protein;intermediate filament17q12-17q21.31 6449 Q14532 NM_002278 KERATIN, TYPE I CUTICULAR HA2 (PTHR23239:SF155) structural protein;intermediate filament17q12-17q21.31 6207 P14923 NM_002230,NM_021991 JUNCTION PLAKOGLOBIN (PTHR23315:SF12) storage protein;signaling molecule;cytoskeletal protein;cell adhesion

molecule17q12-17q21.31 16946 Q92791 NM_006455 SYNAPTONEMAL COMPLEX PROTEIN SC65 (PTHR13986:SF4) nucleic acid binding;extracellular matrix protein17q12-17q21.31 12392 Q99615 NM_001144766,NM_003315 DNAJ HOMOLOG SUBFAMILY C MEMBER 7 (PTHR24078:SF139)17q12-17q21.31 4201 Q92830 NM_021078 HISTONE ACETYLTRANSFERASE KAT2A (PTHR22880:SF124) acetyltransferase;chromatin/chromatin-binding protein17q12-17q21.31 6253 Q9UQ05 NM_012285 POTASSIUM VOLTAGE-GATED CHANNEL SUBFAMILY H MEMBER 4

(PTHR10217:SF378)cyclic nucleotide-gated ion channel;voltage-gated potassium channel;voltage-gated ion channel;cyclic nucleotide-gated ion channel

17q12-17q21.31 11367 P51692 NM_012448 SIGNAL TRANSDUCER AND ACTIVATOR OF TRANSCRIPTION 5B (PTHR11801:SF39)

transcription factor;nucleic acid binding

17q12-17q21.31 5210 P14061 NM_000413 ESTRADIOL 17-BETA-DEHYDROGENASE 1 (PTHR24322:SF577) dehydrogenase;reductase17q12-17q21.31 17928 Q9P2W1 NM_013290,NM_001256014,NM_001256015,

NM_001256016,NM_016556HOMOLOGOUS-PAIRING PROTEIN 2 HOMOLOG (PTHR15938:SF0) signaling molecule;DNA binding protein;enzyme modulator

17q12-17q21.31 26105 Q7Z736 NM_024927 PLECKSTRIN HOMOLOGY DOMAIN-CONTAINING FAMILY H MEMBER 3 (PTHR22903:SF17)

17q12-17q21.31 28122 Q9BRG1 NM_032353 VACUOLAR PROTEIN-SORTING-ASSOCIATED PROTEIN 25 (PTHR13149:SF0)

17q12-17q21.31 9234 Q15648 NM_004774 MEDIATOR OF RNA POLYMERASE II TRANSCRIPTION SUBUNIT 1 (PTHR12881:SF10)

17q12-17q21.31 11610 O15273 NM_003673 TELETHONIN (PTHR15143:SF0) cytoskeletal protein17q12-17q21.31 23719 Q96FM1 NM_033419 POST-GPI ATTACHMENT TO PROTEINS FACTOR 3 (PTHR13148:SF0)

17q12-17q21.31 22963 O75448 NM_001267797,NM_014815,NM_001079518 MEDIATOR OF RNA POLYMERASE II TRANSCRIPTION SUBUNIT 24 (PTHR12898:SF1)

17q12-17q21.31 7962 P20393 NM_021724 NUCLEAR RECEPTOR SUBFAMILY 1 GROUP D MEMBER 1 (PTHR24082:SF113)

nuclear hormone receptor;receptor;nucleic acid binding

17q12-17q21.31 11989 P11388 NM_001067 DNA TOPOISOMERASE 2-ALPHA (PTHR10169:SF46) DNA topoisomerase;isomerase;enzyme modulator17q12-17q21.31 30839 Q7Z3Z0 NM_181534 KERATIN, TYPE I CYTOSKELETAL 25 (PTHR23239:SF160) structural protein;intermediate filament17q12-17q21.31 6414 Q99456 NM_000223 KERATIN, TYPE I CYTOSKELETAL 12 (PTHR23239:SF115) structural protein;intermediate filament17q12-17q21.31 16775 Q9BYU5 NM_001123387 KERATIN-ASSOCIATED PROTEIN 2-1-RELATED (PTHR23262:SF7)17q12-17q21.31 16928 Q9BYR3 NM_032524 KERATIN-ASSOCIATED PROTEIN 4-4 (PTHR23262:SF57)17q12-17q21.31 18912 A8MXZ3 NM_001190460 KERATIN-ASSOCIATED PROTEIN 9-1 (PTHR23262:SF13)17q12-17q21.31 17231 Q9BYQ0 NM_031963 KERATIN-ASSOCIATED PROTEIN 9-2-RELATED (PTHR23262:SF56)17q12-17q21.31 6454 O76013 NM_003771 KERATIN, TYPE I CUTICULAR HA6 (PTHR23239:SF204) structural protein;intermediate filament17q12-17q21.31 18829 Q6JEL2 NM_152467 KELCH-LIKE PROTEIN 10 (PTHR24412:SF172)17q12-17q21.31 25280 Q96NG3 NM_031421 TETRATRICOPEPTIDE REPEAT PROTEIN 25 (PTHR23040:SF1)17q12-17q21.31 2158 P09543 NM_033133 2',3'-CYCLIC-NUCLEOTIDE 3'-PHOSPHODIESTERASE (PTHR10156:SF0) phosphodiesterase

17q12-17q21.31 30589 Q9BQS6 NM_033194 HEAT SHOCK PROTEIN BETA-9 (PTHR11527:SF111) chaperone17q12-17q21.31 4847 O43612 NM_001524 OREXIN (PTHR15173:SF2)17q12-17q21.31 24438 Q8N2G8 NM_032484,NM_001142623 GH3 DOMAIN-CONTAINING PROTEIN (PTHR31901:SF5)17q12-17q21.31 29932 Q13057 NM_001042532,NM_025233,NM_001042529 BIFUNCTIONAL COENZYME A SYNTHASE (PTHR10695:SF20) kinase;kinase

17q12-17q21.31 12419 Q9NRH3 NM_016437 TUBULIN GAMMA-2 CHAIN (PTHR11588:SF79) tubulin17q12-17q21.31 9844 O60895 NM_005854 RECEPTOR ACTIVITY-MODIFYING PROTEIN 2 (PTHR14076:SF9) receptor17q12-17q21.31 3430 P04626 NM_004448 RECEPTOR TYROSINE-PROTEIN KINASE ERBB-2 (PTHR24416:SF137)

17q12-17q21.31 27905 Q68DK7 NM_001012241 MALE-SPECIFIC LETHAL 1 HOMOLOG (PTHR21656:SF2)17q12-17q21.31 24352 Q8IZW8 NM_032865 TENSIN-4 (PTHR12305:SF53) protein phosphatase;protein phosphatase17q12-17q21.31 16779 Q9BYR7 NM_031959 KERATIN-ASSOCIATED PROTEIN 3-2 (PTHR23260:SF4)17q12-17q21.31 18909 Q9BYQ5 NM_030976 KERATIN-ASSOCIATED PROTEIN 4-6-RELATED (PTHR23262:SF60)17q12-17q21.31 18169 Q96AY3 NM_021939 PEPTIDYL-PROLYL CIS-TRANS ISOMERASE FKBP10 (PTHR10516:SF247) isomerase;chaperone;calcium-binding protein

17q12-17q21.31 12417 P23258 NM_001070 TUBULIN GAMMA-1 CHAIN (PTHR11588:SF62) tubulin17q12-17q21.31 9160 P11086 NM_002686 PHENYLETHANOLAMINE N-METHYLTRANSFERASE (PTHR10867:SF18) methyltransferase

17q12-17q21.31 20678 Q6X784 NM_198844,NM_199321 ZONA PELLUCIDA-BINDING PROTEIN 2 (PTHR15443:SF4)17q12-17q21.31 16038 Q8N138 NM_139280 ORM1-LIKE PROTEIN 3 (PTHR12665:SF11)17q12-17q21.31 30923 Q8TF74 NM_133264 WAS/WASL-INTERACTING PROTEIN FAMILY MEMBER 2 (PTHR23202:SF3)

17q12-17q21.31 5473 P22692 NM_001552 INSULIN-LIKE GROWTH FACTOR-BINDING PROTEIN 4 (PTHR11551:SF7)

17q12-17q21.31 30841 Q7Z3Y8 NM_181537 KERATIN, TYPE I CYTOSKELETAL 27 (PTHR23239:SF120) structural protein;intermediate filament17q12-17q21.31 16771 Q8IUG1 NM_030966 KERATIN-ASSOCIATED PROTEIN 1-3-RELATED (PTHR23262:SF34)17q12-17q21.31 18906 P0C7H8 NM_001165252,NM_033184 KERATIN-ASSOCIATED PROTEIN 2-2-RELATED (PTHR23262:SF6)17q12-17q21.31 17230 Q9BYQ9 NM_031960 KERATIN-ASSOCIATED PROTEIN 4-8 (PTHR23262:SF67)17q12-17q21.31 18899 Q9BYR2 NM_033188 KERATIN-ASSOCIATED PROTEIN 4-5 (PTHR23262:SF49)17q12-17q21.31 18916 A8MUX0 NM_001146182 KERATIN-ASSOCIATED PROTEIN 16-1 (PTHR23262:SF55)17q12-17q21.31 6453 Q92764 NM_002280 KERATIN, TYPE I CUTICULAR HA5 (PTHR23239:SF193) structural protein;intermediate filament17q12-17q21.31 3249 P41567 NM_005801 EUKARYOTIC TRANSLATION INITIATION FACTOR 1 (PTHR10388:SF10)

17q12-17q21.31 4812 P54257 NM_001079870,NM_001079871,NM_177977 HUNTINGTIN-ASSOCIATED PROTEIN 1 (PTHR15751:SF14) membrane traffic protein

17q12-17q21.31 115 P53396 NM_001096,NM_198830 ATP-CITRATE SYNTHASE (PTHR23118:SF0) transferase;lyase;ligase17q12-17q21.31 17898 Q9NYR9 NM_017595,NM_001144927,NM_001144928,

NM_001001349,NM_001144929NF-KAPPA-B INHIBITOR-INTERACTING RAS-LIKE PROTEIN 2 (PTHR24070:SF236)

small GTPase

17q12-17q21.31 11364 P40763 NM_003150,NM_139276,NM_213662 SIGNAL TRANSDUCER AND ACTIVATOR OF TRANSCRIPTION 3 (PTHR11801:SF2)

transcription factor;nucleic acid binding

17q12-17q21.31 7763 Q15784 NM_006160 NEUROGENIC DIFFERENTIATION FACTOR 2 (PTHR19290:SF83) basic helix-loop-helix transcription factor;nuclease17q12-17q21.31 23690 Q8TAX9 NM_018530,NM_001165958,NM_001165959,

NM_001042471GASDERMIN-B (PTHR16399:SF20)

17q12-17q21.31 9560 O43242 NM_002809 26S PROTEASOME NON-ATPASE REGULATORY SUBUNIT 3 (PTHR10758:SF2)

enzyme modulator

17q12-17q21.31 17040 O15234 NM_007359 PROTEIN CASC3 (PTHR13434:SF0)17q12-17q21.31 19147 Q8N144 NM_152219 GAP JUNCTION DELTA-3 PROTEIN (PTHR11984:SF5) gap junction17q12-17q21.31 32971 Q6A163 NM_213656 KERATIN, TYPE I CYTOSKELETAL 39 (PTHR23239:SF106) structural protein;intermediate filament17q12-17q21.31 18904 P0C5Y4 NM_001257305 KERATIN-ASSOCIATED PROTEIN 1-1-RELATED (PTHR23262:SF58)17q12-17q21.31 18891 Q9BYR9 NM_001165252,NM_033184 KERATIN-ASSOCIATED PROTEIN 2-1-RELATED (PTHR23262:SF7)17q12-17q21.31 18910 Q9BYQ8 NM_001146041 KERATIN-ASSOCIATED PROTEIN 4-11-RELATED (PTHR23262:SF1)17q12-17q21.31 34211 A8MX34 NM_001257309 KERATIN-ASSOCIATED PROTEIN 29-1 (PTHR32378:SF9)17q12-17q21.31 6451 Q14525 NM_002279 KERATIN, TYPE I CUTICULAR HA3-II (PTHR23239:SF99) structural protein;intermediate filament17q12-17q21.31 6421 P19012 NM_002275 KERATIN, TYPE I CYTOSKELETAL 15 (PTHR23239:SF164) structural protein;intermediate filament17q12-17q21.31 6416 P02533 NM_000526 KERATIN, TYPE I CYTOSKELETAL 14 (PTHR23239:SF201) structural protein;intermediate filament17q12-17q21.31 29517 Q96C10 NM_024119 ATP-DEPENDENT RNA HELICASE DHX58-RELATED (PTHR14074:SF7)

17q12-17q21.31 9688 Q6NZI2 NM_012232 POLYMERASE I AND TRANSCRIPT RELEASE FACTOR (PTHR15240:SF3) transcription factor

17q12-17q21.31 865 Q93050 NM_005177,NM_001130020,NM_001130021 V-TYPE PROTON ATPASE 116 KDA SUBUNIT A ISOFORM 1 (PTHR11629:SF68)

ATP synthase

MANUSCRIP

T

ACCEPTED

ACCEPTED MANUSCRIPT17q12-17q21.31 11645 Q9UH92 NM_198205,NM_198204,NM_170607 MAX-LIKE PROTEIN X (PTHR15741:SF25) basic helix-loop-helix transcription factor;nucleic acid binding