a fast and accurate method for genome-wide time-to-event … · 2020. 6. 25. · the american...
TRANSCRIPT
![Page 1: A Fast and Accurate Method for Genome-Wide Time-to-Event … · 2020. 6. 25. · The American Journal of Human Genetics, Volume 107 Supplemental Data A Fast and Accurate Method for](https://reader033.vdocuments.mx/reader033/viewer/2022051822/5fec9a6f9513e9516157840c/html5/thumbnails/1.jpg)
The American Journal of Human Genetics, Volume 107
Supplemental Data
A Fast and Accurate Method for Genome-Wide
Time-to-Event Data Analysis and Its Application
to UK Biobank
Wenjian Bi, Lars G. Fritsche, Bhramar Mukherjee, Sehee Kim, and Seunggeun Lee
![Page 2: A Fast and Accurate Method for Genome-Wide Time-to-Event … · 2020. 6. 25. · The American Journal of Human Genetics, Volume 107 Supplemental Data A Fast and Accurate Method for](https://reader033.vdocuments.mx/reader033/viewer/2022051822/5fec9a6f9513e9516157840c/html5/thumbnails/2.jpg)
Figure S1. Comparisons between Score test and SPACox-NoSPA based on 𝑽𝒂�̂�𝒆𝒎𝒑(𝑺).
(A) Comparison between 𝑉𝑎�̂�(𝑆) estimated from observed information matrix and empirical
variance 𝑉𝑎�̂�𝑒𝑚𝑝(𝑆), (B) Comparison of p values between Score test and SPACox-NoSPA, (C)
QQ plot of -log10(p values) of Score test and SPACox-NoSPA. We simulated 2×105 replications
under three event rates (ERs) of 1%, 10% and 50%. The sample size was 4,000 and we considered
common variants (MAF = 0.3, expected MAC = 2,400) and low-frequency variants (MAF = 0.01,
expected MAC = 80). MAF: Minor Allele Frequency; MAC: Minor Allele Counts. Score test and
SPACox-NoSPA use 𝑉𝑎�̂�(𝑆) and 𝑉𝑎�̂�𝑒𝑚𝑝(𝑆) to standardize the score statistics and calculate p
values.
![Page 3: A Fast and Accurate Method for Genome-Wide Time-to-Event … · 2020. 6. 25. · The American Journal of Human Genetics, Volume 107 Supplemental Data A Fast and Accurate Method for](https://reader033.vdocuments.mx/reader033/viewer/2022051822/5fec9a6f9513e9516157840c/html5/thumbnails/3.jpg)
Figure S2. Comparisons between Score test and SPACox-NoSPA based on 𝑽𝒂�̂�𝒆𝒎𝒑(𝑺)|�̇�.
(A) Comparison between 𝑉𝑎�̂�(𝑆) estimated from observed information matrix and empirical
variance 𝑉𝑎�̂�𝑒𝑚𝑝(𝑆)|�̇�, (B) Comparison of p values between Score test and SPACox-NoSPA with
variance 𝑉𝑎�̂�𝑒𝑚𝑝(𝑆)|�̇�, (C) QQ plot of -log10(P) of Score test and SPACox-NoSPA. We simulated
2×105 replications under three event rates (ERs) of 1%, 10% and 50%. The sample size was 4,000
and we considered common variants (MAF = 0.3, expected MAC = 2,400) and low-frequency
variants (MAF = 0.01, expected MAC = 80). MAF: Minor Allele Frequency; MAC: Minor Allele
Counts. Score test and SPACox-NoSPA use 𝑉𝑎�̂�(𝑆) and 𝑉𝑎�̂�𝑒𝑚𝑝(𝑆)|�̇� to standardize the score
statistics and calculate p values.
![Page 4: A Fast and Accurate Method for Genome-Wide Time-to-Event … · 2020. 6. 25. · The American Journal of Human Genetics, Volume 107 Supplemental Data A Fast and Accurate Method for](https://reader033.vdocuments.mx/reader033/viewer/2022051822/5fec9a6f9513e9516157840c/html5/thumbnails/4.jpg)
Figure S3. Empirical Type I Error Rates of Wald Test Based on Signs of �̂�.
From left to right, the plots considered 5 event rates (ERs) of 0.2%, 1%, 10%, 20%, and 50%. Top
and bottom plots are for empirical type I error rates at 𝛼 = 5 × 10−5 and 5 × 10−8, respectively.
Sample size 𝑛 = 100,000. For each pair of MAF and event rate, we simulated 109 replications.
![Page 5: A Fast and Accurate Method for Genome-Wide Time-to-Event … · 2020. 6. 25. · The American Journal of Human Genetics, Volume 107 Supplemental Data A Fast and Accurate Method for](https://reader033.vdocuments.mx/reader033/viewer/2022051822/5fec9a6f9513e9516157840c/html5/thumbnails/5.jpg)
Figure S4. Empirical Powers of SPACox, Firth, Wald, Score, and SPACC Tests when 𝜸 is
Negative.
From left to right, the plots considered 3 MAFs of 0.01, 0.05, and 0.3. From top to bottom, the
plots considered 5 event rates (ERs) of 0.2%, 1%, 10%, 20%, and 50%. Empirical powers were
evaluated at the significance level 5 × 10−8. Sample size 𝑛 = 100,000. For each pair of MAF and
event rate, we simulated 1,000 replications.
![Page 6: A Fast and Accurate Method for Genome-Wide Time-to-Event … · 2020. 6. 25. · The American Journal of Human Genetics, Volume 107 Supplemental Data A Fast and Accurate Method for](https://reader033.vdocuments.mx/reader033/viewer/2022051822/5fec9a6f9513e9516157840c/html5/thumbnails/6.jpg)
Figure S5. Empirical Powers of SPACox, Firth, Wald, Score, and SPACC tests when Testing
Rare Variants with MAF of 0.001.
From top to bottom, the plots considered 5 event rates (ERs) of 0.2%, 1%, 10%, 20%, and 50%.
Empirical powers were evaluated at a significance level 5 × 10−8. Sample size 𝑛 = 100,000. For
each event rate, we simulated 1,000 replications.
![Page 7: A Fast and Accurate Method for Genome-Wide Time-to-Event … · 2020. 6. 25. · The American Journal of Human Genetics, Volume 107 Supplemental Data A Fast and Accurate Method for](https://reader033.vdocuments.mx/reader033/viewer/2022051822/5fec9a6f9513e9516157840c/html5/thumbnails/7.jpg)
Figure S6. QQ Plots for 12 Diseases from UK Biobank.
QQ plots were based on p values calculated from SPACox method. The red line represents the
genome-wide significance level 𝛼 = 5 × 10−8.
![Page 8: A Fast and Accurate Method for Genome-Wide Time-to-Event … · 2020. 6. 25. · The American Journal of Human Genetics, Volume 107 Supplemental Data A Fast and Accurate Method for](https://reader033.vdocuments.mx/reader033/viewer/2022051822/5fec9a6f9513e9516157840c/html5/thumbnails/8.jpg)
Figure S7. Manhattan Plots for 12 Diseases from UK Biobank (SPACox-NoSPA).
Manhattan plots were based on p values calculated from SPACox-NoSPA method. The red line
represents the genome-wide significance level 𝛼 = 5 × 10−8. SPACox-NoSPA uses normal
approximation (based on empirical variance 𝑉𝑎�̂�𝑒𝑚𝑝(𝑆)) for all SNPs.
![Page 9: A Fast and Accurate Method for Genome-Wide Time-to-Event … · 2020. 6. 25. · The American Journal of Human Genetics, Volume 107 Supplemental Data A Fast and Accurate Method for](https://reader033.vdocuments.mx/reader033/viewer/2022051822/5fec9a6f9513e9516157840c/html5/thumbnails/9.jpg)
Figure S8. QQ Plots for 12 Diseases from UK Biobank (SPACox-NoSPA).
QQ plots were based on p values calculated from SPACox-NoSPA method. The red line
represents the genome-wide significance level 𝛼 = 5 × 10−8. SPACox-NoSPA uses normal
approximation (based on empirical variance 𝑉𝑎�̂�𝑒𝑚𝑝(𝑆)) for all SNPs.
![Page 10: A Fast and Accurate Method for Genome-Wide Time-to-Event … · 2020. 6. 25. · The American Journal of Human Genetics, Volume 107 Supplemental Data A Fast and Accurate Method for](https://reader033.vdocuments.mx/reader033/viewer/2022051822/5fec9a6f9513e9516157840c/html5/thumbnails/10.jpg)
Figure S9. QQ Plots and Manhattan Plots for Three Diseases of Essential Hypertension,
Asthma and Alzheimer’s Disease from UK Biobank (Wald).
Upper plots are for QQ plots and lower plots are for Manhattan plots. P values were calculated
from a hybrid-version Wald test in which Wald test is used when p values of SPACox < 5e-3.
Wald test was performed via coxph function in R package survival. The red line represents the
genome-wide significance level 𝛼 = 5 × 10−8.
![Page 11: A Fast and Accurate Method for Genome-Wide Time-to-Event … · 2020. 6. 25. · The American Journal of Human Genetics, Volume 107 Supplemental Data A Fast and Accurate Method for](https://reader033.vdocuments.mx/reader033/viewer/2022051822/5fec9a6f9513e9516157840c/html5/thumbnails/11.jpg)
Figure S10. Hazard Ratios and MAFs of the 611 Significant Loci from UK Biobank.
SPACox identified 611 significant loci for the 12 diseases. Variants within a region of 200kb or
at the same gene were treated as the same locus.
![Page 12: A Fast and Accurate Method for Genome-Wide Time-to-Event … · 2020. 6. 25. · The American Journal of Human Genetics, Volume 107 Supplemental Data A Fast and Accurate Method for](https://reader033.vdocuments.mx/reader033/viewer/2022051822/5fec9a6f9513e9516157840c/html5/thumbnails/12.jpg)
(A) (B)
Figure S11. P values of SPACC and SPACox for the 611 Significant Loci from UK
Biobank.
(A) SPACC uses top 4 PCs, sex, and birth year as covariates, (B) SPACC uses top 4 PCs, sex,
and time-to-event as covariates. SPACox identified 611 significant loci for the 12 diseases. The
red line represents the genome-wide significance level 𝛼 = 5 × 10−8. PC: principal component.
![Page 13: A Fast and Accurate Method for Genome-Wide Time-to-Event … · 2020. 6. 25. · The American Journal of Human Genetics, Volume 107 Supplemental Data A Fast and Accurate Method for](https://reader033.vdocuments.mx/reader033/viewer/2022051822/5fec9a6f9513e9516157840c/html5/thumbnails/13.jpg)
Figure S12. Cumulative Risk Curves of the most Significant SNPs for 12 diseases.
![Page 14: A Fast and Accurate Method for Genome-Wide Time-to-Event … · 2020. 6. 25. · The American Journal of Human Genetics, Volume 107 Supplemental Data A Fast and Accurate Method for](https://reader033.vdocuments.mx/reader033/viewer/2022051822/5fec9a6f9513e9516157840c/html5/thumbnails/14.jpg)
Table S1. Empirical Type I Error Rates of SPACox, SPACox-NoSPA, Wald, Firth, and Score
Tests.
Significance
Level Event
Rate MAF Empirical Type I Error Rates
SPACox SPACox-
NoSPA
Firth Wald Score
5.00E-05 0.2% 0.001 5.24E-05 0.004 6.43E-05 0.0006 0.0008 0.01 4.72E-05 0.0004 4.07E-05 0.0003 0.0005 0.3 4.32E-05 5.08E-05 4.94E-05 4.44E-05 5.13E-05
1% 0.001 4.90E-05 0.0007 5.01E-05 0.0004 0.0007 0.01 4.88E-05 0.0001 4.84E-05 0.0001 0.0001 0.3 4.83E-05 4.96E-05 4.96E-05 4.88E-05 5.01E-05
10% 0.001 5.02E-05 7.43E-05 5.22E-05 0.0001 0.0002 0.01 4.93E-05 5.16E-05 4.98E-05 5.84E-05 6.16E-05 0.3 5.02E-05 5.03E-05 5.04E-05 5.04E-05 5.05E-05
20% 0.001 5.02E-05 6.89E-05 5.27E-05 0.0001 0.0001 0.01 4.92E-05 5.10E-05 4.98E-05 5.53E-05 5.68E-05 0.3 4.95E-05 4.96E-05 4.99E-05 4.98E-05 4.99E-05
50% 0.001 5.01E-05 8.27E-05 5.22E-05 7.94E-05 8.69E-05 0.01 5.01E-05 5.32E-05 5.00E-05 5.28E-05 5.34E-05 0.3 5.04E-05 5.05E-05 5.00E-05 5.00E-05 5.01E-05
5.00E-08 0.2% 0.001 2.11E-08 0.0005 6.68E-08 3.76E-05 0.0004 0.01 3.94E-08 1.11E-05 4.31E-08 3.22E-06 1.25E-05 0.3 2.13E-08 4.25E-08 3.30E-08 4.54E-08 5.64E-08
1% 0.001 4.75E-08 2.61E-05 7.10E-08 9.14E-06 5.04E-05 0.01 4.87E-08 8.48E-07 5.15E-08 6.60E-07 1.22E-06 0.3 3.39E-08 3.74E-08 3.65E-08 4.36E-08 4.64E-08
10% 0.001 4.61E-08 2.02E-07 5.16E-08 1.10E-06 2.11E-06 0.01 4.44E-08 4.61E-08 4.43E-08 1.23E-07 1.42E-07 0.3 5.25E-08 5.25E-08 5.39E-08 5.39E-08 5.53E-08
20% 0.001 3.72E-08 1.40E-07 6.47E-08 6.07E-07 9.47E-07 0.01 6.50E-08 7.55E-08 6.50E-08 8.16E-08 9.40E-08 0.3 4.82E-08 4.82E-08 5.32E-08 4.76E-08 4.76E-08
50% 0.001 3.32E-08 3.35E-07 4.19E-08 2.74E-07 3.49E-07 0.01 4.36E-08 6.11E-08 4.61E-08 7.46E-08 7.60E-08 0.3 5.71E-08 5.71E-08 4.85E-08 4.71E-08 4.85E-08
![Page 15: A Fast and Accurate Method for Genome-Wide Time-to-Event … · 2020. 6. 25. · The American Journal of Human Genetics, Volume 107 Supplemental Data A Fast and Accurate Method for](https://reader033.vdocuments.mx/reader033/viewer/2022051822/5fec9a6f9513e9516157840c/html5/thumbnails/15.jpg)
Table S2. UK Biobank Inpatient Data Provider Information.
Hosptital
Admissions
(Inpatients)
Data Provider International
Classification of Disease
Censoring Total Sample
Size
Sample Size in UK
Biobank Analysis ICD9 ICD10
Hospital Episode
Statistics for
England
NHS Digital 1996
onwards
31 March
2017
366,439 248,992
Scottish Morbidity
Record Information and
Statistics Division,
Scotland
1987-1996 1996
onwards 31 October
2016
31,135 22,193
Patient Episode
Database for Wales Secure Anonymized
Information
Linkage, Wales
1999
onwards 29 February
2016
16,115 11,686
![Page 16: A Fast and Accurate Method for Genome-Wide Time-to-Event … · 2020. 6. 25. · The American Journal of Human Genetics, Volume 107 Supplemental Data A Fast and Accurate Method for](https://reader033.vdocuments.mx/reader033/viewer/2022051822/5fec9a6f9513e9516157840c/html5/thumbnails/16.jpg)
Table S3: Summary Information of the 38 Loci Significant Based on SPACox but not Significant Based on SPACC. We use a
Genome-wide Significance Level 5×10-8. (UTR = untranslated region, ncRNA = non-coding RNA)
Phenotype RSID CHR REF ALT MAF Gene.refGene Func.refGene Cox PH Walda SPACC
p value
SPACox p
value HR SE p value
Asthma rs5758324 22 G T 0.2145 TEF intronic 1.06 0.011 2.49E-08 1.19E-07 2.39E-08
rs2844649 6 A G 0.1135 SFTA2;
DPCR149
intergenic 1.08 0.014 5.03E-08 1.63E-07 4.45E-08
Cardiac
dysrhythmias
rs412768 14 A G 0.3109 MYH645 intronic 1.05 0.009 2.42E-08 5.51E-08 2.46E-08
Cataract rs149821426 4 G A 0.0033 LINC02429;
MIR548AG1
intergenic 1.49 0.071 1.87E-08 6.64E-08 4.83E-08
rs1043618 6 G C 0.3792 HSPA1A46 UTR5 1.05 0.010 3.76E-08 7.12E-08 3.82E-08
Coronary
atherosclerosis
rs9515203 13 T C 0.2641 COL4A241 intronic 0.94 0.012 1.22E-08 5.99E-08 1.28E-08
rs112043140 3 C T 0.2211 LRRC2 intronic 1.07 0.012 4.59E-08 9.66E-08 4.88E-08
Essential
hypertension
rs2304615 19 A G 0.2064 REXO1 intronic 0.97 0.006 4.87E-08 5.03E-08 4.30E-08
rs752520449 12 A G 0.0005 LINC02400;
GXYLT1
intergenic 1.82 0.106 1.85E-08 5.75E-08 4.65E-08
rs10838835 11 A G 0.1494 OR4B1;
OR4X2
intergenic 0.96 0.007 1.43E-08 6.27E-08 1.15E-08
rs7763581 6 T G 0.4866 FOXC1 downstream 0.97 0.005 3.06E-08 6.36E-08 2.93E-08
rs73094438 7 T C 0.0069 LOC401324;
HERPUD2
intergenic 1.19 0.031 1.34E-08 6.86E-08 1.85E-08
rs2814949 6 A G 0.3595 C6orf106 intronic 1.03 0.005 9.44E-09 6.90E-08 9.01E-09
rs76702537 12 C A 0.0316 LINC02468;
PDE3A47
intergenic 0.92 0.015 1.91E-08 6.93E-08 1.35E-08
rs9932220 16 G A 0.2181 SALL1;
LINC0157144
intergenic 0.97 0.006 2.48E-08 6.98E-08 2.13E-08
rs10828266 10 A G 0.2844 DNAJC1 intronic 0.97 0.006 1.01E-08 7.21E-08 9.89E-09
rs2725371 8 A G 0.3044 PURG UTR3 0.97 0.006 2.92E-08 7.51E-08 2.89E-08
rs9683944 4 A G 0.2034 LINC02510;
PCDH18
intergenic 1.04 0.006 3.64E-08 8.76E-08 3.63E-08
rs2282143 6 C T 0.0155 SLC22A1 exonic 1.12 0.020 1.77E-08 9.23E-08 2.52E-08
rs4678408 3 A G 0.3708 NME9;
MRAS
intergenic 0.97 0.005 9.43E-09 9.89E-08 9.20E-09
rs10756197 9 G A 0.4848 PTPRD-AS2;
TYRP143
intergenic 1.03 0.005 2.71E-08 1.00E-07 2.64E-08
![Page 17: A Fast and Accurate Method for Genome-Wide Time-to-Event … · 2020. 6. 25. · The American Journal of Human Genetics, Volume 107 Supplemental Data A Fast and Accurate Method for](https://reader033.vdocuments.mx/reader033/viewer/2022051822/5fec9a6f9513e9516157840c/html5/thumbnails/17.jpg)
rs433750 5 T G 0.4215 LOC101927078 ncRNA_intronic 0.97 0.005 3.63E-08 1.05E-07 3.41E-08
rs11657730 17 C T 0.3621 LOC100130370;
BAHCC1
intergenic 0.97 0.005 1.97E-08 1.07E-07 1.78E-08
rs10067451 5 G A 0.1094 LINC00461 ncRNA_intronic 0.95 0.008 3.03E-08 1.12E-07 2.33E-08
rs35275911 7 G C 0.1947 LOC101926943 ncRNA_intronic 1.04 0.006 2.67E-08 1.29E-07 2.75E-08
rs13062241 3 C T 0.4700 FGD539 intronic 0.97 0.005 5.20E-09 1.39E-07 5.06E-09
rs2046301 11 A G 0.1426 OR4C3;
OR4C45
intergenic 0.96 0.007 3.47E-08 1.55E-07 2.79E-08
rs2517521 6 A G 0.1453 HCG2248 UTR3 1.04 0.007 5.02E-08 2.04E-07 4.26E-08
rs28724242 6 A G 0.2964 HLA-DQB142 intronic 1.03 0.006 4.36E-08 2.05E-07 4.26E-08
rs9603420 13 G T 0.4888 B3GLCT;
RXFP2
intergenic 1.03 0.005 2.41E-08 4.64E-07 2.27E-08
Hyperlipidemia rs2517521 6 A G 0.1453 HCG22 UTR3 1.06 0.011 3.70E-08 1.01E-07 3.39E-08
Osteoarthrosis rs114786346 12 T C 0.0892 RFLNA intronic 1.08 0.014 3.78E-08 6.16E-08 4.19E-08
rs62063281 17 A G 0.2219 MAPT intronic 1.06 0.010 2.61E-08 6.18E-08 2.71E-08
rs1724411 17 T C 0.2303 LRRC37A4P;
MAPK8IP1P2
intergenic 1.06 0.010 2.64E-08 6.68E-08 2.73E-08
rs4841411 8 C G 0.4298 RP1L1;
MIR4286
intergenic 1.05 0.008 5.18E-08 7.59E-08 4.88E-08
rs2532386 17 G A 0.2253 KANSL1;
LRRC37A
intergenic 1.06 0.010 4.29E-08 9.18E-08 4.39E-08
Type 2
diabetes
rs646123 17 G A 0.2867 MLX intronic 1.06 0.011 3.11E-08 6.30E-08 3.33E-08
rs146886108 5 C T 0.0071 ANKH exonic 0.66 0.074 3.73E-08 6.64E-08 3.55E-08 a HR: hazard ratio, exp(𝛾); SE: standard error.
![Page 18: A Fast and Accurate Method for Genome-Wide Time-to-Event … · 2020. 6. 25. · The American Journal of Human Genetics, Volume 107 Supplemental Data A Fast and Accurate Method for](https://reader033.vdocuments.mx/reader033/viewer/2022051822/5fec9a6f9513e9516157840c/html5/thumbnails/18.jpg)
Table S4: Summary Information of the 17 Loci Significant Based on SPACC but not Significant Based on SPACox. We use a
Genome-wide Significance Level 5×10-8. (UTR = untranslated region, ncRNA = non-coding RNA)
Phenotype RSID CHR REF ALT MAF Gene.refGene Func.refGene Cox PH Walda SPACC
p value
SPACox
p value HR SE p value
Type 2
diabetes
rs3810291 19 G A 0.3225 ZC3H4 UTR3 1.06 0.011 8.04E-08 4.24E-08 7.94E-08
Hyperlipide
mia
rs2287029 19 C T 0.1885 DNM2 intronic 0.95 0.010 5.04E-08 2.38E-08 5.01E-08
Essential
hypertension
rs11857726 15 G A 0.3793 CHP1 intronic 1.03 0.005 5.23E-08 4.74E-08 5.12E-08
rs12902197 15 T A 0.1873 MIR4713HG ncRNA_intronic 1.04 0.007 5.89E-08 1.25E-08 5.59E-08
rs11249906 8 T G 0.3053 PPP1R3B;
LOC101929128
intergenic 1.03 0.006 5.91E-08 1.65E-08 5.61E-08
rs11998678 8 C T 0.4651 CTSB;DEFB13
6
intergenic 1.03 0.005 6.47E-08 4.42E-09 6.00E-08
rs7115856 11 A C 0.4633 HSD17B12 intronic 0.97 0.005 7.56E-08 3.09E-08 7.21E-08
rs142076278 16 A G 0.0148 LINC01571;
C16orf97
intergenic 1.12 0.022 1.04E-07 3.63E-08 1.24E-07
rs2867695 4 C T 0.1069 ANTXR2;
PRDM8
intergenic 0.96 0.008 1.10E-07 3.02E-08 9.82E-08
rs1242765 14 G A 0.2353 UNC79 intronic 0.97 0.006 1.20E-07 4.55E-08 1.08E-07
rs2251473 8 C A 0.4452 MTMR9 intronic 1.03 0.005 1.31E-07 7.96E-09 1.21E-07
rs2341599 4 G A 0.3406 MAP9;
GUCY1A3
intergenic 0.97 0.005 1.34E-07 9.78E-09 1.29E-07
rs8184986 22 A T 0.1345 CHEK2 intronic 1.04 0.007 1.65E-07 2.81E-08 1.67E-07
Coronary
atherosclero
sis
rs2073532 7 G C 0.3197 ETV1 intronic 1.06 0.011 5.08E-08 4.14E-08 5.31E-08
rs8003602 14 T C 0.2604 HHIPL1;
CYP46A1
intergenic 1.07 0.012 5.48E-08 3.81E-08 5.54E-08
rs10841443 12 C G 0.3315 LINC02398 ncRNA_intronic 1.06 0.011 5.90E-08 4.81E-08 6.08E-08
Asthma rs6835638 4 C T 0.1516 IL21-AS1 ncRNA_intronic 1.07 0.012 1.87E-07 4.25E-08 1.73E-07 a HR: hazard ratio, exp(𝛾); SE: standard error.
![Page 19: A Fast and Accurate Method for Genome-Wide Time-to-Event … · 2020. 6. 25. · The American Journal of Human Genetics, Volume 107 Supplemental Data A Fast and Accurate Method for](https://reader033.vdocuments.mx/reader033/viewer/2022051822/5fec9a6f9513e9516157840c/html5/thumbnails/19.jpg)
Table S5. Empirical Type I Error Rates of SPACox When Covariates are Time-Varying.
Event Rate MAF Significance Level SPACox Type I Error Rates
0.20% 0.001 5.00E-05 5.99E-05
5.00E-08 5.00E-08
0.01 5.00E-05 4.75E-05
5.00E-08 4.20E-08
0.3 5.00E-05 4.38E-05
5.00E-08 2.01E-08
1% 0.001 5.00E-05 4.96E-05
5.00E-08 4.00E-08
0.01 5.00E-05 4.93E-05
5.00E-08 4.00E-08
0.3 5.00E-05 4.84E-05
5.00E-08 5.45E-08
10% 0.001 5.00E-05 4.98E-05
5.00E-08 4.20E-08
0.01 5.00E-05 5.00E-05
5.00E-08 5.60E-08
0.3 5.00E-05 5.01E-05
5.00E-08 4.11E-08
![Page 20: A Fast and Accurate Method for Genome-Wide Time-to-Event … · 2020. 6. 25. · The American Journal of Human Genetics, Volume 107 Supplemental Data A Fast and Accurate Method for](https://reader033.vdocuments.mx/reader033/viewer/2022051822/5fec9a6f9513e9516157840c/html5/thumbnails/20.jpg)
Supplementary Methods
Section A: Discussion about the time-varying covariates
One of the strengths of the Cox PH model is its ability to encompass covariates that change over
time. Survival package gives a detailed vignette55 to describe how to incorporate time-varying
covariates into Cox PH model. SPACox follows its process to use time intervals to code time-
varying covariates. Suppose that multiple time intervals are for one subject, after fitting a null Cox
PH model, we add the martingale residuals corresponding to these time intervals as the overall
martingale residual of this subject, and then calculate empirical SPA of the martingale residuals.
In step 2, we use the weighted mean of the covariates to calculate centered covariate-adjusted
genotype �̃�.
We carried out simulation studies to evaluate type I error rates of SPACox. Similar as in the
main text, for subject 𝑖, we first simulated the censoring time 𝐶𝑖 and the underlying failure time
𝑇𝑖∗ , and then calculated 𝑇𝑖 = min(𝑇𝑖
∗, 𝐶𝑖) and 𝛿𝑖 = 𝐼(𝑇𝑖∗ ≤ 𝐶𝑖) . The censoring time 𝐶𝑖 was
simulated following a Weibull distribution with the scale parameter of 0.15 and the shape
parameter of 1. The survival time 𝑇𝑖 was simulated from a Cox proportional hazard model with a
Weibull baseline hazard function and two time-varying covariates as
𝑇𝑖(𝑋𝑖1, 𝑋𝑖2) =
{
√𝜆2 ⋅− log𝑈𝑖
exp(𝜂𝑖0), − log𝑈𝑖 <
exp(𝜂𝑖0) 𝑡𝑆
2
𝜆2
√𝜆2 ⋅− log𝑈𝑖
exp(𝜂𝑖1)−𝑡𝑆2 exp(𝜂𝑖
0)
exp(𝜂𝑖1)
+ 𝑡𝑆2, − log𝑈𝑖 ≥
exp(𝜂𝑖0) 𝑡𝑆
2
𝜆2
where 𝑈𝑖 was simulated following a uniform distribution on an interval (0,1), 𝑡𝑆 = 0.2 was the
time point at which covariate 𝑋𝑖1 changed from 𝑥𝑖10 to 𝑥𝑖1
1 and covariate 𝑋𝑖2 changed from 𝑥𝑖20 to
𝑥𝑖21 . We simulated 𝑥𝑖1
𝑗, 𝑗 = 0, 1 following a standard normal distribution and simulated 𝑥𝑖2
𝑗, 𝑗 =
![Page 21: A Fast and Accurate Method for Genome-Wide Time-to-Event … · 2020. 6. 25. · The American Journal of Human Genetics, Volume 107 Supplemental Data A Fast and Accurate Method for](https://reader033.vdocuments.mx/reader033/viewer/2022051822/5fec9a6f9513e9516157840c/html5/thumbnails/21.jpg)
0, 1 following a Bernoulli distribution with a probability of 0.5. Linear predictor 𝜂𝑖𝑗= 0.5𝑥𝑖1
𝑗+
0.5𝑥𝑖2𝑗, 𝑗 = 0,1 and the scale parameter 𝜆 is selected to correspond to fixed event rates.
We considered common, low-frequency and rare variants with MAFs of 0.3, 0.01 and 0.001,
and simulated 106 genetic variants for each MAF. We considered five event rates of 0.2%, 1% and
10%, and simulated 1,000 datasets of time-to-event phenotypes for each event rate. Hence, for
each pair of MAF and event rate, totally 109 replications were evaluated. The type I error rates of
SPACox is presented in Table S5. We can see that, in all parameter settings, type I error rates can
be well controlled.
55. Therneau, T., Crowson, C., and Atkinson, E. (2017). Using time dependent covariates and
time dependent coefficients in the cox model. Survival Vignettes.