mendelian randomization: concepts and scope

39
Mendelian Randomization: Concepts and Scope Rebecca C. Richmond 1,2 and George Davey Smith 1,2,3 1 MRC Integrative Epidemiology Unit, University of Bristol, Bristol BS8 2BN, United Kingdom 2 Population Health Sciences, Bristol Medical School, University of Bristol, Bristol BS8 2BN, United Kingdom 3 NIHR Bristol Biomedical Research Centre, University Hospitals Bristol NHS Foundation Trust and University of Bristol, Bristol BS1 3NU, United Kingdom Correspondence: [email protected] Mendelian randomization (MR) is a method of studying the causal effects of modifiable exposures (i.e., potential risk factors) on health, social, and economic outcomes using genetic variants associated with the specific exposures of interest. MR provides a more robust understanding of the influence of these exposures on outcomes because germline genetic variants are randomly inherited from parents to offspring and, as a result, should not be related to potential confounding factors that influence exposureoutcome associa- tions. The genetic variant can therefore be used as a tool to link the proposed risk factor and outcome, and to estimate this effect with less confounding and bias than conventional epi- demiological approaches. We describe the scope of MR, highlighting the range of applica- tions being made possible as genetic data sets and resources become larger and more freely available. We outline the MR approach in detail, covering concepts, assumptions, and esti- mation methods. We cover some common misconceptions, provide strategies forovercoming violation of assumptions, and discuss future prospects for extending the clinical applicability, methodological innovations, robustness, and generalizability of MR findings. M endelian randomization (MR) was devel- oped as a method to help provide a robust understanding of environmentally modiable inuences on disease (Davey Smith and Ebra- him 2003). It was proposed to offer a more reli- able strategy than conventional observational epidemiological studies that have traditionally been plagued by issues such as confounding (in which a common cause of an exposure X and outcome Y may distort the association between X and Y), reverse causation (in which Yor the disease process leading to Yinu- ences X) and other forms of bias, thus resulting in potentially misleading causal inference (Davey Smith and Ebrahim 2002). The clearest examples are shown through observational epi- demiological studies that have indicated an ap- parent causal effect that has later failed to be conrmed in large-scale randomized controlled trials (RCTs) (Davey Smith et al. 2020). The proposed protective effects of vitamin and anti- oxidant supplements on cardiovascular disease (CVD) (Rimm et al. 1993; Myung et al. 2013), β-carotene on lung cancer (Menkes et al. 1986; Heinonen et al. 1994), and selenium on prostate cancer (Yoshizawa et al. 1998; Lippman et al. Editors: George Davey Smith, Rebecca Richmond, and Jean-Baptiste Pingault Additional Perspectives on Combining Human Genetics and Causal Inference to Understand Human Disease and Development available at www.perspectivesinmedicine.org Copyright © 2021 Cold Spring Harbor Laboratory Press; all rights reserved Advanced Online Article. Cite this article as Cold Spring Harb Perspect Med doi: 10.1101/cshperspect.a040501 1 www.perspectivesinmedicine.org on April 23, 2022 - Published by Cold Spring Harbor Laboratory Press http://perspectivesinmedicine.cshlp.org/ Downloaded from

Upload: others

Post on 24-Apr-2022

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Mendelian Randomization: Concepts and Scope

Mendelian Randomization: Concepts and Scope

Rebecca C. Richmond1,2 and George Davey Smith1,2,3

1MRC Integrative Epidemiology Unit, University of Bristol, Bristol BS8 2BN, United Kingdom2Population Health Sciences, Bristol Medical School, University of Bristol, Bristol BS8 2BN, United Kingdom3NIHR Bristol Biomedical Research Centre, University Hospitals Bristol NHS Foundation Trust and University ofBristol, Bristol BS1 3NU, United Kingdom

Correspondence: [email protected]

Mendelian randomization (MR) is a method of studying the causal effects of modifiableexposures (i.e., potential risk factors) on health, social, and economic outcomes usinggenetic variants associated with the specific exposures of interest. MR provides a morerobust understanding of the influence of these exposures on outcomes because germlinegenetic variants are randomly inherited from parents to offspring and, as a result, shouldnot be related to potential confounding factors that influence exposure–outcome associa-tions. The genetic variant can therefore be used as a tool to link the proposed risk factor andoutcome, and to estimate this effect with less confounding and bias than conventional epi-demiological approaches. We describe the scope of MR, highlighting the range of applica-tions being made possible as genetic data sets and resources become larger and more freelyavailable. We outline the MR approach in detail, covering concepts, assumptions, and esti-mationmethods.We cover somecommonmisconceptions, provide strategies for overcomingviolation of assumptions, and discuss future prospects for extending the clinical applicability,methodological innovations, robustness, and generalizability of MR findings.

Mendelian randomization (MR) was devel-oped as a method to help provide a robust

understanding of environmentally modifiableinfluences on disease (Davey Smith and Ebra-him 2003). It was proposed to offer a more reli-able strategy than conventional observationalepidemiological studies that have traditionallybeen plagued by issues such as confounding(in which a common cause of an exposure Xand outcome Y may distort the associationbetween X and Y), reverse causation (in whichY—or the disease process leading to Y—influ-ences X) and other forms of bias, thus resulting

in potentially misleading causal inference(Davey Smith and Ebrahim 2002). The clearestexamples are shown through observational epi-demiological studies that have indicated an ap-parent causal effect that has later failed to beconfirmed in large-scale randomized controlledtrials (RCTs) (Davey Smith et al. 2020). Theproposed protective effects of vitamin and anti-oxidant supplements on cardiovascular disease(CVD) (Rimm et al. 1993; Myung et al. 2013),β-carotene on lung cancer (Menkes et al. 1986;Heinonen et al. 1994), and selenium on prostatecancer (Yoshizawa et al. 1998; Lippman et al.

Editors: George Davey Smith, Rebecca Richmond, and Jean-Baptiste PingaultAdditional Perspectives on Combining Human Genetics and Causal Inference to Understand Human Disease and Developmentavailable at www.perspectivesinmedicine.org

Copyright © 2021 Cold Spring Harbor Laboratory Press; all rights reservedAdvanced Online Article. Cite this article as Cold Spring Harb Perspect Med doi: 10.1101/cshperspect.a040501

1

ww

w.p

ersp

ecti

vesi

nm

edic

ine.

org

on April 23, 2022 - Published by Cold Spring Harbor Laboratory Press http://perspectivesinmedicine.cshlp.org/Downloaded from

Page 2: Mendelian Randomization: Concepts and Scope

2009) are noteworthy examples. Such spuriousfindings from observational studies have hadnegative consequences, including the launch ofexpensive trials based on inadequate evidence,and increased uptake of nutritional supple-ments in the general population, some of whichhave subsequently been found to have adverseeffects (Heinonen et al. 1994; Lippman et al.2009).

MR uses genetic variants robustly associatedwith exposures to strengthen inference regard-ing their potential causal influence on a partic-ular outcome (Davey Smith and Ebrahim 2003;Davey Smith and Hemani 2014). The online“MR Dictionary” (Lawlor et al. 2019) offers afull description and definitions of terminologyspecific toMR, which will be useful to refer to aswe elaborate on the concepts and scope of theapproach in this paper.

The MR approach draws on Mendel’s lawsof segregation and independent assortment,whereby genetic variants are allocated indepen-dently of environment and other genetic factors(except those in close physical proximity to thevariant of interest, which tend to be inheritedtogether through linkage disequilibrium [LD])(Davey Smith et al. 2020). Based on the premisethat the random inheritance of genetic variantsfrom parents to offspring is reflected at a popu-lation level, genetic variants can identify groupsthat differ, on average, by a modifiable exposure.Here, group membership should not be as-sociated with a range of behavioral, social, andphysiological factors thatmay confound observa-tional associations (Davey Smith et al. 2007). Bydesign, genetic associations should therefore belargely free from confounding, thus any differ-ence in outcomes between genetically definedgroups can be directly attributed to the exposure.

The association between an outcome and agenetic variant known to proxy a particular riskfactor mimics the link between the outcome andthe proposed risk factor, and can be used toestimate this relationship with less confoundingand bias than conventional epidemiological ap-proaches. Other qualities of (germline) geneticvariants that make them useful in causal infer-ence analysis are that they (1) can be robustlyassociated with modifiable exposures (i.e., can

serve as genetic proxies); (2) are fixed at concep-tion and not influenced by disease processes(i.e., are less susceptible to reverse causation);and (3) are subject to relatively little measure-ment error and typically have long-term effects(i.e., are less liable to the underestimation of theexposure–outcome association, referred to as re-gression dilution bias) (Davey Smith and Ebra-him 2004).

Exposures of interest are typically modifi-able and so evidence of causality can—in prin-ciple—be used to infer that intervening on anexposure will lead to a change in the outcomeunder investigation. Making such inference de-pends on considering it reasonable to accept theprinciple of gene–environment equivalence:that perturbation of a phenotype by either a(hypothetical) change in genotype or by envi-ronmental change would produce the samedownstream effect on an outcome (Ames 1999;West-Eberhard 2003; Ebrahim andDavey Smith2008; Davey Smith 2012a). For example, underthis assumption, we would anticipate genotypicinfluence on circulating cholesterol level wouldlead to the same effect on coronary heart disease(CHD) as would a similar change in cholesterollevel induced by dietary influences. Althoughmany exposures can be closely proxied bygenetic variation, for others it is unlikely thatgenetic variation will mimic environment exact-ly, for example, in capturing aspects of years ofeducation (Davies et al. 2019b). Gene–environ-ment equivalence is a fundamental principle inMR that also brings to the fore the issue of thetime-depth of the exposure that is being exam-ined, because genetic variants that influence aphenotype will do so over an extended period.Wewill come back to the issue of time, discussedat length in the MR literature since its inception(Davey Smith and Ebrahim 2003, 2004; Holmeset al. 2017).

Within a causal inference framework, MRcan be implemented as a form of instrumentalvariable (IV) analysis in which the genetic var-iants serve as proxies or IVs for the modifiablefactors of interest (Fig. 1; Lawlor et al. 2008). Ifwe suppose X and Y are the exposure and out-come of interest, C is a set of variables that affectsX and Y (i.e., potential confounding factors),

R.C. Richmond and G. Davey Smith

2 Advanced Online Article. Cite this article as Cold Spring Harb Perspect Med doi: 10.1101/cshperspect.a040501

ww

w.p

ersp

ecti

vesi

nm

edic

ine.

org

on April 23, 2022 - Published by Cold Spring Harbor Laboratory Press http://perspectivesinmedicine.cshlp.org/Downloaded from

Page 3: Mendelian Randomization: Concepts and Scope

andU is a further set of variables that affect Y, wecan use a further variable G (the genetic variantof interest) as an IV to establish the causal effectof X on Y if it satisfies the following assumptions(Hernan and Robins 2020):

1. G is robustly associated with X (“relevance”);

2. G does not share common causes (C and U)with Y (“independence” or “exchangeabili-ty”); and

3. G affects Y exclusively through its effect on X(“exclusion restriction”).

These assumptions are described in moredetail in the section “Assumptions ofMendelianRandomization” and “Instrumental VariableAnalysis.”

SCOPE OF MENDELIAN RANDOMIZATION

MR has been used to:

• appraise the causal relevance of both endoge-nous (e.g., blood pressure, low-density lipo-protein [LDL] cholesterol) and exogenous ex-posures (e.g., alcohol, smoking),

• confirm and uncover causal effects for knownrisk factors of clinical relevance,

• establish the causal role of behavioral traits,

• evaluate causality in relation to social and eco-nomic factors,

• assess life-course effects,

• elucidate intergenerational influences,

• characterize difficult to measure environmen-tal exposures,

• proxy for modifiers of environmental expo-sure (e.g., metabolism or detoxification),

• mimic drug targets,

• evaluate the role of modifiable mediators be-tween upstream exposures and disease out-comes, and

• evaluate the effects of genetic liability to aparticular disease.

A selection of studies in Table 1 shows howMR has been previously used across a wide va-riety of contexts.

When the basic principles ofMRwere initial-ly formalized there were few examples of geneticvariants that had robust associations with poten-tiallymodifiable exposures, and it was recognizedthat the future potential of MR would dependupon identifying such associations (Davey Smithand Ebrahim2003). There has been very substan-tial progress in this area. Improvements and costreductions in array-based genotyping techniques,complemented byDNA sequencing and imputa-tion of information from human genome refer-ence sets, have led to a dramatic increase in ourunderstanding of the genetic contribution to dis-ease risk. Such improvements have alsopermittedthe widespread use of genome-wide associationstudies (GWAS), which have been successful atdetecting replicable associations between com-mon genetic variants and a host of traits in ahypothesis-free approach.

U

C

XG Y

Figure 1.Directedacyclic graph forMendelian randomizationanalysis.Thegenetic variant (G) is associatedwith theexposure of interest (X); there are noconfounders (C,U)of the associationbetween genetic variant (G) andoutcome(Y); and the genetic variant (G) does not affect the outcome (Y) except through its effect on the exposure (X).

Mendelian Randomization

Advanced Online Article. Cite this article as Cold Spring Harb Perspect Med doi: 10.1101/cshperspect.a040501 3

ww

w.p

ersp

ecti

vesi

nm

edic

ine.

org

on April 23, 2022 - Published by Cold Spring Harbor Laboratory Press http://perspectivesinmedicine.cshlp.org/Downloaded from

Page 4: Mendelian Randomization: Concepts and Scope

Table1.

Aselectionof

Men

delia

nrand

omizationstud

ieswith

diffe

rent

applications

App

lication

Expo

sure(s)

Outco

me(s)

Con

clusions

Referenc

esClin

ical

Vitam

inD

Multiplesclerosis(M

S)Lo

wer

25-hydroxyvitamin

Dhasacausaleffecton

increasedMSsusceptibility

Mokry

etal.2015

Metabolicfactors

Pancreaticcancer

Causalroleof

body

massindex(BMI)andfasting

insulin

inpancreaticcancer

etiology

Carreras-Torresetal.

2017

Creactive

protein(CRP)

Coron

aryheartdisease(CHD)

CRPconcentrationisun

likelyto

bean

impo

rtant

causalfactor

inCHD

Wensley

etal.2011

Behavioral

Cannabisuse

Schizoph

renia

Someevidence

forasm

allcausaleffectof

cann

abis

initiation

onschizoph

reniarisk

Gageetal.2017

Alcoh

olintake

Vasculardisease

Alcoh

olincreasesbloodpressure

andstroke

risk

acrossthedistribu

tion

ofintake

Chenetal.2008;

Millwoodetal.

2019

Smokingheaviness

Anx

ietyanddepression

Limited

supp

ortfor

acausalroleofsm

okingheaviness

indepression

andanxiety

Bjorngaardetal.

2013;T

ayloretal.

2014

BMIandsm

oking

COVID

-19

ElevatedBMIandsm

okingincrease

susceptibilityto

severe

COVID

-19andsepsis

Ponsford

etal.2020

Socioecono

mic

Edu

cation

Myopia

Moreyearsin

educationcontribu

testo

increasedrisk

ofmyopia

Cuellar-Partidaetal.

2016;M

ountjoy

etal.2018

HeightandBMI

Socioecono

micpo

sition

Earlierfind

ings

suggesting

thatheight

andBMImay

influenceaspectso

fsocioecon

omicpo

sition

may

bebiased

byfamily-leveldynasticeffects

Tyrrelletal.2016a;

Brumpton

etal.

2020

Age

atmenarche

Edu

cation

Positive

causaleffectofageatmenarcheon

timespent

ineducation

Gill

etal.2017

Edu

cation

Cardiovasculardisease(CVD)

BMI,systolicbloodpressure

andsm

okingmediatea

substantialp

ropo

rtionof

theprotective

effectof

educationon

therisk

ofCVD

Carteretal.2019

Environ

mental

Cop

perandzinc

Ischem

icheartdisease(IHD)

Protectiveeffectof

copp

eranddetrim

entaleffectof

zinc

onIH

DKod

alietal.2018

Arsenicmetabolism

Skin

lesion

sEvidenceforacausalrelation

ship

betweenarsenic

metabolism

andskin

lesion

risk

Pierceetal.2013

Continu

ed

R.C. Richmond and G. Davey Smith

4 Advanced Online Article. Cite this article as Cold Spring Harb Perspect Med doi: 10.1101/cshperspect.a040501

ww

w.p

ersp

ecti

vesi

nm

edic

ine.

org

on April 23, 2022 - Published by Cold Spring Harbor Laboratory Press http://perspectivesinmedicine.cshlp.org/Downloaded from

Page 5: Mendelian Randomization: Concepts and Scope

Table1.

Con

tinue

d

App

lication

Expo

sure(s)

Outco

me(s)

Con

clusions

Referenc

es

Malaria

Stun

ting

Use

ofgeneticvariationforsicklecelltraitto

demon

strateincreasedrisk

ofstun

ting

following

malaria

Kangetal.2013

Lifecourse

Childho

odadiposity

Type1diabetes

(T1D

)Supp

ortforan

effectof

child

hood

adiposityon

T1D

risk

Censinetal.2017

Age

atmenarche

BMI

Evidencethatearlierageatmenarchecauses

higher

BMIinlaterlife

may

beconfou

nded

byprepub

ertal

BMI

Belletal.2018;Gill

etal.2018

Childho

odbody

size

Coron

aryartery

disease(CAD),

type

2diabetes

(T2D

),breast

andprostatecancer

Whereas

thecausaleffectof

child

hood

body

size

onCADandT2D

risk

islargelymediatedby

adulthoodbody

size,smallerchild

hood

body

size

might

increase

therisk

ofbreastcancer

regardless

ofbody

size

inadulthood

Richardsonetal.

2020b

Intergenerational

MaternalB

MIandglucose

Birth

weight

MaternalB

MIandbloodglucoselevelscausally

relatedto

higher

offspringbirthw

eight

Tyrrelletal.2016b

Maternalalcoh

olintake

Edu

cation

Causaladverse

effectof

alcoho

lintakedu

ring

pregnancyon

offspringeducationalattainm

ent

Zuccolo

etal.2013

Parentaleducation

Offspring

education

Similareffectof

paternalandmaternalfam

ilyenvironm

entalinfl

uences

onoffspringeducational

attainment

Kon

getal.2018;

Wangetal.2021a

Drugtargets

Niemann–

PickC1-Like

1(N

PC1L

1)andHMG-CoA

redu

ctase(H

MGCR)

CHD

Validationof

ezetim

ibeandstatin

useforredu

cing

risk

ofCHDusingdrug

targetsin

NPC1L

1and

HMGCR

Ferenceetal.2015

HMGCR

Bon

emineraldensity(BMD)

LDL-CloweringeffectofHMGCRon

increasedBMD

suggestsrepu

rposingop

portun

ityforstatins

Lietal.2020

HMGCRandprop

rotein

convertase

subsilisin-kexin

type

9(PCSK

9)

T2D

StatinsandPCSK

9inhibitorsprescribed

forredu

cing

risk

ofCHDmay

increase

risk

ofT2D

Ferenceetal.2016

Angiotensin-con

verting

enzyme(ACE)

T2D

CausalprotectiveeffectofACEinhibitorson

T2D

risk

Pigeyre

etal.2020

Interleukin6receptor

(IL-6R

)COVID

-19

Potentialroleof

IL-6Rblockage

inCOVID

-19

severity

Bovijn

etal.2020

Continu

ed

Mendelian Randomization

Advanced Online Article. Cite this article as Cold Spring Harb Perspect Med doi: 10.1101/cshperspect.a040501 5

ww

w.p

ersp

ecti

vesi

nm

edic

ine.

org

on April 23, 2022 - Published by Cold Spring Harbor Laboratory Press http://perspectivesinmedicine.cshlp.org/Downloaded from

Page 6: Mendelian Randomization: Concepts and Scope

Table1.

Con

tinue

d

App

lication

Expo

sure(s)

Outco

me(s)

Con

clusions

Referenc

es

Molecular

traits

DNAmethylation

BMI

DNAmethylation

changesarepredom

inantlythe

consequenceof

adiposity,rather

than

thecause

Wahletal.2017

Tissue-specificgene

expression

Cardiovasculartraits

Identified

anu

mbero

ftissue-specificeffectsatseveral

geno

micregion

son

cardiovascular

traits

Tayloretal.2019

Proteins

Cardiovascular,cancer,and

autoim

mun

eou

tcom

esAmon

gresults

wasthefind

ingofan

independ

entrole

forIL18R1andIL1R

L2on

atop

icderm

atitisrisk

Sunetal.2018

Disease

liability

Schizoph

reniarisk

Reprodu

ctivesuccess

Increasedgeneticliabilityforschizoph

reniado

esno

tconferafitnessadvantagebu

tdoesincreasemating

success

Lawnetal.2019

Coron

aryartery

disease

(CAD)risk

Metabolites

GeneticvariantsthatinfluenceCADrisk

inadultsare

associated

withlargeperturbation

sin

metabolite

levelsin

child

hood

Battram

etal.2018

Atopy

Glio

ma

Limited

evidence

foran

effectof

atop

icdisease

liabilityon

risk

ofglioma

Disney-Hoggetal.

2018

R.C. Richmond and G. Davey Smith

6 Advanced Online Article. Cite this article as Cold Spring Harb Perspect Med doi: 10.1101/cshperspect.a040501

ww

w.p

ersp

ecti

vesi

nm

edic

ine.

org

on April 23, 2022 - Published by Cold Spring Harbor Laboratory Press http://perspectivesinmedicine.cshlp.org/Downloaded from

Page 7: Mendelian Randomization: Concepts and Scope

The establishment of GWAS consortia, eachfocused on investigating different complex traitsand diseases, has encouraged numerous popula-tion-based studies to contribute genetic data formeta-analysis (Table 2). This has in turn in-creased sample sizes for the discovery and robustreplication of GWAS findings. Many of theseconsortia have also made their GWAS summarydata publicly available, which, aided by data re-sources hosting such summary data (Richardsonet al. 2020a), have catalyzed the development ofsummary data-based MR studies (described inmore detail in the section “MR Methods”).

The recent availability of massive genotypedand phenotyped data sets, including biobankresources (Table 2), has added considerably toGWAS efforts. GWAS of phenotypic data fromthese resources are increasingly performed in anautomated fashion, with summary statisticsmade freely available online (Fig. 2). Effortssuch as these have uncovered a host of geneticvariants related to a range of traits, which mayleverage greater explanatory power by acting asstronger genetic proxies or instruments in MR(Dudbridge 2020).

ASSUMPTIONS OF MENDELIANRANDOMIZATION AND INSTRUMENTALVARIABLE ANALYSIS

The key assumption of MR is that of gene–envi-ronment equivalence, as discussed above. Whenusing the properties of germline genetic variantsto strengthen causal inference, the confidencethat a particular modifiable exposure is implicat-ed in the causation of a disease can be enhancedby identifying the direction andmagnitude of theeffect. This can be estimated through IV analysis.The large majority of MR studies are now imple-mented within an IV framework, and thereforethe IV assumptions are central to MR analysis.

Relevance Assumption: The Genetic VariantMust Be Robustly Associated with theExposure

The most common method of deriving genet-ic instruments in recent MR studies is viaGWAS, whereby single-nucleotide polymor-

phisms (SNPs) that pass genome-wide signifi-cance ( p < 5 × 10−8) are typically considered forinclusion. However, it is important that thestrength of the instrument is tested separatelyto appraise the relevance assumption, which isoften done by means of the proportion of vari-ance explained (r2) and the related F-statistic,which additionally takes into account the sizeof the sample under investigation. Increasingly,multiple genetic variants are found to be inde-pendently associated with traits investigated inGWAS and these may be combined in geneticrisk scores or throughmeta-analysis approachesto explainmore variation in the trait (Dudbridge2020). This in turn can be used to increase pow-er, obtain more precise causal estimates andminimize risk of weak instrument bias (i.e., un-certainty in the SNP-exposure association thatcan bias causal estimates) (Pierce and Burgess2013).

Independence/Exchangeability Assumption:There Are No Confounders of the Associationbetween the Genetic Variant and Outcome

Because genetic variants are randomized at con-ception, they should be allocated independent ofenvironmental and other genetic variants ex-cluding those in LD. This means that at a familylevel, genetic associations should be largely freefrom conventional confounding. Although MRwas explicitly introduced in 2003 within a par-ent–offspring design, data availability did notgenerally allow use of such designs at the time.It was suggested, however, that population-based studies with appropriate control for pop-ulation stratification could approximate the par-ent–offspring design (Davey Smith andEbrahim2003; Davey Smith et al. 2020). Concerns aboutpotential violation of this assumption at a pop-ulation level relate to confounding by ancestry orpopulation stratification, which can influencevariation in both allele frequency and diseaserisk in population(s) being investigated (Fig.3). Approaches to limit spurious associa-tions generated because of population groupsinclude use of genetic associations derivedfrom homogeneous populations or with ade-quate control for population structure (e.g.,

Mendelian Randomization

Advanced Online Article. Cite this article as Cold Spring Harb Perspect Med doi: 10.1101/cshperspect.a040501 7

ww

w.p

ersp

ecti

vesi

nm

edic

ine.

org

on April 23, 2022 - Published by Cold Spring Harbor Laboratory Press http://perspectivesinmedicine.cshlp.org/Downloaded from

Page 8: Mendelian Randomization: Concepts and Scope

Table2.

Selectionof

stud

iesan

dco

nsortia

used

forg

enom

e-wideassociationan

alysis

Gen

etic

consortia

App

roximate

samplesize

App

roximatenu

mbe

rof

contribu

tingstud

ies

Summaryda

taavailableford

ownloa

dCoh

ortsforHeartandAging

Researchin

Genom

icEpidemiology

(CHARGE)

60,000

10Yes,viadb

GAP

EarlyGrowth

Genetics(EGG)andEarlyGeneticsandLifecourse

Epidemiology

(EAGLE

)170,000

40Yes

egg-consortium

.org,

www.eagle-con

sortium.org

Meta-Analysesof

Glucose

andInsulin

-related

traitsCon

sortium

(MAGIC)

130,000

55Yes

www.m

agicinvestigators.org

DIA

betesGeneticReplicationAnd

Meta-analysis(D

IAGRAM)

150,000

40Yes

www.diagram

-con

sortium.org

GeneticInvestigationof

Anthrop

ometricTraits(G

IANT)

340,000

60Yes

portals.broadinstitute.org/collaboration

/giant/index.ph

p/GIA

NT_con

sortium

UCL–

LSHTM–E

dinb

urgh–B

ristol

(UCLE

B)(cardiom

etabolictraits)

40,000

12No

Con

tactUCLE

Bsteering

committee

Coron

aryARtery

DIsease

Genom

e-wideReplicationandMeta-

analysis(CARDIoGRAM)plus

theCoron

aryartery

Disease

(C4D

)genetics

consortium

190,000

50Yes

www.cardiogramplusc4d.org

GeneticAssociation

sandMechanism

sin

Oncology(G

AMe-On)

500,000

33Yes,viadb

GAP

Reprodu

ctiveGenetics(ReproGen)

330,000

85Yes

www.reprogen.org

PsychiatricGenom

icsCon

sortium

(PGC)

600,000

>50

Yes

www.m

ed.unc.edu

/pgc

SocialScienceGeneticAssociation

Con

sortium

(SSG

AC)

1,000,000

100

Yes

www.th

essgac.org

Continu

ed

R.C. Richmond and G. Davey Smith

8 Advanced Online Article. Cite this article as Cold Spring Harb Perspect Med doi: 10.1101/cshperspect.a040501

ww

w.p

ersp

ecti

vesi

nm

edic

ine.

org

on April 23, 2022 - Published by Cold Spring Harbor Laboratory Press http://perspectivesinmedicine.cshlp.org/Downloaded from

Page 9: Mendelian Randomization: Concepts and Scope

Table2.

Con

tinue

d

Gen

etic

stud

ies

App

roximate

samplesize

Num

berw

ithgene

ticda

taavailable

Availa

bilityof

data

UKBiobank

500,000

500,000

Datashared

withbona

fide

researcherson

projectapproval;geneticsummarydatafreely

available:gw

as.m

rcieu.ac.uk,www.nealelab.is/uk-biobank

China

Kadoorie

Biobank

(CKB)

500,000

110,000

Period

ofexclusiveuseforthe

CKBcollaborative

team

,afterwhich

datawillbe

madeavailable

tothewider

research

commun

ity;dataaccessrequ

estsof

allbon

afide

researcherswill

beconsidered

MillionVeteran

Program

1,000,000

300,000

Availableto

Veteran

Affairs–affiliatedinvestigators

Biobank

Japan(BBJ)

270,000

200,000

Dataaccessto

bona

fide

researchersrequ

irescollaboration

andresearch

visitsto

BBJ;genetic

summarydatafreelyavailable

jenger.riken.jp/en

HUNT

70,000

70,000

Availableto

investigatorsaffiliatedwithaNorwegianresearch

institute

Finn

Gen

500,000

180,000

Dataaccessavailableto

consortium

partnerson

ly;geneticsummarydatafreelyavailable

www.finn

gen.fi/en/access_results

23andM

e>8

,000,000

>8,000,000

Collaboration

requ

estsbu

tindividu

aldatacann

otbe

distribu

ted,

shared,orsold

tothird

parties

Mendelian Randomization

Advanced Online Article. Cite this article as Cold Spring Harb Perspect Med doi: 10.1101/cshperspect.a040501 9

ww

w.p

ersp

ecti

vesi

nm

edic

ine.

org

on April 23, 2022 - Published by Cold Spring Harbor Laboratory Press http://perspectivesinmedicine.cshlp.org/Downloaded from

Page 10: Mendelian Randomization: Concepts and Scope

through principal components analysis or line-ar mixed models) (Loh et al. 2015). However,the independence assumption can also be vio-lated by dynastic effects (when parental geno-types directly affect offspring phenotypes), orby assortative mating (when individuals selecta partner based on a particular phenotype).These biases will likely differ depending on theexposure(s), outcome(s), and population(s) un-der study.

It is impossible to prove the independenceassumption in an MR study because, althoughattempts can be made to account for ancestryand examine how genetic variants relate to mea-sured confounders, associations with unknownconfounders cannot be demonstrated. In addi-tion, whereas previous recommendations havebeen to assess associations between the geneticinstrument and a wide range of factors thatcould bias exposure–outcome associations (Da-vey Smith et al. 2007), these associations are like-ly to be indicators of confounding by ancestry

(Fig. 3) or horizontal pleiotropy (Fig. 4), ratherthan reflecting conventional confounding.

Exclusion Restriction Assumption: TheGenetic Variant Should Only Influence theOutcome of Interest via the Exposure

Pleiotropy is the phenomenon whereby a geneticvariant influences multiple traits, and is a majorthreat to the exclusion restriction assumption.However, it is important to make the distinctionbetween vertical and horizontal pleiotropy (Da-vey Smith and Hemani 2014; Hemani et al.2018a). Vertical (or mediated) pleiotropy occurswhen the genetic variant (G) is associated withthe outcome (Y) because G affects Y entirelythrough the exposure (X). This fulfils the exclu-sion restriction assumption and is the essence ofthe MR approach. Horizontal (unmediated orbiological) pleiotropy occurs when G affectsboth X and Y but through different pathways.This can yield biased estimates inMR if a genetic

76

61

46

32

17

21 2 3 4 5 6 7 8 9 1110

Chromosome

Tea intakeN = 349,376

–lo

g 10 (

P)

12 1413 15 181716 19 22 232120

Figure 2.Manhattan plot from the “GWASbot” displaying genetic association signals for tea intake from the UKBiobank.

R.C. Richmond and G. Davey Smith

10 Advanced Online Article. Cite this article as Cold Spring Harb Perspect Med doi: 10.1101/cshperspect.a040501

ww

w.p

ersp

ecti

vesi

nm

edic

ine.

org

on April 23, 2022 - Published by Cold Spring Harbor Laboratory Press http://perspectivesinmedicine.cshlp.org/Downloaded from

Page 11: Mendelian Randomization: Concepts and Scope

instrument influences the outcome via a mecha-nism other than the exposure of interest (Ver-banck et al. 2018). Such pleiotropy can be direct,as in the path from G to Y (uncorrelated pleiot-ropy), or can be indirect (e.g., when G affects Xand Y through a shared confounder, U [correlat-ed pleiotropy]) (Fig. 4; Morrison et al. 2020). Thelatter may occur in cases of misspecifying theprimary phenotype, such as when a genetic var-iant is used to proxy for a trait secondary to thetrait with which it is directly associated.

Although it is not possible to prove thatthe exclusion restriction assumption holds inany MR study, various sensitivity analyses canbe applied to uncover deviations from theassumption.

Use a Functional Polymorphism for theExposure of Interest

One method of ensuring that the genetic variantis unlikely to influence the outcome via anotherpathway is to use a SNP that has knownbiologicalfunction or is located in a gene that directly codesfor the exposure of interest. For example, variantswithin or near the protein-encoding locus for C-reactive protein (CRP) are known to alter serumlevels ofCRPandare likely tohave a predominantinfluence on any outcomes via this pathway(Timpson et al. 2011).

Although SNPs serve as valid instruments insome situations, in other cases their use is lim-ited if variants do have a pleiotropic effect that

C

SA

G1

G2

SGA

YX

Figure 3. Violation of the independence and exclusion restriction assumptions caused by shared ancestry orpopulation structure. Shared ancestry (SA) can confound the relationship between shared genetic ancestry (SGA)and a potential confounder (C), violating the Mendelian randomization (MR) assumption of independence/exchangeability by inducing an association between a genetic instrument, G1 and C (red line). SGA can alsoinduce an association between the genetic instrument (G1) and another genetic variant (G2) (green line), thusviolating the MR assumption of exclusion restriction.

C

XG Y

Figure 4.Correlated and uncorrelated pleiotropy.Uncorrelated pleiotropyoccurs whenG affects X andY throughseparate mechanisms (green line). Correlated pleiotropy occurs when G affects X and Y through a sharedconfounder (C) (red line).

Mendelian Randomization

Advanced Online Article. Cite this article as Cold Spring Harb Perspect Med doi: 10.1101/cshperspect.a040501 11

ww

w.p

ersp

ecti

vesi

nm

edic

ine.

org

on April 23, 2022 - Published by Cold Spring Harbor Laboratory Press http://perspectivesinmedicine.cshlp.org/Downloaded from

Page 12: Mendelian Randomization: Concepts and Scope

cannot be directly estimated. This may be par-ticularly problematic if a variant

• is associated with multiple biomarkers onseparate biological pathways (e.g., genetic var-iants influencing the branched-chain α-ke-toacid dehydrogenase (BCKD) enzyme are as-sociated with different branch chain aminoacids [Lotta et al. 2016]);

• disrupts the normal function of the exposure(e.g., an IL6R variant increases circulating in-terleukin 6 [IL-6] but reduces aspects of IL-6signaling and that appears to decrease risk ofCHD [Swerdlow et al. 2012]); and

• is associated with multiple dependent traits onoverlapping pathways, and if those traits havedifferent roles in disease (e.g., ALDH2 is asso-ciated with both alcohol consumption and ac-etaldehyde level, a known carcinogen, whichmakes it difficult to disentangle the effects ofalcohol and acetaldehyde on risk of esophagealcarcinoma [Lewis and Davey Smith 2005]).

For a detailed description of these scenariosand other applied examples, see Holmes et al.(2017). Although functional SNP analyses maytherefore appear plausible, they can have theirdrawbacks and many of the sensitivity analysesused for evaluating pleiotropy cannot be appliedwith a single SNP (see the section “Methods forAssessing and Accounting for Horizontal Plei-otropy”). In cases where a single SNP is used, itis recommended that associations between theSNP and a wide range of traits are investigated,as described below.

Assess Associations betweenGenetic Variantsand Other Factors

The presence of associations between geneticvariants and other factors may reveal violationsto the independence and/or exclusion restric-tion assumption. A common approach to ap-praise this is to assess whether the genetic vari-ants used to instrument the exposure (and thosevariants in LDwith the genetic instrument) havebeen associated with other phenotypes inGWAS, for example, by searching PhenoScan-ner (Staley et al. 2016). Although this may high-

light genetic variants with horizontal pleiotropy,it can also pick up vertical pleiotropy (e.g., a SNPrelated to bodymass index [BMI] may appear ina GWAS of blood pressure via its influence onBMI). In addition, truly horizontal pleiotropicSNPs may not be detected by this method if theGWAS of the phenotype on the pleiotropic pathis absent or underpowered. As such, it is notsufficient to simply exclude the variants that ap-pear in other GWAS as a way to assess the ex-clusion restriction assumption.

Conduct Stratified Analysis in a Subgroup ofthe Population inWhich theGenetic Variant IsNot Associated with the Exposure of Interest

In some instances, conducting a stratified analysiscan provide evidence against the possibility ofhorizontal pleiotropy. When a genetic variant isnot related to the exposure of interest in a partic-ular subgroup of the population, this variantshould also not be associated with the outcomeof interest in this subgroup (given an absence ofthe association with the exposure). For example,ALDH2, coding for aldehyde dehydrogenase 2, isa common polymorphism in East Asian popula-tions that has been used as a genetic instrumentfor alcohol consumption (Lewis andDavey Smith2005; Chen et al. 2008; Millwood et al. 2019). InEast Asian populations, in which women aremuch less likely to drink alcohol than men, thispolymorphism is not strongly associated with al-cohol intake among women (Chen et al. 2008).This approach has been used to assess the pres-ence of pleiotropy and evaluate a causal relation-ship between alcohol consumption and increasedblood pressure (Chen et al. 2008) and risk of vas-cular disease (Millwood et al. 2019). For example,if the effects of alcohol consumption on bloodpressure and vascular disease are causal, wewould expect to find evidence of association be-tween variation in ALDH2 and the outcomes inEast Asian men, but not East Asian women. Anyassociation observed between ALDH2 and theoutcomes in East Asian women, in the absenceof alcohol intake, would indicate pleiotropy. Suchan approach can be considered a negative controldesign (Lipsitch et al. 2010; Davey Smith 2012b)andmodels built on this approach can detect and

R.C. Richmond and G. Davey Smith

12 Advanced Online Article. Cite this article as Cold Spring Harb Perspect Med doi: 10.1101/cshperspect.a040501

ww

w.p

ersp

ecti

vesi

nm

edic

ine.

org

on April 23, 2022 - Published by Cold Spring Harbor Laboratory Press http://perspectivesinmedicine.cshlp.org/Downloaded from

Page 13: Mendelian Randomization: Concepts and Scope

adjust for the pleiotropic effects and provide validestimates in such instances (Cho et al. 2015; Spil-ler et al. 2019) (see the section “Methods for As-sessing and Accounting for Horizontal Pleiotro-py”). However, genetic variants that are notassociated with the exposure in a subgroup of apopulation may be uncommon, and so such di-rect assessment of pleiotropy is oftennot possible.

Do Not Condition on the Exposure to AssessExclusion Restriction

Although it may seem intuitive to assess whetherstatistical adjustment for the exposure leads to

attenuation of the gene–outcome relationship,this is not a recommended approach for testingthe exclusion restriction assumption. This is be-cause adjusting for the exposure may induce col-lider bias, in which another factor that causes theexposure becomes correlated with the genetic in-strument by conditioning on the exposure in thismanner (Fig. 5; Cole et al. 2010). Stratifying bysex is not problematic in this context becausebiological sex is not caused by other factors. Itis, however, a potential problem in instanceswhere genetic effects are investigatedwithinothersubsets of the population, for example, if we wereto stratify on alcohol drinker status itself (Gage

Smoking, socio-

economic position,

depression

YG

A

B

C

U

Smoking, socio-

economic position,

depression

Study

participation

Induced association

Conditioning on a collider

Cardiovascular

disease

Cardiovascular

disease

Alcohol drinker

status

Alcohol

variant

Alcohol

variant

X

X

Figure 5.Collider bias. (A) Generalizable directed acyclic graph (DAG). (B) Collider bias induced from stratifyingon exposure. (C) Collider bias induced from stratifying on study participation (e.g., caused by selection or loss tofollow-up).

Mendelian Randomization

Advanced Online Article. Cite this article as Cold Spring Harb Perspect Med doi: 10.1101/cshperspect.a040501 13

ww

w.p

ersp

ecti

vesi

nm

edic

ine.

org

on April 23, 2022 - Published by Cold Spring Harbor Laboratory Press http://perspectivesinmedicine.cshlp.org/Downloaded from

Page 14: Mendelian Randomization: Concepts and Scope

et al. 2016). Bias induced through adjustment forthe exposure may be exacerbated by potentialmeasurement error in the exposure, in additionto introducing collider bias (VanderWeele et al.2012).

More advanced methods have been devel-oped to assess violation of the exclusion restric-tion, including techniques that explicitly modeland adjust for pleiotropy, and those that are nat-urally robust to pleiotropy (Hemani et al. 2018a).These are described in more detail in the nextsection.

MR Methods

Direct Genotype Associations

The simplest MR approach to evaluate the pres-ence of a causal relationship is to assess the as-sociation between a genetic variant known toinfluence or modify an exposure and the out-come of interest. However, this does not allowfor the magnitude of causal effect to be estimat-ed, which is most often the estimate of interest,especially when considering the translationalimplications and clinical utility of findings. Inaddition, multiple pathways can often explainthe association between a genetic variant and aparticular outcome, so more knowledge of the

exposure of interest and its association with thegenetic variant is generally required for a validinterpretation (Holmes et al. 2017).

Original Applications of One-SampleMendelian Randomization

In the pre-GWAS era, most examples of appliedMR were conducted within one data set (i.e., inwhich genetic variants, exposures, and out-comes of interest are obtained from individualsin the same sample) (Fig. 6A). In such a scenario,the causal effect of the exposure on the outcomecan typically be estimated using two-stage least-squares (2SLS) regression (Angrist and Imbens1995a). In the first stage, the exposure is re-gressed on the genetic instrument and in thesecond stage the outcome is regressed againstthe predicted values from the first stage regres-sion. The effect estimate can then be interpretedas the change in the outcome per unit increase inthe exposure. The genetic instrument used inone-sample analysis can be a SNP, multipleSNPs, or a genetic risk score (i.e., a summationof risk alleles for each individual) that can beunweighted or weighted to give those geneticvariantswith the strongest effect on the exposuremore weight (Dudbridge 2020).

G X

Sample 1

Y

X

Sample 1 Sample 2

Y

CA

BC

G

Figure 6.One-sample and two-sample Mendelian randomization (MR) study designs. (A) One-sample MR usesa data set in which genotype, exposure, and outcome have been assessed. (B) Two-sampleMR uses a genetic dataset in which the exposure has been measured (to derive SNP-exposure estimates, sample 1) and a second geneticdata set in which the outcome has been measured (to derive SNP-outcome estimates, sample 2).

R.C. Richmond and G. Davey Smith

14 Advanced Online Article. Cite this article as Cold Spring Harb Perspect Med doi: 10.1101/cshperspect.a040501

ww

w.p

ersp

ecti

vesi

nm

edic

ine.

org

on April 23, 2022 - Published by Cold Spring Harbor Laboratory Press http://perspectivesinmedicine.cshlp.org/Downloaded from

Page 15: Mendelian Randomization: Concepts and Scope

Studies with more individual-level data mayalso permit an assessment of associations be-tween genetic variants and confounders of theexposure–outcome relationship to interrogatethe independence and exclusion restriction as-sumptions. Additional approaches to evaluateviolation of the exclusion restriction assumptionin one-sample MR include the Sargan test (Sar-gan 1958), which evaluates heterogeneity of theindividual SNP estimates, and IV approaches,which can estimate the causal effect in the pres-ence of invalid (e.g., pleiotropic) instruments(Kang et al. 2016; Windmeijer et al. 2019).

Early one-sample MR studies suffered thelimitation of low power because few large datasets with relevant genotypic and phenotypicdata were available. To counteract this, anumber of MR studies were conducted usingmeta-analysis of causal estimates obtainedfrom independent studies, which was greatlyaided by the existence of large genetic consor-tia (Tyrrell et al. 2016a). However, the develop-

ment of two-sample MR analysis has vastly im-proved the scope of MR applied to large-scaledata sets.

Development of Two-Sample MendelianRandomization

It is possible to use MR to estimate causal effectsin which genetic associations with the exposureand the outcome have been estimated in differ-ent samples (Fig. 6B). This approach, nowknown as two-sample MR (Pierce and Burgess2013), has greatly increased both the scope andpopularity of MR analysis (Fig. 7). Although theinitial extended exposition of MR in 2003 (Da-vey Smith and Ebrahim 2003) included exam-ples of what is now called two-sample MR, therise in popularity in recent years is attributed tothe public availability of GWAS summary data,as well as the development of methods to har-monize and integrate data sets and computecausal estimates when the SNP–exposure and

2011

130

19

52

60

32

5

Nu

mb

er

of stu

die

s

39

04

55

52

05

85

One-sampleTwo-sample or subsample

65

065

0

20

30

40

50

Pro

port

ion (

%)

of stu

die

s u

sin

g th

e s

ub

sa

mp

le a

nd

/or

two

-sa

mp

le s

tra

teg

y

60

70

80

90

10

010

0

2013 2015 2017

Year

2019 2011 2013 2015 2017

Year

2019

Figure 7. Number of empirical one- and two-sample Mendelian randomization (MR) papers in PubMed from2011–2020. Figure has been adapted and updated since that produced by Hartwig et al. (2016). We performed aliterature search from 1 January 2011 to 31 December 2020 using the terms “Mendelian randomization” or“Mendelian randomisation” to identify the proportion of Mendelian randomization papers using the two-sample design.

Mendelian Randomization

Advanced Online Article. Cite this article as Cold Spring Harb Perspect Med doi: 10.1101/cshperspect.a040501 15

ww

w.p

ersp

ecti

vesi

nm

edic

ine.

org

on April 23, 2022 - Published by Cold Spring Harbor Laboratory Press http://perspectivesinmedicine.cshlp.org/Downloaded from

Page 16: Mendelian Randomization: Concepts and Scope

SNP–outcome associations come from differentstudies (Hemani et al. 2018b).

The two-sample approach eliminates the re-quirement to have access to raw genetic data onindividuals within a study and also makes per-forming MR less time consuming. In terms ofdata requirements, all that is needed are thedetails of the genetic association between thevariant(s) and the trait from the exposureGWAS (sample one) and the outcome GWAS(sample two). This typically includes informa-tion on the effect and other allele, effect allelefrequency, effect estimate, and standard errorfrom both GWAS. In addition, the developmentof web software and code for summary-leveldata makes MR very straightforward to imple-ment (see the section “Novel InformaticTools”). It also increases the scope of MR anal-yses, with a wealth of data on exposures andoutcomes available for interrogation, whichmay be infeasible or expensive to measure inthe same set of individuals. Furthermore, a se-ries of methods have been developed within thissetting in recent years to assess and correct forpotential pleiotropy (see the section “Methodsfor Assessing and Accounting for HorizontalPleiotropy”).

The simplest approach for using summary-level data in an MR framework is to derive aWald ratio for a SNP. This is the effect estimatefor the SNP–outcome association (from sample2) divided by the coefficient of the SNP–expo-sure association (from sample 1), with the stan-dard error of theWald ratio often approximatedby the delta method (Thomas et al. 2007). Inthe presence of multiple genetic instruments,a meta-analysis approach (usually inverse-weighted [IVW] meta-analysis) may be usedto combine Wald ratio estimates of the causaleffect obtained from different SNPs (Dudbridge2020). The point estimates from an IVWMRareequivalent to a weighted linear regression of theSNP–outcome associations on SNP–exposureassociations when the intercept is constrainedto zero. The effect estimates obtained shouldalso be equivalent to the effect estimated in2SLS when sample sizes are large, SNPs inde-pendent, and there is limited heterogeneity inthe Wald ratios.

One- versus Two-Sample MR

Despite its ease of application, there are variouslimitations of the two-sample MR that also re-quire consideration. These have been discussed indetail elsewhere (Haycock et al. 2016; Zheng et al.2017) and are also summarized inTable 3. In partbecause of these limitations, and also because ofthe recent availability of large-scale genotypedand phenotyped data sets (Table 1), there hasbeen a recent resurgence of one-sampleMR.Ma-jor benefits of the one-sample MR approach arethe flexibility to perform rigorous MR, and theability to assess the independence and exclusionrestriction assumptions through assessment ofindividual-level confounders.

However, weak instrument bias may threat-en the estimation of causal effects in one-sampledata sets (Burgess and Thompson 2011), inwhich uncertainty in the SNP–exposure associ-ation could bias the causal estimate (Table 3).Importantly, where weak instruments will biascausal estimates in the direction of the null in atwo-sample setting, weak instrument bias will betoward the observational association in a one-sample setting (Zheng et al. 2017). In addition,selection bias caused by winner’s curse couldlead to biased causal estimates if the genetic var-iants were discovered in the same sample underinvestigation. This is a phenomenon that occursin GWAS by using a P-value cut-off that canlead to chance overestimation of the effect sizeof SNPs with the strongest genetic signals in theGWAS discovery sample (Garner 2007).

Whereas two-sample methods can be usedfor one-sampleMR analysis (Minelli et al. 2021),these may produce biased estimates and type 1error rate inflation (i.e., incorrectly rejecting thenull hypothesis of no association), somethinglearned from two-sampleMR analyses when ge-netic consortia have overlapping samples (Bur-gess et al. 2016a). It is advised that the covariancebetween the SNP–exposure and SNP–outcomeassociation estimates are taken into account andthat external weights be used where possible tominimize the risk of bias. Specifically, geneticvariants can be weighted by the magnitude oftheir association with the exposure in an inde-pendent data set (Burgess andThompson 2013),

R.C. Richmond and G. Davey Smith

16 Advanced Online Article. Cite this article as Cold Spring Harb Perspect Med doi: 10.1101/cshperspect.a040501

ww

w.p

ersp

ecti

vesi

nm

edic

ine.

org

on April 23, 2022 - Published by Cold Spring Harbor Laboratory Press http://perspectivesinmedicine.cshlp.org/Downloaded from

Page 17: Mendelian Randomization: Concepts and Scope

in what could be described as a “one-and-a-halfsample MR” design.

Extensions to the Basic MR Approach

Other Methods

A series of developments have been made toextend the application of MR to

• consider the prevailing direction of causalitybetween two traits (bidirectional MR);

• evaluate intermediates on the causal pathwaybetween exposures and outcomes (two-step,network, or mediation MR);

• assess the causal role of closely related traitsand to establish independent effects of each(multivariable MR); and

Table 3.Comparison of strengths and limitations of one-sample and two-sampleMendelian randomization (MR)

One-sample MR Two-sample MR

Strengths Flexibility of the analytical strategy in terms ofregression models that can be performed aswell as covariates and participants that canbe included/excluded

Improved sample size and powerFlexibility and enhanced power to perform an

array of sensitivity analyses (e.g., pleiotropy-robust methods)

Permits thorough evaluation of confounders totest above assumption

Less time-consuming and easier to implement

Allows for comparison with observationalestimates in same study (e.g., throughDurbin-Wu-Hausman test)

Can evaluate causal relationships between arange of exposure and outcomes, which mightnot be possible in a single sample setting

Can model interactions, survival time, andother analyses (including MR analysis ofnonlinear effects)

Unable to thoroughly evaluate individual-levelconfounding factors

Assumes the two samples are exchangeable.Examples of where this is difficult to assert arewhere the samples are heterogeneous in termsof age, sex distributions or ancestry

Limitations Traditionally low power and thereforeimprecise causal estimates

Potential for selection bias caused by studysampling

Weak instrument bias is toward nullPotential for selection bias caused by studysampling

Winner’s curse in which the discovery GWASused to estimate the SNP-trait associationmayoverestimate the effect of the geneticinstrument relative to the exposure

Weak instrument bias is toward observationalestimate

Relative rigidity of the summary data available,which is limited by the original GWAS modelperformed (e.g., adjustment for unwantedcovariates and a lack of available data onsubgroups of interest (e.g., sex-specificestimates)

Winner’s curse in which the sample in thediscovery GWAS is the same as that used forMR, which can lead to overestimation of thestrength of association of the geneticinstrument with the exposure

SNP–exposure and SNP–outcome associationsshould be coded relative to the same effectallele, also known as “harmonization,” whichis nontrivial in the situation of palindromicSNPs (i.e., G/C and A/T SNPs) and in theabsence of information on allele frequencies

Need to have access to individual-level geneticand phenotypic data

Assumes no overlap between samples, whichcould bias estimates if this is not true

Direct comparison with observational estimatesnot as straightforward

Unable to model interactions, survival time, andother analyses (including nonlinear analyses)

Mendelian Randomization

Advanced Online Article. Cite this article as Cold Spring Harb Perspect Med doi: 10.1101/cshperspect.a040501 17

ww

w.p

ersp

ecti

vesi

nm

edic

ine.

org

on April 23, 2022 - Published by Cold Spring Harbor Laboratory Press http://perspectivesinmedicine.cshlp.org/Downloaded from

Page 18: Mendelian Randomization: Concepts and Scope

• evaluate combined causal effects of risk fac-tors (factorial MR or exposure interactions).

Descriptions, directed acyclic graphs(DAGs), and applied examples for each of thesemethods are outlined in Table 4.

Novel Informatic Tools

The recent widespread availability of GWASsummary data for a range of traits, with largedata repositories and bioinformatic resourcesfor performing MR, provides a powerful and

user-friendly way of investigating causal relation-ships between many different traits (Richardsonet al. 2020a). For example, MR-Base is a platformthat has retrospectively collected GWAS data setsfor more than 20,000 traits, as well as protein-,methylation-, and expression-quantitative traitloci (pQTL, mQTL, and eQTL) statistics fortens of thousands ofmolecularmarkers (Hemaniet al. 2018b). Together with its web-based inter-face, which allows the user to explore a range ofcausal relations, there is an accompanying Rpackage (TwoSampleMR), which allows for LDpruning of genetic instruments in the exposure

Table 4. Extensions to the basic Mendelian randomization (MR) approach

Method Description Directed acyclic graphs (DAGs) Applications

Bidirectional orreciprocal MR(Timpson et al.2011)

Used to evaluate the causaldirection(s) of effect betweentwo traits X and Y, with theuse of valid instruments GX

and GY

G1

G2

X

Y

Y

X

Body mass index(BMI) and vitaminD (Vimaleswaranet al. 2013)

Two-step MR(Relton andDavey Smith2012)

Used to assess the role of anintermediary factor (Z) inmediating the effect of X on Ywith the use of validinstruments GX and Gz

Y

G1

X Z

G2 DNA methylation,gene expression,and BMI(Mendelson et al.2017)

Network MR(Burgess et al.2015)

Extension of the two-step MRapproach to consider thecausal role of multiplemediators or causal networks

G1

X Z

G2

WG3

Y

Effect of education oncardiovasculardisease (CVD) viasmoking, BMI, andalcohol (Carter et al.2019)

Multivariable MR(Burgess andThompson2015)

Used to assess the role ofmultiple correlated exposuresusing genetic variants thatare associated with one ormultiple exposures to estimatethe independent causal effectof each exposure on theoutcome. Can also be adaptedto evaluate mediation (incombination with or separateto two-step MR)

X

Z

G1

G2

YG12

Lipid fractions andcoronary heartdisease (CHD)(Burgess et al. 2014)

Factorial MR/exposureinteractions(Rees et al.2020)

Used to determine the combinedcausal effects of two or morerisk factors for diseasewithin afactorial design

Y

G1 X

Z

XZ

G2

Statin (HMGCR),ezetimibe(NPC1L1) andCHD (Ference et al.2015)

R.C. Richmond and G. Davey Smith

18 Advanced Online Article. Cite this article as Cold Spring Harb Perspect Med doi: 10.1101/cshperspect.a040501

ww

w.p

ersp

ecti

vesi

nm

edic

ine.

org

on April 23, 2022 - Published by Cold Spring Harbor Laboratory Press http://perspectivesinmedicine.cshlp.org/Downloaded from

Page 19: Mendelian Randomization: Concepts and Scope

GWAS, the identification of SNPs and taggingSNPs for each instrument in the outcomeGWAS, the careful harmonization of summarystatistics between exposure and outcomeGWAS, as well as the use of sensitivity analysesto promote evaluation of the MR assumptions.

Methods for Assessing and Accounting forHorizontal Pleiotropy

Violation of the exclusion restriction assumptionvia horizontal pleiotropy is a major threat to thevalidity of MR analyses and so various methodshave been developed in recent years to try toovercome this. These methods (1) can test forpotential pleiotropy (e.g., heterogeneity and out-lier tests); (2) can directly model and correct forpleiotropy (e.g., MR Egger regression [Bowdenet al. 2015]) and MR-TRYX [Cho et al. 2020]);or (3) are “naturally robust” to pleiotropy (e.g.,mode and median estimators [Bowden et al.2016a; Hartwig et al. 2017]). These typically useIVestimates as the basis of the sensitivity analysesand can be used to explore how robust MR find-ings are to the assumption that the genetic instru-ments used have nohorizontal pleiotropic effects.

If the estimate obtained for a causal effect isof a consistent magnitude across multiple inde-pendent variants, then pleiotropy is less likely tobe a concern. However, often effect estimates arenot consistent across independent instruments,with some “outlying” variants having an ob-

served association with the outcome that is sub-stantially different to that expected given theirassociation with the exposure. If the instrumentis valid, it should have an effect on the outcomethat is proportional to the effect on the exposure.Formal tests for examining heterogeneity in-clude Sargan’s test for 2SLS and Cochran’s Qstatistic, Rucker’s Q, and likelihood ratiotests in two-sample MR (Bowden et al. 2018b;Hemani et al. 2018a). For detecting outliers,the following approaches can be considered:leave-one-out analysis, Cook’s distance, studen-tized residuals, Q-contribution, and the MR-PRESSO global and outlier tests (Verbancket al. 2018).

Graphical assessment is also helpful for as-sessing potential pleiotropy. Heterogeneity canbe visualized in scatter plots (Fig. 8), in whichestimates derived from each genetic variant donot align with the regression line (i.e., do notconverge to the same causal estimate), or in forestplots inwhich there is clear variation in the causalestimates obtained from each variant. Funnel plotdisplays of MR estimates of individual geneticvariants against their precision will show asym-metry if some variants have unusually strong ef-fects, indicative of pleiotropy. Leave-one-outplots can be used to assess the influence of indi-vidual outliers; and the Galbraith radial plot canbe considered in place of the scatter plot for de-tecting outliers and influential data points (Bow-den et al. 2018a).

SNP-exposure

Y-interceptdifferent to 0

Pleiotropic SNP Nonpleiotropic SNP

True slope

SN

P-o

utco

me MR-Egger: valid causal estimate

IVW: invalid causal estimate

Figure 8. Inverse-variance weighted (IVW) and Mendelian randomization (MR)-Egger regression.

Mendelian Randomization

Advanced Online Article. Cite this article as Cold Spring Harb Perspect Med doi: 10.1101/cshperspect.a040501 19

ww

w.p

ersp

ecti

vesi

nm

edic

ine.

org

on April 23, 2022 - Published by Cold Spring Harbor Laboratory Press http://perspectivesinmedicine.cshlp.org/Downloaded from

Page 20: Mendelian Randomization: Concepts and Scope

Whereas random variation could result indifferent effects estimated by the individual vari-ants, the presence of heterogeneity in the causaleffect estimated by individual SNPs could alsoindicate horizontal pleiotropy. The simplestmethod of accounting for this is with the use ofa random effects IVW meta-analysis model(Dudbridge 2020). However, this approach canonly be used where horizontal pleiotropy is “ba-lanced” (i.e., where the random effects have zeroweighted mean) (Hemani et al. 2018a). The firstmethod developed for assessing and counteract-ing the extent of “unbalanced” or “directional”pleiotropy was the application of the Egger re-gression technique to MR analysis (Bowdenet al. 2015). This approach, first introduced inthe context of small-study bias withinmeta-anal-ysis, allows the intercept of the weighted linearregression line of the SNP–outcome on theSNP–exposure association to vary freely (Fig. 8).Directional pleiotropy can be tested by assessingthe extent to which the intercept deviates when itis not constrained to the origin (as in IVW), andthe gradient of the line can be used to provide acausal estimate in the presence of directional plei-otropyusing theMR-Eggerapproach. It is impor-tant to report both estimates.

Additional pleiotropy-robust approaches in-clude the modal and median based estimatorsthat both avoid the contribution of some invalidinstruments (Bowden et al. 2016a; Hartwig et al.2017). Both of these methods can be viewed asimplicit outlier correction techniques, becausethey only allow certain more “reliable” SNPs tocontribute to the overall estimate. Use of aweighted approach for both of these methods istypically advocated tomaximize statistical power.

Additional methods intended to account forpleiotropy include direct outlier removal (e.g.,MR-PRESSO [Verbanck et al. 2018], generalizedsummary MR [GSMR] [Zhu et al. 2018], andCook’s distance [Corbin et al. 2016]), outlierpenalization (e.g., MR-TRYX [Cho et al.2020]), “no-relevance point” approaches includ-ing negative controls (Gage et al. 2017), gener-alized gene–environment interaction models(MRGxE [Spiller et al. 2019]), pleiotropy-robustMR (van Kippersluis and Rietveld 2018), andtechniques that attempt to directly model pleio-

tropic pathways including multivariable MR(Burgess and Thompson 2015), structural equa-tionmodeling (SEM) (Evans et al. 2019), and thedirection of causation approach (MR-DOC)(Evans et al. 2019; Hwang et al. 2020). Thesemethods are described in more detail elsewhere(Hemani et al. 2018a; Burgess et al. 2019; Sloband Burgess 2020).

It should be emphasized that while theseapproaches can relax or bypass the exclusionrestriction assumption of conventional MRanalysis, they in turn come with their own as-sumptions (Table 5; Hemani et al. 2018a). Inaddition, these approaches are typically lesswell powered than the IVW approach. As such,methods for assessing and accounting for plei-otropy should be viewed as sensitivity analysesto conventional approaches such as IVW.

COMMON MISCONCEPTIONS

With rising popularity of the MR approach,which is now becoming a common element ofGWAS papers, there are a number of commonmisconceptions that require debunking to en-sure that the findings from MR analysis are in-terpreted appropriately.

There Are Three Assumptions on Which MRRelies for Estimating Causal Effects

Although there are three core IV assumptionsthat apply to manyMR studies (relevance, inde-pendence, and exclusion restriction), additionalassumptions are needed to quantify the effect orto consider that the study is informative abouteffects that may be produced bymanipulation ofthe exposure. The latter is made under the gene–environment equivalence assumption alreadydiscussed.

Another assumption in instrumental variableanalysis that is relevant for effect estimation, isthe assumption of homogeneity (Swanson 2017).This assumption relates to assessing whether thecausal effect obtained in MR is uniform acrossthe population, and so represents an averagetreatment effect (ATE). For example, if investi-gating the effect of BMI on CVD, we wouldassume that a kg/m2 increase in BMI would el-

R.C. Richmond and G. Davey Smith

20 Advanced Online Article. Cite this article as Cold Spring Harb Perspect Med doi: 10.1101/cshperspect.a040501

ww

w.p

ersp

ecti

vesi

nm

edic

ine.

org

on April 23, 2022 - Published by Cold Spring Harbor Laboratory Press http://perspectivesinmedicine.cshlp.org/Downloaded from

Page 21: Mendelian Randomization: Concepts and Scope

evate risk of CVD irrespective of the person’sgender or age. The homogeneity assumptioncan also be replaced by the less stringent mono-tonicity assumption, which assumes that an in-crease in the number of risk alleles will neverlower the likelihood of exposure (Swanson andHernán 2018) (e.g., a BMI genetic risk scoreshould not increase BMI for some individualsand decrease it for others). If this assumptionholds, then estimation of a local average treat-ment effect (LATE) among those individualswhose exposure level is affected by their geno-type is possible. Such assumptions must bemade for the effect estimate obtained from anyMR analysis to be interpreted as the causal effectof the exposure on the outcome. Although theycannot be explicitly tested, the assumptions canbe interrogated through various means. Onepossibility is to estimate causal effects in differ-ent subpopulations and to evaluate whether theestimated effects differ. Alternatively, as nonho-mogeneity in the genetic variant–exposure asso-ciation would lead to nonhomogeneity in thegenetic variant–outcome association, evaluatingthe variance of exposure and outcome groups bygenotype would provide a test for the presenceand degree of violation of this assumption (Mills

et al. 2020). Large GWAS allow variance to becharacterized well, and so can be interrogated toinvestigate this (Young et al. 2018).

An instance in which it is difficult to obtainrelevant treatment effects fromMRestimates is inthe presence of binary exposures (Burgess andLabrecque 2018). As the assumptions of homo-geneity and monotonicity are less likely to holdwhen interpreting causal estimates with binaryexposures using MR, it is often simpler to reporton the existence and direction (rather than themagnitude) of the causal effect (Burgess and La-brecque 2018). If these assumptions can bemade,there are options for causal estimation with abinary exposure that allow estimates to be con-verted onto amore clinically meaningful scale. Inone-sample data with timed events, it may bepossible in principle to estimate the causal effectof a binary exposure, but this has not yet beendemonstrated. Even if these assumptions can bemade, interpretation is to the liability to the bi-nary exposure, rather than the binary exposureitself (Davey Smith 2019; Richmond and DaveySmith 2019).

Interpretation of causal effects on binary out-comes is also challenging (Palmer et al. 2008).Although it is empirically possible to calculate

Table 5. Additional assumptions of Mendelian randomization (MR) sensitivity analyses

Approach Assumption Description References

Inverse-varianceweighted (IVW)and MR-Eggerregression

No measurement error(NOME)

There is no measurement error in theassociation between the single-nucleotide polymorphism (SNP)and the exposure

Bowden et al.2016b

MR-Egger regression Instrument strengthindependent of direct effect(InSIDE)

The strength of the SNP–exposureassociation should not correlatewith the strength of the pleiotropyeffect

Bowden et al.2015

Modal estimator Zero modal pleiotropyassumption (ZEMPA)

The most common causal effectestimate is a consistent estimate ofthe true causal effect, even if themajority of instruments are invalid

Hartwiget al. 2017

Median estimator Simple median = the causaleffect is provided by themedian SNP estimate

Simple median = at least 50% of theinstruments are valid (i.e., notpleiotropic)

Bowden et al.2016a

Weighted median = the causaleffect is provided by theweighted median SNPestimate

Weightedmedian = at least 50% of theweight in the analysis stems fromvariants that are valid instruments(i.e., not pleiotropic)

Mendelian Randomization

Advanced Online Article. Cite this article as Cold Spring Harb Perspect Med doi: 10.1101/cshperspect.a040501 21

ww

w.p

ersp

ecti

vesi

nm

edic

ine.

org

on April 23, 2022 - Published by Cold Spring Harbor Laboratory Press http://perspectivesinmedicine.cshlp.org/Downloaded from

Page 22: Mendelian Randomization: Concepts and Scope

causal estimates in a similar manner to those forcontinuous variables, for example, with use of thelog odds scale, theMReffect estimatemayonly beapproximate in the case of a binary outcome(Burgess et al. 2016c). This is because the “non-collapsibility” of odds ratios means that these es-timates may not be constant across strata, and socannot be used to obtain a precise causal effect(Greenland et al. 1999). Alternatively, analysescan be conducted on the risk difference scale,which reduces the risk of bias caused by noncol-lapsibility.

As mentioned, whereas sensitivity analysesmay relax some of the core IV assumptions (e.g.,exclusion restriction in the case of pleiotropy-robust methods), they impose their own set of(albeit weaker) assumptions (Table 5). Two-sample MR also imposes additional assump-tions to the one-sampleMR approach, includingexchangeability of the two samples (i.e., whetherthey are both drawn from the same underlyingpopulation), as well as the assumption that thetwo samples are nonoverlapping (Lawlor 2016).

MR Is Analogous to a Randomized ControlledTrial

An analogy has often beenmade between anMRstudy and an RCT, in which genotypes are said torandomize participants into different levels ofexposure or treatment, independent of con-founding (Davey Smith and Ebrahim 2005). Inparticular, it is this random allocation of geneticvariants from parents to offspring that can beviewed as analogous to an RCT (Davey Smithand Ebrahim 2003; Davey Smith et al. 2020). Of-ten this analogy is useful, particularly when it ispossible to scale causal effect estimates to that ofinterventions, for example, in the case of (retro-spectively) predicting the null effect of seleniumon risk of prostate cancer using MR (RR 1.01[95% CI 0.89, 1.13] per 114 µg/L in MR vs. RR1.04 [95% CI 0.91, 1.19] per 114 µg/L in RCT)(Yarmolinsky et al. 2018). However, the analogyis not perfect because RCTs typically involve in-terventions of short duration, whereas an indi-vidual’s genotype could reflect lifelong exposures,time-dependent or critical period effects, as wellas potential developmental compensation that

may arise over time (Davey Smith and Ebrahim2003; Holmes et al. 2017). Causal estimates ob-tained from MR analyses may therefore differ inmagnitude to those seen or anticipated in anRCT, although can also be useful in providingadded value regarding life-course effects. For ex-ample, knowledge that cholesterol lowering inearlier life is likely to be important for preventingCVD (Ference et al. 2014). However, inMR anal-ysis conducted at a population, rather than fam-ily-level, the analogy with RCTs is only approx-imate (Davey Smith and Ebrahim 2003), asdescribed below.

Genetic Variants Are Not Influenced byConfounding Factors

Under Mendel’s laws of segregation and inde-pendent assortment, it is assumed that geneticvariants should be inherited independently ofother genetic and environmental factors. Al-though population-level genetic variants aretypically much less associated withmany poten-tial confounders than directly measured expo-sures (Davey Smith et al. 2007), the randominheritance of genetic variants from parents tooffspring does not guarantee that genetic vari-ants and confounders will be independent insamples of unrelated individuals. For example,an obvious violation of this is created because ofpopulation stratification that can introduce con-founding of genotype–disease associations byfactors related to subpopulation groupmember-ship within the overall population. This mightoccur even within groups of relatively homoge-neous ancestry, as a result of underlying sub-structure (Abdellaoui et al. 2019; Haworthet al. 2019; Lawson et al. 2020), or also at thefamily level, for example, caused by assortativemating (Hartwig et al. 2018). One potential vi-olation of the second IV assumption is that of“dynastic” or “genetic nurture” effects, in whichparental genotypes affect the offspring via theenvironment that parents create for their off-spring by affecting the parental phenotype (Da-vies et al. 2019; Brumpton et al. 2020). As aresult, a genetic instrument in the offspringwill be correlated with the environment createdby the parents. One solution to this problem is to

R.C. Richmond and G. Davey Smith

22 Advanced Online Article. Cite this article as Cold Spring Harb Perspect Med doi: 10.1101/cshperspect.a040501

ww

w.p

ersp

ecti

vesi

nm

edic

ine.

org

on April 23, 2022 - Published by Cold Spring Harbor Laboratory Press http://perspectivesinmedicine.cshlp.org/Downloaded from

Page 23: Mendelian Randomization: Concepts and Scope

performMR analysis between siblings who havea shared family background and whose geno-type differences will not be confounded by pa-rental or family factors (within-familyMR) (Da-vies et al. 2019a; Brumpton et al. 2020). In theinitial presentation of MR, it was suggested thatthe most robust form would be within families(Davey Smith and Ebrahim 2003) and with in-creasing sources of data for family-based studies,this approach offers potential for elucidatingcausal effects for those traits that are most likelyto be influenced by dynastic effects (e.g., socio-economic and behavioral factors).

The Exclusion Restriction AssumptionIs Violated because of Pleiotropy

The exclusion restriction assumption is some-times referred to as the “no pleiotropy assump-tion,” although it can be violated in a number ofother ways, including timing effects, interac-tions, reverse causation, collider bias, and LD,as previously described (VanderWeele et al.2014).

In particular, the following sources of collid-er bias may induce spurious associations be-tween a genetic variant and factors other thanthe exposure of interest that may influence theoutcome (Fig. 5):

• The use of instruments generated from aGWAS that adjusts for another phenotype;

• Ascertainment bias in case-control studies;

• Selection and loss to follow-up bias in cohortstudies;

• Survival bias when investigating later-life out-comes; and

• Evaluation of disease progression in a case-only setting.

When there are only moderate effects of aphenotype on selection, bias is generally small(Gkatzionis and Burgess 2019). However, wherecollider bias is likely to exist, it is recommendedthat sensitivity and simulation studies are car-ried out to evaluate the extent to which biasmight distort estimates (Hughes et al. 2019).In addition, alternative approaches such as in-

verse-probability weighting, modeling compet-ing risks, adjusting for index event bias (Dud-bridge et al. 2019; Mahmoud et al. 2020), andthe use of negative control outcomes (Sandersonet al. 2021) can also be considered.

Another way in which the exclusion restric-tion may be violated is when genetic instru-ments are in LD with other variants thathave an effect on the outcome not via the expo-sure. In this instance, genetic colocalization ap-proaches (Giambartolomei et al. 2014) may beused that attempt to evaluate whether two traitsshare the same causal variant at a particular lo-cus, and thus contribute to evaluation of wheth-er the exclusion restriction assumption is likelyto hold.

Reverse Causation Is Not an Issuefor MR

Because germline genotypes are fixed at concep-tion, they cannot be influenced by reverse cau-sation, and therefore it is often claimed that re-verse causation is not an issue forMR. Althoughthis is true, a scenario in which reverse causationmight pose a problem for MR is where the ge-netic instruments for two traits (X and Y), GX

and GY, are not well characterized (Table 4; Da-vey Smith andHemani 2014). If trait X influenc-es trait Y, then a GWAS with adequate statisticalpower will identify a genetic variant with its pri-mary influence of trait X as being associatedwithtrait Y (for example, the CHRNA5 variant relat-ed to smoking intensity has been identified atgenome-wide significance in relation to lungcancer [Amos et al. 2008]). This will lead tospurious conclusions if this variant were thenused as a genetic instrument for trait Y (e.g.,that lung cancer causes smoking), that is, bymisspecifying the primary phenotype. One wayto minimize this problem is to ensure that thetwo sets of instruments are independent of eachother by excluding the genetic variants that theyhave in common. However, this may also in-crease the risk of type II error if variants areexcluded from the genetic instruments that re-flect vertical pleiotropy (e.g., if the CHRNA5variant were removed from both the smokingand lung cancer instruments). In situations

Mendelian Randomization

Advanced Online Article. Cite this article as Cold Spring Harb Perspect Med doi: 10.1101/cshperspect.a040501 23

ww

w.p

ersp

ecti

vesi

nm

edic

ine.

org

on April 23, 2022 - Published by Cold Spring Harbor Laboratory Press http://perspectivesinmedicine.cshlp.org/Downloaded from

Page 24: Mendelian Randomization: Concepts and Scope

where an outcome Y influences trait X, for ex-ample, developing coronary heart disease in-creases CRP levels, it may be possible to investi-gate the number of “cases” of the outcome in thesample used to run the exposure GWAS. Here, alow prevalence of the outcome in the exposuresample would minimize risk of such a reversesignal. Another approach that can be used inthis context is the Steiger method applied toMR (Hemani et al. 2017b). This has been devel-oped to investigate whether the genetic variantsbeing used to instrument trait X are morestrongly correlated with trait Y than X, in whichcase they will be excluded from the instrumentfor X.

One way in which this kind of reverse cau-sation can be leveraged within an MR study iswith the notion of “reverse MR” (Holmes andDavey Smith 2019). Here, a disease-associatedgenetic risk score would be expected to associatewith causes of the disease (e.g., a genetic riskscore for lung cancer would be associated withsmoking because the CHRNA5 variant is in-cluded in the score). If the outcome of interestcannot plausibly cause the exposure being con-sidered (for example, in a subgroup in which theoutcome is not prevalent, e.g., among youngstudy participants in the case of lung cancer),then this situation can be used to confirm causalexposures. This principle has been applied toinvestigate perturbations in proteins andmetab-olites in relation to later cardiometabolic diseaserisk (Battram 2018; Bell et al. 2020; Ritchie et al.2021). However, alternative scenarios such aspleiotropy, a causal effect of disease liability, orearly stages of disease that influence the exposure(e.g., prediagnostic cholesterol lowering in cancer[Ahn et al. 2009]) also need to be considered.

OVERCOMING LIMITATIONS

Limitations of the MR approach have been dis-cussed extensively elsewhere (Davey Smith andEbrahim 2003, 2004; Davey Smith and Hemani2014; VanderWeele et al. 2014; Haycock et al.2016; Zheng et al. 2017; Davey Smith et al.2020). Although some of the early concerns ofthe MR approach such as a lack of genetic in-struments, horizontal pleiotropy, and low power

have been ameliorated with larger data sets andnovel methods, some limitations remain andnew constraints have been recognized.

Lack of Reliable Polymorphisms for StudyingModifiable Exposures of Interest

Genetic instruments extracted from a singlegenewith awell-understood biological function,and therefore most likely to meet the MR as-sumptions, are not available for all exposures.Although the proliferation of GWAS has in-creased the availability of genetic variants touse as potential instruments, the role of the var-iants identified often requires careful consider-ation to assess their validity in MR. Polygenicinfluences onmost phenotypes imply individualSNPs of small effect size, each with a marginalcontribution to the variance explained in a trait.This is both a threat to the exclusion restrictionassumption and can lead to problems of weakinstruments, unless the variants can be com-bined into a genetic risk score or applied in largesample sizes.

The use of genetic variants may sometimeslead to counterintuitive results. For example,while it would be expected that longer-term IL-6 levels would elevate the risk of CHD (Daneshet al. 2008), genetic variation in the IL-6 receptorthat increases circulating IL-6 levels has actuallybeen found to decrease risk of CHD (Swerdlowet al. 2012). This can be explained because car-riers of this polymorphism exhibit higher circu-lating IL-6 levels but reduced membrane-boundIL-6, with reduced IL-6 signaling, which in turnreduces risk of CHD (Swerdlowet al. 2012). Sim-ilarly, if genetic polymorphisms are associatedwith multiple aspects of a single trait, for exam-ple, variation inCHRNA5 that is associated withdifferent dimensions of smoking behavior (e.g.,number of puffs per cigarette, depth of inhala-tion) (Lassi et al. 2016), this can also lead toproblems in the interpretation of causal effectsfor any particular dimension of the trait.

A further limitation posed by a lack of un-derstanding of genetic variants is that of “con-tamination” of the instrument by variants asso-ciated with traits “upstream” from the trait ofinterest, leading to misspecification of the pri-

R.C. Richmond and G. Davey Smith

24 Advanced Online Article. Cite this article as Cold Spring Harb Perspect Med doi: 10.1101/cshperspect.a040501

ww

w.p

ersp

ecti

vesi

nm

edic

ine.

org

on April 23, 2022 - Published by Cold Spring Harbor Laboratory Press http://perspectivesinmedicine.cshlp.org/Downloaded from

Page 25: Mendelian Randomization: Concepts and Scope

mary phenotype (Davey Smith and Hemani2014). This is a particular concern as GWASsample sizes increase, because it increases thepower to detect genetic variants that act indi-rectly on the trait of interest. For example, ge-netic variants with a primary influence on BMIappear among the top hits in GWAS of C-reac-tive protein (Dehghan et al. 2011), but shouldnot be used as instruments for CRP levels. Thishas already been discussed within the context ofa bidirectional relationship, in which geneticvariants that influence the exposure through re-verse causemay be picked up. In addition, itmayreintroduce confounding if genetic variants as-sociated with confounders are picked up as ge-netic variants for the exposure. For example, ge-netic variants identified in relation to drinkingbehavior have been found to be strongly relatedto socioeconomic factors (Rosoff et al. 2021).This may lead to confounding in MR studies ifthe genetic variants used to instrument the ex-posure (here drinking behavior) are primarilyassociated with factors upstream from the expo-sure (e.g., educational attainment) and may ex-plain the opposite genetic correlations observedfor alcohol quantity and intake frequency(which are differentially associated with educa-tional attainment) in relation to various healthoutcomes in a recent study (Marees et al. 2020).MultivariableMR can be used in this instance toestimate the “true” effect of the phenotype beinginvestigated, for example, accounting for educa-tional attainment to estimate direct effects ofdrinking behavior, but this requires knowingthe structure of the underlying phenotypes.

Horizontal Pleiotropy

There is a clear trade-off between includingmore variants in a genetic instrument to maxi-mize variance explained in the exposure and theincreased risk of pleiotropy by including morepoorly characterized variants. However, the po-tential advantage of including more variants isthe ability to use the suite of approaches alreadydescribed. These approaches relax the exclusionrestriction assumption but each has its own setsof assumptions (Table 5). If results are largelyconsistent across methods that have orthogonal

biases (Munafo et al. 2020), there can be moreconfidence in drawing robust conclusions. Al-ternatively, in the presence of inconsistentresults across methods, a Bayesian model aver-aging framework may be used as a basis for effi-cient estimation in the presence of pleiotropy(Shapland et al. 2020).

Lack of Independent Instruments

Although it is not necessary that a genetic vari-ant used to instrument a modifiable exposure ofinterest in MR should be the causal variant forthat trait, it is important to assess whether mul-tiple SNPs used in the instrument are likely to betagging the same causal variant. This is becauseincluding correlated variants will typically leadto erroneous precision in the causal estimateobtained. LD-based clumping or pruning canbe used to exclude variants in strong LD, whichcan ensure independence of the instruments.Another approach is to use a weighted general-ized linear regression method that takes into ac-count correlation of multiple nonindependentSNPs (Burgess et al. 2016b). This approach isparticularly useful when assessing causality ofmolecular phenotypes (e.g., DNA methylation,gene expression, protein levels) that are oftencharacterized by few independent instrumentsin cis (genetic variants located close to the targetlocus/gene). A method that is often used in con-junction with MR when there is, for example,one independent variant to instrument a molec-ular trait in cis, is the approach of genetic colo-calization mentioned previously. Alternatively,the inclusion of trans instruments (genetic var-iants more distal to the target locus/gene) in theMR analysis can be considered, although it ismore likely that these variants will violate theexclusion restriction criteria via horizontal plei-otropy.

Need for Optimal Precision

A failure to recognize the importance of bothsample size and instrument strength in MRstudies for the detection of expected effects hasin the past led to uninformative findings thatlack adequate precision to support or reject hy-

Mendelian Randomization

Advanced Online Article. Cite this article as Cold Spring Harb Perspect Med doi: 10.1101/cshperspect.a040501 25

ww

w.p

ersp

ecti

vesi

nm

edic

ine.

org

on April 23, 2022 - Published by Cold Spring Harbor Laboratory Press http://perspectivesinmedicine.cshlp.org/Downloaded from

Page 26: Mendelian Randomization: Concepts and Scope

potheses about causal effects. Genotyping inlarge-scale epidemiological studies as well asthe availability of GWAS summary statisticshas vastly improved the power of MR studiesand has revealed an increasing number of genet-ic variants that explain a larger proportion oftrait variance. Nonetheless, it is recommendedthat power calculations for MR studies be con-ducted a priori (Brion et al. 2013). Equivalencetesting may also be used to evaluate observedeffects within a priori defined equivalencebounds, to distinguish effects that are largeenough in magnitude to be deemed robust.

It is important to recognize that several of theextensions of the conventional MR approach,such as factorial MR, multivariable MR, sensitiv-ity analyses such asMR-Egger, andwithin-familyMR analyses all suffer from precision constraintsthat should be taken into consideration. This canbe evaluated through additional tests, for exam-ple, with the Sanderson-Windmeijer conditionalF-statistic in the case of multivariable MR, whichcan be used to assess instrument strength formultiple exposures when estimated jointly (San-derson et al. 2019, 2020). Furthermore, precisioncan be limited in specific contexts, for example,when evaluating intergenerational causal effectsthat have previously been limited to studies withgenetic data available in two generations (Lawloret al. 2017). In this context, new structural equa-tion model (SEM) approaches have been devel-oped that allow effects of parental exposures onoffspring outcomes to be inferred without havingto have intergenerational genetic data, and whichleverage power from large GWAS summary datain a two-sample approach (Evans et al. 2019;Hwang et al. 2020).

Weak Instrument Bias

Methods to overcome potential weak instru-ment bias in MR include the use of SIMEX-cor-rected estimates when the assumption of nomeasurement error (NOME) cannot be met(Bowden et al. 2016b), the use of robust-adjust-ed profile scores (Zhao et al. 2018; Wang et al.2021b), as well as a weak instrument and pleiot-ropy robust estimation method for use in mul-tivariable MR (Sanderson et al. 2020).

Winner’s Curse

It is recommended that the GWAS discoverysample is independent of the sample(s) used toconduct the MR analysis (Haycock et al. 2016).Ideally, the genetic variants used as instrumentsin an MR analysis will also have been replicatedin an independent sample to further minimizerisk of Winner’s curse. However, there is a cleartrade-off between maximizing sample sizes ofGWAS for discovery of genetic variants andavoiding the problem of Winner’s curse by re-taining a sample for replication and implemen-tation of the MR approach. In the largest datasets, it may be possible to perform a split-sample(Angrist and Krueger 1995b) or jackknife anal-ysis (Angrist et al. 1999), whereby the data set ispartitioned to avoid problems of sample overlapand Winner’s curse.

Canalization and Time-Varying Effects

The notion of canalization or developmentalcompensation is a potential limitation of MRforwhich there is no simple empirical assessment(Davey Smith and Ebrahim 2003). It refers to thebuffering of genetic effects during developmentthat may bias MR estimates and vitiate gene–environment equivalence (i.e., that perturbationscaused by genotype have the same downstreameffects as if they were caused by modifiable expo-sures) (Davey Smith 2012a). Canalization is awidespread phenomenon in gene knockout stud-ies (Davey Smith and Ebrahim 2003), although itis currently unclear whether the generally smallphenotypic differences induced by commonfunctional polymorphisms are sufficient to in-duce compensatory responses. A related consid-eration is the often-stated assumption that genet-ic variants have lifelong effects, which has beenused previously to explain large point estimatesobtained from MR compared with other epide-miological approaches (Ference et al. 2012).There are clear examples of MR in which expo-sures are time limited, and in which canalizationis therefore less likely to be an issue. For example,when assessing causal effects of exposures inutero, the maternal genetic variants being usedas instruments will only have an effect on the

R.C. Richmond and G. Davey Smith

26 Advanced Online Article. Cite this article as Cold Spring Harb Perspect Med doi: 10.1101/cshperspect.a040501

ww

w.p

ersp

ecti

vesi

nm

edic

ine.

org

on April 23, 2022 - Published by Cold Spring Harbor Laboratory Press http://perspectivesinmedicine.cshlp.org/Downloaded from

Page 27: Mendelian Randomization: Concepts and Scope

offspring viamechanisms during the intrauterineperiod (Lawlor et al. 2017), and when assessingcausal effects of exposures that occur predomi-nantly in adulthood (e.g., alcohol consumption,childbearing), the genetic variants will only havean effect after the developmental stage in whichcanalization is most likely to occur.

Although it is often difficult to model suchtime-varying effects in MR, they can bias causalestimates (Labrecque and Swanson 2019) andmay have implications for determining optimaltiming of interventions. New GWAS studieshave started to reveal genetic variants with dis-tinct timing effects (Couto Alves et al. 2019)that may be leveraged to investigate timevarying effects in an MR context. For example,a recent multivariable MR study used geneticvariants with distinct effects on body size inchildhood and adulthood to separate the causaleffects of this trait at two stages of the life courseon risk of chronic disease (Richardson et al.2020b).

FUTURE PROSPECTS

Although the scope of MR has grown massivelyin recent years, there are several priority researchareas that have not yet been fully evaluated.Withincreasing methodological and bioinformaticinnovation, there is great potential to makeprogress in these areas. However, subject-specif-ic knowledge, methodological insight, and im-proved reporting of MR findings are required toensure the robustness and reproducibility offindings.

Extending Clinical Applicability

Identifying Factors Underlying DiseaseProgression

To date, the majority of GWAS have sought toidentify genetic variants associated with risk ofdisease occurrence. Such variants are informa-tive for disease prevention, but not necessarilyfor treatment aimed at influencing disease pro-gression because the same genetic factors willnot necessarily influence both disease onsetand progression of the disease (Davey Smith et

al. 2017; Paternoster et al. 2017). In 2017, just 8%of genetic association hits in the GWAS Cataloghad attempted to identify variants associatedwith disease progression or severity, and mostwith modest sample size (Paternoster et al.2017). Nonetheless, an increasing number ofprogression GWAS are being carried out, instudies such as the Genetics and SubsequentCoronary Heart Disease Consortium (GE-NIUS-CHD; CHD, n > 270 k cases) (Patel et al.2019), the Breast Cancer Association Consor-tium (BCAC; breast cancer, n > 47 k cases) (Es-cala-Garcia et al. 2019), and the Prostate CancerAssociation Group to Investigate Cancer Asso-ciated Alterations in the Genome consortium(PRACTICAL; prostate cancer, n > 45 k cases)(Szulkin et al. 2015), which should allow MRstudies of progression to be conducted.

However, determining true causal effects ondisease progression using MR in case-only datasets is made more challenging because of theissue of collider bias (Paternoster et al. 2017;Hughes et al. 2019). In particular, when a groupof participants are selected based on certaincharacteristics (e.g., presence of disease), thiswill introduce a spurious association betweenindependent risk factors influencing selectionthat will then distort the relationships betweeneach risk factor and disease progression (Fig. 5).This is a threat to both conventional observa-tional associations and to studies of genetic in-fluences, with confounding being reintroducedthat can lead to violation of the MR assump-tions. Methods for alleviating such biases arecurrently indevelopment (Dudbridge et al. 2019;Hughes et al. 2019;Mahmoud et al. 2020). Thesemethods attempt to estimate the bias adjustmentfactor based on estimates for the association ofgenetic variants with both incidence and pro-gression.

Drug Trials

Whereas RCTs remain the gold standard ap-proach for testing the efficacy and safety of anew drug, RCTs can be complemented by MRin terms of prioritizing drug targets, predictingthe outcome of trials and optimizing trial design(Ference et al. 2021; Schmidt et al. 2021). Hu-

Mendelian Randomization

Advanced Online Article. Cite this article as Cold Spring Harb Perspect Med doi: 10.1101/cshperspect.a040501 27

ww

w.p

ersp

ecti

vesi

nm

edic

ine.

org

on April 23, 2022 - Published by Cold Spring Harbor Laboratory Press http://perspectivesinmedicine.cshlp.org/Downloaded from

Page 28: Mendelian Randomization: Concepts and Scope

man genetic evidence is a strong predictor ofdrug success (Nelson et al. 2015) andMR studiesof proteins and metabolites are becoming fun-damental in drug discovery and development(Ference et al. 2021; Gill et al. 2021; Holmeset al. 2021; Schmidt et al. 2021). In particular,cis-acting variants may serve as genetic proxiesfor protein drug targets, and selection of suchvariants may be optimized to evaluate the po-tential causal relationships between those drugtargets and a range of diseases (Sun et al. 2018;Zheng et al. 2019; Schmidt et al. 2020). A prom-ising application ofMR is in the prioritization oftargets for disease prevention, for example, re-vealing the role of PCKS9 inhibitors for reducingLDL cholesterol (Ference et al. 2016), as well asdeprioritizing interventions based on null re-sults from MR, for example, showing that CRPconcentration is unlikely to be a causal factor inCHD (Wensley et al. 2011).MRcan also indicatepotential side effects of drugs, including the el-evated risk of type 2 diabetes with use of somelipid-lowering drugs (Ference et al. 2012), whichhas also been shown in the case of statin trials(Swerdlow et al. 2015). MR has also been usedto identify potential repurposing opportunitiesof existing drugs (Li et al. 2020). Most recent-ly, genetic variation in IL6R has been associ-ated with lower risk of hospitalization fromCOVID-19, which is in line with findings fromIL-6R therapeutic inhibition trials (Bovijn et al.2020). In addition to predicting the consequenc-es of pharmacotherapy, MR has the potential tobe used to optimize trial design, for example, inrelation to segmenting patients who are mostlikely to benefit from the drug or giving insightsinto the timing of drug initiation (Ference et al.2012, 2021). However, this typically requires theavailability of prospective data sets of target pop-ulations with genetic data available (Schmidtet al. 2021).

Scaling up Feasibility Trials Using MR toRobustly Infer Causal Effects on Clinical EndPoints

A novel application of MR has recently beenproposed that may enhance the value of feasi-bility studies of interventions (Sandu et al.

2019). This approach uses MR to predict causaleffect of these interventions on long-term clin-ical outcomes via short-term intermediate bio-markers. Feasibility trials are small-scale studiesthat aim to assess the practicality and acceptabil-ity of implementing an intervention in a clinicalsetting. These are not usually powered to evalu-ate effects on clinical outcomes and are not typ-ically followed up for enough time to assesslong-term outcomes. However, intermediatemeasures may be collected that can serve as sur-rogate end points in such studies. These mea-sures may lie on the causal pathway to clinicaloutcomes and MR can be used in this context toappraise the causal effect of those intermediatemeasures altered by the intervention on long-term outcomes, using a larger study base tothe feasibility study in question to bolster power.Whereas these intermediate biomarkers serve assurrogate end points, which are well known tohave limitations (Prentice 1989), the advantageof using MR in this context is that it is possibleto explore both expected and unanticipatedeffects of manipulating an intermediate traiton a range of outcomes, including relativelyrare ones, and with larger sample sizes. Further-more, the approach may be used to uncoverother causal intermediates that could validatechoice of surrogate markers for use in futurefeasibility studies.

Uncovering Molecular Mechanisms

Building on the success of GWAS and the avail-ability of cost-effective and robust technologiesis the scaling up of other -omic technologieswithin population health. This is largely con-cerned with understanding how gene regulatorymechanisms or gene products influence health-related outcomes and is useful for investigatingthemolecular pathways thatmay underpin caus-al effects. In particular, such -omic measures areinfluenced by many environmental and endog-enous factors and so can be considered as inter-mediate phenotypes through which causal ef-fects may be investigated (Davey Smith andHemani 2014).

Of particular utility are large-scale -omicscans for formulating novel hypotheses on bio-

R.C. Richmond and G. Davey Smith

28 Advanced Online Article. Cite this article as Cold Spring Harb Perspect Med doi: 10.1101/cshperspect.a040501

ww

w.p

ersp

ecti

vesi

nm

edic

ine.

org

on April 23, 2022 - Published by Cold Spring Harbor Laboratory Press http://perspectivesinmedicine.cshlp.org/Downloaded from

Page 29: Mendelian Randomization: Concepts and Scope

logical processes underpinning complex traitsand diseases. However, in contrast to germlinegenetic variation, -omic signatures are largelyphenotypic, and are therefore subject to thesame potential problems of confounding andreverse causation that afflict conventional epide-miology (Relton and Davey Smith 2010). MR isbeing increasingly applied to elucidate causalityfor a range of molecular data, including epige-netics, transcriptomics, gene expression, metab-olomics, and proteomics (Porcu et al. 2021). Forthis, approaches such as two-step, network, andmultivariable MR are of particular utility fordetermining whether these markers lie on thecausal pathway between risk factors and disease(Table 4). Such approaches are being applied inincreasingly complex and innovative ways, toconsider the causal nature of a large number ofmolecular markers (Wahl et al. 2017), integrat-ing several types of -omics data to evaluate mo-lecular pathways (Mendelson et al. 2017), as wellas considering the tissue-specific nature of caus-al effects (Taylor et al. 2019; Richardson et al.2020c).

Increasing Ethnic Diversity in MR Studies

Approximately two-thirds of all previousGWAS have been performed in individuals ofEuropean ancestry (Duncan et al. 2019). Dif-ferences in allele frequencies and LD patternsbetween populations threaten the validity ofidentified genetic variants and therefore trans-ferability of MR findings to other populations(Martin et al. 2017; Duncan et al. 2019). Al-though restricting GWAS and MR analysis tomore homogeneous ancestries can help to re-duce the threat of population stratification, oth-er approaches such as Bayesian mixture modelanalysis can be taken to overcome this limita-tion while ensuring greater diversity in geneticstudies (Loh et al. 2015). Greater diversity isimportant as it allows for improved causal in-ference of risk factors and the clinical transla-tion of genetic findings in other ethnic groups.Furthermore, allowing for diversity in MR stud-ies can help to identify genetic variants that aretypically rare in Europeans, where more com-mon variation in other ethnic groups can bol-

ster the power of MR for determining causaleffects (Kang et al. 2013; Millwood et al.2019). Some large-scale, non-European bio-banks are available for genetic analysis (Table2), with a recent GWAS of∼200,000 individualsin Biobank Japan identifying a number of novelloci important for elucidating biology in EastAsian populations (Ishigaki et al. 2020).

Methodological Innovations

A number of methodological extensions of theoriginal MR approach have been discussedthroughout this review and an increasing num-ber are being developed. In particular, severalrecently developed whole-genome-based ap-proaches, including Genetic Instrumental Vari-able regression (GIV) (DiPrete et al. 2018) andCausal Analysis Using Summary Effect esti-mates (CAUSE) (Morrison et al. 2020), areseemingly less vulnerable to environmental con-founders that are correlated with genes thanmany of the methods already described (Figs. 3and 4). Whereas the previously outlined meth-ods are specifically designed to account forhorizontal pleiotropy of the genetic instru-ments, GIV and CAUSE make use of fullGWAS data to also account for other sourcesof bias. For example, if the primary phenotypehas been misspecified, the Instrument StrengthIndependent of Direct Effect (InSIDE) assump-tion of theMR-Egger sensitivity analysis is likelyto be violated, and so the CAUSE method maybe used as an additional test to determine thepresence of correlated pleiotropy in this in-stance.

Although the range of methods now avail-able allows for rigorous analysis and robustcausal inference to be made, it can be difficultto navigate the various approaches and appraisetheir relative strengths and limitations. Howev-er, it has been emphasized that the best choice ofmethod can often be context specific (Koellingerand de Vlaming 2019).More generally, applyinga number of approaches each with orthogonalbiases can be helpful in “triangulating” evidenceif the estimates obtained from the approachesconverge on a similar causal estimate (Munafoet al. 2020).

Mendelian Randomization

Advanced Online Article. Cite this article as Cold Spring Harb Perspect Med doi: 10.1101/cshperspect.a040501 29

ww

w.p

ersp

ecti

vesi

nm

edic

ine.

org

on April 23, 2022 - Published by Cold Spring Harbor Laboratory Press http://perspectivesinmedicine.cshlp.org/Downloaded from

Page 30: Mendelian Randomization: Concepts and Scope

Automation

The development of bioinformatic platformsand software supports the systematic applicationof MR (Hemani et al. 2018b; Richardson et al.2020a). There is scope to automate MR analysesto evaluate a multitude of causal relationships ina time-efficient manner. This can aid in the ac-celerated identification, prioritization, and eval-uation of intervention targets, for example,through phenome-wide association studies(MR-PhEWAS) (Millard et al. 2015; Richardsonet al. 2019), and the examination of causality inincreasingly complex networks with the integra-tion of molecular data (Hemani et al. 2017a).However, limitations of these agnostic ap-proaches include the multiple testing burdenimposed as well as the possibility of false posi-tives, which require careful follow-up in terms ofevaluating patterns of bias and ensuring robust-ness of findings to the various assumptions. Al-though machine learning and Bayesian modelalgorithms have been developed to help selectthe most appropriate model for evaluation (He-mani et al. 2017a; Dudbridge 2020; Howey et al.2020; Shapland et al. 2020; Zuber et al. 2020),users should be careful that the use of automa-tion and data repositories do not trivialize theanalysis being conducted and interpretation ofresults (Burgess and Davey Smith 2019).

Improving Reproducibility and Reportingof MR Results

The relative ease at which MR analysis can nowbe performed can also threaten the design, con-duct, and reporting of the approach. This maylead to spurious and/or nonreproducible resultsand may encourage data fishing or the selectivecherry-picking of findings, which can lead tostudy bias in the literature. Guidelines that de-scribe and emphasize the importance of analyti-cal choice considerations and appraise the trans-parencyofMR reporting should help tomaintainand improve the quality of MR studies being per-formed (Davies et al. 2018; Burgess et al. 2019;Davey Smith et al. 2019). Code sharing and im-proved reproducibility of findings, for example,emulating recommendations in GWAS to pro-

vide independent replication before publication,should also be encouraged.

CONCLUSIONS

This paper provides an overarching summary ofthe Mendelian randomization approach, whichuses genetic variants reliably related to modifi-able exposures to provide a more robust under-standing of the influence of these exposures ondisease-relevant outcomes. The development ofcomputational tools and availability of largeGWAS data sets has enabled the automation ofMR analyses for evaluating amultitude of causalrelationships in a time-efficient manner, pre-dominantly via the two-sample MR approach.This has led to the rapid expansion of MR pub-lications and is accelerating the identification,prioritization, and evaluation of interventiontargets, the detection of novel causal relation-ships and the integration of molecular data toexamine causality in increasingly complex net-works. However, as MR is increasingly easy toimplement, it may lead to the proliferation ofpoorly thought-out and conducted studies. Itis therefore important that anyone applyingthe approach is well versed in its assumptionsand limitations. We have discussed the currentstate of the field, highlighting current best prac-tice methodology, methods of assessing the MRassumptions, attempts to overcome potentialpitfalls, and some exciting future prospects. Sev-eral of the other papers in this collection elabo-rate on some of the novel methodological ap-proaches, including multivariable MR andthe use of MR for assessing mediation (Sander-son 2021), polygenic MR methods for assessingpleiotropy (Dudbridge 2020), as well as within-family MR methods (Hwang et al. 2020). Otherpapers describe the application of the approachfor extending clinical applicability (Ference et al.2021; Schmidt et al. 2021) and uncovering mo-lecular mechanisms (Porcu et al. 2021).

ACKNOWLEDGMENTS

We thank Frank Dudbridge, Floriaan Schmidt,Jean-Baptiste Pingault, Fernanda Morales Ber-stein, Grace Power, and Shah Ebrahim for their

R.C. Richmond and G. Davey Smith

30 Advanced Online Article. Cite this article as Cold Spring Harb Perspect Med doi: 10.1101/cshperspect.a040501

ww

w.p

ersp

ecti

vesi

nm

edic

ine.

org

on April 23, 2022 - Published by Cold Spring Harbor Laboratory Press http://perspectivesinmedicine.cshlp.org/Downloaded from

Page 31: Mendelian Randomization: Concepts and Scope

thoughtful comments on a previous version ofthe paper. We also thank Fernando Hartwig forhis help in producing Figure 7. This work wassupported by the MRC Integrative Epidemiolo-gy Unit which receives funding from the UKMedical Research Council and the Universityof Bristol (MC_UU_00011/1). R.C.R. is a dePass Vice Chancellor's Research Fellow at theUniversity of Bristol.

REFERENCES�Reference is also in this collection.

Abdellaoui A, Hugh-Jones D, Yengo L, Kemper KE, NivardMG, Veul L, Holtz Y, Zietsch BP, Frayling TM,Wray NR,et al. 2019. Genetic correlates of social stratification inGreat Britain. Nat Hum Behav 3: 1332–1342. doi:10.1038/s41562-019-0757-5

Ahn J, LimU,Weinstein SJ, SchatzkinA,Hayes RB, VirtamoJ, Albanes D. 2009. Prediagnostic total and high-densitylipoprotein cholesterol and risk of cancer. Cancer Epide-miol Biomarkers Prev 18: 2814–2821. doi:10.1158/1055-9965.EPI-08-1248

Ames BN. 1999. Cancer prevention and diet: help from sin-gle nucleotide polymorphisms. Proc Natl Acad Sci 96:12216–12218. doi:10.1073/pnas.96.22.12216

Amos CI, Wu XF, Broderick P, Gorlov IP, Gu J, Eisen T,Dong Q, Zhang Q, Gu XJ, Vijayakrishnan J, et al. 2008.Genome-wide association scan of tag SNPs identifies asusceptibility locus for lung cancer at 15q25.1. Nat Genet40: 616–622. doi:10.1038/ng.109

Angrist JD, Imbens GW. 1995a. Two-stage least squares es-timation of average causal effects in models with variabletreatment intensity. J Am Stat Assoc 90: 431–442. doi:10.1080/01621459.1995.10476535

Angrist JD, Krueger AB. 1995b. Split-sample instrumentalvariables estimates of the return to schooling. J Bus EconStat 13: 225–235.

Angrist JD, ImbensGW,KruegerAB. 1999. Jackknife instru-mental variables estimation. J Appl Econom 14: 57–67.doi:10.1002/(SICI)1099-1255(199901/02)14:1<57::AID-JAE501>3.0.CO;2-G

Battram T, Hoskins L, Hughes DA, Kettunen J, Ring SM,Davey Smith G, Timpson NJ. 2018. Coronary artery dis-ease, genetic risk and the metabolome in young individ-uals.Wellcome Open Res 3: 114. doi:10.12688/wellcomeopenres.14788.1

Bell JA, Carslake D, Wade KH, Richmond RC, Langdon RJ,Vincent EE, Holmes MV, Timpson NJ, Davey Smith G.2018. Influence of puberty timing on adiposity and car-diometabolic traits: a Mendelian randomisation study.PLoS Med 15: e1002641. doi:10.1371/journal.pmed.1002641

Bell JA, Bull CJ, Gunter MJ, Carslake D, Mahajan A, DaveySmith G, Timpson NJ, Vincent EE. 2020. Early metabolicfeatures of genetic liability to type 2 diabetes: cohort studywith repeated metabolomics across early life. DiabetesCare 43: 1537–1545. doi:10.2337/dc19-2348

Bjorngaard JH, Gunnell D, Elvestad MB, Davey Smith G,Skorpen F, Krokan H, Vatten L, Romundstad P. 2013.The causal role of smoking in anxiety and depression: aMendelian randomization analysis of the HUNTstudy. Psychol Med 43: 711–719. doi:10.1017/S0033291712001274

Bovijn J, Lindgren CM, Holmes MV. 2020. Genetic variantsmimicking therapeutic inhibition of IL-6 receptor signal-ing and risk of COVID-19. Lancet Rheumatol 2: e658–e659. doi:10.1016/S2665-9913(20)30345-3

Bowden J, Davey Smith G, Burgess S. 2015. Mendelian ran-domization with invalid instruments: effect estimationand bias detection through Egger regression. Int J Epide-miol 44: 512–525. doi:10.1093/ije/dyv080

Bowden J, Davey Smith G, Haycock PC, Burgess S. 2016a.Consistent estimation in Mendelian randomization withsome invalid instruments using a weighted median esti-mator. Genet Epidemiol 40: 304–314. doi:10.1002/gepi.21965

Bowden J, DelGrecoMF,Minelli C,Davey SmithG, SheehanNA, Thompson JR. 2016b. Assessing the suitability ofsummary data for two-sample Mendelian randomizationanalyses using MR-Egger regression: the role of the I2

statistic. Int J Epidemiol 45: 1961–1974.Bowden J, Spiller W, Del Greco MF, Sheehan N, Thompson

J, Minelli C, Davey Smith G. 2018a. Improving the visu-alization, interpretation and analysis of two-sample sum-mary data Mendelian randomization via the radial plotand radial regression. Int J Epidemiol 47: 2100–2100.doi:10.1093/ije/dyy265

Bowden J, Hemani G, Davey Smith G. 2018b. Invited com-mentary: detecting individual and global horizontal plei-otropy inMendelian randomization-a job for the humbleheterogeneity statistic? Am J Epidemiol 187: 2681–2685.

Brion MJ, Shakhbazov K, Visscher PM. 2013. Calculatingstatistical power in Mendelian randomization studies.Int J Epidemiol 42: 1497–1501. doi:10.1093/ije/dyt179

Brumpton B, Sanderson E, Heilbron K, Hartwig FP, Harri-son S, VieGÅ, ChoY,Howe LD,HughesA, BoomsmaDI,et al. 2020. Avoiding dynastic, assortative mating, andpopulation stratification biases in Mendelian randomiza-tion through within-family analyses. Nat Commun 11:3519. doi:10.1038/s41467-020-17117-4

Burgess S, Davey Smith G. 2019. How humans can contrib-ute to Mendelian randomization analyses. Int J Epidemiol48: 661–664. doi:10.1093/ije/dyz152

Burgess S, Labrecque JA. 2018. Mendelian randomizationwith a binary exposure variable: interpretation and pre-sentation of causal estimates. Eur J Epidemiol 33: 947–952. doi:10.1007/s10654-018-0424-6

Burgess S, Thompson SG. 2011. Bias in causal estimatesfromMendelian randomization studies with weak instru-ments. Stat Med 30: 1312–1323. doi:10.1002/sim.4197

Burgess S, Thompson SG. 2013. Use of allele scores as in-strumental variables for Mendelian randomization. Int JEpidemiol 42: 1134–1144. doi:10.1093/ije/dyt093

Burgess S, Thompson SG. 2015. Multivariable Mendelianrandomization: the use of pleiotropic genetic variants toestimate causal effects. Am J Epidemiol 181: 251–260.doi:10.1093/aje/kwu283

Burgess S, Freitag DF, KhanH, GormanDN, Thompson SG.2014. Using multivariable Mendelian randomization to

Mendelian Randomization

Advanced Online Article. Cite this article as Cold Spring Harb Perspect Med doi: 10.1101/cshperspect.a040501 31

ww

w.p

ersp

ecti

vesi

nm

edic

ine.

org

on April 23, 2022 - Published by Cold Spring Harbor Laboratory Press http://perspectivesinmedicine.cshlp.org/Downloaded from

Page 32: Mendelian Randomization: Concepts and Scope

disentangle the causal effects of lipid fractions. PLoS ONE9: e108891. doi:10.1371/journal.pone.0108891

Burgess S, Daniel RM, Butterworth AS, Thompson SG;EPIC-InterAct Consortium. 2015. Network Mendelianrandomization: using genetic variants as instrumentalvariables to investigate mediation in causal pathways.Int J Epidemiol 44: 484–495. doi:10.1093/ije/dyu176

Burgess S, Davies NM, Thompson SG. 2016a. Bias due toparticipant overlap in two-sample Mendelian randomi-zation. Genet Epidemiol 40: 597–608. doi:10.1002/gepi.21998

Burgess S, Dudbridge F, Thompson SG. 2016b. Combininginformation on multiple instrumental variables in Men-delian randomization: comparison of allele score andsummarized data methods. Stat Med 35: 1880–1906.doi:10.1002/sim.6835

Burgess S, Thompson SG; CRP CHD Genetics Collabora-tion. 2016c. Methods for meta-analysis of individual par-ticipant data fromMendelian randomisation studies withbinary outcomes. Stat Methods Med Res 25: 272–293.doi:10.1177/0962280212451882

Burgess S, Davey Smith G, Davies NM, Dudbridge F, Gill D,Glymour MM, Hartwig FP, Holmes MV, Minelli C, Rel-ton C, et al. 2019. Guidelines for performing Mendelianrandomization investigations.Wellcome Open Res 4: 186.doi:10.12688/wellcomeopenres.15555.1

Carreras-Torres R, JohanssonM, GaborieauV, Haycock PC,Wade KH, Relton CL, Martin RM, Davey Smith G, Bren-nan P. 2017. The role of obesity, type 2 diabetes, andmetabolic factors in pancreatic cancer: a Mendelian ran-domization study. J Natl Cancer Inst 109: djx012. doi:10.1093/jnci/djx012

Carter AR, Gill D, Davies NM, Taylor AE, Tillmann T,Vaucher J, Wootton RE, Munafo MR, Hemani G, MalikR, et al. 2019. Understanding the consequences of educa-tion inequality on cardiovascular disease: Mendelian ran-domisation study. BMJ 365: l1855. doi:10.1136/bmj.l1855

Censin JC, Nowak C, Cooper N, Bergsten P, Todd JA, Fall T.2017. Childhood adiposity and risk of type 1 diabetes:a Mendelian randomization study. PLoS Med 14:e1002362. doi:10.1371/journal.pmed.1002362

Chen L, Davey Smith G, Harbord RM, Lewis SJ. 2008. Al-cohol intake and blood pressure: a systematic review im-plementing a Mendelian randomization approach. PLoSMed 5: e52. doi:10.1371/journal.pmed.0050052

Cho Y, Shin SY,Won S, Relton CL, Davey Smith G, ShinMJ.2015. Alcohol intake and cardiovascular risk factors: aMendelian randomisation study. Sci Rep 5: 18422.doi:10.1038/srep18422

Cho Y, Haycock PC, Sanderson E, Gaunt TR, Zheng J, Mor-ris AP, Davey Smith G, Hemani G. 2020. Exploiting hor-izontal pleiotropy to search for causal pathways within aMendelian randomization framework. Nat Commun 11:1010. doi:10.1038/s41467-020-14452-4

Cole SR, Platt RW, Schisterman EF, Chu HT, Westreich D,Richardson D, Poole C. 2010. Illustrating bias due to con-ditioning on a collider. Int J Epidemiol 39: 417–420.doi:10.1093/ije/dyp334

Corbin LJ, Richmond RC, Wade KH, Burgess S, Bowden J,Davey Smith G, Timpson NJ. 2016. BMI as a modifiablerisk factor for type 2 diabetes: refining and understanding

causal estimates using Mendelian randomization. Diabe-tes 65: 3002–3007. doi:10.2337/db16-0418

Couto Alves A, De Silva NMG, Karhunen V, Sovio U, Das S,Taal HR, Warrington NM, Lewin AM, Kaakinen M,Cousminer DL, et al. 2019. GWAS on longitudinal growthtraits reveals different genetic factors influencing infant,child, and adult BMI. Sci Adv 5: eaaw3095. doi:10.1126/sciadv.aaw3095

Cuellar-Partida G, Lu Y, Kho PF, Hewitt AW, WichmannHE, Yazar S, Stambolian D, Bailey-Wilson JE, Wojcie-chowski R, Wang JJ, et al. 2016. Assessing the geneticpredisposition of education on myopia: a Mendelian ran-domization study. Genet Epidemiol 40: 66–72. doi:10.1002/gepi.21936

Danesh J, Kaptoge S, Mann AG, Sarwar N, Wood A, Angle-man SB, Wensley F, Higgins JPT, Lennon L, EiriksdottirG, et al. 2008. Long-term interleukin-6 levels and subse-quent risk of coronary heart disease: two new prospectivestudies and a systematic review. PLoS Med 5: e78. doi:10.1371/journal.pmed.0050078

Davey Smith G. 2012a. Epigenesis for epidemiologists: doesevo-devo have implications for population health re-search and practice? Int J Epidemiol 41: 236–247. doi:10.1093/ije/dys016

Davey Smith G. 2012b. Negative control exposures in epide-miologic studies. Epidemiology 23: 350–351.

Davey Smith G. 2019. Does schizophrenia influence canna-bis use? How to report the influence of disease liability onoutcomes in Mendelian randomization studies. TARGBlog, University of Bristol, Bristol, UK. https://targ.blogs.bristol.ac.uk/author/kz-davey-smith [accessed Au-gust 18, 2021].

Davey Smith G, Ebrahim S. 2002. Data dredging, bias, orconfounding. BMJ 325: 1437–1438. doi:10.1136/bmj.325.7378.1437

Davey SmithG, Ebrahim S. 2003.Mendelian randomization:can genetic epidemiology contribute to understandingenvironmental determinants of disease? Int J Epidemiol32: 1–22. doi:10.1093/ije/dyg070

Davey SmithG, Ebrahim S. 2004.Mendelian randomization:prospects, potentials, and limitations. Int J Epidemiol 33:30–42.

Davey Smith G, Ebrahim S. 2005. What can Mendelian ran-domisation tell us about modifiable behavioural and en-vironmental exposures? BMJ 330: 1076–1079. doi:10.1136/bmj.330.7499.1076

Davey SmithG,Hemani G. 2014.Mendelian randomization:genetic anchors for causal inference in epidemiologicalstudies. Hum Mol Genet 23: R89–R98. doi:10.1093/hmg/ddu328

Davey Smith G, Lawlor DA, Harbord R, Timpson N, Day I,Ebrahim S. 2007. Clustered environments and random-ized genes: a fundamental distinction between conven-tional and genetic epidemiology. PLoSMed 4: 1985–1992.

Davey Smith G, Paternoster L, Relton C. 2017. When willMendelian randomization become relevant for clinicalpractice and public health? JAMA 317: 589–591. doi:10.1001/jama.2016.21189

Davey Smith G, Davies NM, Dimou N, Egger M, Gallo V,Golub R, Higgins JPT, Langenberg C, Loder EW,Richards JB, et al. 2019. STROBE-MR: guidelines forstrengthening the reporting of Mendelian randomization

R.C. Richmond and G. Davey Smith

32 Advanced Online Article. Cite this article as Cold Spring Harb Perspect Med doi: 10.1101/cshperspect.a040501

ww

w.p

ersp

ecti

vesi

nm

edic

ine.

org

on April 23, 2022 - Published by Cold Spring Harbor Laboratory Press http://perspectivesinmedicine.cshlp.org/Downloaded from

Page 33: Mendelian Randomization: Concepts and Scope

studies. PeerJ Preprints 7: e27857v1. doi:10.7287/peerj.preprints.27857v1

Davey Smith G, Holmes MV, Davies NM, Ebrahim S. 2020.Mendel’s laws, Mendelian randomization and causal in-ference in observational data: substantive and nomencla-tural issues. Eur J Epidemiol 35: 99–111. doi:10.1007/s10654-020-00622-7

Davies NM, Holmes MV, Davey Smith G. 2018. ReadingMendelian randomisation studies: a guide, glossary, andchecklist for clinicians. BMJ 362: k601. doi:10.1136/bmj.k601

Davies NM, Howe LJ, Brumpton B, Havdahl A, Evans DM,Davey Smith G. 2019a. Within family Mendelian ran-domization studies. Hum Mol Genet 28: R170–R179.doi:10.1093/hmg/ddz204

Davies N, Dickson M, Davey Smith G, Windmeijer F,van den Berg GJ. 2019b. The causal effects of educationon adult health, mortality and income: evidence fromMendelian randomization and the raising of the schoolleaving age. IZA Institute of Labor Economics, Bonn,Germany. www.iza.org/publications/dp/12192

Dehghan A, Dupuis J, Barbalic M, Bis JC, Eiriksdottir G, LuC, Pellikka N,Wallaschofski H, Kettunen J, Henneman P,et al. 2011. Meta-analysis of genome-wide associationstudies in >80 000 subjects identifies multiple loci forC-reactive protein levels. Circulation 123: 731–738.doi:10.1161/CIRCULATIONAHA.110.948570

DiPrete TA, Burik CAP, Koellinger PD. 2018. Genetic in-strumental variable regression: explaining socioeconomicand health outcomes in nonexperimental data. Proc NatlAcad Sci 115: E4970–E4979. doi:10.1073/pnas.1707388115

Disney-Hogg L, Cornish AJ, Sud A, Law PJ, Kinnersley B,Jacobs DI, Ostrom QT, Labreche K, Eckel-Passow JE,Armstrong GN, et al. 2018. Impact of atopy on risk ofglioma: a Mendelian randomisation study. BMCMed 16:42. doi:10.1186/s12916-018-1027-5

� Dudbridge F. 2020. Polygenic Mendelian randomization.Cold Spring Harb Perspect Med 11: a039586. doi:10.1101/cshperspect.a039586

Dudbridge F, Allen RJ, Sheehan NA, Schmidt AF, Lee JC,Jenkins RG, Wain LV, Hingorani AD, Patel RS. 2019.Adjustment for index event bias in genome-wide associ-ation studies of subsequent events. Nat Commun 10:1561. doi:10.1038/s41467-019-09381-w

Duncan L, Shen H, Gelaye B, Meijsen J, Ressler K, FeldmanM, Peterson R, Domingue B. 2019. Analysis of polygenicrisk score usage and performance in diverse human pop-ulations. Nat Commun 10: 3328. doi:10.1038/s41467-019-11112-0

Ebrahim S, Davey SmithG. 2008.Mendelian randomization:can genetic epidemiology help redress the failures of ob-servational epidemiology?HumGenet 123: 15–33. doi:10.1007/s00439-007-0448-6

Escala-Garcia M, Guo Q, Dörk T, Canisius S, Keeman R,Dennis J, Beesley J, Lecarpentier J, Bolla MK, Wang Q,et al. 2019. Genome-wide association study of germlinevariants and breast cancer-specific mortality. Br J Cancer120: 647–657. doi:10.1038/s41416-019-0393-x

Evans DM, Moen GH, Hwang LD, Lawlor DA, WarringtonNM. 2019. Elucidating the role of maternal environmen-tal exposures on offspring health and disease using two-

sample Mendelian randomization. Int J Epidemiol 48:861–875. doi:10.1093/ije/dyz019

Ference BA, Yoo W, Alesh I, Mahajan N, Mirowska KK,Mewada A, Kahn J, Afonso L, Williams KA Sr, FlackJM. 2012. Effect of long-term exposure to lower low-den-sity lipoprotein cholesterol beginning early in life on therisk of coronary heart disease: a Mendelian randomiza-tion analysis. J Am Coll Cardiol 60: 2631–2639. doi:10.1016/j.jacc.2012.09.017

Ference BA, Julius S, Mahajan N, Levy PD, Williams KA,Flack JM. 2014. Clinical effect of naturally random allo-cation to lower systolic blood pressure beginningbefore the development of hypertension. Hypertension63: 1182–1188. doi:10.1161/HYPERTENSIONAHA.113.02734

Ference BA, Majeed F, Penumetcha R, Flack JM, Brook RD.2015. Effect of naturally random allocation to lower low-density lipoprotein cholesterol on the risk of coronaryheart disease mediated by polymorphisms in NPC1L1,HMGCR, or both: a 2×2 factorial Mendelian randomiza-tion study. J AmColl Cardiol 65: 1552–1561. doi:10.1016/j.jacc.2015.02.020

Ference BA, Robinson JG, Brook RD, Catapano AL, Chap-manMJ, Neff DR, Voros S, Giugliano RP, Davey Smith G,Fazio S, et al. 2016. Variation in PCSK9 andHMGCR andrisk of cardiovascular disease and diabetes. N Engl J Med375: 2144–2153. doi:10.1056/NEJMoa1604304

� Ference BA, HolmesMV, Davey Smith G. 2021. UsingMen-delian randomization to directly inform and improve thedesign of randomized trials. Cold Spring Harb PerspectMed doi:10.1101/cshperspect.a040980

Gage SH,Davey SmithG,Ware JJ, Flint J, MunafoMR. 2016.G=E: what GWAS can tell us about the environment.PLoS Genet 12: e1005765. doi:10.1371/journal.pgen.1005765

Gage SH, Jones HJ, Burgess S, Bowden J, Davey Smith G,Zammit S, Munafò MR. 2017. Assessing causality in as-sociations between cannabis use and schizophrenia risk: atwo-sample Mendelian randomization study. PsycholMed 47: 971–980. doi:10.1017/S0033291716003172

Garner C. 2007. Upward bias in odds ratio estimates fromgenome-wide association studies. Genet Epidemiol 31:288–295. doi:10.1002/gepi.20209

Giambartolomei C, Vukcevic D, Schadt EE, Franke L, Hin-gorani AD, Wallace C, Plagnol V. 2014. Bayesian test forcolocalisation between pairs of genetic association studiesusing summary statistics. PLoS Genet 10: e1004383.doi:10.1371/journal.pgen.1004383

Gill D, Del Greco F, Rawson TM, Sivakumaran P, Brown A,Sheehan NA, Minelli C. 2017. Age at menarche and timespent in education: a Mendelian randomization study.Behav Genet 47: 480–485. doi:10.1007/s10519-017-9862-2

Gill D, Brewer CF, Del GrecoMF, Sivakumaran P, Bowden J,Sheehan NA, Minelli C. 2018. Age at menarche and adultbody mass index: a Mendelian randomization study. Int JObesity 42: 1574–1581. doi:10.1038/s41366-018-0048-7

Gill D, Georgakis MK, Walker VM, Schmidt AF, GkatzionisA, Freitag DF, Finan C, Hingorani AD, Howson JMM,Burgess S, et al. 2021. Mendelian randomization forstudying the effects of perturbing drug targets.WellcomeOpen Res 6: 16. doi:10.12688/wellcomeopenres.16544.2

Mendelian Randomization

Advanced Online Article. Cite this article as Cold Spring Harb Perspect Med doi: 10.1101/cshperspect.a040501 33

ww

w.p

ersp

ecti

vesi

nm

edic

ine.

org

on April 23, 2022 - Published by Cold Spring Harbor Laboratory Press http://perspectivesinmedicine.cshlp.org/Downloaded from

Page 34: Mendelian Randomization: Concepts and Scope

Gkatzionis A, Burgess S. 2019. Contextualizing selection biasin Mendelian randomization: how bad is it likely to be?Int J Epidemiol 48: 691–701. doi:10.1093/ije/dyy202

Greenland S, Pearl J, Robins JM. 1999. Confounding andcollapsibility in causal inference. Stat Sci 14: 29–46.doi:10.1214/ss/1009211805

Hartwig FP, Davies NM, Hemani G, Davey Smith G. 2016.Two-sample Mendelian randomization: avoiding thedownsides of a powerful, widely applicable but potentiallyfallible technique. Int J Epidemiol 45: 1717–1726.

Hartwig FP, Davey Smith G, Bowden J. 2017. Robust infer-ence in summary data Mendelian randomization via thezero modal pleiotropy assumption. Int J Epidemiol 46:1985–1998. doi:10.1093/ije/dyx102

Hartwig FP, Davies NM, Davey Smith G. 2018. Bias in Men-delian randomization due to assortative mating. GenetEpidemiol 42: 608–620. doi:10.1002/gepi.22138

Haworth S, Mitchell R, Corbin L, Wade KH, Dudding T,Budu-Aggrey A, Carslake D, Hemani G, Paternoster L,Smith GD, et al. 2019. Apparent latent structure withinthe UK biobank sample has implications for epidemio-logical analysis. Nat Commun 10: 333. doi:10.1038/s41467-018-08219-1

Haycock PC, Burgess S, Wade KH, Bowden J, Relton C,Davey Smith G. 2016. Best (but oft-forgotten) practices:the design, analysis, and interpretation of Mendelian ran-domization studies. Am J Clin Nutr 103: 965–978. doi:10.3945/ajcn.115.118216

Heinonen OP, Huttunen JK, Albanes D, Haapakoski J,Palmgren J, Pietinen P, Pikkarainen J, Rautalahti M, Vir-tamo J, Edwards BK, et al. 1994. Effect of vitamin E andβ carotene on the incidence of lung cancer and othercancers in male smokers. N Engl J Med 330: 1029–1035.doi:10.1056/NEJM199404143301501

Hemani G, Bowden J, Haycock P, Zheng J, Davis O, Flach P,Gaunt T, Davey Smith G. 2017a. Automating Mendelianrandomization through machine learning to construct aputative causal map of the human phenome. bioRxivdoi:10.1101/173682

Hemani G, Tilling K, Davey Smith G. 2017b. Orienting thecausal relationship between imprecisely measured traitsusing GWAS summary data. PLoS Genet 13: e1007081.doi:10.1371/journal.pgen.1007081

Hemani G, Bowden J, Davey Smith G. 2018a. Evaluating thepotential role of pleiotropy in Mendelian randomizationstudies. Hum Mol Genet 27: R195–R208. doi:10.1093/hmg/ddy163

Hemani G, Zheng J, Wade KH, Laurin C, Elsworth B, Bur-gess S, Bowden J, Langdon R, Tan V, Yarmolinsky J, et al.2018b. TheMR-Base platform supports systematic causalinference across the human phenome. eLife 7: e34408.doi:10.7554/eLife.34408

Hernan MA, Robins JM. 2020. Causal inference: what if?Chapman & Hall/CRC, Boca Raton, FL.

Hill WD, Hagenaars SP, Marioni RE, Harris SE, LiewaldDCM,Davies G,OkbayA,McIntoshAM,Gale CR,DearyIJ. 2016. Molecular genetic contributions to social depri-vation and household income in UK biobank. Curr Biol26: 3083–3089. doi:10.1016/j.cub.2016.09.035

HolmesMV, Davey SmithG. 2019. CanMendelian random-ization shift into reverse gear? Clin Chem 65: 363–366.doi:10.1373/clinchem.2018.296806

Holmes MV, Ala-Korpela M, Davey Smith G. 2017. Mende-lian randomization in cardiometabolic disease: challengesin evaluating causality. Nat Rev Cardiol 14: 577–590.doi:10.1038/nrcardio.2017.78

Holmes MV, Richardson TG, Ference BA, Davies NM, Da-vey Smith G. 2021. Integrating genomics with biomarkersand therapeutic targets to invigorate cardiovascular drugdevelopment. Nat Rev Cardiol 18: 435–453. doi:10.1038/s41569-020-00493-1

Howey R, Shin SY, Relton C, Davey Smith G, Cordell HJ.2020. Bayesian network analysis incorporating geneticanchors complements conventional Mendelian random-ization approaches for exploratory analysis of causal rela-tionships in complex data. PLoS Genet 16: e1008198.doi:10.1371/journal.pgen.1008198

Hughes RA, Davies NM, Davey Smith G, Tilling K. 2019.Selection bias when estimating average treatment effectsusing one-sample instrumental variable analysis. Epide-miology 30: 350–357. doi:10.1097/EDE.0000000000000972

� Hwang LD, Davies NM, Warrington NM, Evans DM. 2020.Integrating family-based and Mendelian randomizationdesigns. Cold Spring Harb Perspect Med 11: a039503.doi:10.1101/cshperspect.a039503

Ishigaki K, AkiyamaM,KanaiM, TakahashiA, Kawakami E,Sugishita H, Sakaue S, Matoba N, Low SK, Okada Y, et al.2020. Large-scale genome-wide association study in aJapanese population identifies novel susceptibility lociacross different diseases. Nat Genet 52: 669–679. doi:10.1038/s41588-020-0640-3

Kang H, Kreuels B, Adjei O, Krumkamp R, May J, Small DS.2013. The causal effect of malaria on stunting: a Mende-lian randomization and matching approach. Int J Epide-miol 42: 1390–1398. doi:10.1093/ije/dyt116

Kang H, Zhang AR, Cai TT, Small DS. 2016. Instrumentalvariables estimation with some invalid instruments andits application to Mendelian randomization. J Am StatAssoc 111: 132–144. doi:10.1080/01621459.2014.994705

Kodali HP, Pavilonis BT, Schooling CM. 2018. Effects ofcopper and zinc on ischemic heart disease andmyocardialinfarction: a Mendelian randomization study. Am J ClinNutrition 108: 237–242. doi:10.1093/ajcn/nqy129

Koellinger PD, de Vlaming R. 2019. Mendelian randomiza-tion: the challenge of unobserved environmental con-founds. Int J Epidemiol 48: 665–671. doi:10.1093/ije/dyz138

Kong A, Thorleifsson G, FriggeML, Vilhjalmsson BJ, YoungAI, Thorgeirsson TE, Benonisdottir S, Oddsson A, Hall-dorsson BV,Masson G, et al. 2018. The nature of nurture:effects of parental genotypes. Science 359: 424–428.doi:10.1126/science.aan6877

Labrecque JA, Swanson SA. 2019. Interpretation and poten-tial biases of Mendelian randomization estimates withtime-varying exposures. Am J Epidemiol 188: 231–238.doi:10.1093/aje/kwy204

Lassi G, Taylor AE, TimpsonNJ, Kenny PJ,Mather RJ, EisenT, Munafò MR. 2016. The CHRNA5-A3-B4 gene clusterand smoking: fromdiscovery to therapeutics.TrendsNeu-rosci 39: 851–861. doi:10.1016/j.tins.2016.10.005

LawlorDA. 2016. Commentary: two-sampleMendelian ran-domization: opportunities and challenges. Int J Epidemiol45: 908–915. doi:10.1093/ije/dyw127

R.C. Richmond and G. Davey Smith

34 Advanced Online Article. Cite this article as Cold Spring Harb Perspect Med doi: 10.1101/cshperspect.a040501

ww

w.p

ersp

ecti

vesi

nm

edic

ine.

org

on April 23, 2022 - Published by Cold Spring Harbor Laboratory Press http://perspectivesinmedicine.cshlp.org/Downloaded from

Page 35: Mendelian Randomization: Concepts and Scope

Lawlor DA, Harbord RM, Sterne JAC, Timpson N, DaveySmith G. 2008. Mendelian randomization: using genes asinstruments for making causal inferences in epidemiolo-gy. Stat Med 27: 1133–1163. doi:10.1002/sim.3034

Lawlor D, Richmond R,Warrington N, McMahon G, DaveySmith G, Bowden J, Evans DM. 2017. Using Mendelianrandomization to determine causal effects of maternalpregnancy (intrauterine) exposures on offspring out-comes: sources of bias and methods for assessing them.Wellcome Open Res 2: 11. doi:10.12688/wellcomeopenres.10567.1

Lawlor DA, Wade K, Borges MC, Palmer T, Hartwig FP,Hemani G, Bowden J. 2019. A Mendelian randomizationdictionary: useful definitions and descriptions for under-taking, understanding and interpreting Mendelian ran-domization studies. Working Paper, OSF Preprints.doi:10.31219/osf.io/6yzs7

Lawson DJ, Davies NM, Haworth S, Ashraf B, Howe L,Crawford A, Hemani G, Davey Smith G, Timpson NJ.2020. Is population structure in the genetic biobank erairrelevant, a challenge, or an opportunity? Hum Genet139: 23–41.

Lawn RB, Sallis HM, Taylor AE, Wootton RE, Davey SmithG, Davies NM, Hemani G, Fraser A, Penton-Voak IS,Munafò MR. 2019. Schizophrenia risk and reproductivesuccess: aMendelian randomization study.R SocOpen Sci6: 181049. doi:10.1098/rsos.181049

Lewis SJ, Davey SmithG. 2005. Alcohol,ALDH2, and esoph-ageal cancer: a meta-analysis which illustrates the poten-tials and limitations of a Mendelian randomization ap-proach. Cancer Epidem Biomar 14: 1967–1971. doi:10.1158/1055-9965.EPI-05-0196

Li GH, Cheung CL, Au PC, Tan KC, Wong IC, Sham PC.2020. Positive effects of low LDL-C and statins on bonemineral density: an integrated epidemiological observa-tion analysis and Mendelian randomization study. Int JEpidemiol 49: 1221–1235. doi:10.1093/ije/dyz145

Lippman SM, Klein EA, Goodman PJ, LuciaMS, ThompsonIM, Ford LG, Parnes HL, Minasian LM, Gaziano JM,Hartline JA, et al. 2009. Effect of selenium and vitaminE on risk of prostate cancer and other cancers the sele-nium and vitamin E cancer prevention trial (SELECT).JAMA 301: 39–51. doi:10.1001/jama.2008.864

Lipsitch M, Tchetgen Tchetgen E, Cohen T. 2010. Negativecontrols: a tool for detecting confounding and bias inobservational studies. Epidemiology 21: 383–388. doi:10.1097/EDE.0b013e3181d61eeb

Loh PR, Tucker G, Bulik-Sullivan BK, Vilhjálmsson BJ, Fi-nucane HK, Salem RM, Chasman DI, Ridker PM, NealeBM, Berger B, et al. 2015. Efficient Bayesianmixed-modelanalysis increases association power in large cohorts. NatGenet 47: 284–290. doi:10.1038/ng.3190

Lotta LA, Scott RA, Sharp SJ, Burgess S, Luan J, Tillin T,Schmidt AF, Imamura F, Stewart ID, Perry JR, et al. 2016.Genetic predisposition to an impaired metabolism of thebranched-chain amino acids and risk of type 2 diabetes: aMendelian randomisation analysis. PLoS Med 13:e1002179. doi:10.1371/journal.pmed.1002179

Mahmoud O, Dudbridge F, Davey Smith G, Munafo M,Tilling K. 2020. Slope-hunter: a robust method for in-dex-event bias correction in genome-wide association

studies of subsequent traits. bioRxiv doi:10.1101/2020.01.31.928077

Marees AT, Smit DJA, Ong JS, MacGregor S, An J, Denys D,Vorspan F, van den Brink W, Derks EM. 2020. Potentialinfluence of socioeconomic status on genetic correlationsbetween alcohol consumption measures and mentalhealth. Psychol Med 50: 484–498. doi:10.1017/S0033291719000357

MartinAR,GignouxCR,Walters RK,WojcikGL,Neale BM,Gravel S, Daly MJ, Bustamante CD, Kenny EE. 2017.Human demographic history impacts genetic risk predic-tion across diverse populations. Am J Hum Genet 100:635–649. doi:10.1016/j.ajhg.2017.03.004

Mendelson MM, Marioni RE, Joehanes R, Liu C, HedmanAK, Aslibekyan S, Demerath EW, GuanW, Zhi D, Yao C,et al. 2017. Association of body mass index with DNAmethylation and gene expression in blood cells and rela-tions to cardiometabolic disease: a Mendelian randomi-zation approach. PLoS Med 14: e1002215. doi:10.1371/journal.pmed.1002215

Menkes MS, Comstock GW, Vuilleumier JP, Helsing KJ,Rider AA, Brookmeyer R. 1986. Serum β-carotene, vita-mins A and E, selenium, and the risk of lung cancer.N Engl J Med 315: 1250–1254. doi:10.1056/NEJM198611133152003

Millard LA, Davies NM, Timpson NJ, Tilling K, Flach PA,Davey Smith G. 2015. MR-PheWAS: hypothesis prioriti-zation among potential causal effects of body mass indexon many outcomes, using Mendelian randomization. SciRep 5: 16645. doi:10.1038/srep16645

Mills HL, Higgins JP, Morris RW, Kessler D, Heron J, WilesN, Davey Smith G, Tilling K. 2020. Detecting heteroge-neity of intervention effects using analysis andmeta-anal-ysis of differences in variance between arms of a trial.Epidemiology (in press).

Millwood IY, Walters RG, Mei XW, Guo Y, Yang L, Bian Z,Bennett DA, Chen Y, Dong C, Hu R, et al. 2019. Conven-tional and genetic evidence on alcohol and vascular dis-ease aetiology: a prospective study of 500,000 men andwomen in China. Lancet 393: 1831–1842. doi:10.1016/S0140-6736(18)31772-0

Minelli C, Del GrecoMF, van der Plaat DA, Bowden J, Shee-hanNA, Thompson J. 2021. The use of two-samplemeth-ods forMendelian randomization analyses on single largedatasets. Int J Epidemiol dyab084. doi:10.1093/ije/dyab084

Mokry LE, Ross S, Ahmad OS, Forgetta V, Davey Smith G,Goltzman D, Leong A, Greenwood CM, Thanassoulis G,Richards JB. 2015. Vitamin D and risk of multiple scle-rosis: a Mendelian randomization study. PLoS Med 12:e1001866. doi:10.1371/journal.pmed.1001866

Morrison J, Knoblauch N, Marcus JH, Stephens M, He X.2020. Mendelian randomization accounting for correlat-ed and uncorrelated pleiotropic effects using genome-wide summary statistics. Nat Genet 52: 740–747. doi:10.1038/s41588-020-0631-4

Mountjoy E, Davies NM, Plotnikov D, Davey Smith G,Rodriguez S, Williams CE, Guggenheim JA, Atan D.2018. Education and myopia: assessing the direction ofcausality by Mendelian randomisation. BMJ 361: k2022.doi:10.1136/bmj.k2022

Mendelian Randomization

Advanced Online Article. Cite this article as Cold Spring Harb Perspect Med doi: 10.1101/cshperspect.a040501 35

ww

w.p

ersp

ecti

vesi

nm

edic

ine.

org

on April 23, 2022 - Published by Cold Spring Harbor Laboratory Press http://perspectivesinmedicine.cshlp.org/Downloaded from

Page 36: Mendelian Randomization: Concepts and Scope

� Munafo MR, Higgins JPT, Davey Smith G. 2020. Triangu-lating evidence through the inclusion of genetically-in-formed designs. Cold Spring Harb Perspect Med doi:10.1101/cshperspect.a040659

Myung SK, JuW, Cho B, Oh SW, Park SM, Koo BK, Park BJ.2013. Grp KMAKS: efficacy of vitamin and antioxidantsupplements in prevention of cardiovascular disease: sys-tematic review and meta-analysis of randomised con-trolled trials. BMJ 346: f10. doi:10.1136/bmj.f10

NelsonMR, TipneyH, Painter JL, Shen J, Nicoletti P, Shen Y,Floratos A, Sham PC, Li MJ, Wang J, et al. 2015. Thesupport of human genetic evidence for approved drugindications. Nat Genet 47: 856–860. doi:10.1038/ng.3314

Palmer TM, Thompson JR, TobinMD, SheehanNA, BurtonPR. 2008. Adjusting for bias and unmeasured confound-ing in Mendelian randomization studies with binary re-sponses. Int J Epidemiol 37: 1161–1168. doi:10.1093/ije/dyn080

Patel RS, Tragante V, Schmidt AF, McCubrey RO, HolmesMV, Howe LJ, Direk K, Akerblom A, Leander K, ViraniSS, et al. 2019. Subsequent event risk in individuals withestablished coronary heart disease. Circ Genom PrecisMed 12: e002470.

Paternoster L, Tilling K, Davey Smith G. 2017. Genetic ep-idemiology and Mendelian randomization for informingdisease therapeutics: conceptual and methodologicalchallenges. PLoS Genet 13: e1006944. doi:10.1371/journal.pgen.1006944

Pierce BL, Burgess S. 2013. Efficient design for Mendelianrandomization studies: subsample and 2-sample instru-mental variable estimators. Am J Epidemiol 178: 1177–1184. doi:10.1093/aje/kwt084

Pierce BL, Tong L, Argos M, Gao JJ, Jasmine F, Roy S, Paul-Brutus R, Rahaman R, Rakibuz-ZamanM, Parvez F, et al.2013. Arsenic metabolism efficiency has a causal role inarsenic toxicity: Mendelian randomization and gene-en-vironment interaction. Int J Epidemiol 42: 1862–1872.doi:10.1093/ije/dyt182

Pigeyre M, Sjaarda J, Chong M, Hess S, Bosch J, Yusuf S,Gerstein H, Paré G. 2020. ACE and type 2 diabetes risk: aMendelian randomization study. Diabetes Care 43: 835–842. doi:10.2337/dc19-1973

PonsfordMJ, Gkatzionis A,Walker VM, Grant AJ,WoottonRE,Moore LSP, Fatumo S,MasonAM, Zuber V,Willer C,et al. 2020. Cardiometabolic traits, sepsis, and severeCOVID-19: a Mendelian randomization investigation.Circulation 142: 1791–1793. doi:10.1161/CIRCULATIONAHA.120.050753

� Porcu E, Sjaarda J, Lepik K, Carmeli C, Darrous L, Sulc J,Mounier N, Kutalik Z. 2021. Causal inference methods tointegrate omics and complex traits. Cold Spring HarbPerspect Med 11: a040493. doi:10.1101/cshperspect.a040493

Prentice RL. 1989. Surrogate endpoints in clinical trials: def-inition and operational criteria. Stat Med 8: 431–440.doi:10.1002/sim.4780080407

Rees JMB, Foley CN, Burgess S. 2020. Factorial Mendelianrandomization: using genetic variants to assess interac-tions. Int J Epidemiol 49: 1147–1158. doi:10.1093/ije/dyz161

Relton CL, Davey SmithG. 2010. Epigenetic epidemiology ofcommon complex disease: prospects for prediction, pre-

vention, and treatment. PLoS Med 7: e1000356. doi:10.1371/journal.pmed.1000356

Relton CL, Davey Smith G. 2012. Two-step epigenetic Men-delian randomization: a strategy for establishing the caus-al role of epigenetic processes in pathways to disease. Int JEpidemiol 41: 161–176. doi:10.1093/ije/dyr233

Richardson TG, Harrison S, Hemani G, Davey Smith G.2019. An atlas of polygenic risk score associations to high-light putative causal relationships across the human phe-nome. eLife 8: e43657. doi:10.7554/eLife.43657

� Richardson TG, Zheng J, Gaunt TG. 2020a. Computationaltools for causal inference in genetics. Cold Spring HarbPerspect Med 11: a039248. doi:10.1101/cshperspect.a039248

Richardson TG, Sanderson E, Elsworth B, Tilling K, DaveySmith G. 2020b. Use of genetic variation to separate theeffects of early and later life adiposity on disease risk:Mendelian randomisation study. BMJ 369: m1203.doi:10.1136/bmj.m1203

Richardson TG, Hemani G, Gaunt TR, Relton CL, DaveySmith G. 2020c. A transcriptome-wide Mendelian ran-domization study to uncover tissue-dependent regulatorymechanisms across the human phenome. Nat Commun11: 185.

Richmond RC, Davey Smith G. 2019. Commentary: orient-ing causal relationships between two phenotypes usingbidirectional Mendelian randomization. Int J Epidemiol48: 907–911. doi:10.1093/ije/dyz149

RimmEB, StampferMJ, Ascherio A, Giovannucci E, ColditzGA, Willett WC. 1993. Vitamin E consumption and therisk of coronary heart disease in men. N Engl J Med 328:1450–1456. doi:10.1056/NEJM199305203282004

Ritchie SC, Lambert SA, Arnold M, Teo SM, Lim S, Scepa-novic P, Marten J, Zahid S, Chaffin M, Liu Y, et al. 2021.Integrative analysis of the plasma proteome and polygen-ic risk of cardiometabolic diseases. bioRxiv doi:10.1101/2019.12.14.876474

Rosoff DB, Clarke TK, Adams MJ, McIntosh AM, DaveySmith G, Jung J, Lohoff FW. 2021. Educational attain-ment impacts drinking behaviors and risk for alcoholdependence: results from a two-sample Mendelian ran-domization study with ∼780,000 participants. Mol Psy-chiatry 26: 1119–1132. doi:10.1038/s41380-019-0535-9

� Sanderson E. 2021. Multivariable Mendelian randomizationand mediation. Cold Spring Harb Perspect Med 11:a038984. doi:10.1101/cshperspect.a038984

Sanderson E, Davey SmithG,Windmeijer F, Bowden J. 2019.An examination of multivariable Mendelian randomiza-tion in the single-sample and two-sample summary datasettings. Int J Epidemiol 48: 713–727. doi:10.1093/ije/dyy262

Sanderson E, Spiller W, Bowden J. 2020. Testing and cor-recting for weak and pleiotropic instruments in two-sam-ple multivariable Mendelian randomisation. bioRxivdoi:10.1101/2020.04.02.021980

Sanderson E, Richardson TG, Hemani G, Davey Smith G.2021. The use of negative control outcomes in Mendelianrandomization to detect potential population stratifica-tion. Int J Epidemiol dyaa288. doi:10.1093/ije/dyaa288

Sandu MR, Beynon RA, Richmond RC, Santos Ferreira DL,Hackshaw-McGeagh L, Davey Smith G, Metcalfe C, LaneJA, Martin RM. 2019. Two-step randomisation: applying

R.C. Richmond and G. Davey Smith

36 Advanced Online Article. Cite this article as Cold Spring Harb Perspect Med doi: 10.1101/cshperspect.a040501

ww

w.p

ersp

ecti

vesi

nm

edic

ine.

org

on April 23, 2022 - Published by Cold Spring Harbor Laboratory Press http://perspectivesinmedicine.cshlp.org/Downloaded from

Page 37: Mendelian Randomization: Concepts and Scope

the results of small feasibility studies of interventions tolarge-scale Mendelian randomisation studies to robustlyinfer causal effects on clinical endpoints. Preprints doi:10.20944/preprints201910.0276.v1

Sargan JD. 1958. The estimation of economic relationshipsusing instrumental variables. Econometrica 26: 393–415.doi:10.2307/1907619

Schmidt AF, Finan C, Gordillo-MaranonM, Asselbergs FW,Freitag DF, Patel RS, Tyl B, Chopade S, Faraway R, Zwier-zyna M, et al. 2020. Genetic drug target validation usingMendelian randomisation.Nat Commun 11: 3255. doi:10.1038/s41467-020-16969-0

� Schmidt AF, Hingorani AD, Finan C. 2021. Human geno-mics and drug development. Cold Spring Harb PerspectMed doi:10.1101/cshperspect.a039230

Shapland CY, Zhao Q, Bowden J. 2020. Profile-likelihoodBayesian model averaging for two-sample summarydata Mendelian randomization in the presence of hori-zontal pleiotropy. bioRxiv doi:10.1101/2020.02.11.943712

Slob EAW, Burgess S. 2020. A comparison of robust Men-delian randomization methods using summary data. Ge-net Epidemiol 44: 313–329. doi:10.1002/gepi.22295

Spiller W, Slichter D, Bowden J, Davey Smith G. 2019. De-tecting and correcting for bias in Mendelian randomiza-tion analyses using gene-by-environment interactions.Int J Epidemiol 48: 702–712. doi:10.1093/ije/dyy204

Staley JR, Blackshaw J, Kamat MA, Ellis S, Young R, Butter-worth AS. 2016. Phenoscanner: a database of human ge-notype-phenotype associations. Genet Epidemiol 40:664–664. doi:10.1093/bioinformatics/btw373

Sun BB, Maranville JC, Peters JE, Stacey D, Staley JR, Black-shaw J, Burgess S, Jiang T, Paige E, Surendran P, et al.2018. Genomic atlas of the human plasma proteome.Na-ture 558: 73–79. doi:10.1038/s41586-018-0175-2

Swanson SA. 2017. Commentary: can we see the forest forthe IVs?: Mendelian randomization studies with multiplegenetic variants. Epidemiology 28: 43–46. doi:10.1097/EDE.0000000000000558

Swanson SA, Hernán MA. 2018. The challenging interpre-tation of instrumental variable estimates under monoto-nicity. Int J Epidemiol 47: 1289–1297. doi:10.1093/ije/dyx038

Swerdlow DI, Holmes MV, Kuchenbaecker KB, EngmannJEL, Shah T, Sofat R, GuoYR, ChungC, Peasey A, Ster RP,et al. 2012. The interleukin-6 receptor as a target for pre-vention of coronary heart disease: a Mendelian random-isation analysis. Lancet 379: 1214–1224. doi:10.1016/S0140-6736(12)60110-X

Swerdlow DI, Preiss D, Kuchenbaecker KB, Holmes MV,Engmann JE, Shah T, Sofat R, Stender S, Johnson PC,Scott RA, et al. 2015. HMG-coenzyme A reductase inhi-bition, type 2 diabetes, and bodyweight: evidence fromgenetic analysis and randomised trials. Lancet 385:351–361. doi:10.1016/S0140-6736(14)61183-1

Szulkin R, Karlsson R, Whitington T, Aly M, Gronberg H,Eeles RA, Easton DF, Kote-Jarai Z, Al Olama AA, Benl-loch S, et al. 2015. Genome-wide association study ofprostate cancer-specific survival. Cancer Epidemiol Bio-markers Prev 24: 1796–1800. doi:10.1158/1055-9965.EPI-15-0543

Taylor AE, Fluharty ME, Bjørngaard JH, Gabrielsen ME,Skorpen F, Marioni RE, Campbell A, Engmann J, MirzaSS, Loukola A, et al. 2014. Investigating the possible causalassociation of smoking with depression and anxiety usingMendelian randomisation meta-analysis: the CARTAconsortium. BMJ Open 4: e006141. doi:10.1136/bmjopen-2014-006141

Taylor K, Davey Smith G, Relton CL, Gaunt TR, RichardsonTG. 2019. Prioritizing putative influential genes in car-diovascular disease susceptibility by applying tissue-spe-cific Mendelian randomization. Genome Med 11: 6.doi:10.1186/s13073-019-0613-2

Thomas DC, Lawlor DA, Thompson JR. 2007. Re: estima-tion of bias in nongenetic observational studies using“Mendelian triangulation” by Bautista et al. Ann Epide-miol 17: 511–513. doi:10.1016/j.annepidem.2006.12.005

TimpsonNJ, Nordestgaard BG, Harbord RM, Zacho J, Fray-ling TM, Tybjaerg-Hansen A, Davey Smith G. 2011. C-reactive protein levels and body mass index: elucidatingdirection of causation through reciprocal Mendelian ran-domization. Int J Obes (Lond) 35: 300–308. doi:10.1038/ijo.2010.137

Tyrrell J, Richmond RC, Palmer TM, Feenstra B, RangarajanJ, Metrustry S, Cavadino A, Paternoster L, Armstrong LL,De Silva NMG, et al. 2016a. Genetic evidence for causalrelationships between maternal obesity-related traits andbirth weight. JAMA 315: 1129–1140. doi:10.1001/jama.2016.1975

Tyrrell J, Richmond RC, Palmer TM, Feenstra B, RangarajanJ, Metrustry S, Cavadino A, Paternoster L, Armstrong LL,De Silva NMG, et al. 2016b. Genetic evidence for causalrelationships between maternal obesity-related traits andbirth weight. JAMA 315: 1129–1140. doi:10.1001/jama.2016.1975

VanderWeele TJ, Valeri L, Ogburn EL. 2012. Commentary:the role of measurement error and misclassification inmediation analysis: mediation and measurement er-ror. Epidemiology 23: 561–564. doi:10.1097/EDE.0b013e318258f5e4

VanderWeele TJ, TchetgenTchetgen EJ, CornelisM, Kraft P.2014. Methodological challenges in Mendelian random-ization. Epidemiology 25: 427–435. doi:10.1097/EDE.0000000000000081

van Kippersluis H, Rietveld CA. 2018. Pleiotropy-robustMendelian randomization. Int J Epidemiol 47: 1279–1288. doi:10.1093/ije/dyx002

Verbanck M, Chen CY, Neale B, Do R. 2018. Detection ofwidespread horizontal pleiotropy in causal relationshipsinferred from Mendelian randomization between com-plex traits and diseases. Nat Genet 50: 693–698. doi:10.1038/s41588-018-0099-7

Vimaleswaran KS, Berry DJ, Lu C, Tikkanen E, Pilz S, HirakiLT, Cooper JD, Dastani Z, Li R, Houston DK, et al. 2013.Causal relationship between obesity and vitaminD status:bi-directional Mendelian randomization analysis of mul-tiple cohorts. PLoS Med 10: e1001383. doi:10.1371/journal.pmed.1001383

Wahl S, Drong A, Lehne B, Loh M, Scott WR, Kunze S, TsaiPC, Ried JS, Zhang W, Yang Y, et al. 2017. Epigenome-wide association study of body mass index, and the ad-verse outcomes of adiposity. Nature 541: 81–86. doi:10.1038/nature20784

Mendelian Randomization

Advanced Online Article. Cite this article as Cold Spring Harb Perspect Med doi: 10.1101/cshperspect.a040501 37

ww

w.p

ersp

ecti

vesi

nm

edic

ine.

org

on April 23, 2022 - Published by Cold Spring Harbor Laboratory Press http://perspectivesinmedicine.cshlp.org/Downloaded from

Page 38: Mendelian Randomization: Concepts and Scope

Wang B, Baldwin JR, Schoeler T, Cheesman R, BarkhuizenW, Dudbridge F, Bann D, Morris TT, Pingault JB. 2021a.Genetic nurture effects on education: a systematic reviewand meta-analysis. bioRxiv doi:10.1101/2021.01.15.426782

Wang J, ZhaoQ, Bowden J,HemaniG,Davey SmithG, SmallDS, ZhangNR. 2021b. Causal inference for heritable phe-notypic risk factors using heterogeneous genetic instru-ments. PLOS Genet doi:10.1371/journal.pgen.1009575

Wensley F, Gao P, Burgess S, Kaptoge S, Di Angelantonio E,Shah T, Engert JC, Clarke R, Davey Smith G, Nordest-gaard BG, et al. 2011. Association between C reactiveprotein and coronary heart disease: Mendelian random-isation analysis based on individual participant data. BMJ342: d548. doi:10.1136/bmj.d548

West-Eberhard MJ. 2003. Developmental plasticity and evo-lution. Oxford University Press, Oxford.

Windmeijer F, Farbmacher H, Davies N, Davey Smith G.2019. On the use of the Lasso for instrumental variablesestimation with some invalid instruments. J Am Stat As-soc 114: 1339–1350. doi:10.1080/01621459.2018.1498346

Yarmolinsky J, Bonilla C, Haycock PC, Langdon RJQ, LottaLA, Langenberg C, Relton CL, Lewis SJ, Evans DM,DaveySmith G, et al. 2018. Circulating selenium and prostatecancer risk: a Mendelian randomization analysis. J NatlCancer Inst 110: 1035–1038. doi:10.1093/jnci/djy081

Yoshizawa K, Willett WC, Morris SJ, Stampfer MJ, Spiegel-man D, Rimm EB, Giovannucci E. 1998. Study of pre-diagnostic selenium level in toenails and the risk of ad-vanced prostate cancer. J Natl Cancer Inst 90: 1219–1224.doi:10.1093/jnci/90.16.1219

Young AI, Wauthier FL, Donnelly P. 2018. Identifying lociaffecting trait variability and detecting interactions in ge-

nome-wide association studies.Nat Genet 50: 1608–1614.doi:10.1038/s41588-018-0225-6

Zhao Q, Wang J, Hemani G, Bowden J, Small DS. 2018.Statistical inference in two-sample summary-data Men-delian randomization using robust adjusted profile score.arXiv 1801.09652

Zheng J, Baird D, Borges MC, Bowden J, Hemani G, Hay-cock P, Evans DM, Davey Smith G. 2017. Recentdevelopments in Mendelian randomization studies.Curr Epidemiol Rep 4: 330–345. doi:10.1007/s40471-017-0128-6

Zheng J, Haberland V, Baird D, Walker V, Haycock P, Gut-teridge A, Richardson TG, Staley J, Elsworth B, Burgess S,et al. 2019. Phenome-wide Mendelian randomizationmapping the influence of the plasma proteome on com-plex diseases. bioRxiv doi:10.1101/627398

Zhu ZH, Zheng ZL, Zhang FT,Wu Y, TrzaskowskiM,MaierR, Robinson MR, McGrath JJ, Visscher PM, Wray NR,et al. 2018. Causal associations between risk factorsand common diseases inferred from GWAS summarydata. Nat Commun 9: 224. doi:10.1038/s41467-017-02317-2

Zuber V, Colijn JM, Klaver C, Burgess S. 2020. Selectinglikely causal risk factors from high-throughput ex-periments using multivariable Mendelian randomiza-tion. Nat Commun 11: 29. doi:10.1038/s41467-019-13870-3

Zuccolo L, Lewis SJ, Davey Smith G, Sayal K, Draper ES,Fraser R, Barrow M, Alati R, Ring S, Macleod J, et al.2013. Prenatal alcohol exposure and offspring cognitionand school performance. A “Mendelian randomization”natural experiment. Int J Epidemiol 42: 1358–1370. doi:10.1093/ije/dyt172

R.C. Richmond and G. Davey Smith

38 Advanced Online Article. Cite this article as Cold Spring Harb Perspect Med doi: 10.1101/cshperspect.a040501

ww

w.p

ersp

ecti

vesi

nm

edic

ine.

org

on April 23, 2022 - Published by Cold Spring Harbor Laboratory Press http://perspectivesinmedicine.cshlp.org/Downloaded from

Page 39: Mendelian Randomization: Concepts and Scope

published online August 23, 2021Cold Spring Harb Perspect Med  Rebecca C. Richmond and George Davey Smith Mendelian Randomization: Concepts and Scope

Subject Collection Combining Human Genetics and Causal Inference to Understand Human Disease and Development

and FutureCausal Inference with Genetic Data: Past, Present,

George Davey SmithJean-Baptiste Pingault, Rebecca Richmond and

Human Genomics and Drug Development

FinanAmand F. Schmidt, Aroon D. Hingorani and Chris

Mendelian Randomization: Concepts and ScopeRebecca C. Richmond and George Davey Smith

Mendelian RandomizationEwan Birney

The Meaning of ''Cause'' in GeneticsKate E. Lynch Genetically Informed Designs

Triangulating Evidence through the Inclusion of

Davey SmithMarcus R. Munafò, Julian P.T. Higgins and George

Design of Randomized TrialsUsing Mendelian Randomization to Improve the

Davey SmithBrian A. Ference, Michael V. Holmes and George

ExperimentTwins and Causal Inference: Leveraging Nature's

Zavos, et al.Tom A. McAdams, Fruhling V. Rijsdijk, Helena M.S.

GeneticsComputational Tools for Causal Inference in

Tom G. Richardson, Jie Zheng and Tom R. GauntRandomization DesignsIntegrating Family-Based and Mendelian

Warrington, et al.Liang-Dar Hwang, Neil M. Davies, Nicole M.

Comparisons, and Adoption DesignsSibling Pairs, Maternal versus Paternal Exposures: In Vitro Fertilization, DiscordantFactors from Pre- and Postnatal Environmental Family-Based Designs that Disentangle Inherited

Anita Thapar and Frances Rice

Complex TraitsCausal Inference Methods to Integrate Omics and

al.Eleonora Porcu, Jennifer Sjaarda, Kaido Lepik, et

Polygenic Mendelian RandomizationFrank Dudbridge Mediation

Multivariable Mendelian Randomization and

Eleanor Sanderson

http://perspectivesinmedicine.cshlp.org/cgi/collection/ For additional articles in this collection, see

Copyright © 2021 Cold Spring Harbor Laboratory Press; all rights reserved

on April 23, 2022 - Published by Cold Spring Harbor Laboratory Press http://perspectivesinmedicine.cshlp.org/Downloaded from