sources of variation and co-variation in the population jaakko kaprio university of helsinki

48
Sources of variation and co-variation in the population Jaakko Kaprio University of Helsinki

Upload: alannah-mckenzie

Post on 12-Jan-2016

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Sources of variation and co-variation in the population Jaakko Kaprio University of Helsinki

Sources of variation and co-variation in the population

Jaakko Kaprio

University of Helsinki

Page 2: Sources of variation and co-variation in the population Jaakko Kaprio University of Helsinki

Place

Epidemiology examines determinants of disease in relation to place, time and person characteristics such as:

- genes- behavior- environment- developmental stage

TimePerson

Page 3: Sources of variation and co-variation in the population Jaakko Kaprio University of Helsinki

Development of life expectancy(U.S females in 1990, 1995 and projected)

Olshansky et al., Science 2001; 291:1491

Page 4: Sources of variation and co-variation in the population Jaakko Kaprio University of Helsinki

Changes in cardiovascular risk factors explain for men 66% of the changes in mortality from stroke

Vartiainen et al. BMJ 1995;310:901-4

Page 5: Sources of variation and co-variation in the population Jaakko Kaprio University of Helsinki

In complex disease a person's susceptibility

genotype and environmental history combine to establish

present health status, and the genotype's norm of

reaction determines future health trajectory

Genes, developmental history and environment as determinants of health

Page 6: Sources of variation and co-variation in the population Jaakko Kaprio University of Helsinki

The post-genomic eraThe post-genomic eraNow that the full human genome sequence has been published, we have access to genetic information in an unprecedented manner:– 3 billion base pairs in the human genome– c 20 000 to 30 000 genes

Thus, developments in molecular genetic analysis render it now possible to attempt identification of liability genes in complex, multifactorial traits, and to dissect out with new precision the role of genetic predisposition and environment/life style factors in these disorders.

New technologies and statistical tools are continuously introduced

Nonetheless, quantitative genetic methods provide an overall picture of the role of familial and genetic factors

Page 7: Sources of variation and co-variation in the population Jaakko Kaprio University of Helsinki

Monogenic & Complex disorders

The majority of human diseasesare complex, i.e. multiple geneticand non-genetic causes

Figure: Peltonen & McKusickScience 2001

Page 8: Sources of variation and co-variation in the population Jaakko Kaprio University of Helsinki

Segregation and linkage

Do diseased family members share alleles at a locus more often than expected?

Are these alleles the same in many families?Sibpairs or large pedigrees can be studied, depending on the

disease or trait in question

Page 9: Sources of variation and co-variation in the population Jaakko Kaprio University of Helsinki

Types of genes

Rare inborn errors of metabolism and other Mendelian gene variants (e.g. familial hypercholesterolemia) have major impact on individuals and families, but little effect on population level;– FH accounts for 1% of serum cholesterol variability in the

populationsee e.g. OMIM: http://www.ncbi.nlm.nih.gov/Omim/However, they continue to account for only a small

fraction of all cases

Page 10: Sources of variation and co-variation in the population Jaakko Kaprio University of Helsinki

Characteristics of complex traits

Trait values are determined by complex interactions among numerous metabolic and physiological systems, as well as demographic and lifestyle factors

Variation in a large number of genes can potentially influence interindividual variation of trait values

The impact of any one gene is likely to be small to moderate in size

For diseases: Monogenic diseases that mimic complex diseases typically account for a small fraction of disease cases (examples in obesity, hypertension, dyslipidemias)

.

Page 11: Sources of variation and co-variation in the population Jaakko Kaprio University of Helsinki

Susceptibility genes

Susceptibility genes increase disease risk only moderately and are context dependent.– total heritability of cholesterol levels is typically c 50%– Apo E account for 5-10% of variability in serum cholesterol in many

populations, but effect of Apo E4 allele is small in individuals– presence of apo E4 moderately increases CHD and AD risk in many

populationsFor example frequency of apo E4 allele (associated with CHD

and Alzheimer’s) is highest in nomadic populations [e.g. Pygmies (0.407) and Khoi San (0.370), Papuans (0.368), some Native Americans (0.280), and Lapps (0.310) ] compared to .10 to .15 in populations of Mediterranean descent.

Page 12: Sources of variation and co-variation in the population Jaakko Kaprio University of Helsinki

Genetic epidemiology and behavior genetics

Strategies for family studies:

Does disease or behavior aggregate in families?

What are the causes of familial aggregation?

What is the model of genetic inheritance and which genes are responsible?

How do genes interact with the environment?

Page 13: Sources of variation and co-variation in the population Jaakko Kaprio University of Helsinki

How to detect genetic effects and genes?

Family studies:– provide estimates of heritability– information on mode of inheritance– adoption and twin studies as special cases

Molecular genetic studies:– genome-wide association studies & snp-heritability– linkage in families– animal studies (e.g.’knockouts’)– known functional variants

D1S1597

GATA29A05

D1S552

D1S1622

D1S2134

D1S1669

D1S1665

D1S551

D1S1588

D1S1631

D1S1675

D1S534

D1S1595

D1S1679

D1S1677

D1S1589

D1S518

D1S1660

D1S1678

D1S3465

D1S2141

D1S549

D1S1656

ATA29C07

Page 14: Sources of variation and co-variation in the population Jaakko Kaprio University of Helsinki

What is heritabilityHeritability is the estimate of the proportion in total

variance of a trait or liability to a disease that is accounted for by genetic variance - interindividual genetic differences.

Genetic variance may arise from additive effects, due to different alleles at a locus, or may be due to dominance, the interactions of alleles

Heritability is a characteristic of populations, not individuals or families, which is affected by both genetic and environmental effects

Page 15: Sources of variation and co-variation in the population Jaakko Kaprio University of Helsinki

FAMILY STUDY

Provides estimates of the degree of family aggregation

Risks to siblings, parents, offspring as well as to other relatives can be estimated

Similarity of different types of relatives can permit modelling of genetic versus non-genetic familial influences

Page 16: Sources of variation and co-variation in the population Jaakko Kaprio University of Helsinki

Obesity in families(Quebec Family study, 1996)

0

0,05

0,1

0,15

0,2

0,25

0,3

BMI correlation

Parent-child Siblings Spouse

Page 17: Sources of variation and co-variation in the population Jaakko Kaprio University of Helsinki

Genetic epidemiology

To disentangle genes and experience, we study special family groups:

Either family members sharing experiences but differing in shared genes, e.g. twin studies or

family members sharing genes, but differing in their shared experience, e.g. adoption studies

Page 18: Sources of variation and co-variation in the population Jaakko Kaprio University of Helsinki

ADOPTION DESIGNTest for association between trait in adoptees and trait in

biological parents (genetic correlation) &

Test for association between trait in adoptees and trait in

adoptive parents.

STRENGTHS: relatively powerful

WEAKNESSES:(1) poor generalizability

(2) adoptive parents likely to provide ‘good homes’

(3) biological parents of adoptive children may have

had multiple forms of psychopathology - selection

(4) poor characterization of phenotypes of biological

parents

Page 19: Sources of variation and co-variation in the population Jaakko Kaprio University of Helsinki

Adoption studies of obesity(Sörensen et al.1998)

0

0,05

0,1

0,15

0,2

0,25

BMI correlation

Bio. mother bio father bio. sibs adop.parent

Page 20: Sources of variation and co-variation in the population Jaakko Kaprio University of Helsinki

The Classical Twin StudyMonozygotic (MZ) pairs are genetically alikeDizygotic (DZ) pairs, like siblings, share on average half of their

segregating genesDZ pairs can be same-sexed or opposite-sex (male-female) Increased similarity of twin pairs compared to unrelated subjects

suggests familial factors Increased similarity of MZ pairs compared to DZ pairs provides

evidence for genetic factors

Page 21: Sources of variation and co-variation in the population Jaakko Kaprio University of Helsinki

BMI in 25 year olds female twin pairs (rMZ= 0.78, rDZ = 0.37)

10

20

30

40

50

10 20 30 40 10 20 30 40

MZ DZB

MI in

tw

in 1

BMI in twin 2Graphs by Zygosity

FinnTwin16 study

Page 22: Sources of variation and co-variation in the population Jaakko Kaprio University of Helsinki

The classical twin study modelling

Model contribution of additive (A) and non-additive (D)genetic effects, environmental effects shared by family members (C ) and unshared effects (E) (i.e. unique to each family member)

Competing models, e.g. E, AE, ACE can be statistically compared and tested against actual data

Mx – statistical program created by Mike Neale most commonly used in genetic modelling: http://views.vcu.edu/mx/

Twin 1 Twin 2

A1 A2C1 C2E1 E2

1.0 (MZ) / .5 (DZ) 1.0

Page 23: Sources of variation and co-variation in the population Jaakko Kaprio University of Helsinki

Twin similarity for life span at very old age

Page 24: Sources of variation and co-variation in the population Jaakko Kaprio University of Helsinki
Page 25: Sources of variation and co-variation in the population Jaakko Kaprio University of Helsinki

Extensions of the classical twin study I

Effect modification by age, sex and environmental factors, e.g. smoking or obesity

Assess genetic covariance over time through longitudinal models

Assess sex effects by comparison of like-sexed and same-sexed DZ pairs

Assess social interaction effects

Page 26: Sources of variation and co-variation in the population Jaakko Kaprio University of Helsinki

Age dependence of genetic effects: CHD in twin brothers

Page 27: Sources of variation and co-variation in the population Jaakko Kaprio University of Helsinki

a2

m:0.20f:0.47

c2

m:0.42f:0.18

e2

m:0.39f:0.35

a2

m:0.84f:0.90

e2

m:0.16f:0.10

PI at birth BMI at 16 yr

m: 0.11, f: 0.09

re m: 0.16, f: 0.07

Twin 1Variable 1

A A A AC C C CE E E E

1.0 (0.5) 1.0 1.0 (0.5) 1.0

Twin 1Variable 2

Twin 2Variable 1

Twin 2Variable 2

rarc re ra rc re

Bivariate analyses indicate the genetic and environmental contributions to the relationship of relative weight at birth and in adolescence (Pietiläinen et al, Obes Res 2002)

ra m: 0.21, f: 0.13

FinnTwin16

Page 28: Sources of variation and co-variation in the population Jaakko Kaprio University of Helsinki

Different phenotypes, different effects of genes: smoking

Genetic effects

Non-genetic family effects

Experimentation (age 12) 11% 73%

Initiation/ever smoker

(adolescents)

20-36% 18-59%

Initiation/ever smoker

(adults)

28-80% 4-50%

Persistence/ cessation 58-71% None

Nicotine dependence (Fagerström or DSM-IV)

60-72% None

Page 29: Sources of variation and co-variation in the population Jaakko Kaprio University of Helsinki

Models of Gene-Environment Interaction

Purcell, S., Variance components models for gene-environment interaction in twin analysis. Twin Research, 2002. 5: p. 554-571

A C E

T

a + βXM c + βyMe + βZM

+ βMM

Page 30: Sources of variation and co-variation in the population Jaakko Kaprio University of Helsinki

Parental Monitoring and Smoking Quantity (Dick et al, J Abn. Psych, 2006)

00.10.20.30.40.50.60.70.80.9

1

4 5 6 7 8 9 10 11 12

a2c2e2

Parental Monitoring

Low High

Sta

ndar

dize

d V

aria

nce

Page 31: Sources of variation and co-variation in the population Jaakko Kaprio University of Helsinki

TWIN DESIGN: Weaknesses(1) Generalizability - having a same-age sibling??

- having a genetically identicalsame-age sibling??

(2) Relative rarity of twin pairs.

(3) Non-orthogonal design -- need large sample sizes.

(4) If major environmental risk-factors are not assessed, interaction of genetic effects and shared environmental effects will be confounded with genetic effects.

(5) Weak for detecting parent-to-offspring environmental influences.

Page 32: Sources of variation and co-variation in the population Jaakko Kaprio University of Helsinki

Assumptions of the classical twin study

Equality of environmental variances in MZ and DZ pairsDifferences may arise from:

placentation and in utero effects Fetal programming hypothesis implications

differential parental treatmentzygosity determination errors

Random mating

Page 33: Sources of variation and co-variation in the population Jaakko Kaprio University of Helsinki

Perinatal mortality among twins by zygosity and chorionicity

0

1

2

3

4

5

6

7

8

9

Fetal 1-7 days Perinatal

DZMZDCMZMC

Page 34: Sources of variation and co-variation in the population Jaakko Kaprio University of Helsinki

Birthweights of twinsEast Flanders Prospective Twin Survey (Loos 1998)

DZ MZDC MZMC

% of pairs 64 10 26

Mean Birthweight

2476g 2401g 2314g

Page 35: Sources of variation and co-variation in the population Jaakko Kaprio University of Helsinki

FAMILY STUDY

Ultimately, sampling regular families must be a key part of any genetic epidemiologic approach.

* Provides tests of generalizability of findings using more specialized twin-family and adoption designs.

* Allows adequate representation of minority groups. Numbers of minority twin pairs, eg. Swedish speaking twin pairs in Finland, available for study are often small.

Page 36: Sources of variation and co-variation in the population Jaakko Kaprio University of Helsinki

How to detect genetic effects and genes?

Molecular genetic studies:– candidate genes, genome-wide scans– association studies & linkage– animal studies (e.g.’knockouts’)

Family studies:– provide estimates of heritability– information on mode of inheritance– adoption and twin studies as special cases

D1S1597

GATA29A05

D1S552

D1S1622

D1S2134

D1S1669

D1S1665

D1S551

D1S1588

D1S1631

D1S1675

D1S534

D1S1595

D1S1679

D1S1677

D1S1589

D1S518

D1S1660

D1S1678

D1S3465

D1S2141

D1S549

D1S1656

ATA29C07

Page 37: Sources of variation and co-variation in the population Jaakko Kaprio University of Helsinki

Increasing the genetic signal in the data...

ascertain pedigree units that are likely to segregate genes of relevance – Ex: pedigrees with quasi-Mendelian disease

transmission – affected sib pair approach of linkage analysis

ascertain families on the basis of individuals with extreme or remarkable phenotypes– Ex: extremely discordant sibpairs – ascertain young individuals with the disease

ascertain individuals from isolated populations: – more homogenous genetically and culturally as well

ascertain intermediate phenotypes – physiologic phenotype is “closer” to sequence variants

Page 38: Sources of variation and co-variation in the population Jaakko Kaprio University of Helsinki

Two basic Analysis Strategies

1. candidate gene analysismotto: study a few good genes

2. whole-genome searches (genome scans)

motto: cast out a net that catches all the big fish

Page 39: Sources of variation and co-variation in the population Jaakko Kaprio University of Helsinki

Association studies: Case-control design

What is the difference between genes of cases (e.g. with disease or trait) and controls?

Selection of controls is major challenge, as in all case-control studies High rate of false-positive studies:

many genes are available for study population admixture confounding factor

Publication bias

Page 40: Sources of variation and co-variation in the population Jaakko Kaprio University of Helsinki

Candidate Gene Studies

statistically straightforward: test the association between genotypes and phenotype with contingency tables, chi-square test, regression

principle: if an allele is more frequent in affecteds than unaffecteds gene may be close to a disease gene

candidacy of a gene can come from a number of different sources: – biological insights (e.g. gene expressed in a certain tissue)– homology to other genes – functional studies in model organisms – member of a relevant gene family

Challenge: greater biological understanding of the genes

Page 41: Sources of variation and co-variation in the population Jaakko Kaprio University of Helsinki

POPULATION STRATIFICATIONHypothetical Example (by Andrew Heath)

Falsely infer that A1 allele is risk-factor for Roman Catholicism.

OR = 2.28, 95%CI 1.39 - 3.73

NO ASSOCIATION NO ASSOCIATION

NORTHERN EUROPEANANCESTRY (N=200)

SOUTHERN EUROPEAN ANCESTRY (N=200)

NOT A1 alleleA1 allele

NOTROMAN

CATHOLICROMAN

CATHOLIC

NOTROMAN

CATHOLICROMAN

CATHOLIC

16218

90%

182

10%

3515

25%

10545

75%

70%

30%

90%

10%

NOTROMAN

CATHOLICROMAN

CATHOLIC

19733

12347

NOT A1 alleleA1 allele

MINGLED IN AUSTRALIAN POPULATION (N=400)

Page 42: Sources of variation and co-variation in the population Jaakko Kaprio University of Helsinki

Genome-wide association studies

Large scale case-control series

For example MI patients and matched controls without MI

Use of very large numbers of SNPs to identify all possible genes associated with the disease

Typically 100,000 to 500,000 SNPs

Different technology platforms (Affymetrix, Illumina)

Page 43: Sources of variation and co-variation in the population Jaakko Kaprio University of Helsinki

Gene x Environment Interactions Kendler & Eaves, 1986

Protective Predisposing

Environment

Lia

bili

ty t

o Il

lnes

s

AA

Aa

aa

Protective Predisposing

Environment

Lia

bili

ty t

o Il

lnes

s AA

Aa

aa

Protective Predisposing

Environment

Lia

bili

ty t

o Il

lnes

s

AA

Aa

aa

Genes and environment have additive, independent effects

Genes control degree of sensitivity to environmental influence

Genes control susceptibility to environmental pathogenesis

Page 44: Sources of variation and co-variation in the population Jaakko Kaprio University of Helsinki

Gene-environment correlations refer to genetic effects on individual differences in liability to exposure to particular

environmental circumstances.(Background is the extensive evidence that

environmental risk exposure is far from randomly distributed)

Gene-environment interactions concern genetically influenced individual differences in the sensitivity to specific environmental

factors.(Background is the extensive evidence of huge

individual differences in vulnerability to all manner of environmental hazards)

Page 45: Sources of variation and co-variation in the population Jaakko Kaprio University of Helsinki

Examples of social x biological interactive effects

Biology controls sensitivity to environment effects– E.g., family stress x serotonin metabolism => depression and anxiety risk (Caspi, Science 2003)

Social context generates undifferentiated risk; biology constrains pathologic specificity – E.g., childhood neglect => alcoholism in men, eating disorders in women

Biological susceptibilities are amplified during rapid or intense contextual change– E.g., biological or gender-based vulnerabilities to depression and alcohol use as indexed by

pubertal development

Biology controls liability to experiencing predisposing environments– E.g. genes for skin color

Page 46: Sources of variation and co-variation in the population Jaakko Kaprio University of Helsinki

Integration of information at different levels

Developments in molecular genetics render it now possible to attempt identification of liability genes in complex, multifactorial traits, and to dissect out with new precision the role of genetic predisposition and environment/life style factors in these disorders. But, an integrative framework is needed

Complex picture

Gottesmann I, Science 1997

Page 47: Sources of variation and co-variation in the population Jaakko Kaprio University of Helsinki

Complexity of Complex Diseases

Classical polygenic or "threshold" inheritance: a certain number of mutations at different loci must be present before a system is sufficiently challenged to result in disease.

Locus heterogeneity, in which defects in any of a number of genes or loci confer disease susceptibility independently of each other.

Epistasis, or gene interaction: interactive effects of mutations, genotypes, and/or their biologic products

Environmental vulnerability: gene products are influenced by environmental stimuli.

Gene × environment interactions: gene has a deleterious effects only in the presence of a particular environmental stimulus.

Time-dependent expression of genesGeneral aging of the system

Page 48: Sources of variation and co-variation in the population Jaakko Kaprio University of Helsinki

Testing of epidemiological causal hypotheses – use of twins

Differences between MZ cotwins in a pair are due to environmental causes (in the very broadest sense)

somatic mutations and other genetic changes during development prenatal environmental and birth order effects differential treatment in childhood different exposures ( occupational, lifestyle)

Exposure/disease discordant DZ pairs are fully matched on early childhood effects, and partially on genetic factors

Studies of exposure discordant twin pairs have increased power compared to unmatched case-control series, depending on the degree of familiality of the exposure