overview - pdfs.semanticscholar.org · uk 1971 pbi maris nimrod uk 1972 pbi maris huntsman uk 1995...

39
Jon White, 15 March, 2011 Overview Association mapping Problems Statistical solutions Comments on design and power

Upload: leliem

Post on 28-Jul-2019

220 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Overview - pdfs.semanticscholar.org · UK 1971 PBI Maris Nimrod UK 1972 PBI Maris Huntsman UK 1995 DESP Soissons UK 1968 CAMB Joss Cambier UK 1992 ICI Admiral Pre-1970 Post-1970 Kinship

Jon White, 15 March,

2011

Overview

• Association mapping

– Problems

– Statistical solutions

• Comments on design and power

Page 2: Overview - pdfs.semanticscholar.org · UK 1971 PBI Maris Nimrod UK 1972 PBI Maris Huntsman UK 1995 DESP Soissons UK 1968 CAMB Joss Cambier UK 1992 ICI Admiral Pre-1970 Post-1970 Kinship

Trait mapping using association

Allele a is found more frequently with

Allele A Allele a

Page 3: Overview - pdfs.semanticscholar.org · UK 1971 PBI Maris Nimrod UK 1972 PBI Maris Huntsman UK 1995 DESP Soissons UK 1968 CAMB Joss Cambier UK 1992 ICI Admiral Pre-1970 Post-1970 Kinship

Population structure and association

If there are unknown

subgroups or families,

if allele freqs differ

between subgroups,

if traits differ between

subgroups,

then:

spurious association

will be observed.

A A a A

A a A a

A A A A

a a a A

a a a A

a a A a

A A A a a A a a a a

Page 4: Overview - pdfs.semanticscholar.org · UK 1971 PBI Maris Nimrod UK 1972 PBI Maris Huntsman UK 1995 DESP Soissons UK 1968 CAMB Joss Cambier UK 1992 ICI Admiral Pre-1970 Post-1970 Kinship

Jon White, 15 March,

2011

Between marker association falls with rising genetic distance.

Association

without

linkage.

Page 5: Overview - pdfs.semanticscholar.org · UK 1971 PBI Maris Nimrod UK 1972 PBI Maris Huntsman UK 1995 DESP Soissons UK 1968 CAMB Joss Cambier UK 1992 ICI Admiral Pre-1970 Post-1970 Kinship

LD mapping

Family based linkage mapping and LD mapping compared:

LD mapping exploits historic recombination in wild populations

and is best at fine mapping.

Linkage analysis exploits contemporary recombination in

experimental populations and is best at QTL detection.

Page 6: Overview - pdfs.semanticscholar.org · UK 1971 PBI Maris Nimrod UK 1972 PBI Maris Huntsman UK 1995 DESP Soissons UK 1968 CAMB Joss Cambier UK 1992 ICI Admiral Pre-1970 Post-1970 Kinship

Strengths and weaknesses of LD mapping

Kinship & pop structure largely solved

Low power need large pop sizes

Better precision LD decays more rapidly

Use of existing data historical collections

Need high marker density will be solved

Page 7: Overview - pdfs.semanticscholar.org · UK 1971 PBI Maris Nimrod UK 1972 PBI Maris Huntsman UK 1995 DESP Soissons UK 1968 CAMB Joss Cambier UK 1992 ICI Admiral Pre-1970 Post-1970 Kinship

Pedigree structure generates false +ve’s

1) Close kinship

Amounts to double counting: you have less data than you think.

2) Distant branches diverge: selection /drift /founder effects

Genotypes and phenotypes can differ between branches, causing associations

across the genome at multiple loci.

Any natural population will comprise a mixture of these effects.

Relative importance will vary with dataset.

Need to account for both.

In crops, problems associated with kinship effects are massive.

Not a problem in experimental mapping populations eg an F2

Page 8: Overview - pdfs.semanticscholar.org · UK 1971 PBI Maris Nimrod UK 1972 PBI Maris Huntsman UK 1995 DESP Soissons UK 1968 CAMB Joss Cambier UK 1992 ICI Admiral Pre-1970 Post-1970 Kinship

Jon White, 15 March,

2011

UK30

UK60

UK40

UK50

UK70

UK80

UK00

UK90

US20

US00

US80

US90

US60

US70

US30

US50

US10

US40

Au50

Au60

Au70

Au80

Au90

Wheat varieties

Structure

UK 1969 PBNE Cama

UK 1994 CPB Cadenza

UK 1942 PBI Steadfast

UK 1956 WEIB Banco

UK 1965 RUMK Kloka

UK 1943 BLON Bersee

UK 1978 RPB Armada

UK 1984 RPB Mission

UK 1965 PBI Maris Widgeon

UK 1969 DESP West Desprez

UK 1972 DESP Bouquet

UK 1986 PBI Mercia

UK 1973 GUIL Atou

UK 1974 RPB Mega

UK 1944 VILM Vilmoren 27

UK 1953 DESP Capelle Desprez

UK 1968 PBI Maris Ranger

UK 1956 GART N59

UK 1950 Mar Hybrid 46

UK 1960 LEPU Elite Lepeuple

UK 1962 BEOI Champlein

UK 1974 LEGL Flinor

UK 1958 GART Milfast

UK 1947 GART Redman

UK 1950 CENE Staring

UK 1935 ESCA Steel

UK 1935 PBNE Juliana

UK 1931 PBNE Wilhelmina

UK 1947 GART Victor

UK 1935 PBI Holfast

UK 1931 PBI Little Joss

UK 1954 GART Masterpiece

UK 1931 PBI Yeoman

UK 1940 PBNE Wilma

UK 1940 GART Warden

UK 1952 PADE King II

UK 1957 GART Dominator

UK 1944 PBBE Jubilegem

UK 1988 BREU Apollo

UK 1950 GART Pilot

UK 1931 ESCA Iron III

UK 1963 WEIB Thor

UK 1931 PBI Squareheads Master [aka Standard Red]

UK 1942 GART Gartons 60

UK 1938 ESCA Chevalier

UK 1938 ESCA Crown

UK 1960 RPB Flamingo

UK 1975 PBI Maris Fundin

UK 1980 BENI Copain

UK 2003 CPB Robigus

UK 2005 STAA Glasgow

UK 1973 JORI Val

UK 1979 RPB Aquila

UK 1983 RPB Stetson

UK 1986 WEIB Slejpner

UK 1993 SERA Genesis

UK 1994 RPB Flame

UK 1991 PBI Hereward

UK 1997 PBI Charger

UK 2004 PBI Gladiator

UK 2001 ELSM Tanker

UK 1993 PBI Hunter

UK 1990 PBI Beaver

UK 1990 PBI Haven

UK 2004 CPB Cordiale

UK 1999 CPB Malacca

UK 1999 RPB Claire

UK 1993 ICI Brigadier

UK 1992 PBI Torfrida

UK 2004 ADVA Smuggler

UK 1938 DESP Desprez 80 [Joncquois]

UK 1977 RING Kador

UK 1983 PBI Galahad

UK 1981 GART Rapier

UK 1982 PBI Fenman

UK 1981 PBI Norman

UK 1983 PBI Longbow

UK 1989 PBI Riband

UK 2004 CEBE Dickson

UK 2001 PBI Option

UK 1979 PBI Brigand

UK 1980 PBI Avalon

UK 1964 RPB Rothwell Perdix

UK 1993 RPB Spark

UK 1954 PBNE Minister

UK 1979 PBI Virtue

UK 1978 PBI Hustler

UK 1960 PBNE Professeur Marchal

UK 1971 PBI Maris Nimrod

UK 1972 PBI Maris Huntsman

UK 1995 DESP Soissons

UK 1968 CAMB Joss Cambier

UK 1992 ICI Admiral

Pre-1970

Post-1970

Kinship

Pre-1970

Post-1970

Structure or kinship?

Kinship or Structure?

Page 9: Overview - pdfs.semanticscholar.org · UK 1971 PBI Maris Nimrod UK 1972 PBI Maris Huntsman UK 1995 DESP Soissons UK 1968 CAMB Joss Cambier UK 1992 ICI Admiral Pre-1970 Post-1970 Kinship

Experimental solutions

The transmission disequilibrium test humans

(also QTDT, PTD etc.)

Nested Association mapping - maize (Buckler)

MAGIC - mouse, Arabidopsis, wheat

Selection experiments

Page 10: Overview - pdfs.semanticscholar.org · UK 1971 PBI Maris Nimrod UK 1972 PBI Maris Huntsman UK 1995 DESP Soissons UK 1968 CAMB Joss Cambier UK 1992 ICI Admiral Pre-1970 Post-1970 Kinship

Analytical solutions

Genomic control: returning the mean of the distribution of the test

statistic to its expectation under the null.

Structured Association: simple linear regression but include covariates

to account for subpopulation membership. Use STRUCTURE to get

the covariates

PCA: Similar to SA but use PCA to adjust both and phenotype for

subpopulation membership in terms of top (20) eigenvectors of the

correlation matrix; measure association in terms of correlation

between the residuals from these models.

Mixed Model: currently the method of choice

Others

Page 11: Overview - pdfs.semanticscholar.org · UK 1971 PBI Maris Nimrod UK 1972 PBI Maris Huntsman UK 1995 DESP Soissons UK 1968 CAMB Joss Cambier UK 1992 ICI Admiral Pre-1970 Post-1970 Kinship

Jon White, 15 March,

2011

Raw association with winter/spring habit. Barley.

0

20

40

60

80

100

120

140

X0.0

0.1

.1

X47.4

7.1

.1

X61.5

3.1

.1

X106.6

0.1

X138.3

1.1

.1

X32.1

7.2

X54.9

5.2

.4

X70.5

4.2

X96.8

2.2

.1

X121.5

0.2

X147.1

2.2

X6.0

3.3

.1

X54.4

0.3

X56.4

0.3

.27

X74.7

8.3

X111.4

2.3

.2

X155.8

5.3

X19.5

2.4

.1

X48.5

0.4

.3

X62.1

0.4

X78.7

7.4

X117.6

0.4

.1

X26.2

8.5

X51.0

0.5

.7

X63.3

1.5

.2

X103.0

1.5

X129.4

1.5

.5

X146.0

0.5

.2

X159.7

9.5

.8

X189.6

0.5

.3

X31.7

3.6

.1

X54.6

0.6

.3

X63.2

7.6

X81.1

7.6

X112.3

2.6

.1

X1.8

9.7

.1

X60.6

9.7

X79.6

0.7

.8

X128.3

6.7

X0.0

0.7

L.2

Markers in genetic map order.

-Lo

g10 P

(No

asso

cia

tio

n)

Does this look like a credible genetic model? It can lead to strong association between

many loci and phenotype.

The effect of structure is genome-wide and

random with respect to loci.

Page 12: Overview - pdfs.semanticscholar.org · UK 1971 PBI Maris Nimrod UK 1972 PBI Maris Huntsman UK 1995 DESP Soissons UK 1968 CAMB Joss Cambier UK 1992 ICI Admiral Pre-1970 Post-1970 Kinship

Genomic control

Test locus Unlinked ‘null’ markers

( )2cE

c2 No stratification

( )2cE

c2

Stratification adjust test statistic

Stratification present

Page 13: Overview - pdfs.semanticscholar.org · UK 1971 PBI Maris Nimrod UK 1972 PBI Maris Huntsman UK 1995 DESP Soissons UK 1968 CAMB Joss Cambier UK 1992 ICI Admiral Pre-1970 Post-1970 Kinship

Jon White, 15 March,

2011

Genomic Control: Rationale

• There is association at nearly all markers.

• We know the expected distribution of the test statistic

under the null hypothesis.

• We really only expect association at a few markers.

• So we would not expect the observed distribution to be

much different from the expected.

Page 14: Overview - pdfs.semanticscholar.org · UK 1971 PBI Maris Nimrod UK 1972 PBI Maris Huntsman UK 1995 DESP Soissons UK 1968 CAMB Joss Cambier UK 1992 ICI Admiral Pre-1970 Post-1970 Kinship

Jon White, 15 March,

2011

Expected distribution of chi-squared.

0

0.1

0.2

0.3

0.4

0.5

0.6

0.5

1.5

2.5

3.5

4.5

5.5

6.5

7.5

8.5

9.5

10.5

11.5

12.5

13.5

14.5

Chi-Sq

Fre

qu

en

cy

Observed distribution of chi-squared

0

0.005

0.01

0.015

0.02

0.025

0.03

0.035

0.04

0.5

11.5

22.5

33.5

44.5

55.5

66.5

77.5

88.5

99.5

111

122

133

144

155

166

177

188

199

210

221

232

243

Chi-sq

Fre

qu

en

cy

Mean:

1.0

Median:

0.456

Mean:

94.07

Median:

57.28

Genomic control: Example using Chi-Squared, 1 d.f.

Page 15: Overview - pdfs.semanticscholar.org · UK 1971 PBI Maris Nimrod UK 1972 PBI Maris Huntsman UK 1995 DESP Soissons UK 1968 CAMB Joss Cambier UK 1992 ICI Admiral Pre-1970 Post-1970 Kinship

Jon White, 15 March,

2011

Genomic Control

456.057.28

iSqObservedCh

hiSqCorrectedC

dianChiSqExpectedMedianChiSqObservedMe

iSqObservedChhiSqCorrectedC

Some authors correct using observed and

expected mean.

In our example:

Page 16: Overview - pdfs.semanticscholar.org · UK 1971 PBI Maris Nimrod UK 1972 PBI Maris Huntsman UK 1995 DESP Soissons UK 1968 CAMB Joss Cambier UK 1992 ICI Admiral Pre-1970 Post-1970 Kinship

Jon White, 15 March,

2011

Genomic Control Corrected Observed vs. Expected.

0

0.1

0.2

0.3

0.4

0.5

0.6

0.5

1.5

2.5

3.5

4.5

5.5

6.5

7.5

8.5

9.5

10.5

11.5

12.5

13.5

14.5

Chi-sq

Fre

qu

en

cy

Obs (GC corr)

Expected

Page 17: Overview - pdfs.semanticscholar.org · UK 1971 PBI Maris Nimrod UK 1972 PBI Maris Huntsman UK 1995 DESP Soissons UK 1968 CAMB Joss Cambier UK 1992 ICI Admiral Pre-1970 Post-1970 Kinship

Jon White, 15 March,

2011

Before GC After GC

Page 18: Overview - pdfs.semanticscholar.org · UK 1971 PBI Maris Nimrod UK 1972 PBI Maris Huntsman UK 1995 DESP Soissons UK 1968 CAMB Joss Cambier UK 1992 ICI Admiral Pre-1970 Post-1970 Kinship

Jon White, 15 March,

2011

Raw association with winter/spring habit. Barley.

0

20

40

60

80

100

120

140

X0.0

0.1

.1

X47.4

7.1

.1

X61.5

3.1

.1

X106.6

0.1

X138.3

1.1

.1

X32.1

7.2

X54.9

5.2

.4

X70.5

4.2

X96.8

2.2

.1

X121.5

0.2

X147.1

2.2

X6.0

3.3

.1

X54.4

0.3

X56.4

0.3

.27

X74.7

8.3

X111.4

2.3

.2

X155.8

5.3

X19.5

2.4

.1

X48.5

0.4

.3

X62.1

0.4

X78.7

7.4

X117.6

0.4

.1

X26.2

8.5

X51.0

0.5

.7

X63.3

1.5

.2

X103.0

1.5

X129.4

1.5

.5

X146.0

0.5

.2

X159.7

9.5

.8

X189.6

0.5

.3

X31.7

3.6

.1

X54.6

0.6

.3

X63.2

7.6

X81.1

7.6

X112.3

2.6

.1

X1.8

9.7

.1

X60.6

9.7

X79.6

0.7

.8

X128.3

6.7

X0.0

0.7

L.2

Markers in genetic map order.

-Lo

g10 P

(No

asso

cia

tio

n)

Page 19: Overview - pdfs.semanticscholar.org · UK 1971 PBI Maris Nimrod UK 1972 PBI Maris Huntsman UK 1995 DESP Soissons UK 1968 CAMB Joss Cambier UK 1992 ICI Admiral Pre-1970 Post-1970 Kinship

Jon White, 15 March,

2011

Association with winter/spring habit following genomic control. Barley.

0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

X0.0

0.1

.1

X47.4

7.1

.1

X61.5

3.1

.1

X106.6

0.1

X138.3

1.1

.1

X32.1

7.2

X54.9

5.2

.4

X70.5

4.2

X96.8

2.2

.1

X121.5

0.2

X147.1

2.2

X6.0

3.3

.1

X54.4

0.3

X56.4

0.3

.27

X74.7

8.3

X111.4

2.3

.2

X155.8

5.3

X19.5

2.4

.1

X48.5

0.4

.3

X62.1

0.4

X78.7

7.4

X117.6

0.4

.1

X26.2

8.5

X51.0

0.5

.7

X63.3

1.5

.2

X103.0

1.5

X129.4

1.5

.5

X146.0

0.5

.2

X159.7

9.5

.8

X189.6

0.5

.3

X31.7

3.6

.1

X54.6

0.6

.3

X63.2

7.6

X81.1

7.6

X112.3

2.6

.1

X1.8

9.7

.1

X60.6

9.7

X79.6

0.7

.8

X128.3

6.7

X0.0

0.7

L.2

Markers in genetic map order

-Lo

g10 P

(No

asso

cia

tio

n)

Page 20: Overview - pdfs.semanticscholar.org · UK 1971 PBI Maris Nimrod UK 1972 PBI Maris Huntsman UK 1995 DESP Soissons UK 1968 CAMB Joss Cambier UK 1992 ICI Admiral Pre-1970 Post-1970 Kinship

Jon White, 15 March,

2011

Genomic Control

• Corrects the symptoms of structure.

• Does not change the ranking of significance of

association.

• Loss of statistical power.

Key Benefit.

• Returns the distribution of the test statistic close

to its expectation: Allows us to work with

conventional significance thresholds*. (*Well almost!)

Page 21: Overview - pdfs.semanticscholar.org · UK 1971 PBI Maris Nimrod UK 1972 PBI Maris Huntsman UK 1995 DESP Soissons UK 1968 CAMB Joss Cambier UK 1992 ICI Admiral Pre-1970 Post-1970 Kinship

Structured Association

Estimate the ancestry of each individual in the sample. Most

common is to use the programme Structure.

Regress the phenotype on the ancestry coefficients (to adjust for

effects of population structure) and then on the test marker.

Does not correct adequately for recent coancestry – pedigree

relationships.

Page 22: Overview - pdfs.semanticscholar.org · UK 1971 PBI Maris Nimrod UK 1972 PBI Maris Huntsman UK 1995 DESP Soissons UK 1968 CAMB Joss Cambier UK 1992 ICI Admiral Pre-1970 Post-1970 Kinship

Structure View Software: Structure v2.2.

Variety K1 K2 K3 K4

A 0.1 0.0 0.0 0.9

B 0.5 0.0 0.0 0.5

C 0.9 0.1 0.0 0.0

Q Matrix of Fractional Sub-population

membership

Page 23: Overview - pdfs.semanticscholar.org · UK 1971 PBI Maris Nimrod UK 1972 PBI Maris Huntsman UK 1995 DESP Soissons UK 1968 CAMB Joss Cambier UK 1992 ICI Admiral Pre-1970 Post-1970 Kinship

Jon White, 15 March,

2011

Effect of structure specific correction for population

structure

Bonferroni corrected.

(P=0.05)

PCA corrected Association.

Page 24: Overview - pdfs.semanticscholar.org · UK 1971 PBI Maris Nimrod UK 1972 PBI Maris Huntsman UK 1995 DESP Soissons UK 1968 CAMB Joss Cambier UK 1992 ICI Admiral Pre-1970 Post-1970 Kinship

Principal component analysis

PCA of genotype data gives an indicator of ancestry for each variety

(the eigenvector) for a population characterised by its the eigenvalue.

The deviation for each variety from a multiple regression of phenotype

on eigenvectors gives a a new phenotype adjusted for population

structure.

Deviations from regression of candidate markers on eigenvectors gives

adjusted genotypes in the same way.

Correlation between adjusted phenotype and adjusted genotype is a a

measure of association adjusted for the effect of population structure.

Advice is to include ~ 20 largest principle components for ancestry

Currently only works for bi-allelic markers.

Will not adjust for recent coancestry.

Page 25: Overview - pdfs.semanticscholar.org · UK 1971 PBI Maris Nimrod UK 1972 PBI Maris Huntsman UK 1995 DESP Soissons UK 1968 CAMB Joss Cambier UK 1992 ICI Admiral Pre-1970 Post-1970 Kinship

Jon White, 15 March,

2011

PCA based correction (a.k.a. Eigenstrat).

• Define the population structure in terms of

co-variation between individuals.

• Simplify the information as principle

components: eigenvectors.

• Use an informative subset of these vectors to

predict genotype and phenotype.

• Calculate residuals

• Measure the correlation between the

residual phenotype the residual genotype for

each marker.

Page 26: Overview - pdfs.semanticscholar.org · UK 1971 PBI Maris Nimrod UK 1972 PBI Maris Huntsman UK 1995 DESP Soissons UK 1968 CAMB Joss Cambier UK 1992 ICI Admiral Pre-1970 Post-1970 Kinship

-0.15

-0.1

-0.05

0

0.05

0.1

-0.08 -0.06 -0.04 -0.02 0 0.02 0.04 0.06 0.08

PCA 1. (19%)

PC

A 2

. (5

%)

Pop 1

Pop 2

Pop 3

Pop 4

Pop 5

Pop 6

Pop 7

Pop 8

Pop 9

No clear pop

Principle components contain a lot of structural

information.

Page 27: Overview - pdfs.semanticscholar.org · UK 1971 PBI Maris Nimrod UK 1972 PBI Maris Huntsman UK 1995 DESP Soissons UK 1968 CAMB Joss Cambier UK 1992 ICI Admiral Pre-1970 Post-1970 Kinship

Jon White, 15 March,

2011

0

5

10

15

20

25

X0.00

.1.1

X42.5

2.1.

2

X59.7

1.1

X92.8

0.1

X128.

14.1

.1

X10.9

4.2

X44.1

3.2

X58.2

4.2.

2

X71.5

6.2

X96.8

2.2.

2

X119.

05.2

X139.

65.2

.3

X0.00

.2L.

13

X39.4

5.3.

3

X56.4

0.3.

12

X64.1

9.3.

2

X81.6

6.3.

1

X114.

00.3

.5

X162.

15.3

.1

X19.5

2.4

X47.6

0.4

X55.6

3.4.

4

X68.2

1.4.

2

X99.2

8.4

X0.00

.4L.

3

X39.9

7.5.

1

X51.6

0.5.

2

X68.3

5.5

X103.

01.5

X129.

41.5

.2

X143.

92.5

.1

X159.

09.5

.1

X173.

08.5

X3.11

.6

X43.1

5.6.

1

X55.6

5.6.

4

X64.3

6.6.

3

X81.1

7.6.

1

X110.

32.6

X0.00

.6L.

6

X42.6

0.7

X77.8

5.7.

4

X88.6

5.7

X141.

76.7

X0.00

.7S.3

Markers (map order).

-lo

g1

0(p

-va

lue

)

Uncorrected association with awn

colouration in barley.

0

5

10

15

20

25

X0.00

.1.1

X42.5

2.1.

2

X59.7

1.1

X92.8

0.1

X128.

14.1

.1

X10.9

4.2

X44.1

3.2

X58.2

4.2.

2

X71.5

6.2

X96.8

2.2.

2

X119.

05.2

X139.

65.2

.3

X0.00

.2L.

13

X39.4

5.3.

3

X56.4

0.3.

12

X64.1

9.3.

2

X81.6

6.3.

1

X114.

00.3

.5

X162.

15.3

.1

X19.5

2.4

X47.6

0.4

X55.6

3.4.

4

X68.2

1.4.

2

X99.2

8.4

X0.00

.4L.

3

X39.9

7.5.

1

X51.6

0.5.

2

X68.3

5.5

X103.

01.5

X129.

41.5

.2

X143.

92.5

.1

X159.

09.5

.1

X173.

08.5

X3.11

.6

X43.1

5.6.

1

X55.6

5.6.

4

X64.3

6.6.

3

X81.1

7.6.

1

X110.

32.6

X0.00

.6L.

6

X42.6

0.7

X77.8

5.7.

4

X88.6

5.7

X141.

76.7

X0.00

.7S.3

Markers (map order).

-lo

g1

0(p

-va

lue

)

Principle components used to

correct for population structure.

Page 28: Overview - pdfs.semanticscholar.org · UK 1971 PBI Maris Nimrod UK 1972 PBI Maris Huntsman UK 1995 DESP Soissons UK 1968 CAMB Joss Cambier UK 1992 ICI Admiral Pre-1970 Post-1970 Kinship

Mixed models

Pedigree relationships mean that the error variances for each individual are no

longer independent.

In addition to error variances, we must include in the model error covariances

between related individuals.

If the relationships are known, these are fed into the model as expected genetic

covariances among individuals. This is the basis of the mixed model. Software

exists to do this automatically – GenStat, SAS, VCE and others.

If relationships are unknown, they can be estimated using markers, but these are

not so easily fed into standard software which exploit properties of known

pedigrees to greatly speed up computation. Use TASSEL, GenStat, EMMA, SAS

Mixed modelling adjusts for kinship yet still permits the inclusion of covariates to

adjust for differences in phenotype between subgroups.

Page 29: Overview - pdfs.semanticscholar.org · UK 1971 PBI Maris Nimrod UK 1972 PBI Maris Huntsman UK 1995 DESP Soissons UK 1968 CAMB Joss Cambier UK 1992 ICI Admiral Pre-1970 Post-1970 Kinship

Mixed effects modelling

• Commonly implemented using software

called TASSEL – we have found this very

difficult to use.

• EMMA (Efficient mixed model analysis) is

also available free and runs in R.

• A more rapid (and less temperamental)

implementation of the EMMA method is

soon to be available from Will Astle,

Imperial College.

Page 30: Overview - pdfs.semanticscholar.org · UK 1971 PBI Maris Nimrod UK 1972 PBI Maris Huntsman UK 1995 DESP Soissons UK 1968 CAMB Joss Cambier UK 1992 ICI Admiral Pre-1970 Post-1970 Kinship

Strengths and weaknesses of LD mapping

Kinship & pop structure largely solved

Low power need large pop sizes

Better precision LD decays more rapidly

Use of existing data historical collections

Need high marker density will be solved

Easy to publish “hits” educate on good design

Page 31: Overview - pdfs.semanticscholar.org · UK 1971 PBI Maris Nimrod UK 1972 PBI Maris Huntsman UK 1995 DESP Soissons UK 1968 CAMB Joss Cambier UK 1992 ICI Admiral Pre-1970 Post-1970 Kinship

Want a cheap

association

mapping

publication?

Do:

Use a small collection of cultivars.

Use a small number of “genome wide” markers.

Run STRUCTURE but with the default parameters.

Don’t:

Carry out power calculations.

Check for off-chromosome LD.

Check that “replicated QTLs” are no more than expected by chance.

Check the type 1 error rate.

Page 32: Overview - pdfs.semanticscholar.org · UK 1971 PBI Maris Nimrod UK 1972 PBI Maris Huntsman UK 1995 DESP Soissons UK 1968 CAMB Joss Cambier UK 1992 ICI Admiral Pre-1970 Post-1970 Kinship

Underpowered studies in crops, an exemplar:

Recently published:

7 chromosomes, 46 SSRs, 30 accessions.

Multiple traits scored on 5 plants per accession.

No LD plots,

No power calculations,

Many positive results:

Page 33: Overview - pdfs.semanticscholar.org · UK 1971 PBI Maris Nimrod UK 1972 PBI Maris Huntsman UK 1995 DESP Soissons UK 1968 CAMB Joss Cambier UK 1992 ICI Admiral Pre-1970 Post-1970 Kinship

“Strong linkages … were

found for quantitative

traits”

“a few SSR markers were

linked to multiple traits”

“some of the associations identify

chromosomal regions containing

previously known genetic loci”

“variation in one trait …linked

to SSR markers from various

chromosomal regions”

“The present data also indicate

that a relatively high map

resolution can be achieved by

association mapping”

via screening only a small pool of

genotypes, possible loci conferring a

specific trait are detected in this

species, which often requires a large

pool of germplasm to be screened in

other species

“The findings also indicate that the

efficiency of association mapping

is much higher in …than in other

plant species.”

Page 34: Overview - pdfs.semanticscholar.org · UK 1971 PBI Maris Nimrod UK 1972 PBI Maris Huntsman UK 1995 DESP Soissons UK 1968 CAMB Joss Cambier UK 1992 ICI Admiral Pre-1970 Post-1970 Kinship

Sir Austin Bradford Hill.

(First demonstrated the

connection between smoking

and lung cancer.)

“The glitter of

the t table diverts

attention from

the inadequacies

of the fare.”

Page 35: Overview - pdfs.semanticscholar.org · UK 1971 PBI Maris Nimrod UK 1972 PBI Maris Huntsman UK 1995 DESP Soissons UK 1968 CAMB Joss Cambier UK 1992 ICI Admiral Pre-1970 Post-1970 Kinship

Extract from UK MREC application form:

13. Size of the study (including controls)

i) How many patients will be recruited?

ii) How many controls will be recruited?

iii) What is the primary end point?

iv) How was the size of the study determined?

v) What is the statistical power of the study?

Good study design is an ethical issue.

Page 36: Overview - pdfs.semanticscholar.org · UK 1971 PBI Maris Nimrod UK 1972 PBI Maris Huntsman UK 1995 DESP Soissons UK 1968 CAMB Joss Cambier UK 1992 ICI Admiral Pre-1970 Post-1970 Kinship

A well designed association genetics study should consider:

marker density

LD decay

allele frequency distribution

number of samples

relatedness between samples

Are resources adequate for the objectives of the study?

What magnitude of effect are you likely to detect?

With what precision are you likely to locate QTL?

Are the results too good to be true?

Page 37: Overview - pdfs.semanticscholar.org · UK 1971 PBI Maris Nimrod UK 1972 PBI Maris Huntsman UK 1995 DESP Soissons UK 1968 CAMB Joss Cambier UK 1992 ICI Admiral Pre-1970 Post-1970 Kinship

Small studies can find major genes

10 cases, 10 controls,

(40 chromosomes)

recessive trait:

White spotting

Hair ridge.

Mapped to < 1 cM in ~ 20

dogs (40 chromsomes)

Nat Genet 2007 39 p1321

Page 38: Overview - pdfs.semanticscholar.org · UK 1971 PBI Maris Nimrod UK 1972 PBI Maris Huntsman UK 1995 DESP Soissons UK 1968 CAMB Joss Cambier UK 1992 ICI Admiral Pre-1970 Post-1970 Kinship

The Wellcome Trust Case Control Consortium, Nature 2007

Genome-wide association study of 14,000 cases of seven

common diseases and 3,000 shared controls

24 genetic risk factors

Large collaborative effort: > 50 research groups

500 000 markers

But:

These explain only a small proportion of risk:

power is still low for the effect sizes in humans.

2007: GWA studies come of age.

Page 39: Overview - pdfs.semanticscholar.org · UK 1971 PBI Maris Nimrod UK 1972 PBI Maris Huntsman UK 1995 DESP Soissons UK 1968 CAMB Joss Cambier UK 1992 ICI Admiral Pre-1970 Post-1970 Kinship

Hundreds of variants clustered in genomic loci

and biological pathways affect human height.

Nature 2010 doi:10.1038/nature09410

h2 ~80%

183,727 individuals

>180 loci, 100s of genetic variants

“Our data explain approximately 10% of

the phenotypic variation in height”