topics in statistical genetics

22
1 T o p i c s i n S t a t i s t i c a l G e n e t i c s K a r l W B r o m a n D e p a r t m e n t o f B i o s t a t i s t i c s J o h n s H o p k i n s S c h o o l o f P u b l i c H e a l t h

Upload: others

Post on 12-Sep-2021

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Topics in Statistical Genetics

1

Topics in Statistical Genetics

Karl W Broman

Department of BiostatisticsJohns Hopkins School of Public Health

Page 2: Topics in Statistical Genetics

2

• QTLs in experimental crosses

• CEPH data– Genetic maps

– Recombinational variation

– Autozygosity

– Crossover interference

• Genotyping/pedigree errors

• Dog genetic maps

• Various disease projects

Past

8 families (~ 130 people)>8000 STRPs{

Page 3: Topics in Statistical Genetics

3

Traditional genetics: binary traitse.g. disease or no disease

Quantitative traitse.g. yield of tomato crop

# bristles on a fly’s abdomen

– Many genes– Environmental variation

Why?– Biochemical basis of trait– Selection experiments– Evolution

Goal: find (some of) the genes

Page 4: Topics in Statistical Genetics

4

Backcross experiment

Page 5: Topics in Statistical Genetics

5

Genetic markers:STRPs or microsatellites

Page 6: Topics in Statistical Genetics

6

DataPhenotypes (trait values)

yi = phenotype for individual i

Marker genotypes

xij = 1/0 if i is HL/LL at marker j

Genetic map

Locations of markers

ModelsRecombination: No interference

Phenotype/genotype connectiony = µ + ßj zj +

~ Normal(0, 2)

Page 7: Topics in Statistical Genetics

7

Problem

100 to 1000 backcross progeny

100 to 400 markers

y = µ + ßj xj +

Errors:

• Miss important loci

• Include extraneous loci

Find the x’ s with ßj ≠ 0

Page 8: Topics in Statistical Genetics

8

Discussion

• Identifying QTLs is model selection

• Simulation studies are necessary– compare procedures– understand a procedure’s performance

• Different situations will require different procedures

Page 9: Topics in Statistical Genetics

9

Human data

• www.marshmed.org/genetics

• 8 CEPH families– three generations– 11 to 15 progeny– 92 meioses

• ~8,000 STRP markers– 90 ± 7 % typed

• Average spacing– female: 0.6 ± 1.2 cM– male: 0.4 ± 1.0 cM– sex-ave: 0.5 ± 0.9 cM

• Data cleaning

– Removed 764/964,425 (~0.08%) genotypes resulting in tight double recombinants

Page 10: Topics in Statistical Genetics

10

Meiosis

Page 11: Topics in Statistical Genetics

11

• Strand choice

→ Chromatid interference

• Spacing

→ Chiasma (crossover) interference

Interference

Page 12: Topics in Statistical Genetics

12

distance (cM) = average # crossoversin 100 meiotic products

perMorgan

2 chiasmata on 4-strand bundle1 crossover on meiotic product{

recombinationfraction

geneticdistance

as a function of

Haldane r(d) = ½ [ 1 – exp(-2d) ]

Kosambi r(d) = ½ tanh(2d)

Page 13: Topics in Statistical Genetics

130 100 200 300

Page 14: Topics in Statistical Genetics

14

15 20 25 30 35

Total Number of Crossovers

102

884

1331

1332

1347

1362

1413

1416

CE

PH

Fam

ily

Crossovers in Father

Page 15: Topics in Statistical Genetics

15

20 25 30 35 40 45 50 55 60

Total Number of Crossovers

102

884

1331

1332

1347

1362

1413

1416

CE

PH

Fam

ily

Crossovers in Mother

Page 16: Topics in Statistical Genetics

16

0 50 100 150 200 250 300

0

1

2

3

4

5

Chromosome 1

Sex-averaged Location

Fem

ale/

Mal

e D

ista

nce

Rat

io

Page 17: Topics in Statistical Genetics

17

0 50 100 150 200

0

2

4

6

Chromosome 4

Sex-averaged Location

Fem

ale/

Mal

e D

ista

nce

Rat

io

Page 18: Topics in Statistical Genetics

18

14m

13m

12m

11m

10m

9m

8m

7m

6m

5m

4m

3m

0 100 200 300 400

Chromosome 6Family 884

Page 19: Topics in Statistical Genetics

19

Individual Number of regions

Length (cM)

Total length (cM)

884-01 (father) 10 2 – 23 116

884-02 (mother) 11 5 – 40 165

884: grandparents 2 7 – 10

7 1 – 29

14 58 – 120

884: 12 progeny 5 – 18 2 – 36 62 – 204

102: 14 progeny 4 – 13 1 – 39 85 – 254

1331-12 (GP) 2 1 – 6 7

1416-14 (GP) 4 6 – 29 76

35/100 of the others 1 – 2 0 – 7

CEPH: homozygous regions

Page 20: Topics in Statistical Genetics

20

• Finish the past

• Yellowstone wolves

• Sib pairs where parents are unavailable

• Various disease projects

Present

Page 21: Topics in Statistical Genetics

21

130 / 134 134 / 138

130 / 138 134 / 139

Page 22: Topics in Statistical Genetics

22

• Large genetic/epidemiological studies– Many people

– Many phenotypes

– Many environmental covariates

• Highly computational statistical genetic methods

• Various disease projects

Future