introduction to animal breeding with examples of (non-)gaussian traits
Post on 08-Apr-2015
610 Views
Preview:
DESCRIPTION
TRANSCRIPT
Introduction to Animal Breeding withExamples of (Non-)Gaussian Traits
Gregor Gorjanc
University of Ljubljana, Biotechnical Faculty, Department of Animal Science, Slovenia
INLA for Animal Breeders “Project"Trondheim, Norway30th August 2010
Thank you for the invitation to NTNU!!!
My department ...
Table of Contents
1. Animal breeding crash course
2. Categorical trait example
3. Survival analysis example
1. Animal Breeding Crash Course
Introduction
I Animal breeding= mixture(animal science, genetics, statistics, . . . )
I Many species (cattle, chicken, pig, sheep, goat, horse, dog,salmon, shrimp, honeybee, . . . )
I Many (complex) traits:I production (milk, meat, eggs, . . . )I reproduction (no. of offspring, insemination success, . . . )I conformation (body height, width, . . . )I health & longevityI . . .
I Genetic evaluation - to enhance selective breeding
Selective BreedingI Measure phenotype in candidates and select those with the
most favourable values (= "mass” selection)I Selected candidates will bred the next (better) generation
I . . . , but phenotype is not transmitted to the next generation
Decomposition of Phenotypic Value
Genotype Environment
Phenotype
P = G + E + G × E
I Genetic evaluation = inference of genotypic value given thedata and postulated model (= “BLUP” selection)
Postulated Model and DataI Postulated model
P = G + E + G × E = A + D + I + . . .
I A - additive (breeding) valueI D - dominanceI I - epistasis
I DataI phenotypes on various relatives (pedigree)
I own performance testI progeny testI (half-)sib testI . . .
I recently also genotype marker data
Evaluation via Pedigree based Mixed ModelsI Not so standard example - “maternal animal model”
y|b, c, ad , am,R ∼ N (Xb + Zcc + Zadad + Zamam,R)
R = Iσ2e
b ∼ const.c|C ∼ N (0,C)
C = Iσ2c
a =(aT
d , aTm)T |G ∼ N (0,G)
G = G0 ⊗ A,G0 =
(σ2
adσad ,am
sym. σ2am
)data: y (phenotypes), X,Z∗(“covariates”), A (pedigree)
parameters: b, c, a (means)σ2
c , σ2ad, σad ,am , σ
2am , σ
2e (variances)
Inference (for Gaussian models)I “Standard”
I means - solve Mixed Model (Normal) Equations (MME∗)Henderson (1949+)
I SE of means (needed for accuracies) - inversion of LHS orsome approximation
I variances - maximize Restricted Likelihood (REML)Patterson & Thompson (1971)
I “Powerfull/Popular/Fancy/. . . ” - McMC
I ∗MME
LHS =
XTR−1X XTR−1Zc XTR−1Za
ZTc R−1Zc + C−1 ZT
c R−1ZaZT
a R−1Za + G−1 ⊗ A−1
sym.
RHS =
((XTR−1y
)T,(ZT
c R−1y)T,(ZT
a R−1y)T)T
Graphical Model View of Pedigree Model
A−1 =(T−1)TW−1T−1
= (I− 1/2P)TW−1(I− 1/2P)
Wi ,i = 1− 1/4(1 + F f (i)
)− 1/4
(1 + F m(i)
)σ2
a
af (i) am(i)
ai
i = 1 : nI
Wi ,i
1/2 1/2
Genetic GroupsI Different means in founders (usually due to different origin)
= sort of hierarchical centering for pedigree model
. . .
a|G ∼ N (ZaQa0,G)
a0 ∼ const.. . .
after some "massage"
LHS =
. . . . . . . . . 0
. . . . . . 0ZT
a R−1Za + G−1 ⊗ A−1i ,i G−1 ⊗ A−1
i ,gsym. G−1 ⊗ A−1
g ,g
i − individuals, g − genetic groups
Genetic Groups - Graphical Model ViewI Unknown (phantom) parents are represented with (few!)
genetic group(s) - “graphical parent(s)”I Algorithm to set up A−1 directly available!!!I Hierarchical prior can be put on genetic groups for
stability/shrinkage
σ2a
af (i) am(i)
ai
i = 1 : nI
Wi ,i
1/2 1/2
a0g(i)
Multi-trait = multi-variate
y =(yT
1 , yT2)T, X = . . .
y| . . . ∼ N (Xb + Zcc + Zadad + Zamam,R)
R = R0 ⊗ I,R0 =
(σ2
e1 σe1,e2
sym. σ2e2
)c|C ∼ N (0,C)
C = C0 ⊗ I,C0 =
(σ2
c1 σc1,c2
sym. σ2c2
)ad , am|G ∼ N (0,G)
G = G0 ⊗ A,G0 =
σ2
ad1σad1 ,ad2
σad1,am1σad1 ,am2
σ2ad2
σad2 ,am1σad2 ,am2
σ2am1
σam1 ,am2
sym. σ2am2
I there are now 16 variance components!!!
Non-Gaussian TraitsI Categorical (health status, calving ease score, . . . )
I threshold model = (ordered) probit model, cumulative linkmodel, . . .
I multinomial categories mostly treated separately as binarytraits
I Counts (no. of offspring, . . . )I Poisson, but rarely used - replacements: threshold and/or
Gaussian model
I Time (longevity)I survival (Weibull & Cox) models
I MixturesI Gaussian componentsI zero-inflated (no. of black spots in sheep skin -> wool, cure
model - bivariate threshold model)
2. Categorical Trait Example(Calving ease score)
Calving Ease ScoreI Of great economical importance!!!I We can not measure calving difficulty -> subjective score
I 1 = no problemI 2 = easyI 3 = difficultI 4 = mechanical help or ceasearean
I Reasons for difficult calving?I sex (male calfs bigger)I number of calfs - data usually omittedI parity (more problems with the 1st calving)I age (especially in the 1st parity; younger cows more problems)I season?I environment (= herd, herd-year)I . . .
Calving Ease Score III Reasons for difficult calving - genetics?
I morphology of calfI “direct” genetic effect or “sire/bull” effectI genes expressed in calfI “origin” of genes - father and mother of a calf
I morphology of cows’ pelvic areaI “maternal” genetic effectI genes expressed in cowI “origin” of genes - father and mother of a cow
I Negative genetic correlationI larger animals (↑direct effect -> bad) have
larger pelvic area (↓maternal effect -> good)
I Parity specific genetic effects - 1st vs. 2nd+
Threshold Model(Wright, . . . , Gianola & Foulley, Sorensen, . . . )
l|b, c, ad , am,R ∼ N (Xb + Zcc + Zadad + Zamam,R)
Pr (yi = k|µi , t) = Pr (tk−1 < li < tk |µi , t)
= Φ
(tk − µi
σ
)− Φ
(tk−1 − µi
σ
). . .
I Model σ as well to improve model fit? log (σ) = . . .I Methods: approx. EM-REML, Laplace approx., McMC
Approximative (Gaussian) Model - Example(joint work with Marija Špehar - Croatia)
I Dataset: ~150k phenotypes, ~200k animals, 10 dataset samplesI Homogenization of variance by region and period of recording -
scale problems?I Bi-variate (1st & 2nd+ parity) maternal animal model with
heterogenous (by sex within parity class) residual varianceI 18 variance components - with VCE-6 program
I herd-year interaction (3) -> better with autoregressive prior?σ2
h1, σ2
h2+, σh1,h2+
I permanent effect of a cow (repeated records) (1)σ2
c2+
I direct & maternal genetic effect (10)σ2
ad1, σ2
ad2+, σad1 ,ad2+
, . . . σ2am2+
I residual (4)σ2
em1, σ2
ef1, σ2
em2+, σ2
ef2+
Approximative (Gaussian) Model - ExampleI Residual variancesσ2
em1= 0.295, σ2
ef1= 0.204, σ2
em2+= 0.228, σ2
ef2+= 0.162
I Ratios and correlations (1st vs. 2nd+)Herd-year Direct Maternal Perm.
1st 27.545 4.548 3.548 /2nd+ 24.445 9.948 4.248 5.1Corr. 20.845 0.548 0.743 /
I Genetic correlation between direct and maternal effectDirect, 1st Direct, 2nd+
Maternal, 1st -0.490 -0.433Maternal, 2nd+ -0.377 -0.730
A Look at my Data - StructureI Dimensions
I #records (= #calfs) ~150kI #cows ~74kI #bulls ~1kI #pedigree records (all generations + pruning)
I animal pedigree ~230k(basic set are calfs + ancestors)
I sire-dam pedigree ~115k(basic set are mothers and fathers of calfs + ancestors)
I two more options: sire-maternal grandsire pedigree, sirepedigree
I Distribution of scoresI no problem 50.3%I no problem 49.7%
I easy 43.5%I difficult 6.1%I mechanical help or ceasearean 0.1%
A Look at my Data - Sex & Parity
I SexI females 52%I females 47%
I ParityI 1st 59%I 2nd 46%I 3rd 45%I 4th 45%I 5th 45%
A Look at my Data - Age within Parity
20 40 60 80 100
0.0
0.2
0.4
0.6
0.8
1.0
Age at calving
Ave
rage
sco
re
Score 1st (male)
Score 2nd...#Records
A Look at my Data - Age within Parity & Sex
20 40 60 80 100
0.0
0.2
0.4
0.6
0.8
1.0
Age at calving
Ave
rage
sco
re
Score 1st (male)Score 1st (female)Score 2nd...#Records
A Look at my Data - Season
0 20 40 60 80 100
0.0
0.2
0.4
0.6
0.8
1.0
Season
Ave
rage
sco
re
Score#Records
Analysis of my Data in R - Available ToolsI Bernoulli/binomial model
I glm() - package statsI glmer() - package lme4
I Laplace and adaptive Gauss-Hermite approximation (for moreeffects)
I inla()
I threshold modelI polr() - package MASSI clm() - package ordinal
I location (additive) and scale (multiplicative) modelI clmm() - package ordinal
I location (additive) and scale (multiplicative) modelI Laplace and adaptive Gauss-Hermite approximation (for one
effect)
3. Survival Analysis Example(Longevity = Length of Productive Life)
Model and DataI Weibull model
y|b∗,h, a, ρ ∼ Weibull (Xb∗ + Zhh + Zaa, ρ)
h (y|b∗,h, a, ρ) = ρyρ−1 exp (Xb∗ + Zhh + Zaa)
b∗ =(ρ lnλ,bT)T
b∗ ∼ const.h|γ ∼ Log − Gamma (γ, γ)
a|G ∼ N (0,G)
G = Aσ2a
I DataI ~110k cows from ~4k herds, ~40% censoringI sire-maternal grandsire pedigree with ~3k bulls
Implementation
I Survival Kit program
I Log-Gamma prior “integrated out”
I Laplace approximation for Normal prior
Time Independent Effect - Age at 1st Calving
Age at first calving (month)19 22 25 28 31 34 37 40 43 46 49
020
0040
0060
0080
0012
000
No.
of r
ecor
ds
1.0
1.2
1.4
1.6
1.8
Rel
ativ
e ris
k
All recordsUncensored recordsRelative riskBaseline
Time Dependent Effect - Parity*Stage
Length of productive life (day)0 500 1000 1500 2000
020
0040
0060
0080
00N
o. o
f rec
ords
0 500 1000 1500 2000
0.00
000.
0005
0.00
100.
0015
0.00
20H
azar
d fu
nctio
n
All recordsUncensored recordsHazard function
Thank you!
Postulated Model and Data III Breeding value for individual
= f(parent average, phenotype deviation, progeny contribution)
b1 b2
a1 a2
y21
y22
a3y3 a4 y4
a5 y5 a6 y6
a7 a8 a9
a10y10
top related