on the use of markov chains and perron frobenuis …on the use of markov chains and perron frobenuis...

21
On the use of Markov chains and Perron Frobenuis Theorem in Population Genetics Ola H¨ ossjer Dept. of Mathematics Stockholm University June 2015

Upload: others

Post on 25-Aug-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: On the use of Markov chains and Perron Frobenuis …On the use of Markov chains and Perron Frobenuis Theorem in Population Genetics Ola H ossjer Dept. of Mathematics Stockholm University

On the use of Markov chains and PerronFrobenuis Theorem in Population Genetics

Ola HossjerDept. of MathematicsStockholm University

June 2015

Page 2: On the use of Markov chains and Perron Frobenuis …On the use of Markov chains and Perron Frobenuis Theorem in Population Genetics Ola H ossjer Dept. of Mathematics Stockholm University

Motivation and Outline

I Conservation biology: Protect genetic diversity in order toI Prevent inbreedingI Keep species viableI Improve environmental adjustment of species

I Outline:I Populations with substructureI Formulas for long term rate of loss of genetic variants

Page 3: On the use of Markov chains and Perron Frobenuis …On the use of Markov chains and Perron Frobenuis Theorem in Population Genetics Ola H ossjer Dept. of Mathematics Stockholm University

Effective population size Ne

I Population of size N

I Ne : Size of ideal population with same rate of loss of geneticvariants as studied population (smaller Ne → faster rate)

I Short term protection rule: Ne ≥ 50

I Ne/N varies between species

Page 4: On the use of Markov chains and Perron Frobenuis …On the use of Markov chains and Perron Frobenuis Theorem in Population Genetics Ola H ossjer Dept. of Mathematics Stockholm University

Genetic drift for size N populationI Discrete time t = 0, 1, 2, . . ..

I Genes g = 1, . . . , 2N.

I Two variants white and black

I νtg nr. of offspring of g , time t.

I Freq. of black, time t + 1, is

Xt+1 = 12N |{g ; g is black at t + 1}|

= 12N

∑g ;g black at time t νtg .

I {Xt} Markov chain, state space

X = {0, 12N , . . . ,

2N−12N , 1}

= {0} ∪ {1} ∪ { 12N , . . . ,

2N−12N }

= X0 ∪ X1 ∪ X2.

Absorbing states: X0, X1

Transient states: X2

2N = 10,Xt = 0.4,

Xt+1 = 0.5,νt2 = νt5 = νt6 = 0,νt1 = νt3 = νt4 = νt9 = νt,10 = 1,νt8 = 2,νt7 = 3.

Page 5: On the use of Markov chains and Perron Frobenuis …On the use of Markov chains and Perron Frobenuis Theorem in Population Genetics Ola H ossjer Dept. of Mathematics Stockholm University

Ideal population: Wright-Fisher (WF) model1Children pick parental genes independently;

{νtg}2Ng=1 ∼ Mult

(2N;

1

2N, . . . ,

1

2N

),

so that the transition kernel of {Xt} is

P = (P(x , y))x,y∈X ,

P(x , y) = P(Xt+1 = y |Xt = x) =(

2N2Ny

)x2Ny (1− x)2N(1−y).

Feller (1951) showed that the eigenvalues of P are

λ1 = λ2 = 1, λj =(2N − 1)(2N − 2) · · · (2N − j + 2)

(2N)j−2, j = 3, . . . , 2N+1,

so that the largest non-unit eigenvalue is

λ = λ3 = 1− 1

2N. (1)

It gives the asymtotic rate of fixation of one variant;

limt→∞

P(Xt ∈ X2)

λt= C , (0 < C <∞).

1Fisher (1921), Wright (1931).

Page 6: On the use of Markov chains and Perron Frobenuis …On the use of Markov chains and Perron Frobenuis Theorem in Population Genetics Ola H ossjer Dept. of Mathematics Stockholm University

Cannings modelCannings (1974) showed more generally that

λ1 = λ2 = 1, λj = E

(j−1∏g=1

νtg

), j = 3, . . . , 2N + 1,

if {νtg}2Ng=1 are exchangeable, so that in particular,

λ = λ3 = E (νt1νt2) = 1 + Cov(νt1, νt2) = 1− p,

with coalescence probability

p = −Cov(νt1, νt2)

= E [νt1(νt1−1)]2N−1

= 2N · E[(νt1

2

)]/(

2N2

)= P (two offspring have the same parent) .

Eigenvalue effective size2 NeE = size of a WF population with fixationrate λ:

λ = 1− 1

2NeE=⇒ NeE =

1

2(1− λ)=

N − 12

E [νt1(νt1 − 1)].

2Crow (1954), Ewens (1982).

Page 7: On the use of Markov chains and Perron Frobenuis …On the use of Markov chains and Perron Frobenuis Theorem in Population Genetics Ola H ossjer Dept. of Mathematics Stockholm University

Gene diversities

Introduce (predicted) gene diversities at time t = 0, 1, 2, . . .

Ht = 2Xt(1− Xt),= P(two genes picked with repl. have diff. variants|Xt)

ht = E (Ht)= P(two genes picked with repl. have diff. variants).

It can be shown thatht+1 = λht .

Hence gene diversities

ht = λth0 = (1− p)th0, (2)

tend to zero at the same multiplicative rate as non-fixationprobabilities P(Xt ∈ X2).

But gene diversities and coalescence probabilities (p) are easier toanalyze theoretically!!

Page 8: On the use of Markov chains and Perron Frobenuis …On the use of Markov chains and Perron Frobenuis Theorem in Population Genetics Ola H ossjer Dept. of Mathematics Stockholm University

Structured populationDivide into s subpopulations i = 1, . . . , s. Let

2Ni = number of genes in subpop. i , (∑

i Ni = N)Xti = fraction of black variants in subpop. i at time t,νtkig = nr. of offspring of gene g of subpop. k at time

t that end up in subpop. i at time t + 1,exchangeable for g = 1, . . . , 2Nk .

This gives dynamics

Xt+1,i =1

2Ni

s∑k=1

∑g ;g∈ subpop. k at time t,

g is black

νtkig .

Find, under suitable conditions,

NeE =1

2(1− λ),

where λ is rate of fixation, so that for some 0 < C <∞,

limt→∞

P(non-fixation at time t)

λt= C .

Page 9: On the use of Markov chains and Perron Frobenuis …On the use of Markov chains and Perron Frobenuis Theorem in Population Genetics Ola H ossjer Dept. of Mathematics Stockholm University

Structured population, contd.Let

Xt = (Xt1, . . . ,Xts).

If reproduction is time invariant, {Xt} is Markov chain with state space

X = {0, 12N1

, . . . , 2N1−12N1

, 1} × . . .× {0, 12Ns

, . . . , 2Ns−12Ns

, 1}= X0 ∪ X1 ∪ X2.

where

X0 = {(0, . . . , 0)},X1 = {(1, . . . , 1)},X2 = X \ (X0 ∪ X1),

and

P = (P(x, y))x,y∈X ,P(x, y) = P(Xt+1 = y|Xt = x),

λ = 3rd largest eigenvalue of P.

[c]

Page 10: On the use of Markov chains and Perron Frobenuis …On the use of Markov chains and Perron Frobenuis Theorem in Population Genetics Ola H ossjer Dept. of Mathematics Stockholm University

Gene diversities for structured populationIntroduce (predicted) gene diversities3 at time t = 0, 1, 2, . . .

Htij = Xti (1− Xtj) + (1− Xti )Xtj ,= P(two genes picked with repl. from i and j have diff. variants|Xt)

htij = E (Htij)= P(two genes picked with repl. from i and j have diff. variants)

between all pairs of subpopulations i and j . The column vector

ht = vec((htij)

si,j=1

)with s2 predicted gene diversities satisfies

ht+1 = Aht =⇒ ht = Ath0, (3)

whereA = (Aij,kl)ij,kl∈{1,...,s}×{1,...,s}

is a square matrix of order s2, and4

λ = λmax(A). (4)

3Maruyama (1970), Felsenstein (1972), Nei (1973).4By Perron-Frobenius Theorem applied to P.

Page 11: On the use of Markov chains and Perron Frobenuis …On the use of Markov chains and Perron Frobenuis Theorem in Population Genetics Ola H ossjer Dept. of Mathematics Stockholm University

Backward migration and coalescence theory to find A

If children pick parental subpopulations independently, with

Bik = probability by which genes in i pickparental subpopulations from k ∈ {1, . . . , s},

pijk = coalescence probability within k= P(two genes from i , j with parents in k , have same parent)

= Nk

2NiNjBikBjk

(E(νtki1(νtki1−1))

1− 12Ni

){i=j}

E (νtki1νtkj1){i 6=j}.

Then (3) holds, with

Aij,kl =

(1− 1

2Ni

){i=j}(

1− pijk

1− 12Nk

){k=l}

BikBjl . (5)

Page 12: On the use of Markov chains and Perron Frobenuis …On the use of Markov chains and Perron Frobenuis Theorem in Population Genetics Ola H ossjer Dept. of Mathematics Stockholm University

Motivation of (3) and (5)

We have that

ht+1,ij = P(genes from i and j at time t + 1 have different variants)= P(different genes picked from i and j at time t + 1)∑

k,l ·P(gene from i has parent from k at time t)

·P(gene from j has parent from l at time t)·P(different parents from k and l)·P(different variants of the different parents at time t)

= (1− 12Ni

){i=j}∑k,l BikBjl(1− pijk){k=l} · ht,kl/(1− 1

2Nk){k=l}

=∑

k,l Aij,klht,kl .

with Aij ,kl as in (5). In vector form this writes

ht+1 = Atht .

Page 13: On the use of Markov chains and Perron Frobenuis …On the use of Markov chains and Perron Frobenuis Theorem in Population Genetics Ola H ossjer Dept. of Mathematics Stockholm University

Computation of NeE

Subpop 1Subpop 2

Subpop 3

Subpop 4

Subpop 5

N4=400

N1=200

N3=50

N5=100

N2=400

5

2

10

3

4

252

5

(Ni )si=1 = (200, 400, 50, 400, 100),

N =∑s

i=1 Ni = 1150,

(Bik)si,k=1 =

0.94 0.05 0 0.01 00.0125 0.9825 0.005 0 00 0.1 0.82 0.08 00.005 0 0.0075 0.9875 00 0 0 0.05 0.95

,

Repr. = {(νtkig )2Nkg=1}

sk=1

∼ Mult(2Ni ;

Bi12N1, . . . , Bi1

2N1, . . . , Bis

2Ns, . . . , Bis

2Ns

)independently for i = 1, . . . , s,

pijk = 1/(2Nk),

NeE = 970,

Page 14: On the use of Markov chains and Perron Frobenuis …On the use of Markov chains and Perron Frobenuis Theorem in Population Genetics Ola H ossjer Dept. of Mathematics Stockholm University

Gene diversity effective size NeGLet

Wij = P(choose gene pair from i and j),

and collect them into row vector of length s2:

W = vec ((Wij)1≤i,j≤s)′ .

Predicted gene diversity for two randomly sampled genes at time t, is

ht = P(the two genes have different variants)=

∑1≤i,j≤s Wijhtij

= Wht

= WAth0.

It follows from (1) and (2), that for Wright-Fisher model

ht =

(1− 1

2N

)t

· h0. (6)

Gene diversity effective size over time interval [0, t] solves (6), i.e.

NeG ([0, t]) =1

2

[1−

(hth0

)1/t] t→∞→ 1

2(1− λ)= NeE .

Page 15: On the use of Markov chains and Perron Frobenuis …On the use of Markov chains and Perron Frobenuis Theorem in Population Genetics Ola H ossjer Dept. of Mathematics Stockholm University

Local/global NeG and NeE

Constant subpop. sizes

0 20 40 60 80 100

020

040

0600

800

100

0120

0

Time

Eff

ecti

veSiz

e

Subpop 1Subpop 2Subpop 3Subpop 4Subpop 5Subpop 1,2,3,4,5Asymptotic Limit

Solid - with replacementDashed - without replacement

Local bottleneck in 1

0 20 40 60 80 100

020

0400

600

800

100

0

Time

Eff

ecti

veSiz

eSubpop 1Subpop 2Subpop 3Subpop 4Subpop 5Subpop 1,2,3,4,5Asymptotic Limit

Solid - with replacementDashed - without replacement

Blocked migration 1-2

0 20 40 60 80 100

020

040

0600

800

100

0120

0

Time

Eff

ecti

veSiz

e

Subpop 1Subpop 2Subpop 3Subpop 4Subpop 5Subpop 1,2,3,4,5Asymptotic Limit

Solid - with replacementDashed - without replacement

Horizontal: NeE

Solid: t → NeG ([0, t]) for whole population and subpopulationsGlobal weights: Wij = 1/s2

Local weights, subpopulation k : Wij = 1{(i ,j)=(k,k)}

Page 16: On the use of Markov chains and Perron Frobenuis …On the use of Markov chains and Perron Frobenuis Theorem in Population Genetics Ola H ossjer Dept. of Mathematics Stockholm University

Proof of (4)

We have that

ht = Wht = WAth0 = Cλmax(A)t + o(λmax(A)t

)(7)

as t →∞. But also

ht =∑

ij Wijhtij=

∑ij WijE [Xti (1− Xtj) + Xtj(1− Xti )]

= E [φ(Xt)]= E [E (φ(Xt)|X0)]

=∑

x,y π(x)P(t)(x, y)φ(y),

(8)

where

φ(x) =∑

i ,j Wij [xi (1− xj) + xj(1− xi )] ,

π(x) = P(X0 = x),

Pt =(P(t)(x, y); x, y ∈ X

).

Page 17: On the use of Markov chains and Perron Frobenuis …On the use of Markov chains and Perron Frobenuis Theorem in Population Genetics Ola H ossjer Dept. of Mathematics Stockholm University

Perron-FrobeniusBlock decompose transition matrix as

P =

1 0 00 1 0

P20 P21 P22

=⇒ Pt =

1 0 00 1 0

P(t)20 P

(t)21 Pt

22

,

Since P22 is non-negative, irreducible and aperiodic, it has uniquelargest eigenvalue λ = λ3, and therefore

P(t)(x, y) = λtr(x)l(y) + o(λt), x, y ∈ X2, (9)

wherel = (l(x); x ∈ X2)r = (r(x); x ∈ X2)′

are left and right eigenvectors of P2 with eigenvalue λ andcomponents

l(x) > 0,r(x) > 0,

(10)

for all x ∈ X2.

Page 18: On the use of Markov chains and Perron Frobenuis …On the use of Markov chains and Perron Frobenuis Theorem in Population Genetics Ola H ossjer Dept. of Mathematics Stockholm University

Proof of (4), contd.

Use (8), (9), (10) and the fact that

φ(x) = 0, x ∈ X0 ∪ X1,φ(x) > 0, x ∈ X2 (if all Wij > 0),π(x) > 0, for some x ∈ X2,

to conclude

ht = λt∑

x,y∈X2π(x)r(x)l(y)φ(y) + o(λt)

= λt∑

x∈X2π(x)r(x) ·

∑y∈X2

l(y)φ(y) + o(λt)

= Cλt + o(λt),

with C > 0. Combining this with (7), we finally deduce

λ = λmax(A).

Page 19: On the use of Markov chains and Perron Frobenuis …On the use of Markov chains and Perron Frobenuis Theorem in Population Genetics Ola H ossjer Dept. of Mathematics Stockholm University

Other topicsI Various types of structure (geographic, age, sex, combinations, . . . ).

I Computer program GESP (Olsson et al, 2015).

I Large population asymptotics:

NeE =N

C1+ o(N) = NeC + o(N) as N →∞,

where NeC is coalescence effective size5 and C1 a coalescence rate.

I Small migration asymptotics:

NeE =N

C2B+ o(B−1) as B → 0,

where B = long term rate of subpopulation change in ancestral line.

I Leading right eigenvector of A to assess subpopulationdifferentiation6

5Nordborg and Krone (2002), Sjodin et al. (2005).6FST of Wright (1943), GST of Nei (1973).

Page 20: On the use of Markov chains and Perron Frobenuis …On the use of Markov chains and Perron Frobenuis Theorem in Population Genetics Ola H ossjer Dept. of Mathematics Stockholm University

References

Hossjer, O. (2011). Coalescence theory for a general class of structured populations with fast migration. Advancesin Probability Theory 43(4), 1027-1047.

Hossjer, O., Jorde, P.E. and Ryman, N. (2013). Quasi equilibrium approximations of the fixation index underneutrality: The island model. Theoretical Population Biology 84, 9-24.

Ryman, N., Allendorf, F.W., Jorde P.E., Laikre, L. and Hossjer, O. (2013). Samples from subdivided populationsyield biased estimates of effective size that overestimate the rate of loss of genetic variation. Molecular EcologyResources 14, 87-99.

Olsson, F. Hossjer, O., Laikre, L. and Ryman, N. (2013). Characteristics of the variance effective population sizeover time using an age structured model with variable size. Theoretical Population Biology 90, 91-103.

Hossjer, O. (2014). Spatial autocorrelation for subdivided populations with invariant migration schemes.Methodology and Computing in Applied Probability 16(4), 777-810. DOI 10.1007/s11009-013-9321-3.

Hossjer, O. and Ryman, N. (2014). Quasi equilibrium, variance effective population size and fixation index formodels with spatial structure. Journal of Mathematical Biology 69(5), 1057-1128. DOI 10.1007/s00285-013-0728-9.

Hossjer, O., Olsson, F., Laikre, L. and Ryman, N. (2014). A new general analytical approach for modeling patternsof genetic differentiation and effective size of subdivided populations over time. Mathematical Biosciences 258,113-133. DOI: 10.1016/j.mbs.2014.10.001.

Hossjer, O., Olsson, F., Laikre, L. and Ryman, N. (2015). Metapopulation Inbreeding Dynamics, Effective Size andSubpopulation Differentiation - a General Analytical Approach for Diploid Organisms. Theoretical PopulationBiology 102, 40-59. DOI:10.1016/j.tpb.2015.03.006.

Hossjer, O. (2015). On the eigenvalue effective size in structured populations. To appear in Journal ofMathematical Biology. Available online, DOI 10.1007/s00285-014-0832-5.

Page 21: On the use of Markov chains and Perron Frobenuis …On the use of Markov chains and Perron Frobenuis Theorem in Population Genetics Ola H ossjer Dept. of Mathematics Stockholm University

THANKS!