the biological esteem project: linear algebra, population genetics, and microsoft excel p’ = p (pw...

38
The Biological ESTEEM Project: Linear Algebra, Population Genetics, and Microsoft Excel 0.000 0.100 0.200 0.300 0.400 0.500 0.600 0.700 0.800 0.900 1.000 0 10 20 30 40 50 60 70 80 90 100 Generatio p’ = p (pW AA + qW ) / W Anton E. Weisstein, Truman State University

Upload: charleen-benson

Post on 26-Dec-2015

213 views

Category:

Documents


0 download

TRANSCRIPT

The Biological ESTEEM Project:Linear Algebra, Population Genetics, and Microsoft Excel

0.000

0.100

0.200

0.300

0.400

0.500

0.600

0.700

0.800

0.900

1.000

0 10 20 30 40 50 60 70 80 90 100

Generation

p’ = p (pWAA + qWAS) / W

Anton E. Weisstein, Truman State University

BIO 2010:Transforming Undergraduate Education

for Future Research Biologists

National Research Council (2003)

Recommendation #2:“Concepts, examples, and techniques from mathematics…should be included in biology courses. …Faculty in biology, mathematics, and physical sciences must work collaboratively to find ways of integrating mathematics…into life science courses…”

Recommendation #1:“Those selecting the new approaches should consider the importance of mathematics...”

BIO 2010:Transforming Undergraduate Education

for Future Research Biologists

National Research Council (2003)

Specific strategies:

• A strong interdisciplinary curriculum that includes physical science, information technology, and math.

• Meaningful laboratory experiences.

Biological Topics

Spread of infectious diseases

Tree growth

Enzyme kinetics

Population genetics

Mathematical Topics

0.000

0.100

0.200

0.300

0.400

0.500

0.600

0.700

0.800

0.900

1.000

0 10 20 30 40 50 60 70 80 90 100

Generation

p

Random walks

Optimi-zation

Linear algebra

Graph theory

Unpacking “ESTEEM”

• Excel: ubiquitous, easy, flexible, non-intimidating

• Exploratory: apply to real-world data; extend & improve

• Experiential: students engage directly with the math

Three Boxes

Black box:Hide the model

? y = axb

Glass box:Study the model

y = axb

No box:Build the model!

How do students interact with the mathematical model underlying the biology?

Copyleft

• download• use• modify• share

Users may freely the software, w/proper attribution

More info available at Free Software Foundation website

3. Survival of the Slightly Better: Exploring an Evolutionary Paradox with Linear Algebra

1. Intro to Population Genetics: Hardy-Weinberg Equilibrium and the Binomial Theorem

2. Evolutionary Analysis:Microevolution, Statistics, and Stability Analysis

ˆ p =WSS −WAS

WAA − 2WAS + WSS

Synthesizing and Applying Math Concepts Using Biological Cases

DefinitionsAllele:One variant of a specific gene.

Genotype:The set of alleles carried by an individual.

Phenotype:The detectable manifestations of a specific genotype.

Example: ABO blood type

IA

IB i

IAIA Type A

IBIB Type B

ii Type O

IAi Type A

IAIB Type AB

IBi Type B

Life Cycle Gametes(eggs & sperm)

Zygotes(fertilized eggs)

Juveniles(reproductively

immature)

Adults(reproductively

mature)

Life Cycle Gametes(eggs & sperm)

Zygotes(fertilized eggs)

Juveniles(reproductively

immature)

Adults(reproductively

mature)

Life Cycle Gametes(eggs & sperm)

Zygotes(fertilized eggs)

Juveniles(reproductively

immature)

Adults(reproductively

mature)

Life Cycle Gametes(eggs & sperm)

Zygotes(fertilized eggs)

Juveniles(reproductively

immature)

Adults(reproductively

mature)

Recursion EquationsLet

x = # AA adults;y = # Aa adults;z = # aa adults.

Definep = # A gametes = x + y/2 ;q = # a gametes = y/2 + z .

Determine expected # adults of each genotype in next generation.

(For now, feel free to make any

simplifying assumptions.)

Hardy-Weinberg Equilibrium

Genotypes reach ratios

p2 : 2pq : q2

in one generation, then stay there forever!

Assumptions?

• Gametes combine at random• All individuals have equal chance of survival• Each gen. a perfectly representative sample of the previous

3. Survival of the Slightly Better: Exploring an Evolutionary Paradox with Linear Algebra

1. Intro to Population Genetics: Hardy-Weinberg Equilibrium and the Binomial Theorem

2. Evolutionary Analysis:Microevolution, Statistics, and Stability Analysis

ˆ p =WSS −WAS

WAA − 2WAS + WSS

Synthesizing and Applying Math Concepts Using Biological Cases

The Case of the Sickled Cell

• The S allele for sickle-cell anemia has a frequency of ~11% in some African populations.

• Why is it so common?

• If it provides a selective advantage, why isn’t its frequency 100%?

DefinitionsReproductive fitness:The average number of offspring produced

by an organism in a specific environment.

Examples:• Antibiotic resistance• Camouflage• Resistance to infectious diseases

Natural selection:An evolutionary mechanism

that tends to increase the freq. of traits that increase an organism’s fitness.

Source: Jeffrey Jeffords, DiveGallery.com

Selection and Sickle-CellAlleles:

A: “normal” hemoglobinS: sickle-cell hemoglobin

Genotype Fitness

AA WAA = 0.9

AS WAS = 1.0

SS WSS = 0.2

Natural selection:

Sickle-cell anemia:~20% survive to reproductive age

Malaria susceptibility: ~90% survive to reproductive age

Recursion Equationsp = # A gametes;q = # S gametes.

Life stage SS(W = 0.2)

AS(W = 1.0)

AA(W = 0.9)

Juvenile q22pqp2

Adult q2WSS2pqWASp2WAA

Zygote q22pqp2

W W W

p’ = p (pWAA + qWAS) / W

W = p2WAA + 2pqWAS + q2WSSNormalization:

Selection and Sickle-Cell

Genotype Fitness

AA WAA = 0.9

AS WAS = 1.0

SS WSS = 0.2

Biological Question:

• How will this population evolve over time?

p’ = p (pWAA + qWAS) / W

Mathematical Question:

What are the equilibria for this recursion equation?

Solving for EquilibriaSet p’ = p and solve:

p =p (pWAA + qWAS )

W

p = 0

pWAA + qWAS = W = p2WAA + 2pqWAS + q2WSSor

Substitute q = 1 – p and factor:

0 = (1− p)[WSS −WAS − p(WAA − 2WAS + WSS )]

ˆ p =1

ˆ p =WSS −WAS

WAA − 2WAS + WSSor

ˆ p = 0 or

ˆ p =0.2 −1.0

0.9 − 2 ⋅1.0 + 0.2=

−0.8

−0.9= 0.889Nontrivial solution:

Stability Analysis:NatSelDiffEqns (Tim Comar, Benedictine College)

Is q = 0.11 stable or unstable?

The Case of the Protective Protein

• HIV docks with the CCR5 surface protein present on some cells of immune system

• CCR5 32 allele partially protects against HIV infection

Peterson 1999. JYI 2: ?

The Case of the Protective Protein

• Based on genetic evidence, 32 arose ~700 years ago.

• Present in ~10% of Caucasians; largely absent in other groups. Why?

Hypothesis: May also have protected vs. plague and/or smallpox.

Biological Question:How much selective advantage must 32 have given to become so common in only 700 years?

Mathematical Question:For what fitness values does 700 years lie within the 95% CI of 32’s age?

Definitions

Examples:

• Absence of blood type B in Native Americans

• Northern elephant seal: virtually no genetic variation 100 years after near-extinction

Genetic drift:An evolutionary mechanism by which

allele frequencies change due to chance alone, independent of those alleles’ effects on fitness.

Modeling Genetic Drift

Let N = population size (constant).

Assume this pop. produces ∞ gametes:f(A) = p, f(B) =q .

But only 2N of those gametes (chosen at random) combine to form the zygotes that

develop into the next generation!

p’ = B(2N, p)12N

≈ N(p, )pq

2N

Genetic Drift as a Random Walkp’ = B(2N, p)1

2N≈ N(p, )pq

2N

0.000

0.100

0.200

0.300

0.400

0.500

0.600

0.700

0.800

0.900

1.000

0 10 20 30 40 50 60 70 80 90 100Generation

p

N = 2000

0.000

0.100

0.200

0.300

0.400

0.500

0.600

0.700

0.800

0.900

1.000

0 10 20 30 40 50 60 70 80 90 100

Generation

p

N = 200

0.000

0.100

0.200

0.300

0.400

0.500

0.600

0.700

0.800

0.900

1.000

0 10 20 30 40 50 60 70 80 90 100

Generation

p

N = 20

• Largest fluctuations in small pops.

• p = 0 and p = 1 are absorbing states

Modeling Microevolution:Deme

3. Survival of the Slightly Better: Exploring an Evolutionary Paradox with Linear Algebra

1. Intro to Population Genetics: Hardy-Weinberg Equilibrium and the Binomial Theorem

2. Evolutionary Analysis:Microevolution, Statistics, and Stability Analysis

ˆ p =WSS −WAS

WAA − 2WAS + WSS

Synthesizing and Applying Math Concepts Using Biological Cases

Sickle Cell Strikes Back!• In addition to the A and S

alleles, there is also a C allele for hemoglobin!

• C confers even stronger malaria resistance than AS but with no anemia!

• But C is found only in a few isolated populations. Why might this happen?

Extend previous analysis to 3 alleles: some surprising results!

Selection and Sickle-CellHemoglobin alleles: A, S, C

Genotype Fitness

AA 0.9

AS 1.0

AC 0.9

SS 0.2

SC 0.7

CC 1.3

Sickle-cell anemia

Malaria susceptibility

Malaria susceptibility

Mild anemia

Strong malaria resistance

C is beneficial only when common!

Selection and Sickle-Cell

p’ = p (pWAA + qWAS + rWAC) / W

q’ = q (pWAS + qWSS + rWSC) / W

r’ = r (pWAC + qWSC + rWCC) / W

Recursion Equations:

Equilibria:p = DA / D, q = DS / D, r = DC / D

where DA = (WAS – WSS)(WAC – WCC) – (WAS – WSC)(WAC – WSC)

DS = (WAS – WAA)(WSC – WCC) – (WAS – WAC)(WSC – WAC)

DC = (WAC – WAA)(WSC – WSS) – (WAC – WAS)(WSC – WAS)

D = DA + DS + DC

Plotting the Adaptive Landscape2 alleles:Landscape W(p) is a curve in R2

3 alleles:Landscape W(p, q, r)

is a sheet in R3

Constraint:p + q + r = 1

Stability Analysis1. Re-express W(p, q, r) as W(x, y)

2. Calculate Hessian matrix:

∂2W

∂x 2

∂ 2W

∂x∂y

∂ 2W

∂y∂x

∂ 2W

∂y 2

⎢ ⎢ ⎢ ⎢

⎥ ⎥ ⎥ ⎥

=T U

U V

⎣ ⎢

⎦ ⎥

where

T = 2(WAA − 2WAS + 2WSS ),

U = 2 33 (WAA −WSS + 2WAC + 2WSC ),

V = 23 (WAA + WSS + 4WCC + 2WAS − 4WAC − 4WSC ).

3. Take the determinant and apply the 2nd derivative test:

TV > U2, T > 0, T+V > 0 Local max

TV > U2, T < 0, T+V < 0 Local min

TV < U2 Saddle point

TV = U2Higher-order tests needed

Survival of the Slightly Better:DeFinetti

Global maximum:only C allele present

Local maximum:C allele eliminatedSaddle point

Cases & Mathematics:Explicit Connections

• Binomial & Normal Distributions • Combinatorics

• Equilibria & Stability Analysis • Normalization

• Recursion & Difference Eqns. • Stochasticity

• Geometry of Curves & Solids • Matrix & Linear Algebra

• Partial Derivatives