chap. 6. molecular phylogeny. charles darwin, 1859 natural selection evolution change in frequency...

42
Chap. 6. Molecular Phylogeny

Upload: fay-melton

Post on 18-Dec-2015

225 views

Category:

Documents


5 download

TRANSCRIPT

Page 1: Chap. 6. Molecular Phylogeny. Charles Darwin, 1859 Natural selection Evolution Change in frequency of genes in a population Heritable changes in a population

Chap. 6. Molecular Phylogeny

Page 2: Chap. 6. Molecular Phylogeny. Charles Darwin, 1859 Natural selection Evolution Change in frequency of genes in a population Heritable changes in a population

Charles Darwin, 1859 Natural selection

Evolution Change in frequency of genes in a population

Heritable changes in a population over many generations

Process of mutation with selectionTwo essential factors that define evolution

Error-prone self-replication Variation in success at self-replication

Evolution

Page 3: Chap. 6. Molecular Phylogeny. Charles Darwin, 1859 Natural selection Evolution Change in frequency of genes in a population Heritable changes in a population

Self-replication Whatever is evolving must have the ability to make copies of itself

Typical developments, aging etc., are not evolution

Genes can self-replicate in the context of cells that they reside in

“replicator” can self-replicateAsexual organisms like bacteria can self-replicate

Sexual organisms can replicate, but inheriting from parents

Darwin focused on genes rather than organisms as the fundamental replicators

Error-prone Self-replication

Page 4: Chap. 6. Molecular Phylogeny. Charles Darwin, 1859 Natural selection Evolution Change in frequency of genes in a population Heritable changes in a population

Error-prone Copies are not always identical to the originals

Perfect copies will not foster evolution

In fact, current genes are from gradual changes from previous versions with slight errors

Errors are essential for evolution, provided they occur not too frequently

Error-prone Self-replication

Page 5: Chap. 6. Molecular Phylogeny. Charles Darwin, 1859 Natural selection Evolution Change in frequency of genes in a population Heritable changes in a population

Cell Replication Replication

One double-strand DNA to two identical double-strand DNA’s

One mother strand is in each of two daughter DNA’s (semi-conservative replication)

Page 6: Chap. 6. Molecular Phylogeny. Charles Darwin, 1859 Natural selection Evolution Change in frequency of genes in a population Heritable changes in a population

Replication step 1 Separate the two DNA strands

At origin of replication

Page 7: Chap. 6. Molecular Phylogeny. Charles Darwin, 1859 Natural selection Evolution Change in frequency of genes in a population Heritable changes in a population

Replication step 2 Synthesize DNA from 5’ to 3’

end and at the same time 3’ to 5’ end DNA polymerase catalyzes

only in 5’ to 3’ direction in new chains

Original 3’-5’ (leading) strand continues replicating

Original 5’-3’ (lagging) strand replicate semi-discontiously at every 1000-2000 bp (Ozaki fragment)

Page 8: Chap. 6. Molecular Phylogeny. Charles Darwin, 1859 Natural selection Evolution Change in frequency of genes in a population Heritable changes in a population

Replication step 3 Proofread and repair

detect mutation, once in 104 to 105 bases

Mismatch repair in E.Coli(a)Newly synthesized DNA (red) has a mismatch (G-T).(b) MutH, MutS, and MutL link the mismatch with the nearest methylation site (blue)(c) An exonuclease removes from red strand(d) DNA polymerases replace it

Page 9: Chap. 6. Molecular Phylogeny. Charles Darwin, 1859 Natural selection Evolution Change in frequency of genes in a population Heritable changes in a population

How to find the origination/termination site ? Chargaff parity rules (CPR) -1951

# of A = # of T; # of C = # of G CPR I – double strands of DNAs

Obvious from complementary relationship

CPR II – single strand of DNA Cause is not known yet Violation is called ‘skew’ GC skew: (G-C)/(G+C)

Page 10: Chap. 6. Molecular Phylogeny. Charles Darwin, 1859 Natural selection Evolution Change in frequency of genes in a population Heritable changes in a population

GC skew

Max or min of GC skew appears at ori or ter sites

Page 11: Chap. 6. Molecular Phylogeny. Charles Darwin, 1859 Natural selection Evolution Change in frequency of genes in a population Heritable changes in a population

Oligomer skew fi : # of oligomer i in a segment

OAi = ln(fi/fi’)

Page 12: Chap. 6. Molecular Phylogeny. Charles Darwin, 1859 Natural selection Evolution Change in frequency of genes in a population Heritable changes in a population

Most organisms can increase exponentially If all organisms survived and multiplied at the same rate, there will be no change in frequency of the variants, and thus no evolution

Limited by food, space, predators, etc. When population size is limited, not all variants survive

A possibility of natural selectionAlso, chance effects exist

Equal-sized populations with two variants will not stay the same even with the same degree of fitness

Called random drift, the chance effect will take over the whole population

This implies that evolution can occur even without natural selection, referred to as neutral evolution

Variation

Page 13: Chap. 6. Molecular Phylogeny. Charles Darwin, 1859 Natural selection Evolution Change in frequency of genes in a population Heritable changes in a population

Any change in a gene sequence that is passed on to offspringCaused by

A damage to DNA moledule (from radiation, etc.) Errors in replication

Point mutation – simplest form of mutation and occurs all over DNA sequences

Transition – mutation within purine (A,G) or pyrimidine (C,T/U)

Transversion – mutation between nt groups Effects depend on where mutations occur

Non-coding region – no effect on proteins, and neutral

But may have significant effects if occurring in control region

Coding region Synonymous substitution when a mutation does not

change AA Non-synonymous

AA is replaced by another stop codon is introduced

Mutation

Page 14: Chap. 6. Molecular Phylogeny. Charles Darwin, 1859 Natural selection Evolution Change in frequency of genes in a population Heritable changes in a population

Models of nucleotide substitution

A G

T C

transition

transition

transversiontransversion

Page 15: Chap. 6. Molecular Phylogeny. Charles Darwin, 1859 Natural selection Evolution Change in frequency of genes in a population Heritable changes in a population

A

Jukes and Cantor one-parameter model of nucleotide substitution (=)

G

T C

Page 16: Chap. 6. Molecular Phylogeny. Charles Darwin, 1859 Natural selection Evolution Change in frequency of genes in a population Heritable changes in a population

A

Kimura model of nucleotide substitution (assumes ≠ )

G

T C

Page 17: Chap. 6. Molecular Phylogeny. Charles Darwin, 1859 Natural selection Evolution Change in frequency of genes in a population Heritable changes in a population

Jukes-Cantor (JC) Kimura 2P Tamura

Page 18: Chap. 6. Molecular Phylogeny. Charles Darwin, 1859 Natural selection Evolution Change in frequency of genes in a population Heritable changes in a population

Indel mutation Small indels of a single base of a few bases are frequent

Caused by slippage during DNA replication Particularly frequent with repeated sequences

GCGC…: insertion of extra GC or deletion cause slight slippage

CAG repeated region in huntingtin protein can expand, causing Huntington’s disease

Indels can cause frame shift, if indels are not multiples of three

Gene inversion Whole genes are copied to offspring in reverse direction

Translocation Whole genes can be deleted from one genome and inserted into another

Mutation

Page 19: Chap. 6. Molecular Phylogeny. Charles Darwin, 1859 Natural selection Evolution Change in frequency of genes in a population Heritable changes in a population

Orthologs:members of a gene (protein)family in variousorganisms.This tree showsglobin orthologs.

Mutation Example

Page 20: Chap. 6. Molecular Phylogeny. Charles Darwin, 1859 Natural selection Evolution Change in frequency of genes in a population Heritable changes in a population

Paralogs: members of a gene (protein) family within aspecies. This tree shows human globin paralogs.

Page 21: Chap. 6. Molecular Phylogeny. Charles Darwin, 1859 Natural selection Evolution Change in frequency of genes in a population Heritable changes in a population

Globin phylogeny by Dayhoff (1972)

Page 22: Chap. 6. Molecular Phylogeny. Charles Darwin, 1859 Natural selection Evolution Change in frequency of genes in a population Heritable changes in a population

Globin phylogeny by Dayhoff in evolutionary time (1972)

Page 23: Chap. 6. Molecular Phylogeny. Charles Darwin, 1859 Natural selection Evolution Change in frequency of genes in a population Heritable changes in a population
Page 24: Chap. 6. Molecular Phylogeny. Charles Darwin, 1859 Natural selection Evolution Change in frequency of genes in a population Heritable changes in a population

Mature insulin consists of an A chain and B chainheterodimer connected by disulphide bridges

The signal peptide and C peptide are cleaved,and their sequences display fewerfunctional constraints.

Page 25: Chap. 6. Molecular Phylogeny. Charles Darwin, 1859 Natural selection Evolution Change in frequency of genes in a population Heritable changes in a population
Page 26: Chap. 6. Molecular Phylogeny. Charles Darwin, 1859 Natural selection Evolution Change in frequency of genes in a population Heritable changes in a population

Note the sequence divergence in the disulfide loop region of the A chain

Page 27: Chap. 6. Molecular Phylogeny. Charles Darwin, 1859 Natural selection Evolution Change in frequency of genes in a population Heritable changes in a population

Historical background: insulinBy the 1950s, it became clear that amino acid substitutions occur nonrandomly.

For example, Sanger and colleagues noted that most amino acid changes in the insulin A chain are restricted to a disulfide loop region.

Such differences are called “neutral” changes (Kimura, 1968; Jukes and Cantor, 1969)

Subsequent studies at the DNA level showed that rate of nucleotide (and of amino acid) substitution is about six-to ten-fold higher in the C peptide, relative to the A and B chains.

Page 28: Chap. 6. Molecular Phylogeny. Charles Darwin, 1859 Natural selection Evolution Change in frequency of genes in a population Heritable changes in a population

Number of nucleotide substitutions/site/year

0.1 x 10-9

0.1 x 10-91 x 10-9

Page 29: Chap. 6. Molecular Phylogeny. Charles Darwin, 1859 Natural selection Evolution Change in frequency of genes in a population Heritable changes in a population

Surprisingly, insulin from the guinea pig (and from the related coypu) evolve seven times faster than insulinfrom other species. Why?

The answer is that guinea pig and coypu insulindo not bind two zinc ions, while insulin molecules frommost other species do. There was a relaxation on thestructural constraints of these molecules, and so the genes diverged rapidly.

Historical background: insulin

Page 30: Chap. 6. Molecular Phylogeny. Charles Darwin, 1859 Natural selection Evolution Change in frequency of genes in a population Heritable changes in a population

Guinea pig and coypu insulin have undergone anextremely rapid rate of evolutionary change

Arrows indicate positions at which guinea pig insulin (A chain and B chain) differs from both human and mouse

Page 31: Chap. 6. Molecular Phylogeny. Charles Darwin, 1859 Natural selection Evolution Change in frequency of genes in a population Heritable changes in a population

In the 1960s, sequence data were accumulated forsmall, abundant proteins such as globins,cytochromes c, and fibrinopeptides. Some proteinsappeared to evolve slowly, while others evolved rapidly.

Linus Pauling, Emanuel Margoliash and others proposed the hypothesis of a molecular clock:

Molecular clock hypothesis

For every given protein, the rate of molecular evolution is approximately constant in all evolutionary lineages

Page 32: Chap. 6. Molecular Phylogeny. Charles Darwin, 1859 Natural selection Evolution Change in frequency of genes in a population Heritable changes in a population

Millions of years since divergence

corr

ecte

d a

min

o a

cid

ch

ang

es

per

100

res

idu

es (m

)

Dickerson (1971)

Page 33: Chap. 6. Molecular Phylogeny. Charles Darwin, 1859 Natural selection Evolution Change in frequency of genes in a population Heritable changes in a population

If protein sequences evolve at constant rates,they can be used to estimate the times that sequences diverged. This is analogous to datinggeological specimens by radioactive decay.

Molecular clock hypothesis: implications

Page 34: Chap. 6. Molecular Phylogeny. Charles Darwin, 1859 Natural selection Evolution Change in frequency of genes in a population Heritable changes in a population
Page 35: Chap. 6. Molecular Phylogeny. Charles Darwin, 1859 Natural selection Evolution Change in frequency of genes in a population Heritable changes in a population

A

B

C

D

E

F

G

HI

time

6

2

1 1

2

1

2

6

1

2

2

1

A

BC

2

1

2

D

Eone unit

Molecular phylogeny uses trees to depict evolutionaryrelationships among organisms. These trees are basedupon DNA and protein sequence data.

Page 36: Chap. 6. Molecular Phylogeny. Charles Darwin, 1859 Natural selection Evolution Change in frequency of genes in a population Heritable changes in a population

Population GeneticsGenealogical Tree

Evolution tree of a gene without recombination (mtDNA, chromosome)

Given the current generation, can trace back to a single copy of the gene – coalescence process

Example Human mtDNA is traced back to African woman 200,000 years ago (1996)

Page 37: Chap. 6. Molecular Phylogeny. Charles Darwin, 1859 Natural selection Evolution Change in frequency of genes in a population Heritable changes in a population

Coalescence ModelAssumptions

Constant population of N throughout time Each individual is equally fit (same expected number of offspring) – equally likely to have any of the individuals in the previous generation as mother

Pick two individuals in the present generation Prob. of having the same mother = 1/N

Prob. that their most recent common ancestor lived T generations ago

P(T) = (1 - 1/N)T-1 (1/N) ≈ e-T/N / N Coalescence of the lines of descent of any two individuals is exponentially distributed with the mean time until coalescence of N generations

Page 38: Chap. 6. Molecular Phylogeny. Charles Darwin, 1859 Natural selection Evolution Change in frequency of genes in a population Heritable changes in a population

CoalescenceMitochondrial Eve

Used highly variable non-coding part, called D-loop

The average # of site with difference: 61.1 out of 16,553 bases

The average pairwise difference is 76.7 between Africans, and 38.5 between non-Africans

There have been different divergent population in Africa for much longer

Relatively small population left African and spread through the rest of the world

The earliest branch point – 170,000 ± 50,000

Non-African migration – 52,000 ± 27,000

Page 39: Chap. 6. Molecular Phylogeny. Charles Darwin, 1859 Natural selection Evolution Change in frequency of genes in a population Heritable changes in a population

Purple/Green – all Africans

Yellow/blue – non-Africans

Page 40: Chap. 6. Molecular Phylogeny. Charles Darwin, 1859 Natural selection Evolution Change in frequency of genes in a population Heritable changes in a population
Page 41: Chap. 6. Molecular Phylogeny. Charles Darwin, 1859 Natural selection Evolution Change in frequency of genes in a population Heritable changes in a population

Fixation in Neutral ModelMutation 1 does not survive to the present generationMutation 2 has a chance to spread to the entire population (fixed)Most mutation die outIf a mutation is neutral, the prob. of becoming fixed, Pfix ?

Assume N copies of a gene and that each one is equally likely to mutate

Prob. that mutation occurred in the gene copy of an ancestor of the present generation is 1/N = pfix

New mutation takes place with the prob. of u Rate of new fixation of new mutations is the rate at which mutations occur, multiplied by the prob. that each mutation is fixed:

ufix = (Nu)*pfix = u Shows that the rate of fixation of neutral mutations is equal to the underlying mutation rate and is independent of the population size

Page 42: Chap. 6. Molecular Phylogeny. Charles Darwin, 1859 Natural selection Evolution Change in frequency of genes in a population Heritable changes in a population

Fixation in Neutral ModelNumber of mutation in the population changes on a random basis

If m copies of a neutral mutant sequence at one generation,

The number of copies at the next generation, n ≈ m

Wright-Fisher model Each copy of the gene in

the next generation is randomly selected from genes in the previous generation

Mutation prob. a = m/N, prob. of no mutation = 1-a

Prob. of n mutations in the next generation, p(n) = CN

nan(1-a)N-n

The mean value: Na = m Simulation with N=200 with

2,000 generations