ejercicios de alineamiento de secuencias: clustalw insertar secuencias de fasta

Post on 16-Jan-2016

140 Views

Category:

Documents

2 Downloads

Preview:

Click to see full reader

DESCRIPTION

Ejercicios de alineamiento de secuencias: CLUSTALW insertar secuencias de FASTA. Pedir alineamiento múltiple. Analizar resultado. Regiones conservadas y variables en proteinas. Codones y aminoácidos. The 20 amino acids have overlapping properties. Small change. big change. - PowerPoint PPT Presentation

TRANSCRIPT

Ejercicios de alineamiento de secuencias: CLUSTALW insertar secuencias de FASTA

Pedir alineamiento múltiple

Analizar resultado

Regiones conservadas y variables en proteinas

Codones y aminoácidos

The 20 amino acids have overlapping properties

Small change

big change

Relationship between physico-chemical difference and relative substitution frequency

Physico-chemical difference

Rel

ativ

e su

bstit

utio

n fr

eque

ncy

Drastic changes are infrequent

Minor changes are more frequent

Kimura (1983) The neutral theory of molecular evolution.

Pseudogenes as a paradigm of neutral evolution

Li, Gojobori and Nei (1981) Nature 292: 237-239

Pseudogenes show an extremely high rate of nucleotide substitution.

Conservation in a ‘typical’ gene

Start of transcription Polyadenylation site

Splice sitesStart of translation

On the basis of 3,165 human-mouse pairsMGSC Nature (2002) 420 520-562

Degeneracy of the Genetic Code

Colors represent amino acids

Each of the 61 sense codons can mutate in 9 different ways 134 of the 549 possible changes are synonymous

nonsynonymous

synonymous

Synonymous changes can be neutral mutations

King, J. L., and Jukes, T. H. 1969. Non-Darwinian evolution, Science 164, 788-798.

• If most DNA changes were due to adaptive evolution than one would imagine that most changes would occur in the first and second codon positions.

• If DNA divergence includes neutral mutations, then the third position should change more rapidly because synonymous mutations are more likely to be neutral.

The first 220 nucleotides of human and mouse renin binding protein

The third position of all codons are marked

Of the 31 changes:4 - 1st position4 - 2nd position23 - 3rd position

Preponderance of changes in the 3rd position

Estimating separately the rate of synonymous change andnon-synonymous change

• KS = number of Synonymous substitutions per synonymous site

• KA = number of non-synonymous (Altering) substitutions per non-synonymous site

One way of estimating Ks and Ka would be to examine each change individually and check if it is synonymous or not. In the following we present a method for doing this in a systematic manner.

Nucleotide sites can be classified into 3 types of degenerate sites

4-folddegenerate – changes of this nucleotide relate to 4 codons for the same AA

2-foldDegeneratechanges of this nucleotide relate to pairs of codons for the same AA

0-folddegenerate -no change at this nucleotideleaves coding for the same AA

Synonymous - Altering(AA = amino acids)

4-fold degenerate sites are found in 32 of the 3rd position of 61 codon sites

2-fold degenerate sites are found in 25 of the 3rd positionsand 8 of the 1st position

0-fold degenerate sites are found in 2nd position sites of all codons (61) and in of 53 of the 1st position sites

Classify each site in a sequence according to the degeneracy of the sites.

002

002

002

002002

002002

002

002

002

002

002

002

002

002

002

202

202

204

204

004

004

004

004

004

004

002

002

002

002

202

202

204

204

004

004

004

004

004

004

004

004

004

004

004

004

004

004

004

004

004

004

004

004

004

004

- - -

- - -

- - -

000

000

002

002

002

000002002002204002004204002004000002002004004004002002204002004004004

000002002002204002004204002004000002002004004002002002204002004004002

Classify each site in a sequence according to the degeneracy of the sites.

L0= (45+45)/2 = 45L2= (13+15)/2 = 14L4= (10+8)/2 = 9

Counting the number of 4-,2-,0-fold sites(taking the average between the two sequences)

Classify the differences with another sequence as a. transition (S) or transversion (V)b. degeneracy (0,2,4)

0-fold 2-fold 4-fold

transition S0 S2 S4

transversion V0 V2 V4

The key simplification is the special relationship between transition/transversion and degeneracy:

0-fold 2-fold 4-fold

transition S0 S2 S4

transversion V0 V2 V4

Synonymous mutations

Non-synonymous mutations

)Exceptions: 1st position of arginine (CGA,CGG,AGA,AGG), last position of isoleucine (AUU, AUC, AUA).(

A G

TC

= transitions

= transversions

We distinguish between transitions and transversions according to the Kimura model

Use Kimura’s 2-parameter model to estimate the numbers of transitions (Ai) and transversions (Bi) per i-th type site.

Calculate the proportions of transitional and transversional differences:Pi = Si/Li (12/70)Qi = Vi/Li (3/70)

Kimura model is used to correct for multiple hits:

Ai = (1/2) ln (1/(1- 2Pi – Qi)) – (1/4) ln (1/(1- 2Qi))Bi = (1/2) ln (1/(1- 2Qi))

(~6 times more transitions than transversions)

(0.242)(0.045)

The Kimura model is similar to the Jukes-Cantor model (from the previous lecture) but also takes into consideration that transitions and transversions occur at different frequencies

Relationship between the number of nucleotide substitutions and the difference in the year of isolation for the H3 hemagglutinin gene of human influenza A viruses. All sequence comparisons were made with the strain isolated in 1968.

The Molecular Clock of Viral Evolution

Gojobori et al. 1990 PNAS 87 10015-10018

Different rates

top related