picormatics today’s goal: give you an overview of some recent technological bioinformatics...

30
Picormatics Today’s goal: Give you an overview of some recent technological bioinformatics developments that can be applied to picornaviruses. Where possible in less than a day's work, I have applied those techniques, as an example, to 'my' virus: R14. This seminar is available (without © ) from: http://swift.cmbi.ru.nl/gv/seminars/

Post on 19-Dec-2015

213 views

Category:

Documents


0 download

TRANSCRIPT

Picormatics

Today’s goal:

Give you an overview of some recent technological bioinformatics developments that can be applied to picornaviruses.

Where possible in less than a day's work, I have applied those techniques, as an example, to 'my' virus: R14.

This seminar is available (without © ) from:http://swift.cmbi.ru.nl/gv/seminars/

Some notes up front

Your community is not very WWW oriented.

This is concluded from a low number of cross pointers, high numbers of dead links and incomplete sites, and from a lack of update dates,contact addresses, references, etc.

Your community is not very bioinformatics oriented either.Example, www.iah.bbsrc.ac.uk holds a beautifully complete list of VP1 sequences, one-by-one....

Your 'simple' bioinformatics options

Many protein structuresMany protein sequences

This allows for structure based sequence alignments that are very precise, and therefore allow for novel sequence analysis techniques

1) Correlated mutation analysis2) Sequence variability analysis

Structure-based alignment

This is top left corner of alignment of ~1000 sequences of ~300 residues

Correlated mutations

APGADSFGDFHKM Gray is conservedALGADSFRDFRRL Black is variableARGLDPFGMNHSI Red/green areAGGLDPFRMNRRV correlated mutations

Correlated mutations guarantee a function.

Function is determined by the position in the structure; not by the residue type.

Correlated mutations

Pilot indicates this works for VP1,2,3 too.

Correlated mutations and drug design

Correlated mutations and drug design

Correlations between residues and ligands.

Correlated mutations and drug design

Automatic structure comparison

Rhino 9Polio 2FMDV 1Mengo 1

Automatic structure comparison

Automatic structure comparison

Automatic structure comparison

R14 drug placed in R16

antagonistagonist

Automatic structure comparison

Example from nuclear hormone receptor drug design study

Back to sequences

First rule of sequence analysis:

If a residue is conserved, it is important.

Sequence analysis (continued)

Second rule of sequence analysis:

If a residue is very conserved, it is very important.

But what about the variable residues

20

Ei = pi ln(pi) i=1

Sequence variability is the number of residues that is present in more than 0.5% of all sequences.

But what about the variable residues

Entropy - Variability

Entropy = Information Variability = Chaos

11 main function

12 first shell around main function

22 core residues (signal transduction)

23 modulator

33 mainly surface

Entropy - Variability

Most information about mutations is carefully hidden in the literature.Automatic extraction of this information is no longer science-fiction.More than 90% of the 2226 mutations used for the previous few slides were extracted automatically from the literature. We extracted160 more mutations 'by hand'.

Problems are mainly related to protein/gene nomenclature, residue numbering, and unclear description of the effects.

Mutation information

Mutation information

Mutation information

Mutation dataDiseases

0%

10%

20%

30%

40%

50%

60%

Box 11 Box 12 Box 22 Box 23 Box 33

Transcription

0%

5%

10%

15%

20%

Box 11 Box 12 Box 22 Box 23 Box 33

Coregulator

0%

10%

20%

30%

40%

Box 11 Box 12 Box 22 Box 23 Box 33

Dimerization

0%

10%

20%

30%

40%

Box 11 Box 12 Box 22 Box 23 Box 33

Ligand binding

0%

10%

20%

30%

Box 11 Box 12 Box 22 Box 23 Box 33

No effect

0%

1%

2%

3%

4%

5%

6%

Box 11 Box 12 Box 22 Box 23 Box 33

No mutations

0%

5%

10%

15%

20%

25%

Box 11 Box 12 Box 22 Box 23 Box 33

Mutation data

A PubMed search gives:

picornavirus mutation 1176 (2)rhinovirus mutation 101 (62)poliovirus mutation 600 (144)mengovirus mutation 30 (29)

About 1 in 5 (in a small manually checked subset) contained identifiable mutation information in the abstract. But unfortunately often with nomenclature that 'our' software doesn't understand yet.

Picorna mutation information

Now something totally different

Motion is the main ingredient for protein function. Even if that function is as 'dumb' as being a container for the RNA.

For example, all early Rhino directed drugs were aimed at reducing the mobility of its VP1…

The simulation of protein motion is normally called molecular dynamics, or MD.MD is commonly known as a very difficult technique for which you need the help of an army of mathematicians.That is no longer true.Dynamite (based on Bert de Groot's CONCOORD software) predicts protein motions via the WWW.

Protein dynamics calculation

Protein dynamics calculation

A short break for a word from our sponsors

LaerteOliveira

Our industrial sponsor:

FLORENCE

HORN

Wilma Kuipers Weesp Bob Bywater CopenhagenNora vd Wenden The HagueMike SingerNew HavenAd IJzermanLeidenMargot Beukers LeidenFabien Campagne New YorkØyvind Edvardsen TromsØ

Simon Folkertsma FrisiaHenk-Jan Joosten WageningenJoost van Durma BrusselsDavid Lutje Hulsik UtrechtTim Hulsen GoffertManu Bettler Lyon

Elmar

Krieger

Simon Folkertsma

David

Tim

Adje Margot

FabienManu