darwin’s magic: evolutionary computation in nanoscience, bioinformatics and systems &...

86
Darwin’s Magic: Evolutionary Computation in Nanoscience, Bioinformatics, Systems & Synthetic Biology Prof. Natalio Krasnogor Automated Scheduling, Optimisation and Planning Research Group School of Computer Science, University of Nottingham www.cs.nott.ac.uk/~nxk twitter.com/NKrasnogor Page 1 of 86 IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011

Upload: natalio-krasnogor

Post on 12-May-2015

1.449 views

Category:

Education


0 download

DESCRIPTION

In   this   talk   I   will   overview   ten   years of   research   in   the  application  of  evolutionary  computation  ideas  in  the  natural   sciences.    The  talk  will  take  us  on  a  tour  that  will  cover  problems   in   nanoscience,   e.g.   controlling   self-­‐organizing   systems,   optimizing   scanning   probe   microscopy,   etc.,   problems   arising   in   bioinformatics,   such   as   predicting   protein   structures   and   their   features,   to   challenges   emerging   in   systems   and   synthetic   biology.     Although   the   algorithmic   solutions   involved   in   these   problems  are  different  from  each  other,  at  their  core,  they  retain   Darwin’s   wonderful   insights.     I   will   conclude   the   talk   by   giving   a   personal   view   on   why   EC   has   been   so   successful   and   where,   in   my  mind,  the  future  lies.

TRANSCRIPT

Page 1: Darwin’s Magic: Evolutionary Computation in Nanoscience, Bioinformatics and Systems & Synthetic Biology

Darwin’s   Magic:     Evolutionary   Computation  

in   Nanoscience,  Bioinformatics,  Systems &

Synthetic Biology  

Prof. Natalio Krasnogor

Automated Scheduling, Optimisation and Planning Research Group

School of Computer Science, University of Nottingham

www.cs.nott.ac.uk/~nxk

twitter.com/NKrasnogor

Page 1 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011

Page 2: Darwin’s Magic: Evolutionary Computation in Nanoscience, Bioinformatics and Systems & Synthetic Biology

Outline

• Darwin’s Magic and Algorithmic Beauty

• Evolutionary Computation in the Natural Sciences– Self-Assembly and Scanning Probe Microscopy Optimisation– Structural Bioinformatics– Systems Biology & Synthetic Biology

• On Invariants, Decorations and the Future• Conclusions

Page 2 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011

Page 3: Darwin’s Magic: Evolutionary Computation in Nanoscience, Bioinformatics and Systems & Synthetic Biology

Outline• Darwin’s Magic and Algorithmic Beauty

• Evolutionary Computation in the Natural Sciences– Self-Assembly and Scanning Probe Microscopy

Optimisation– Structural Bioinformatics– Systems Biology & Synthetic Biology

• On Invariants, Decorations and the Future• Conclusions

Page 3 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011

Page 4: Darwin’s Magic: Evolutionary Computation in Nanoscience, Bioinformatics and Systems & Synthetic Biology

Darwin’s Magic

Page 4 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011

Thank you Youtube

Page 5: Darwin’s Magic: Evolutionary Computation in Nanoscience, Bioinformatics and Systems & Synthetic Biology

Algorithmic Beauty1. Inheritable Instructions Set

2. Limited Resources

3. Imperfect Replication

A Powerful Secondary Effect: Selection

An awe inspiring product:

Evolution by Natural Selection

Page 5 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011

Page 6: Darwin’s Magic: Evolutionary Computation in Nanoscience, Bioinformatics and Systems & Synthetic Biology

Outline• Darwin’s Magic and Algorithmic Beauty

• Evolutionary Computation in the Natural Sciences– Self-Assembly and Scanning Probe Microscopy

Optimisation– Structural Bioinformatics– Systems Biology & Synthetic Biology

• On Invariants, Decorations and the Future• Conclusions

Page 6 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011

Page 7: Darwin’s Magic: Evolutionary Computation in Nanoscience, Bioinformatics and Systems & Synthetic Biology

Evolutionary Computation in the Natural Sciences

• A Research VisionProgrammable algorithmic entry to the vast world of nanoscale physical, chemical & biological systems and processes

Algorithmic and Artificial Living Matter (ALMA)

Com

pute

r Sci

ence Embedded behavior

Information & Algorithms Complexity Robustness Tradeoffs

How does “The Logistics of Small Things” look like?

How (?) do you gain algorithmic entry into

Page 7 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011

Page 8: Darwin’s Magic: Evolutionary Computation in Nanoscience, Bioinformatics and Systems & Synthetic Biology

The Spatial Scales Involved

Page 8 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011

Page 9: Darwin’s Magic: Evolutionary Computation in Nanoscience, Bioinformatics and Systems & Synthetic Biology

ALMA & The Logistics of Small ThingsHow do you program complex nano/micro scale process :

• through billions of tiny & simple distributed programs/processors?

• when there is no clear distinction between hardware and software?

• when the wetware is not simply a stochastic program:

• when wetware is poorly characterised and is likely to evolve, etc.

function f1(p1,p2,p3,p4){ if (p1<p2) and (rand<0.5)

print p3 else

print p4}

function f1(p1,p2,p3,p4){ if (p1<p2)

RND print p3

RND else

RNDprint p4RND

}

function f1(p1,p2,p3,p4){ if (p1<p2)

RND print p3

RND else

RNDprint p4RND

}

function f1(p1,p2,p3,p4){ if (p1<p2)

RND incr p3

RND else

RNDdecr p4RND

}Page 9 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011

Page 10: Darwin’s Magic: Evolutionary Computation in Nanoscience, Bioinformatics and Systems & Synthetic Biology

Outline• Darwin’s Magic and Algorithmic Beauty

• Evolutionary Computation in the Natural Sciences– Self-Assembly and Scanning Probe Microscopy

Optimisation– Structural Bioinformatics– Systems Biology– Synthetic Biology

• On Invariants, Decorations and the Future• Conclusions

Page 10 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011

Page 11: Darwin’s Magic: Evolutionary Computation in Nanoscience, Bioinformatics and Systems & Synthetic Biology

The Spatial Scales Involved

Page 11 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011

Page 12: Darwin’s Magic: Evolutionary Computation in Nanoscience, Bioinformatics and Systems & Synthetic Biology

Molecular Tiles & Programmable Self-Assembly

Algorithmic Self-Assembly of DNA Sierpinski Triangles. P.W.K. Rothemund, N. Papadakis, E. Winfree. PLoS Biology 2:12 (2004)

Page 12 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011

Page 13: Darwin’s Magic: Evolutionary Computation in Nanoscience, Bioinformatics and Systems & Synthetic Biology

Tiles System

Supra-structure

Which is the correct input ?

• Finite size square lattice (300x300)

• Fixed T = 4

10 tiles

10 tiles

How can we automatically design a tile system that self-assembles into a target shape?

c1

c4

c3

c0

c0 c1 c2 c3 c4

3 7 2 8 1

7 6 7 0 1

2 7 0 5 3

8 0 5 4 1

1 1 3 1 9

c5 c6 c7 c8 c9

c7

c6

c9

c8

c5

2 3 7 6 5

5 2 3 5 2 4 1 2 3 9

0 4 5 2 3 1 0 6 5 1

6 2 7 7 7 2 6 8 2 9

3 8 3 3 6 3 5 2 7 8

9 7 1 1 5 9 1 9 8 1

5 2 7 3 1

3 5 7 1 6

2 4 2 8 7

5 0 6 3 9

c2

Glue strength matrix M

Evolving tiles for automated self-assembly design. G. Terrazas, M. Gheorghe, G. Kendall, and N. Krasnogor. Proceedings for the 2007 IEEE Congress on Evolutionary Computation, 2007. Best paper award.

Toward minimum size self-assembled counters by P. Moisset de Espanes, A. Goel. Nat Comput (2008) 7:317–334

Page 13 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011

Page 14: Darwin’s Magic: Evolutionary Computation in Nanoscience, Bioinformatics and Systems & Synthetic Biology

Tiles with deterministic assembly (Model 1)

Tiles with probabilistic assembly (Model 2)

Page 14 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011

Page 15: Darwin’s Magic: Evolutionary Computation in Nanoscience, Bioinformatics and Systems & Synthetic Biology

Phenotype – Fitness Mapping

Minkowski functionals (A, P, X)

A = 12P = 24X = 0

A = 100P = 40X = 1

Evolutionary Design ApproachVariable length individuals

(Genotype) Genotype -Phenotype Mapping

Phenotype

Randomly created Wang tiles

Bitwise mutation

Vs

One-point crossover

Population size = 100, Individuals length = [1,10], Generations = 300, Pcrossover = 0.7, PMutation = 0.1/0.05/0.01

Page 15 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011

Page 16: Darwin’s Magic: Evolutionary Computation in Nanoscience, Bioinformatics and Systems & Synthetic Biology

Probabilistic Assembly +No Rotation

Probabilistic Assembly +Rotation

Deterministic Assembly +Rotation

Deterministic Assembly +No Rotation

Page 16 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011

Page 17: Darwin’s Magic: Evolutionary Computation in Nanoscience, Bioinformatics and Systems & Synthetic Biology

Two-tile self-assembly Three-tile self-assembly Four-tile self-assembly Five-tile self-assembly

We calculated the equivalence classes of binding pockets defined by “bp1 R bp2 iif NAFE(bp1)=NAFE(bp2)” for the best tile set.

We observed that equivalence classes with NAFE smaller than T are highly likely to participate in the self-assembly process as these are more populous.

More “assembable” binding pockets = Generalised Secondary Structures

How Does Self-Assembly Gets Programmed?

Page 17 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011

Page 18: Darwin’s Magic: Evolutionary Computation in Nanoscience, Bioinformatics and Systems & Synthetic Biology

Triangular site lattice

Physical events to capture1. Adsorption: tiles are placed on the substrate at a given

rate2. Diffusion: tiles move or rotate from one position to another

allowing:• Separation from one or more tiles• Motion along a line of tiles• Motion without interaction

3. Diffusion across terraces on the substrate4. Intramolecule strength: energy between two no-

functionalised porphyrins 5. Molecule-substrate strength: energy of a porphyrin to the

substrate6. Rotational strength: molecule-substrate strength for

spinning

Neighbourhood size 6

Neighbourhood size 8

20

0

3

42

DNA Tiles are Too Big!

Page 18 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011

Page 19: Darwin’s Magic: Evolutionary Computation in Nanoscience, Bioinformatics and Systems & Synthetic Biology

How Do You Image and Manipulate at This Scale?

D.L. Keeling et al. Phys. Rev. Lett 94, 146104 (2005)

Y. Sugimoto et al., Nature letters 446, 64 (2007).

Hla et al. Phys. Rev. Lett. 85, 2777–2780 (2000)

C60

D. M. Eigler & E. K. Schweizer, Nature 344, 524 - 526 (1990)

Page 19 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011

Page 20: Darwin’s Magic: Evolutionary Computation in Nanoscience, Bioinformatics and Systems & Synthetic Biology

Even 3 Variable Problems are Difficult: Optimising a Scanning Probe Microscopy

The tunnel current it is highly dependant on the tip-sample distance, d. This current can be maintained with a feedback loop, G, that actively controls the tip-sample distance.

it exp(−2kd)∝

AScanning tip

Sample surface

X

Z

Y

Axis under direct (piezo) control

G i

V

http://www2.fz-juelich.de/ibn/index.php?index=1021

Page 20 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011

Page 21: Darwin’s Magic: Evolutionary Computation in Nanoscience, Bioinformatics and Systems & Synthetic Biology

Understanding the image

J. H. A. Hagelaar et al. PRB 78, 161405R 2008 L.Gross et al. Science 325 1110 (2009)

Page 21 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011

Page 22: Darwin’s Magic: Evolutionary Computation in Nanoscience, Bioinformatics and Systems & Synthetic Biology

(Un)Stable and (Un)defined Tip States

Imaging problems, spontaneous tip changes

Page 22 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011

Page 23: Darwin’s Magic: Evolutionary Computation in Nanoscience, Bioinformatics and Systems & Synthetic Biology

Two Stage Automation Process

Ex-situIn-situ

Voltage pulsing (deliberate crash) Fine tuning

(changing scan parameters)

Automated probe microscopy via evolutionary optimisation at the atomic scale. R. Woolley, J. Sterling, A. Radocea, N. Krasnogor and P. Moriarty. Applied Physical Letters (to appear)

Cellular GA with Smart Initialisation

Page 23 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011

Page 24: Darwin’s Magic: Evolutionary Computation in Nanoscience, Bioinformatics and Systems & Synthetic Biology

Stage 1: Smart Initialisation (coarsely) Conditions the Probe

Streaky Image. Executing cleaning pulse

Cloudy Image. Executing cleaning pulse

Flat Surface. Zooming in to 50nm

Flat Surface. Zooming in to 20nm

Constant Atomic resolution. Zooming in to 4nm

Poor Atomic resolution.Rescanning

Consistent fair atomic resolution. Stage 1 complete. Time elapsed: 1010.1902 (~17mins)

A deterministic approach

Page 24 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011

Page 25: Darwin’s Magic: Evolutionary Computation in Nanoscience, Bioinformatics and Systems & Synthetic Biology

Stage 2: Fine adjustment with CGAStarting image

Machine Optimised

GiV

GiV

GiV

GiV

GiV

Cellular GA

Page 25 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011

Page 26: Darwin’s Magic: Evolutionary Computation in Nanoscience, Bioinformatics and Systems & Synthetic Biology

Do I really need a cGA? Would a stochastic selection be just as good?

•Standard deviation is from the ‘noise’ of the GA

•RMI average 0.12Insets: 1x1nm2

(a) before cGA, (b) optimised.

•Stochastic selection of parameters, average RMI 0.01

Page 26 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011

Page 27: Darwin’s Magic: Evolutionary Computation in Nanoscience, Bioinformatics and Systems & Synthetic Biology

How Does it Compares to an Expert Operator?

Microscopist Ave. RMI

Change in RMI/min

i-SPM 0.20 7.1Human 0.09 2.6

Page 27 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011

Page 28: Darwin’s Magic: Evolutionary Computation in Nanoscience, Bioinformatics and Systems & Synthetic Biology

Outline• Darwin’s Magic and Algorithmic Beauty

• Evolutionary Computation in the Natural Sciences– Self-Assembly and Scanning Probe Microscopy

Optimisation– Structural Bioinformatics– Systems Biology & Synthetic Biology

• On Invariants, Decorations and the Future• Conclusions

Page 28 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011

Page 29: Darwin’s Magic: Evolutionary Computation in Nanoscience, Bioinformatics and Systems & Synthetic Biology

The Spatial Scales Involved

Page 29 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011

Page 30: Darwin’s Magic: Evolutionary Computation in Nanoscience, Bioinformatics and Systems & Synthetic Biology

Primary Sequence 3D Structure

Protein Folding & Structure Prediction

Protein Structure Prediction (PSP) aims to predict the 3D structure of a protein based on its primary sequence

(perhaps disregarding the folding process)

Anfinsen’s thermodynamic hypothesis [Anfinsen 1973, Dill and Chan 1997]

Page 30 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011

Page 31: Darwin’s Magic: Evolutionary Computation in Nanoscience, Bioinformatics and Systems & Synthetic Biology

Defining and Predicting Useful Features

Contact M. Stout, J. Bacardit, J. Hirst & N. Krasnogor, Bioinformatics 2008

24(7):916-923.

M. Stout, J. Bacardit, J.D. Hirst, R.E Smith, and N. Krasnogor. Prediction of topological contacts in proteins using learning classifier systems. Journal Soft Computing - A Fusion of Foundations, Methodologies and Applications, 13(3):245-258, 2008.

Page 31 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011

Page 32: Darwin’s Magic: Evolutionary Computation in Nanoscience, Bioinformatics and Systems & Synthetic Biology

Integrating Multiple Prediction Sources

1. Prediction of Secondary structure (using PSIPRED) Solvent Accessibility Recursive Convex Hull Coordination Number

2. Integration of all these predictions plus other sources of information

3. Final CM prediction (using BioHEL)

Using BioHEL

Page 32 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011

Page 33: Darwin’s Magic: Evolutionary Computation in Nanoscience, Bioinformatics and Systems & Synthetic Biology

The BioHEL GBML System• BIOinformatics-oriented Hiearchical Evolutionary Learning –

BioHEL (Bacardit & Krasnogor, 2009)

• BioHEL is a rule-based evolutionary learning system that employs the Iterative Rule Learning (IRL) paradigm– First used in EC in Venturini’s SIA system (Venturini, 1993)– Widely used for both Fuzzy and non-fuzzy evolutionary

learning

J. Bacardit, M. Stout, J.D. Hirst, K. Sastry, X. Llora, and N. Krasnogor. Automated alphabet reduction method with evolutionary algorithms for protein structure prediction. Proceedings of the 2007 Genetic and Evolutionary Computation Conference, ACM Press, 2007.

J. Bacardit, M. Stout, J.D. Hirst, A. Valencia, R.E. Smith, and N. Krasnogor. Automated alphabet reduction for protein datasets. BMC Bioinformatics, 10(6), 2009.

Bronze Medal in the THE 2007 “HUMIES” AWARDS FOR HUMAN-COMPETITIVE RESULTS PRODUCED BY GENETIC AND EVOLUTIONARY COMPUTATION.

Page 33 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011

Page 34: Darwin’s Magic: Evolutionary Computation in Nanoscience, Bioinformatics and Systems & Synthetic Biology

How are these features predicted?

Many of these features are due to local interactions of an amino acid and its immediate neighbours We predict them from the closest neighbours

in the chain

Ri

SSi

Ri+1

SSi+1

Ri-1

SSi-1

Ri+2

SSi+2

Ri-2

SSi-2

Ri+3

SSi+3

Ri+4

SSi+4

Ri-3

SSi-3

Ri-4

SSi-4

Ri-5

SSi-5

Ri+5

SSi+5

Ri-1 Ri Ri+1 SSi

Ri Ri+1 Ri+2 SSi+1

Ri+1 Ri+2 Ri+3 SSi+2Page 34 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011

Page 35: Darwin’s Magic: Evolutionary Computation in Nanoscience, Bioinformatics and Systems & Synthetic Biology

Contact Map dataset

The set of 2811 proteins was randomly halved Moreover, all proteins with more than 350

amino acids were discarded Still, the resulting training set contained more

than 15.2 million instances and 631 attributes Less than 2% of those are actual contacts 36GB of disk space

Page 35 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011

Page 36: Darwin’s Magic: Evolutionary Computation in Nanoscience, Bioinformatics and Systems & Synthetic Biology

Samples and ensembles

50 samples of 300K examples are generated from the training set with a ratio of 2:1 non-contacts/contacts

BioHEL is run 25 times for each sample

Prediction is done by a consensus of 1250 rule sets

Confidence of prediction is computed based on the votes distribution in the ensemble.

Whole training process takes about 289 CPU days (~5.5h/rule set)

Training set

x50

x25

Consensus

Predictions

Samples

Rule sets

Page 36 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011

Page 37: Darwin’s Magic: Evolutionary Computation in Nanoscience, Bioinformatics and Systems & Synthetic Biology

Critical Assessment of Techniques for Protein Structure Prediction

CASP facts biannual competition

started in 1994 parallel prediction and

experimental verification model assessment by

human experts

9th edition of CASP 150 human groups 140 server groups

Page 37 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011

Page 38: Darwin’s Magic: Evolutionary Computation in Nanoscience, Bioinformatics and Systems & Synthetic Biology

Contact Map prediction in CASP 7

Ezkudia et al. Proteins 2009; 77(Suppl 9):196-209

Accuracy for groups that predicted a common subset of targets

Page 38 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011

Page 39: Darwin’s Magic: Evolutionary Computation in Nanoscience, Bioinformatics and Systems & Synthetic Biology

Xd results

Ezkudia et al. Proteins 2009; 77(Suppl 9):196-209

Contact Map prediction in CASP 7

Page 39 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011

Page 40: Darwin’s Magic: Evolutionary Computation in Nanoscience, Bioinformatics and Systems & Synthetic Biology

Remarkable Prediction

L/10 prediction for target T0443-D1

67% accuracy

Ezkudia et al. Proteins 2009; 77(Suppl 9):196-209

Page 40 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011

Page 41: Darwin’s Magic: Evolutionary Computation in Nanoscience, Bioinformatics and Systems & Synthetic Biology

A larger set of proteins was employed Set of 3262 proteins for training all the 1D

predictors A subset of 2413 proteins used for CM

prediction All proteins with less than 250AA A randomly selected 20% for larger chains

50 Samples of ~660000 instances were generated

The representation remained unchanged 25K CPU hours were employed just to

train the CM ensemble

Contact Map prediction in CASP 9

Page 41 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011

Page 42: Darwin’s Magic: Evolutionary Computation in Nanoscience, Bioinformatics and Systems & Synthetic Biology

In terms of performance

These two groups derived contact predictions from 3D models

Page 42 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011

Page 43: Darwin’s Magic: Evolutionary Computation in Nanoscience, Bioinformatics and Systems & Synthetic Biology

Energy landscape all-atom force field statistical potential

Search method random walk structure optimisation Folding@home 8.5 peta FLOPS 10 000 CPU days for 10μs of folding[Dill and Chan 1997]

P. Widera, J.M. Garibaldi, J., and N. Krasnogor,. Evolutionary design of the energy function for protein structure prediction, Proceedings of the IEEE Congress on Evolutionary Computation 2009.

P. Widera, J. Garibaldi, and N. Krasnogor. GP challenge: evolving the energy function for protein structure prediction. Journal of Genetic Programming and Evolvable Machines, 11:61-88, 1 2010.

Gold Medal in the THE 2010 “HUMIES” AWARDS FOR HUMAN-COMPETITIVE RESULTS PRODUCED BY GENETIC AND EVOLUTIONARY COMPUTATION

Improving the Energy Function for Full 3D PSP

Page 43 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011

Page 44: Darwin’s Magic: Evolutionary Computation in Nanoscience, Bioinformatics and Systems & Synthetic Biology

How to find good quality models?Correlation between energy and distance to the native structure

Requirements energy reflects

distance distance reflects

similarity

Page 44 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011

Page 45: Darwin’s Magic: Evolutionary Computation in Nanoscience, Bioinformatics and Systems & Synthetic Biology

How the best of CASP do it?Energy of models vs. distance to a target structure

Similarity measure

Decoys generated by I-TASSER [Wu et al. 2007] Robetta [Rohl et al. 2004]

Page 45 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011

Page 46: Darwin’s Magic: Evolutionary Computation in Nanoscience, Bioinformatics and Systems & Synthetic Biology

How the best of CASP do it?Energy of models vs. distance to a target structure

Similarity measure

Decoys generated by I-TASSER [Wu et al. 2007] Robetta [Rohl et al. 2004]

Page 46 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011

Page 47: Darwin’s Magic: Evolutionary Computation in Nanoscience, Bioinformatics and Systems & Synthetic Biology

How the energy function is designed?Weighted sum vs. free combination of terms

Decision support local numerical

approximation

GP input terminals: T1 … T8

functions:add sub mul divsin cos exp log

random ephemerals in range [0,1]

Zhang et al. 2003

Page 47 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011

Page 48: Darwin’s Magic: Evolutionary Computation in Nanoscience, Bioinformatics and Systems & Synthetic Biology

Can GP improve over a weighted sum of terms?Nelder-Mead downhill simplex optimisation

Computational cost of experiments 55 proteins, 1000-2000 structures for each 5 different ranking distance measures 20 different configurations of GP parameters total of 150 CPU days

spearman-sigmoid correlation

method d-100 all d-100 all

simplex 0.734 0.638 0.650 0.166

GP 0.835 0.714 *0.740 *0.200

Page 48 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011

Page 49: Darwin’s Magic: Evolutionary Computation in Nanoscience, Bioinformatics and Systems & Synthetic Biology

Outline• Darwin’s Magic and Algorithmic Beauty

• Evolutionary Computation in the Natural Sciences– Self-Assembly and Scanning Probe Microscopy

Optimisation– Structural Bioinformatics– Systems Biology & Synthetic Biology

• On Invariants, Decorations and the Future• Conclusions

Page 49 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011

Page 50: Darwin’s Magic: Evolutionary Computation in Nanoscience, Bioinformatics and Systems & Synthetic Biology

The Spatial Scales Involved

Page 50 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011

Page 51: Darwin’s Magic: Evolutionary Computation in Nanoscience, Bioinformatics and Systems & Synthetic Biology

The Cell as an Information Processing Device

LeDuc et al. Towards an in vivo biologically inspired nanofactory. Nature (2007)

Page 51 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011

Page 52: Darwin’s Magic: Evolutionary Computation in Nanoscience, Bioinformatics and Systems & Synthetic Biology

Transcription Networks

Gene1 Gene2 Gene3 Genek

Genome

Transcription Factors

Signal2 Signal5Signal1 Signal3 Signal4 Signaln...Environment

Page 52 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011

Page 53: Darwin’s Magic: Evolutionary Computation in Nanoscience, Bioinformatics and Systems & Synthetic Biology

Network Motifs: Evolution’s Preferred Circuits

• Biological networks are complex and vast

• Moreover, these patterns are organised in non-trivial/non-random hierarchies

“Patterns that occur in the real network significantly more often than in randomized networks are called network motifs” Shai S. Shen-Orr et al., Network motifs in the transcriptional regulation network of

Escherichia coli. Nature Genetics 31, 64 - 68 (2002)

Radu Dobrin et al., Aggregation of topological motifs in the Escherichia coli transcriptional regulatory network. BMC Bioinformatics. 2004; 5: 10.

Each network motif carries out a specific information-processing function

The C1-FFL is a ‘sign-sensitive delay’ element and a persistence detector.

The I1-FFL is a pulse generator and response accelerator

Page 53 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011

Page 54: Darwin’s Magic: Evolutionary Computation in Nanoscience, Bioinformatics and Systems & Synthetic Biology

Evolvable Executable Biology

Page 54 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011

Page 55: Darwin’s Magic: Evolutionary Computation in Nanoscience, Bioinformatics and Systems & Synthetic Biology

Nested EA for Model Synthesis

F. Romero-Campero, H.Cao, M. Camara, and N. Krasnogor. Structure and parameter estimation for cell systems biology models. Proceedings of the Genetic and Evolutionary Computation Conference (GECCO-2008), pages 331-338. ACM Publisher, 2008. Best Paper Award

H. Cao, F.J. Romero-Campero, S. Heeb, M. Camara, and N. Krasnogor. Evolving cell models for systems and synthetic biology. Systems and Synthetic Biology , 2009

Page 55 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011

Page 56: Darwin’s Magic: Evolutionary Computation in Nanoscience, Bioinformatics and Systems & Synthetic Biology

The Fitness Function• Multiple time-series per target

• Different time series have very different profiles, e.g., response time or maxima occur at different times/places

• Transient states (sometimes) as important as steady states

•RMSE will mislead search

•Sometimes the time series is qualitative or microarray data

H. Cao, F.J. Romero-Campero, S. Heeb, M. Camara, and N. Krasnogor. Evolving cell models for systems and synthetic biology. Systems and Synthetic Biology , 2009

Page 56 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011

Page 57: Darwin’s Magic: Evolutionary Computation in Nanoscience, Bioinformatics and Systems & Synthetic Biology

Problem Specification

Page 57 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011

Page 58: Darwin’s Magic: Evolutionary Computation in Nanoscience, Bioinformatics and Systems & Synthetic Biology

Target

Page 58 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011

Page 59: Darwin’s Magic: Evolutionary Computation in Nanoscience, Bioinformatics and Systems & Synthetic Biology

A Signal Translatorfor Pattern Formation

act1Prep2

act2Prep1

rep1Pact1

rep2Pact2

rep3Prep1

rep4Prep2

I2Prep3

I1Prep4

FP2Pact2

FP1Pact1

Page 59 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011

Page 60: Darwin’s Magic: Evolutionary Computation in Nanoscience, Bioinformatics and Systems & Synthetic Biology

Uniform Spatial Distribution of Signal Translators for Pattern Formation

E. coli DH5α ∆sdiA/∆lacI (2∆)

pACYC184pBR322

Page 60 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011

Page 61: Darwin’s Magic: Evolutionary Computation in Nanoscience, Bioinformatics and Systems & Synthetic Biology

Pattern Formation in synthetic bacterial colonies

Page 61 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011

Page 62: Darwin’s Magic: Evolutionary Computation in Nanoscience, Bioinformatics and Systems & Synthetic Biology

pAYCP (1-3)pBR322 (4-6)

2∆ DH5αStarting OD=10

Magnification: 100X

Page 62 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011

Page 63: Darwin’s Magic: Evolutionary Computation in Nanoscience, Bioinformatics and Systems & Synthetic Biology

pUC6S (1-6)

2∆ DH5α

Starting OD= 10

Magnification: 40X

Page 63 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011

Page 64: Darwin’s Magic: Evolutionary Computation in Nanoscience, Bioinformatics and Systems & Synthetic Biology

Outline• Darwin’s Magic and Algorithmic Beauty

• Evolutionary Computation in the Natural Sciences– Self-Assembly and Scanning Probe Microscopy

Optimisation– Structural Bioinformatics

• Systems Biology & Synthetic Biology

• On Invariants, Decorations and the Future• Conclusions

Page 64 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011

Page 65: Darwin’s Magic: Evolutionary Computation in Nanoscience, Bioinformatics and Systems & Synthetic Biology

Calculating Pi

Algorithms are Tiny

Factoring: Let n be the number to be factored.

1. Let Δ be a negative integer with Δ = -dn where d is a multiplier and Δ is the negative discriminant of some quadratic form.

2. Take the t first primes , for some .

3. Let fq be a random prime form of GΔ with .

4. Find a generating set X of GΔ

5. Collect a sequence of relations between set X and {fq : q ∈ PΔ} satisfying:

6. Construct an ambiguous form (a, b, c) which is an element f ∈ GΔ of order dividing 2 to obtain a coprime factorization of the largest odd divisor of Δ in which Δ = -4a.c or a(a - 4c) or (b - 2a).(b + 2a)

7. If the ambiguous form provides a factorization of n then stop, otherwise find another ambiguous form until the factorization of n is found. In order to prevent that useless ambiguous forms are generated, build up the 2-Sylow group S2(Δ) of G(Δ).

Page 65 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011

Page 66: Darwin’s Magic: Evolutionary Computation in Nanoscience, Bioinformatics and Systems & Synthetic Biology

They are NOT Algorithms!➡They do not stop, we stop them.➡They are not short pieces of code, but large

systems

What Evolutionary Algorithms are NOT?

Page 66 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011

Page 67: Darwin’s Magic: Evolutionary Computation in Nanoscience, Bioinformatics and Systems & Synthetic Biology

What are Evolutionary Algorithms?

N. Krasnogor and J.E. Smith. IEEE Transactions on Evolutionary Computation, 9(5):474- 488, 2005.

Research Paradigms for Problem Solving

T.S. Kuhn. The Structure of Scientific Revolutions, 1962.

Design Patterns and Pattern Languages

C. Alexander, S. Ishikawa, M. Silverstein, M. Jacobson, I. Fiksdahl-King, S. Angel, S.: A Pattern Language - Towns, Buildings, Construction. Oxford University Press (1977)

Page 67 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011

Page 68: Darwin’s Magic: Evolutionary Computation in Nanoscience, Bioinformatics and Systems & Synthetic Biology

A Compact “Memetic” Algorithm by Merz (2003)

Invariants and Decorations

Page 68 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011

Page 69: Darwin’s Magic: Evolutionary Computation in Nanoscience, Bioinformatics and Systems & Synthetic Biology

A “Memetic” Particles Swarm Optimisation by

Petalas et al (2007)

Invariants and Decorations

Page 69 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011

Page 70: Darwin’s Magic: Evolutionary Computation in Nanoscience, Bioinformatics and Systems & Synthetic Biology

A “Memetic” Artificial Immune System by Yanga et al (2008)

Invariants and Decorations

Page 70 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011

Page 71: Darwin’s Magic: Evolutionary Computation in Nanoscience, Bioinformatics and Systems & Synthetic Biology

A “Memetic” Learning Classifier

System by Bacardit & Krasnogor

(2009)

Invariants and Decorations

Page 71 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011

Page 72: Darwin’s Magic: Evolutionary Computation in Nanoscience, Bioinformatics and Systems & Synthetic Biology

• Many others based on Ant Colony Optimisation, NN, Tabu Search, SA, DE, etc.

• Key Invariants:– Global search mode– Local search mode

• Many Decorations, e.g.:– Crossover/Mutations (EAs based MAs)– Pheromones updates (ACO based MAs)– Clonal selection/Hypermutations (AIS based MAs)– etc

Invariants and Decorations

Page 72 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011

Page 73: Darwin’s Magic: Evolutionary Computation in Nanoscience, Bioinformatics and Systems & Synthetic Biology

A Pattern Language for Memetic AlgorithmsMemetic Algorithms by N. Krasnogor. Handbook of Natural Computation (chapter) in Natural Computing. Springer Berlin /

Heidelberg, 2009. www.cs.nott.ac.uk/~nxk/publications.html

Page 73 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011

Page 74: Darwin’s Magic: Evolutionary Computation in Nanoscience, Bioinformatics and Systems & Synthetic Biology

A General Trend: moving away from close-loop optimisation towards open-ended and embodied

optimisation

Effort (e.g. Time, $$$, etc)Programming solving 1 problem – single instances

Programming solving 1 problem – several instances(self) adaptive

Programming Solving a few problem – several classes instances(self) adaptive Self-generating

Programming Solving multiple unrelated problem – several classes instances(self) adaptive Self-generating

Self-Engineering

Reuse

Reuse

Effort (e.g. Time, $$$, etc)

Effort (e.g. Time, $$$, etc)

Effort (e.g. Time, $$$, etc)

Feedback

Reuse Feedback

Page 74 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011

Page 75: Darwin’s Magic: Evolutionary Computation in Nanoscience, Bioinformatics and Systems & Synthetic Biology

The Future of EAsSoftware Nurseries

• Fundamental Change of Temporal Scales Rethink• Software will be “seeded” and grown, very much like

a plant or animal (including humans)• Software will start in an “embryonic” state and

develop when situated on a production environment• What would a software “incubation” machine look

like?• What would a software “nursery” look like?

Page 75 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011

Page 76: Darwin’s Magic: Evolutionary Computation in Nanoscience, Bioinformatics and Systems & Synthetic Biology

Cells

Organs

Tissue

Individual

DNA/RNA

Potential To Develop into multiple

different types of cells

Commitment

Specialised Function

Ultimate Solver

Page 76 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011

Page 77: Darwin’s Magic: Evolutionary Computation in Nanoscience, Bioinformatics and Systems & Synthetic Biology

Software Cell SC

SC

SC

SC

SCSC

Pluripotential Solver“DNA”

TSP Organ

Euclidean TSP Organ

GraphicalTSP Organ

TSPSolver

SoftwareOrganism

Production EnvironmentInput

Page 77 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011

Page 78: Darwin’s Magic: Evolutionary Computation in Nanoscience, Bioinformatics and Systems & Synthetic Biology

Protein Structure PredictionSolver

SoftwareOrganism

Vehicle RoutingSolver

SoftwareOrganism

Graph IsomorphismSolver

SoftwareOrganism

SATSolver

SoftwareOrganism

Bin PackingSolver

SoftwareOrganism

Graph ColoringSolver

SoftwareOrganism

Network InterdictionSolver

SoftwareOrganism

Quadratic AssignmentSolver

SoftwareOrganism

TSPSolver

SoftwareOrganism

An Ecosystem of solvers

Page 78 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011

Page 79: Darwin’s Magic: Evolutionary Computation in Nanoscience, Bioinformatics and Systems & Synthetic Biology

As, e.g., Biologists & Physicists have done through an ubiquitous, worldwide spanning informatics

infrastructure, we should be focusing on building an online

worldwide computational problem solving infrastructure

Page 79 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011

Page 80: Darwin’s Magic: Evolutionary Computation in Nanoscience, Bioinformatics and Systems & Synthetic Biology

Outline• Darwin’s Magic and Algorithmic Beauty

• Evolutionary Computation in the Natural Sciences– Self-Assembly and Scanning Probe Microscopy

Optimisation– Structural Bioinformatics

• Systems Biology & Synthetic Biology

• On Invariants, Decorations and the Future• Conclusions

Page 80 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011

Page 81: Darwin’s Magic: Evolutionary Computation in Nanoscience, Bioinformatics and Systems & Synthetic Biology

Conclusions• New types of executable structures• In Nanotechnology

– DNA tiles, DNA origami, etc– Non DNA based tiles– Some have very definite programmable features– Others require the program to be “distributed” and exploit noise and

randomness

• In Synthetic Biology– How to orchestrate activities at multiple temporal-spatial-energetic

scales?– How to cope with noise in the background that execures a program and

in the program itself?!– How to cope for programs that will evolve?

Page 81 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011

Page 82: Darwin’s Magic: Evolutionary Computation in Nanoscience, Bioinformatics and Systems & Synthetic Biology

Conclusions• New types of benchmarks• Structural Biology (PSP and GP4PSP)

– Many of these problems can be modelled both as regression or classification problems

– Low/high number of classes– Balanced/unbalanced classes– Adjustable number of attributes– Ideal benchmarks !!

• Scanning Probe Microscopy: – Even a few dimensions are hard– “Chameleons” as it is sampled

• http://www.infobiotic.net/

Page 82 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011

Page 83: Darwin’s Magic: Evolutionary Computation in Nanoscience, Bioinformatics and Systems & Synthetic Biology

• The emerging trend is moving away from close-loop optimisation towards open-ended and embodied optimisation

• Requires strong links with data mining, ALIFE and, of course, AI (beyond existing trends in constraint satisfaction), search based software engineering (beyond current trends on testing/debugging)

Conclusions

•Requires on-line, computer friendly ontologies of code (e.g the pattern language in the left), self-describing source code, protocols for autonomic code reuse, etc

Page 83 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011

Page 84: Darwin’s Magic: Evolutionary Computation in Nanoscience, Bioinformatics and Systems & Synthetic Biology

Missing Components

Missing Components

• Learn From Physics, Chemistry & Biology The Invariants & Patterns, the Decorations are superfluous

• Evolution • Self-Assembly & Self-Organisation• Developmental systems

– Depend on a core genome coding for essential functionality

– Epigenomics canalises development• Hierarchical control systems that modify programs including

susceptibility to horizontal gene (program libraries) transfer• Infrastructure

Conclusions

Page 84 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011

Page 85: Darwin’s Magic: Evolutionary Computation in Nanoscience, Bioinformatics and Systems & Synthetic Biology

Acknowledgements

F. Romero-CamperoJ. Twycross

G. Terrazas

J. Bacardit

J. ChaplinP. Widera

A. Ali Shah

J. Blakes

E. Glaab

K. Righetti

M. Franco

D. Sannasy

L. T. Leong

CEC organisers A.E. Smith I. Parmee G. Kendall M. Schoenauer

• M. Camara, S. Heeb, P. Williams• C. Alexander• P. Moriarty, P. Beton• N. Chapness, R. Wooley• M. Holdsworth, G. Basel• Colleagues at ASAP

Page 85 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011

Page 86: Darwin’s Magic: Evolutionary Computation in Nanoscience, Bioinformatics and Systems & Synthetic Biology

Page 86 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011