darwin’s magic: evolutionary computation in nanoscience, bioinformatics and systems &...
DESCRIPTION
In this talk I will overview ten years of research in the application of evolutionary computation ideas in the natural sciences. The talk will take us on a tour that will cover problems in nanoscience, e.g. controlling self-‐organizing systems, optimizing scanning probe microscopy, etc., problems arising in bioinformatics, such as predicting protein structures and their features, to challenges emerging in systems and synthetic biology. Although the algorithmic solutions involved in these problems are different from each other, at their core, they retain Darwin’s wonderful insights. I will conclude the talk by giving a personal view on why EC has been so successful and where, in my mind, the future lies.TRANSCRIPT
Darwin’s Magic: Evolutionary Computation
in Nanoscience, Bioinformatics, Systems &
Synthetic Biology
Prof. Natalio Krasnogor
Automated Scheduling, Optimisation and Planning Research Group
School of Computer Science, University of Nottingham
www.cs.nott.ac.uk/~nxk
twitter.com/NKrasnogor
Page 1 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
Outline
• Darwin’s Magic and Algorithmic Beauty
• Evolutionary Computation in the Natural Sciences– Self-Assembly and Scanning Probe Microscopy Optimisation– Structural Bioinformatics– Systems Biology & Synthetic Biology
• On Invariants, Decorations and the Future• Conclusions
Page 2 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
Outline• Darwin’s Magic and Algorithmic Beauty
• Evolutionary Computation in the Natural Sciences– Self-Assembly and Scanning Probe Microscopy
Optimisation– Structural Bioinformatics– Systems Biology & Synthetic Biology
• On Invariants, Decorations and the Future• Conclusions
Page 3 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
Darwin’s Magic
Page 4 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
Thank you Youtube
Algorithmic Beauty1. Inheritable Instructions Set
2. Limited Resources
3. Imperfect Replication
A Powerful Secondary Effect: Selection
An awe inspiring product:
Evolution by Natural Selection
Page 5 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
Outline• Darwin’s Magic and Algorithmic Beauty
• Evolutionary Computation in the Natural Sciences– Self-Assembly and Scanning Probe Microscopy
Optimisation– Structural Bioinformatics– Systems Biology & Synthetic Biology
• On Invariants, Decorations and the Future• Conclusions
Page 6 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
Evolutionary Computation in the Natural Sciences
• A Research VisionProgrammable algorithmic entry to the vast world of nanoscale physical, chemical & biological systems and processes
Algorithmic and Artificial Living Matter (ALMA)
Com
pute
r Sci
ence Embedded behavior
Information & Algorithms Complexity Robustness Tradeoffs
How does “The Logistics of Small Things” look like?
How (?) do you gain algorithmic entry into
Page 7 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
The Spatial Scales Involved
Page 8 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
ALMA & The Logistics of Small ThingsHow do you program complex nano/micro scale process :
• through billions of tiny & simple distributed programs/processors?
• when there is no clear distinction between hardware and software?
• when the wetware is not simply a stochastic program:
• when wetware is poorly characterised and is likely to evolve, etc.
function f1(p1,p2,p3,p4){ if (p1<p2) and (rand<0.5)
print p3 else
print p4}
function f1(p1,p2,p3,p4){ if (p1<p2)
RND print p3
RND else
RNDprint p4RND
}
function f1(p1,p2,p3,p4){ if (p1<p2)
RND print p3
RND else
RNDprint p4RND
}
function f1(p1,p2,p3,p4){ if (p1<p2)
RND incr p3
RND else
RNDdecr p4RND
}Page 9 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
Outline• Darwin’s Magic and Algorithmic Beauty
• Evolutionary Computation in the Natural Sciences– Self-Assembly and Scanning Probe Microscopy
Optimisation– Structural Bioinformatics– Systems Biology– Synthetic Biology
• On Invariants, Decorations and the Future• Conclusions
Page 10 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
The Spatial Scales Involved
Page 11 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
Molecular Tiles & Programmable Self-Assembly
Algorithmic Self-Assembly of DNA Sierpinski Triangles. P.W.K. Rothemund, N. Papadakis, E. Winfree. PLoS Biology 2:12 (2004)
Page 12 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
Tiles System
Supra-structure
Which is the correct input ?
• Finite size square lattice (300x300)
• Fixed T = 4
10 tiles
10 tiles
How can we automatically design a tile system that self-assembles into a target shape?
c1
c4
c3
c0
c0 c1 c2 c3 c4
3 7 2 8 1
7 6 7 0 1
2 7 0 5 3
8 0 5 4 1
1 1 3 1 9
c5 c6 c7 c8 c9
c7
c6
c9
c8
c5
2 3 7 6 5
5 2 3 5 2 4 1 2 3 9
0 4 5 2 3 1 0 6 5 1
6 2 7 7 7 2 6 8 2 9
3 8 3 3 6 3 5 2 7 8
9 7 1 1 5 9 1 9 8 1
5 2 7 3 1
3 5 7 1 6
2 4 2 8 7
5 0 6 3 9
c2
Glue strength matrix M
Evolving tiles for automated self-assembly design. G. Terrazas, M. Gheorghe, G. Kendall, and N. Krasnogor. Proceedings for the 2007 IEEE Congress on Evolutionary Computation, 2007. Best paper award.
Toward minimum size self-assembled counters by P. Moisset de Espanes, A. Goel. Nat Comput (2008) 7:317–334
Page 13 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
Tiles with deterministic assembly (Model 1)
Tiles with probabilistic assembly (Model 2)
Page 14 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
Phenotype – Fitness Mapping
Minkowski functionals (A, P, X)
A = 12P = 24X = 0
A = 100P = 40X = 1
Evolutionary Design ApproachVariable length individuals
(Genotype) Genotype -Phenotype Mapping
Phenotype
Randomly created Wang tiles
Bitwise mutation
Vs
One-point crossover
Population size = 100, Individuals length = [1,10], Generations = 300, Pcrossover = 0.7, PMutation = 0.1/0.05/0.01
Page 15 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
Probabilistic Assembly +No Rotation
Probabilistic Assembly +Rotation
Deterministic Assembly +Rotation
Deterministic Assembly +No Rotation
Page 16 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
Two-tile self-assembly Three-tile self-assembly Four-tile self-assembly Five-tile self-assembly
We calculated the equivalence classes of binding pockets defined by “bp1 R bp2 iif NAFE(bp1)=NAFE(bp2)” for the best tile set.
We observed that equivalence classes with NAFE smaller than T are highly likely to participate in the self-assembly process as these are more populous.
More “assembable” binding pockets = Generalised Secondary Structures
How Does Self-Assembly Gets Programmed?
Page 17 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
Triangular site lattice
Physical events to capture1. Adsorption: tiles are placed on the substrate at a given
rate2. Diffusion: tiles move or rotate from one position to another
allowing:• Separation from one or more tiles• Motion along a line of tiles• Motion without interaction
3. Diffusion across terraces on the substrate4. Intramolecule strength: energy between two no-
functionalised porphyrins 5. Molecule-substrate strength: energy of a porphyrin to the
substrate6. Rotational strength: molecule-substrate strength for
spinning
Neighbourhood size 6
Neighbourhood size 8
20
0
3
42
DNA Tiles are Too Big!
Page 18 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
How Do You Image and Manipulate at This Scale?
D.L. Keeling et al. Phys. Rev. Lett 94, 146104 (2005)
Y. Sugimoto et al., Nature letters 446, 64 (2007).
Hla et al. Phys. Rev. Lett. 85, 2777–2780 (2000)
C60
D. M. Eigler & E. K. Schweizer, Nature 344, 524 - 526 (1990)
Page 19 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
Even 3 Variable Problems are Difficult: Optimising a Scanning Probe Microscopy
The tunnel current it is highly dependant on the tip-sample distance, d. This current can be maintained with a feedback loop, G, that actively controls the tip-sample distance.
it exp(−2kd)∝
AScanning tip
Sample surface
X
Z
Y
Axis under direct (piezo) control
G i
V
http://www2.fz-juelich.de/ibn/index.php?index=1021
Page 20 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
Understanding the image
J. H. A. Hagelaar et al. PRB 78, 161405R 2008 L.Gross et al. Science 325 1110 (2009)
Page 21 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
(Un)Stable and (Un)defined Tip States
Imaging problems, spontaneous tip changes
Page 22 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
Two Stage Automation Process
Ex-situIn-situ
Voltage pulsing (deliberate crash) Fine tuning
(changing scan parameters)
Automated probe microscopy via evolutionary optimisation at the atomic scale. R. Woolley, J. Sterling, A. Radocea, N. Krasnogor and P. Moriarty. Applied Physical Letters (to appear)
Cellular GA with Smart Initialisation
Page 23 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
Stage 1: Smart Initialisation (coarsely) Conditions the Probe
Streaky Image. Executing cleaning pulse
Cloudy Image. Executing cleaning pulse
Flat Surface. Zooming in to 50nm
Flat Surface. Zooming in to 20nm
Constant Atomic resolution. Zooming in to 4nm
Poor Atomic resolution.Rescanning
Consistent fair atomic resolution. Stage 1 complete. Time elapsed: 1010.1902 (~17mins)
A deterministic approach
Page 24 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
Stage 2: Fine adjustment with CGAStarting image
Machine Optimised
GiV
GiV
GiV
GiV
GiV
Cellular GA
Page 25 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
Do I really need a cGA? Would a stochastic selection be just as good?
•Standard deviation is from the ‘noise’ of the GA
•RMI average 0.12Insets: 1x1nm2
(a) before cGA, (b) optimised.
•Stochastic selection of parameters, average RMI 0.01
Page 26 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
How Does it Compares to an Expert Operator?
Microscopist Ave. RMI
Change in RMI/min
i-SPM 0.20 7.1Human 0.09 2.6
Page 27 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
Outline• Darwin’s Magic and Algorithmic Beauty
• Evolutionary Computation in the Natural Sciences– Self-Assembly and Scanning Probe Microscopy
Optimisation– Structural Bioinformatics– Systems Biology & Synthetic Biology
• On Invariants, Decorations and the Future• Conclusions
Page 28 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
The Spatial Scales Involved
Page 29 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
Primary Sequence 3D Structure
Protein Folding & Structure Prediction
Protein Structure Prediction (PSP) aims to predict the 3D structure of a protein based on its primary sequence
(perhaps disregarding the folding process)
Anfinsen’s thermodynamic hypothesis [Anfinsen 1973, Dill and Chan 1997]
Page 30 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
Defining and Predicting Useful Features
Contact M. Stout, J. Bacardit, J. Hirst & N. Krasnogor, Bioinformatics 2008
24(7):916-923.
M. Stout, J. Bacardit, J.D. Hirst, R.E Smith, and N. Krasnogor. Prediction of topological contacts in proteins using learning classifier systems. Journal Soft Computing - A Fusion of Foundations, Methodologies and Applications, 13(3):245-258, 2008.
Page 31 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
Integrating Multiple Prediction Sources
1. Prediction of Secondary structure (using PSIPRED) Solvent Accessibility Recursive Convex Hull Coordination Number
2. Integration of all these predictions plus other sources of information
3. Final CM prediction (using BioHEL)
Using BioHEL
Page 32 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
The BioHEL GBML System• BIOinformatics-oriented Hiearchical Evolutionary Learning –
BioHEL (Bacardit & Krasnogor, 2009)
• BioHEL is a rule-based evolutionary learning system that employs the Iterative Rule Learning (IRL) paradigm– First used in EC in Venturini’s SIA system (Venturini, 1993)– Widely used for both Fuzzy and non-fuzzy evolutionary
learning
J. Bacardit, M. Stout, J.D. Hirst, K. Sastry, X. Llora, and N. Krasnogor. Automated alphabet reduction method with evolutionary algorithms for protein structure prediction. Proceedings of the 2007 Genetic and Evolutionary Computation Conference, ACM Press, 2007.
J. Bacardit, M. Stout, J.D. Hirst, A. Valencia, R.E. Smith, and N. Krasnogor. Automated alphabet reduction for protein datasets. BMC Bioinformatics, 10(6), 2009.
Bronze Medal in the THE 2007 “HUMIES” AWARDS FOR HUMAN-COMPETITIVE RESULTS PRODUCED BY GENETIC AND EVOLUTIONARY COMPUTATION.
Page 33 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
How are these features predicted?
Many of these features are due to local interactions of an amino acid and its immediate neighbours We predict them from the closest neighbours
in the chain
Ri
SSi
Ri+1
SSi+1
Ri-1
SSi-1
Ri+2
SSi+2
Ri-2
SSi-2
Ri+3
SSi+3
Ri+4
SSi+4
Ri-3
SSi-3
Ri-4
SSi-4
Ri-5
SSi-5
Ri+5
SSi+5
Ri-1 Ri Ri+1 SSi
Ri Ri+1 Ri+2 SSi+1
Ri+1 Ri+2 Ri+3 SSi+2Page 34 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
Contact Map dataset
The set of 2811 proteins was randomly halved Moreover, all proteins with more than 350
amino acids were discarded Still, the resulting training set contained more
than 15.2 million instances and 631 attributes Less than 2% of those are actual contacts 36GB of disk space
Page 35 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
Samples and ensembles
50 samples of 300K examples are generated from the training set with a ratio of 2:1 non-contacts/contacts
BioHEL is run 25 times for each sample
Prediction is done by a consensus of 1250 rule sets
Confidence of prediction is computed based on the votes distribution in the ensemble.
Whole training process takes about 289 CPU days (~5.5h/rule set)
Training set
x50
x25
Consensus
Predictions
Samples
Rule sets
Page 36 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
Critical Assessment of Techniques for Protein Structure Prediction
CASP facts biannual competition
started in 1994 parallel prediction and
experimental verification model assessment by
human experts
9th edition of CASP 150 human groups 140 server groups
Page 37 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
Contact Map prediction in CASP 7
Ezkudia et al. Proteins 2009; 77(Suppl 9):196-209
Accuracy for groups that predicted a common subset of targets
Page 38 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
Xd results
Ezkudia et al. Proteins 2009; 77(Suppl 9):196-209
Contact Map prediction in CASP 7
Page 39 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
Remarkable Prediction
L/10 prediction for target T0443-D1
67% accuracy
Ezkudia et al. Proteins 2009; 77(Suppl 9):196-209
Page 40 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
A larger set of proteins was employed Set of 3262 proteins for training all the 1D
predictors A subset of 2413 proteins used for CM
prediction All proteins with less than 250AA A randomly selected 20% for larger chains
50 Samples of ~660000 instances were generated
The representation remained unchanged 25K CPU hours were employed just to
train the CM ensemble
Contact Map prediction in CASP 9
Page 41 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
In terms of performance
These two groups derived contact predictions from 3D models
Page 42 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
Energy landscape all-atom force field statistical potential
Search method random walk structure optimisation Folding@home 8.5 peta FLOPS 10 000 CPU days for 10μs of folding[Dill and Chan 1997]
P. Widera, J.M. Garibaldi, J., and N. Krasnogor,. Evolutionary design of the energy function for protein structure prediction, Proceedings of the IEEE Congress on Evolutionary Computation 2009.
P. Widera, J. Garibaldi, and N. Krasnogor. GP challenge: evolving the energy function for protein structure prediction. Journal of Genetic Programming and Evolvable Machines, 11:61-88, 1 2010.
Gold Medal in the THE 2010 “HUMIES” AWARDS FOR HUMAN-COMPETITIVE RESULTS PRODUCED BY GENETIC AND EVOLUTIONARY COMPUTATION
Improving the Energy Function for Full 3D PSP
Page 43 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
How to find good quality models?Correlation between energy and distance to the native structure
Requirements energy reflects
distance distance reflects
similarity
Page 44 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
How the best of CASP do it?Energy of models vs. distance to a target structure
Similarity measure
Decoys generated by I-TASSER [Wu et al. 2007] Robetta [Rohl et al. 2004]
Page 45 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
How the best of CASP do it?Energy of models vs. distance to a target structure
Similarity measure
Decoys generated by I-TASSER [Wu et al. 2007] Robetta [Rohl et al. 2004]
Page 46 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
How the energy function is designed?Weighted sum vs. free combination of terms
Decision support local numerical
approximation
GP input terminals: T1 … T8
functions:add sub mul divsin cos exp log
random ephemerals in range [0,1]
Zhang et al. 2003
Page 47 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
Can GP improve over a weighted sum of terms?Nelder-Mead downhill simplex optimisation
Computational cost of experiments 55 proteins, 1000-2000 structures for each 5 different ranking distance measures 20 different configurations of GP parameters total of 150 CPU days
spearman-sigmoid correlation
method d-100 all d-100 all
simplex 0.734 0.638 0.650 0.166
GP 0.835 0.714 *0.740 *0.200
Page 48 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
Outline• Darwin’s Magic and Algorithmic Beauty
• Evolutionary Computation in the Natural Sciences– Self-Assembly and Scanning Probe Microscopy
Optimisation– Structural Bioinformatics– Systems Biology & Synthetic Biology
• On Invariants, Decorations and the Future• Conclusions
Page 49 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
The Spatial Scales Involved
Page 50 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
The Cell as an Information Processing Device
LeDuc et al. Towards an in vivo biologically inspired nanofactory. Nature (2007)
Page 51 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
Transcription Networks
Gene1 Gene2 Gene3 Genek
Genome
Transcription Factors
Signal2 Signal5Signal1 Signal3 Signal4 Signaln...Environment
Page 52 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
Network Motifs: Evolution’s Preferred Circuits
• Biological networks are complex and vast
• Moreover, these patterns are organised in non-trivial/non-random hierarchies
“Patterns that occur in the real network significantly more often than in randomized networks are called network motifs” Shai S. Shen-Orr et al., Network motifs in the transcriptional regulation network of
Escherichia coli. Nature Genetics 31, 64 - 68 (2002)
Radu Dobrin et al., Aggregation of topological motifs in the Escherichia coli transcriptional regulatory network. BMC Bioinformatics. 2004; 5: 10.
Each network motif carries out a specific information-processing function
The C1-FFL is a ‘sign-sensitive delay’ element and a persistence detector.
The I1-FFL is a pulse generator and response accelerator
Page 53 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
Evolvable Executable Biology
Page 54 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
Nested EA for Model Synthesis
F. Romero-Campero, H.Cao, M. Camara, and N. Krasnogor. Structure and parameter estimation for cell systems biology models. Proceedings of the Genetic and Evolutionary Computation Conference (GECCO-2008), pages 331-338. ACM Publisher, 2008. Best Paper Award
H. Cao, F.J. Romero-Campero, S. Heeb, M. Camara, and N. Krasnogor. Evolving cell models for systems and synthetic biology. Systems and Synthetic Biology , 2009
Page 55 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
The Fitness Function• Multiple time-series per target
• Different time series have very different profiles, e.g., response time or maxima occur at different times/places
• Transient states (sometimes) as important as steady states
•RMSE will mislead search
•Sometimes the time series is qualitative or microarray data
H. Cao, F.J. Romero-Campero, S. Heeb, M. Camara, and N. Krasnogor. Evolving cell models for systems and synthetic biology. Systems and Synthetic Biology , 2009
Page 56 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
Problem Specification
Page 57 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
Target
Page 58 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
A Signal Translatorfor Pattern Formation
act1Prep2
act2Prep1
rep1Pact1
rep2Pact2
rep3Prep1
rep4Prep2
I2Prep3
I1Prep4
FP2Pact2
FP1Pact1
Page 59 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
Uniform Spatial Distribution of Signal Translators for Pattern Formation
E. coli DH5α ∆sdiA/∆lacI (2∆)
pACYC184pBR322
Page 60 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
Pattern Formation in synthetic bacterial colonies
Page 61 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
pAYCP (1-3)pBR322 (4-6)
2∆ DH5αStarting OD=10
Magnification: 100X
Page 62 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
pUC6S (1-6)
2∆ DH5α
Starting OD= 10
Magnification: 40X
Page 63 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
Outline• Darwin’s Magic and Algorithmic Beauty
• Evolutionary Computation in the Natural Sciences– Self-Assembly and Scanning Probe Microscopy
Optimisation– Structural Bioinformatics
• Systems Biology & Synthetic Biology
• On Invariants, Decorations and the Future• Conclusions
Page 64 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
Calculating Pi
Algorithms are Tiny
Factoring: Let n be the number to be factored.
1. Let Δ be a negative integer with Δ = -dn where d is a multiplier and Δ is the negative discriminant of some quadratic form.
2. Take the t first primes , for some .
3. Let fq be a random prime form of GΔ with .
4. Find a generating set X of GΔ
5. Collect a sequence of relations between set X and {fq : q ∈ PΔ} satisfying:
6. Construct an ambiguous form (a, b, c) which is an element f ∈ GΔ of order dividing 2 to obtain a coprime factorization of the largest odd divisor of Δ in which Δ = -4a.c or a(a - 4c) or (b - 2a).(b + 2a)
7. If the ambiguous form provides a factorization of n then stop, otherwise find another ambiguous form until the factorization of n is found. In order to prevent that useless ambiguous forms are generated, build up the 2-Sylow group S2(Δ) of G(Δ).
Page 65 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
They are NOT Algorithms!➡They do not stop, we stop them.➡They are not short pieces of code, but large
systems
What Evolutionary Algorithms are NOT?
Page 66 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
What are Evolutionary Algorithms?
N. Krasnogor and J.E. Smith. IEEE Transactions on Evolutionary Computation, 9(5):474- 488, 2005.
Research Paradigms for Problem Solving
T.S. Kuhn. The Structure of Scientific Revolutions, 1962.
Design Patterns and Pattern Languages
C. Alexander, S. Ishikawa, M. Silverstein, M. Jacobson, I. Fiksdahl-King, S. Angel, S.: A Pattern Language - Towns, Buildings, Construction. Oxford University Press (1977)
Page 67 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
A Compact “Memetic” Algorithm by Merz (2003)
Invariants and Decorations
Page 68 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
A “Memetic” Particles Swarm Optimisation by
Petalas et al (2007)
Invariants and Decorations
Page 69 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
A “Memetic” Artificial Immune System by Yanga et al (2008)
Invariants and Decorations
Page 70 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
A “Memetic” Learning Classifier
System by Bacardit & Krasnogor
(2009)
Invariants and Decorations
Page 71 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
• Many others based on Ant Colony Optimisation, NN, Tabu Search, SA, DE, etc.
• Key Invariants:– Global search mode– Local search mode
• Many Decorations, e.g.:– Crossover/Mutations (EAs based MAs)– Pheromones updates (ACO based MAs)– Clonal selection/Hypermutations (AIS based MAs)– etc
Invariants and Decorations
Page 72 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
A Pattern Language for Memetic AlgorithmsMemetic Algorithms by N. Krasnogor. Handbook of Natural Computation (chapter) in Natural Computing. Springer Berlin /
Heidelberg, 2009. www.cs.nott.ac.uk/~nxk/publications.html
Page 73 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
A General Trend: moving away from close-loop optimisation towards open-ended and embodied
optimisation
Effort (e.g. Time, $$$, etc)Programming solving 1 problem – single instances
Programming solving 1 problem – several instances(self) adaptive
Programming Solving a few problem – several classes instances(self) adaptive Self-generating
Programming Solving multiple unrelated problem – several classes instances(self) adaptive Self-generating
Self-Engineering
Reuse
Reuse
Effort (e.g. Time, $$$, etc)
Effort (e.g. Time, $$$, etc)
Effort (e.g. Time, $$$, etc)
Feedback
Reuse Feedback
Page 74 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
The Future of EAsSoftware Nurseries
• Fundamental Change of Temporal Scales Rethink• Software will be “seeded” and grown, very much like
a plant or animal (including humans)• Software will start in an “embryonic” state and
develop when situated on a production environment• What would a software “incubation” machine look
like?• What would a software “nursery” look like?
Page 75 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
Cells
Organs
Tissue
Individual
DNA/RNA
Potential To Develop into multiple
different types of cells
Commitment
Specialised Function
Ultimate Solver
Page 76 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
Software Cell SC
SC
SC
SC
SCSC
Pluripotential Solver“DNA”
TSP Organ
Euclidean TSP Organ
GraphicalTSP Organ
TSPSolver
SoftwareOrganism
Production EnvironmentInput
Page 77 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
Protein Structure PredictionSolver
SoftwareOrganism
Vehicle RoutingSolver
SoftwareOrganism
Graph IsomorphismSolver
SoftwareOrganism
SATSolver
SoftwareOrganism
Bin PackingSolver
SoftwareOrganism
Graph ColoringSolver
SoftwareOrganism
Network InterdictionSolver
SoftwareOrganism
Quadratic AssignmentSolver
SoftwareOrganism
TSPSolver
SoftwareOrganism
An Ecosystem of solvers
Page 78 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
As, e.g., Biologists & Physicists have done through an ubiquitous, worldwide spanning informatics
infrastructure, we should be focusing on building an online
worldwide computational problem solving infrastructure
Page 79 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
Outline• Darwin’s Magic and Algorithmic Beauty
• Evolutionary Computation in the Natural Sciences– Self-Assembly and Scanning Probe Microscopy
Optimisation– Structural Bioinformatics
• Systems Biology & Synthetic Biology
• On Invariants, Decorations and the Future• Conclusions
Page 80 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
Conclusions• New types of executable structures• In Nanotechnology
– DNA tiles, DNA origami, etc– Non DNA based tiles– Some have very definite programmable features– Others require the program to be “distributed” and exploit noise and
randomness
• In Synthetic Biology– How to orchestrate activities at multiple temporal-spatial-energetic
scales?– How to cope with noise in the background that execures a program and
in the program itself?!– How to cope for programs that will evolve?
Page 81 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
Conclusions• New types of benchmarks• Structural Biology (PSP and GP4PSP)
– Many of these problems can be modelled both as regression or classification problems
– Low/high number of classes– Balanced/unbalanced classes– Adjustable number of attributes– Ideal benchmarks !!
• Scanning Probe Microscopy: – Even a few dimensions are hard– “Chameleons” as it is sampled
• http://www.infobiotic.net/
Page 82 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
• The emerging trend is moving away from close-loop optimisation towards open-ended and embodied optimisation
• Requires strong links with data mining, ALIFE and, of course, AI (beyond existing trends in constraint satisfaction), search based software engineering (beyond current trends on testing/debugging)
Conclusions
•Requires on-line, computer friendly ontologies of code (e.g the pattern language in the left), self-describing source code, protocols for autonomic code reuse, etc
Page 83 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
Missing Components
Missing Components
• Learn From Physics, Chemistry & Biology The Invariants & Patterns, the Decorations are superfluous
• Evolution • Self-Assembly & Self-Organisation• Developmental systems
– Depend on a core genome coding for essential functionality
– Epigenomics canalises development• Hierarchical control systems that modify programs including
susceptibility to horizontal gene (program libraries) transfer• Infrastructure
Conclusions
Page 84 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
Acknowledgements
F. Romero-CamperoJ. Twycross
G. Terrazas
J. Bacardit
J. ChaplinP. Widera
A. Ali Shah
J. Blakes
E. Glaab
K. Righetti
M. Franco
D. Sannasy
L. T. Leong
CEC organisers A.E. Smith I. Parmee G. Kendall M. Schoenauer
• M. Camara, S. Heeb, P. Williams• C. Alexander• P. Moriarty, P. Beton• N. Chapness, R. Wooley• M. Holdsworth, G. Basel• Colleagues at ASAP
Page 85 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011
Page 86 of 86IEEE Congress on Evolutionary Computation - New Orleans, USA, 2011