Different Varieties of Genetic Programming
Je-Gun Joung
Some of the Many Different Structures Used for GP
9.1 GP with Tree Genomes
Mutation Operators Applied in Tree-based GP
Point Mutation
+
*
x
--
*
1 x 1 - -
x 1 x 1
* -
x 1
+
*
x
--
+
1 x 1 - -
x 1 x 1
* -
x 1
Permutation
+
*
x
--
*
1 x 1 - -
x 1 x 1
* -
x 1
+
*
x
--
+
1 x 1 - -
x 1 1 x
* -
x 1
Hoist
+
*
x
--
*
1 x 1 - -
x 1 x 1
* -
x 1
- -
x 1 x 1
*
Expansion Mutation
+
*
x
--
*
1 x 1 - -
x 1 x 1
* -
x 1
+
*
x
--
*
x 1 - -
x 1 x 1
* -
x1
- -
x 1 x 1
*
Collapse Subtree Mutation
+
*
x
--
*
1 x 1 - -
x 1 x 1
* -
x 1
+
*
x
--
*
1 x 1
x -
x 1
Subtree Mutation
+
*
x
--
*
1 x 1 - -
x 1 x 1
* -
x 1
+
*
- -
x 1 x 1
* -
x 1
-
x 1
Crossover Operators Applied in Tree-based GP
Subtree Exchange Crossover
Selfcrossover
Module CrossoverCrossover Operators Applied within Tre
e-based GP
9.2 GP with Linear Genomes
Linear GP acts on linear genomes, like program code represented by bit strings or code for register machines.
The influence of change in a linear structure can be expected to follow the linear order in which the instructions are executed.
Tree-based GP is that all operators uniformly select nodes from a tree.
Linear GP is that all operators uniformly select nodes from a sequence.
9.2.1 Evolutionary Program Induction with Introns
Wineberg and Oppacher [1994] have formulated an evolutionary programming method they call EPI (evolutionary program induction).
They use fixed length strings to code their individuals and a GA-like crossover.
The code is constructed to maintain a fixed structure within the chromosome that allows similar alleles to compete against each other at a locus during
9.2.2 Developmental Genetic Programming
Developmental genetic programming (DGP) is extension of GP by a developmental step.
In tree based GP, the space of genotypes (search space) is usually identical to the space of phenotypes (solution space)
DGP maps binary sequences, genotype, through a developmental process into separate phenotypes
The Genotype-phenotype Mapping
GenotypeGenotype-Phenotype
Mapping (GPM) Penotype
Search Space(unconstrained) Constraint implementation
Solution space (constrained)
9.2.3 An Example: Evolution in C
Symbolic function regression
)tan(1
)cos()(esin ae
vmfq
An Example Result
Runs lasted for 50 generations at most, with a population size of 500 individuals.
In one experimental run, the genotype 1100 0010 1000 0111 1001 0010 1101 1001 0111 1100
0000 1011 1001 1110 1001 1010 1101 0011 1100 1111
0101 1010 0110 1110 0001 The raw symbol sequence
T*(a)*R)aE+C)E)SRDT)vSqE* Repairing transforms this illegal sequence into
{T((a)*R(a+m)+(S(D((v+q+D} This sequence is unfinished, repairing terminates by completing t
he sequence into
{T((a)*R(a+m))+(S(D((v+q+D(m)))))}
Finally, editing produces double ind(double m, double v, double a)
{return T((a)*R(a+m))+(S(D((v+q+D(m))))); }
A C compiler takes over to generate an executable that is valid on the underlying hardware platform
This executable is the final phenotype encoded by the genotype
mqv
maaf1
1sintan
9.2.4 Machine Language
1: x=x-1 (x-1)2+ (x-1)3
2: y=x*x
3: x=x*y
4: y=x+y Figure 9.13
-1
x
*
*
+
y
+
*
x
--
*
1 x 1 - -
x 1 x 1
* -
x 1
The representation of (x-1)2+(x-1) 3 in a tree-based genome
The reasons for using machine code in GP - as Opposed to
Higher-level languages The most efficient optimization can be done at
the machine code level. High-level tools might simply not be available
for a target processor It could be more convenient to let the
computer evolve small pieces of machine code programs itself rather than learning to master machine code programming
Reasons for Using Binary Machine Code
The GP algorithm can be made very fast by having the individual programs in the population in binary machine code.
The system is also much more memory efficient than a tree based GP system.
An additional advantage is that memory consumption is stable during evolution with no need for garbage collection.
The JB Language
0 = BLOCK (group statements)
1 = LOOP
2 = SET
3 = ZERO (clear)
4 = INCREMENT
Individual genome:
0 0 1 3 1 9 1 2 1 4 1 7
Block stat. 1 stat.2
register 1 = 0
repeat stat.1, register2
register1 = register1+1
The GEMS System
One of the most extensive systems for evolution of machine code is the GEMS system [Crepeau, 1995].
The system includes an almost complete interpreter for the Z-80 8-bit microprocessor.
The Z-80 has 691 different instructions, and GEMS implements 660 instructions.
It has so far been used to evolve a “hello world” program consisting of 58 instructions.
The Crossover of GEM
9.2.5 An Example: Evolution in Machine Language
9.3 GP with Graph Genomes
9.3.1 PADO The graph-based GP system PADO (Parallel Algorithm
Discovery and Orchestration) [Teller and Veloso, 1995] Each program has a stack and an indexed memory for it
s own use of intermediate values and for communication.
There are also the following special nodes in a program Start node Stop node Subprogram calling nodes Library subprogram calling nodes
The Representation of a Program and Subprogram in
the PADO
Fig 9.19
STOP
START
START
STOP
Main Program
Subprogram (private of public)
Stack
Indexed Memory
9.3.2 Cellular Encoding
9.4 Other Genomes
9.4.1 STROGANOFF Iba, Sato, and deGaris [1995] have introduced a more compli
cated structure into the nodes of a tree that could represent a program.
They base their approach on the well-known Group Method of Data Handling (GMDH)
In order to understand STructured Representation On Genetic Algorithms for Nonlinear Function Fitting (STROGANOFF)
The STROGANOFF method applies GP crossover and mutation to a population of the polynominal nodes.
Group Method of Data Handling (GMDH)
P1
P2 P4
X3 X5X1
P3
X2 X4
215224
2132211021 ),( xxaxaxaxaxaazxxP jj
Crossover of trees of GMDHP1
P2 P4
X3 X5X4
Pa
Pc
X3X1 X2 X4
P1
P2 Pb
X3 X5X2 X4
Pa
P4 Pc
X3X1 X2 X4
X2
Pb
Different Mutation of trees of GMDH
P1
P2 P4
X3 X5X1
P3
X2 X4
P1
P2
X4X1
P3
X2 X5
P1
P2
X3X1
P3
X2 X4
P1
P2 P4
X5X1
P3
X2
P1
P2 P4
X3 X5X1
P3
X2 X4
(a) (b)
(c) (d)
X3
P3
X4
9.4.2 GP Using Context-Free Grammars
By the use of a context-free grammar, typing and syntax are automatically assured throughout the evolutionary process
A Context-free grammar can be considered a four-tuple
Definition 9.2 A terminal of a context-free grammar is a symbol for which no production rule exists in the grammar.
Definition 9.3 A production rule is a substitution of the kind where and
),,,( NNS
YX YX NX
A Grammatical Structure
S
B
S
B B
B B- B B- B B- B B-
T T
X 1
T T
X 1
T T
X 1
T T
X 1
* B B*
B+
BX
T|%BB|BB|-BB|BBB
1|xT
S : the start symbol
B : a binary expression
T : a terminal
x and 1 : variables and a constant
9.4.3 Genetic Programming of L-Systems
Lindenmayer systems (also known as L-system [Lindenmayer, 1968][Prusinkiewicz and Lindenmayer, 1990] have been intorduced independently into the area of genetic programming by different researchers [Koza, 1993][Jacob, 1994][Hemmi et al., 1994]
L-systems were invented for the purpose of modeling biological structure formation
The rewriting all non-terminals in parallel is important in this respect.
L-system in their simplest form (0L-systems) are context-free grammars whose production rules are applied not sequentially but simultaneously to the growing tree of non-terminals.
Context-free L-system Individual Encoding a Production Rule System of Lin
denmayer type
0L-System
AxiomA LRule
LRuleLRuleLRule
pred succ pred succ pred succ