biologically inspired computing: evolutionary algorithms: hillclimbing, landscapes, neighbourhoods...

Post on 21-Dec-2015

223 Views

Category:

Documents

1 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Biologically Inspired Computing:

Evolutionary Algorithms: Hillclimbing, Landscapes,

Neighbourhoods

This is lecture 4 of

`Biologically Inspired Computing’

Contents:

hillclimbing, local search, landscapes

Back to BasicsWith your thirst for seeing example EAs temporarily quenched, the story now skips to simpler optimization algorithms.

1. Hillclimbing

2. Local Search

These have much in common with EAs, but with no population – they just use a single solution and keep trying to improve it.

HC and LS can be very effective algorithms when appropriately engineered. But by looking at them, we will discover certain limitations, and this leads us directly to algorithm design strategies that look like Eas. That is, we can ‘arrive’ at EAs from an algorithm design route, not just a ‘bio-inspiration’ route.

The Travelling Salesperson Problem(also see http://francisshanahan.com/tsa/tsaGAworkers.htm )

An example (hard) problem, for illustration

The Travelling Salesperson ProblemFind the shortest tour through the cities.

AD E

CB

A B C D E

A 5 7 4 15

B 5 3 4 10

C 7 3 2 7

D 4 4 2 9

E 15 10 7 9

The one below is length: 33

Simplest possible EA: Hillclimbing

0. Initialise: Generate a random solution c; evaluate its fitness, f(c). Call c the current solution.

1. Mutate a copy of the current solution – call the mutant mEvaluate fitness of m, f(m).

2. If f(m) is no worse than f(c), then replace c with m, otherwise do nothing (effectively discarding m).

3. If a termination condition has been reached, stop. Otherwise, go to 1.

Note. No population (well, population of 1). This is a very simple version of an EA, although it has been around for much longer.

Why “Hillclimbing”?

Suppose that solutions are lined up along the x axis, and that mutationalways gives you a nearby solutions. Fitness is on the y axis; this is a landscape

12

4

3

5, 8

67

9 10

1. Initial solution; 2. rejected mutant; 3. new current solution, 4. New current solution; 5. new current solution; 6. new current soln7. Rejected mutant; 8. rejected mutant; 9. new current solution,10. Rejected mutant, …

Example: HC on the TSP

We can encode a candidate solution to the TSP as a permutation

AD E

CB

A B C D E

A 5 7 4 15

B 5 3 4 10

C 7 3 2 7

D 4 4 2 9

E 15 10 7 9

Here is our initial random solution ACEDB withfitness 32

Current solution

Example: HC on the TSP

We can encode a candidate solution to the TSP as a permutation

AD E

CB

A B C D E

A 5 7 4 15

B 5 3 4 10

C 7 3 2 7

D 4 4 2 9

E 15 10 7 9

D E

CB

Here is our initial random solution ACEDB withfitness 32. We are going to mutate it – first make Mutant a copy of Current.

A

Current solutionCopy of current…

HC on the TSP

AD E

CB

A B C D E

A 5 7 4 15

B 5 3 4 10

C 7 3 2 7

D 4 4 2 9

E 15 10 7 9

D E

CB

We randomly mutate it (swap randomly chosenadjacent nodes) from ACEDB to ACDEBwhich has fitness 33 -- so current stays the sameBecause we reject this mutant.

A

Current solutionMutant

HC on the TSP

AD E

CB

A B C D E

A 5 7 4 15

B 5 3 4 10

C 7 3 2 7

D 4 4 2 9

E 15 10 7 9

D E

CB

We now try another mutation of Current (swap randomly chosen adjacent nodes) from ACEDB to CAEDB. Fitness is 38, so reject that too.

A

Current solutionMutant

HC on the TSP

AD E

CB

A B C D E

A 5 7 4 15

B 5 3 4 10

C 7 3 2 7

D 4 4 2 9

E 15 10 7 9

D E

CB

Our next mutant of Current is from ACEDB toAECDB. Fitness 33, reject this too.

A

Current solutionMutant

HC on the TSP

AD E

CB

A B C D E

A 5 7 4 15

B 5 3 4 10

C 7 3 2 7

D 4 4 2 9

E 15 10 7 9

D E

CB

Our next mutant of Current is from ACEDB toACDEB. Fitness 33, reject this too.

A

Current solutionMutant

HC on the TSP

AD E

CB

A B C D E

A 5 7 4 15

B 5 3 4 10

C 7 3 2 7

D 4 4 2 9

E 15 10 7 9

D E

CB

Our next mutant of Current is from ACEDB toACEBD. Fitness is 32. Equal to Current, so this becomes the new Current.

A

Current solutionMutant

HC on the TSP

A B C D E

A 5 7 4 15

B 5 3 4 10

C 7 3 2 7

D 4 4 2 9

E 15 10 7 9

ACEBD is our Current solution, with fitness 32

Current solution

D E

CB

A

HC on the TSP

A B C D E

A 5 7 4 15

B 5 3 4 10

C 7 3 2 7

D 4 4 2 9

E 15 10 7 9

ACEBD is our Current solution, with fitness 32.We mutate it to DCEBA (note that first and last are adjacent nodes); fitness is 28. So this becomesour new current solution.

Current solution

D E

CB

A

Mutant

D E

CB

A

HC on the TSP

A B C D E

A 5 7 4 15

B 5 3 4 10

C 7 3 2 7

D 4 4 2 9

E 15 10 7 9

Our new Current, DCEBA, with fitness 28

Current solution

D E

CB

A

HC on the TSP

A B C D E

A 5 7 4 15

B 5 3 4 10

C 7 3 2 7

D 4 4 2 9

E 15 10 7 9

Our new Current, DCEBA, with fitness 28 . We mutate it, this time getting DCEAB, with fitness33 – so we reject that and DCEBA is still ourCurrent solution. Current solution

D E

CB

A

Mutant

D E

CB

A

And so on …

HC again, on a 1D ‘landscape’F

ITN

ES

S

x - the ‘genotype’

Initial random pointF

ITN

ES

S

x - the ‘genotype’

mutantF

ITN

ES

S

x - the ‘genotype’

Current solution (unchanged)F

ITN

ES

S

x - the ‘genotype’

Next mutantF

ITN

ES

S

x - the ‘genotype’

New current solutionF

ITN

ES

S

x - the ‘genotype’

Next mutantF

ITN

ES

S

x - the ‘genotype’

New current solutionF

ITN

ES

S

x - the ‘genotype’

ETC …………………………F

ITN

ES

S

x - the ‘genotype’

How will HC do on this landscape?

Some other landscapes

LandscapesRecall S, the search space, and f(s), the fitness of a candidate in S

f(s)

members of S lined up along here

The structure we get by imposing f(s) on S is called a landscape

NeighbourhoodsRecall S, the search space, and f(s), the fitness of a candidate in S

f(s)

Showing the neighbourhoods of two candidate solutions, assumingThe mutation operator adds a random number between −1 and 1

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17

NeighbourhoodsRecall S, the search space, and f(s), the fitness of a candidate in S

f(s)

Showing the neighbourhood of a candidate solutions, assumingThe mutation operator adds a random integer between −2 and 2

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17

NeighbourhoodsRecall S, the search space, and f(s), the fitness of a candidate in S

f(s)

Showing the neighbourhood of a candidate solutions, assumingthe mutation operator simply changes the solution to a new randomNumber between 0 and 20

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17

NeighbourhoodsRecall S, the search space, and f(s), the fitness of a candidate in S

f(s)

Showing the neighbourhood of a candidate solution, assumingthe mutation operator adds a Gaussian (ish) with mean zero.

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17

NeighbourhoodsLet s be an individual in S, f(s) is our fitness function, and M is

our mutation operator, so that M(s1) s2, where s2 is a mutant of s1.

Given M, we can usually work out the neighbourhood of an individual point s – the neighbourhood of s is the set of all possible mutants of s

E.g. Encoding: permutations of k objects (e.g. for k-city TSP) Mutation: swap any adjacent pair of objects. Neighbourhood: Each individual has k neighbours. E.g. neighbours of EABDC are: {AEBDC, EBADC, EADBC, EABCD, CABDE}

Encoding: binary strings of length L (e.g. L-item 2-bin-packing) Mutation: choose a bit randomly and flip it. Neighbourhood: Each individual has L neighbours. E.g. neighbours of 00110 are: {10110, 01110, 00010, 00100, 00111}

Landscape Topology

Mutation operators lead to slight changes in the solution, which tend to lead to slight changes in fitness.

Why are “big mutations” generally a bad idea

to have in a search algorithm ??

Typical Landscapes

f(s)

members of S lined up along here

Typically, with large (realistic) problems, the huge majority of thelandscape has very poor fitness – there are tiny areas where the decent solutions lurk.So, big random changes are very likely to take us outside the nice areas.

Typical Classes of Landscapes

As we home in on the good areas, we can identify broad types of Landscape feature. Most landscapes of interest are predominantly multimodal.Despite being locally smooth, they are globally rugged

Unimodal Plateau

Multimodal Deceptive

Beyond Hillclimbing

HC clearly has problems with typical landscapes:

There are two broad ways to improve HC, from the algorithm viewpoint:

1. Allow downhill moves – a family of methods called Local Search does this in various ways.

2. Have a population – so that different regions can be explored inherently in parallel – I.e. we keep `poor’ solutions around and give them a chance to `develop’.

Local Search

Initialise: Generate a random solution c; evaluate its fitness, f(s) = b; call c the current solution, and call b the best so far.

Repeat until termination conditon reached:1. Search the neighbourhood of c, and choose one, m Evaluate fitness of m, call that x.2. According to some policy, maybe replace c with x, and update c and b as appropriate.

E.g. Monte Carlo search: 1. same as hillclimbing; 2. If x is better, accept it as new current solution;if x is worse, accept it with someprobabilty (e.g. 0.1).

E.g. tabu search: 1. evaluate all immediate neighbours of c 2. choose the best from (1) to be the next current solution, unless it is`tabu’ (recently visited), in which choose the next best, etc.

Population-Based Search• Local search is fine, but tends to get stuck in local

optima, less so than HC, but it still gets stuck.• In PBS, we no longer have a single `current solution’,

we now have a population of them. This leads directly to the two main algorithmic differences between PBS and LS– Which of the set of current solutions do we mutate? We

need a selection method– With more than one solution available, we needn’t just

mutate, we can [recombine, crossover, etc …] two or more current solutions.

• So this is an alternative route towards motivating our nature-inspired EAs – and also starts to explain why they turn out to be so good.

An extra bit about Encodings

Direct vs Indirect

An aside …

Encoding a humanAcgtctcacgcttc

acgtctctcactcgagctactactccattctcctaggccgcgagcgcgagcgcggttatctagctccttagccaatctctagtagtctcagcgctttactactcacgcaagagctcagcggcattataaattctatctcatctcattagagccgaacgagccggatactcgatcgatcgacttcgaccgacgcgggaggcttcatagcatcgatcg

The nucleus contains our DNA; this is a very big molecule resembling a long chain; actually it is cut into bits called chromosomes,but for our purposes we can usually think of it as one big molecule,and in fact we can think of it as a long string of letters from theset {A, C, G, T} -- in humans, about 3,000,000,000 letters.

How the encoding works: Genes

Certain sections of DNA are genes. In most organisms, they are few and far between (e.g. 3% of the genome), so the distance above is misleading, although they often come in clusters close together.

CATCGGCTTATCTAGCTAATCGAGCTCTCTGAAGAGAAATATCATCTACGACTACTACGACACACATCGACGAGGCATC

You can think of a cell as a protein factory Each gene is a blueprint for a protein, which gets manufactured in the cell, and then goes and does some job elsewhere in the body, or maybe in the same cell. A gene specifies how to make a specific protein, using the materials typically found inside the cell (amino acids) .

From Genes to Proteins

This is what seems to happen:

ACTCGCGATCGAGCTACGAGACTCATGCAGCTATGAC

First, owing to something about the shape and chemical properties of the DNA molecule near the start of the gene, a specific collectionof proteins in the cell are attracted to this region.

Genes Proteins

ACTCGCGATCGAGCTACGAGACTCATGCAGCTATGAC

In turn, this collection attracts a big thing (actually a collectionof proteins itself) called DNA polymerase.

Genes Proteins

ACTCGCGATCGAGCTACGAGACTCATGCAGCTATGAC

The DNA polymerase then travels along the gene, and on the way (via complex and elegant biochemical gymnastics) it makes an RNA transcript of the gene. This is a kind of `negative’ of the gene, carrying all of the information in it. The DNA polymerase falls of at the end of the gene, owing to something about the pattern of nucleotides it finds there. RNA is similar to DNA, but uses `U’ in place of T.

CUAGCUCGARNA

Genes ProteinsACTCGCGATCGAGCTACGAGACTCATGCAGCTATGAC

The RNA transcript then finds its way out of the nucleus, and is attracted toa ribosome (which is, you guessed it, made of a collection of proteins).

CUAGCUCGAUGCUCUGAGUACGUC ribosome

The ribosome then manufactures a protein by, put most simply and rathermisleadingly, attaching amino acids to the RNA transcript. There are 20 differentamino acids floating about in the cell, and which are attached where depends on 3-letter sequences of RNA called codons

Note: If you don’t eat your vitamins (vital amino acids) they aren’t around enoughin your cells to make certain key proteins.

Genes Proteins

CUAGCUCGAUGCUCUGAGUACGUCUAGRNA transcript

The ribosome `translates’ each 3-letter codon into a specific amino acid; this mapping is called the genetic code.

L A R C S E Y V [stop]

Protein Structure

A protein has a backbone of Carbon and Nitrogen atoms, with an aminoacid `residue’ attached to every second carbon along it.

The sequence of amino acids determines the structure of the protein

L A R C S E Y V

CC

NC

CN

CC

NC

H

H

H

H

O

O

O

H

H

HL

A

R

C

Protein Structure II Backbone

Sequence of amino acid residues

etc…

This long chain (e.g. around 300-2000 atoms), with amino acids strungalong it like beads, instantaneously folds into a 3D structure, governedby the chemical and electrostatic environment. The resulting structureis a protein.

Notice how the information contained in the gene is now reflected inthe sequence of amino acids, which in turn determines the 3D structureand properties of the protein.

These proteins, and a few other

things, then interact in

complex ways, and build you.

That was an example of an indirect encoding

Some common terminology is: genotype – the data structure (e.g. a vector of real numbers), and phenotype – the solution interpreted from the genotype (e.g. a design for a two-phase jet nozzle).

The genotypephenotype mapping is an important part of the encoding. Sometimes it is direct (i.e. simple), sometimes it is indirect (involving many complex steps).

Encoding / Representation

Maybe the main issue in (applying) ECNote that:• Given an optimisation problem to solve, we need

to find a way of encoding candidate solutions • There can be many very different encodings for

the same problem• Each way affects the shape of the landscape and

the choice of best strategy for climbing that landscape.

E.g. encoding a timetable I

4, 5, 13, 1, 1, 7, 13, 2

• Generate any string of 8 numbers between 1 and 16, and we have a timetable! • Fitness may be <clashes> + <consecs> + etc …• Figure out an encoding, and a fitness function, and you can try to evolve solutions.

mon tue wed thur

9:00 E4, E5

E2 E3, E7

11:00

E8

2:00 E6

4:00 E1

Exam1 in 4th slotExam2 in 5th slot

Etc …

Mutating a Timetable with Encoding 1

4, 5, 13, 1, 1, 7, 13, 2

mon tue wed thur

9:00 E4, E5

E2 E3, E7

11:00

E8

2:00 E6

4:00 E1

Using straightforward single-gene mutation

Choose a random gene

Mutating a Timetable with Encoding 1

4, 5, 6 , 1, 1, 7, 13, 2

mon tue wed thur

9:00 E4, E5

E2 E7

11:00

E8 E3

2:00 E6

4:00 E1

Using straightforward single-gene mutation

One mutation changes position of one exam

Alternative ways to do it

This is called a `direct’ encoding. Note that:

• A random timetable is likely to have lots of clashes.

• The EA is likely (?) to spend most of its time crawling through clash-ridden areas of the search space.

• Is there a better way?

E.g. encoding a timetable II

4, 5, 13, 1, 1, 7, 13, 2

mon tue wed thur

9:00 E4, E5

E3, E7

11:00

E8

2:00 E6, E2

4:00 E1

Use the 4th clash-free slot for exam1Use the 5th clash-free slot for exam2 (clashes with E4,E8)

Use the 13th clash-free slot for exam3

Etc …

So, a common approach is to build an encoding around an

algorithm that builds a solution

• Don’t encode a candidate solution directly

• … instead encode parameters/features for a constructive algorithm that builds a candidate solution

E2

top related