cap6938 neuroevolution and artificial embryogeny competitive coevolution dr. kenneth stanley...

CAP6938Neuroevolution and Artificial Embryogeny

Competitive Coevolution

Dr. Kenneth Stanley

February 20, 2006

Example:I Want to Evolve a Go Player

• Go is one of the hardest games for computers• I am terrible at it• There are no good Go programs either

(hypothetically)• I have no idea how to measure the fitness of a

Go player• How can I make evolution solve this problem?

Generally: Fitness May Be Difficult to Formalize

• Optimal policy in competitive domains unknown• Only winner and loser can be easily determined• What can be done?

Competitive Coevolution

• Coevolution: No absolute fitness function

• Fitness depends on direct comparisons with other evolving agents

• Hope to discover solutions beyond the ability of fitness to describe

• Competition should lead to an escalating arms race

The Arms Race

The Arms Race is an AI Dream

• Computer plays itself and becomes champion

• No need for human knowledge whatsoever

• In practice, progress eventually stagnates (Darwen 1996; Floreano and Nolfi 1997; Rosin and Belew 1997)

So Who Plays Against Whom?

• If evaluation is expensive, everyone can’t play everyone

• Even if they could, a lot of candidates might be very poor

• If not everyone, who then is chosen as competition for each candidate?

• Need some kind of intelligent sampling

Challenges with Choosing the Right Opponents

• Red Queen Effect: Running in Circles– A dominates B– C dominates B– A dominates B

• Overspecialization– Optimizing a single skill to the neglect of all others– Likely to happen without diverse opponents in sample

• Several other failure dynamics

Heuristic in NEAT:Utilize Species Champions

Each individual plays all the species champions and keeps a score

Hall of Fame (HOF)(Rosin and Belew 1997)

• Keep around a list of past champions

• Add them to the mix of opponents

• If HOF gets too big, sample from it

More Recently:Pareto Coevolution

• Separate learners and tests• The tests are rewarded for distinguishing

learners from each other• The learners are ranked in Pareto layers

– Each test is an objective– If X wins against a superset of tests that Y wins again,

then X Pareto-dominates Y– The first layer is a nondominated front– Think of tests as objectives in a multiobjective

optimization problem• Potentially costly: All learners play all tests

De Jong, E.D. and J.B. Pollack (2004). Ideal Evaluation from Coevolution Evolutionary Computation, Vol. 12, Issue 2, pp. 159-192, published by The MIT Press.

http://www.cs.uu.nl/~dejong/publications/coevec.ps



http://mitpress.mit.edu/EVCO

Choosing Opponents Isn’t Everything

• How can new solutions be continually created that maintain existing capabilities?

• Mutations that lead to innovations could simultaneously lead to losses

• What kind of process ensures elaboration over alteration?

Alteration vs. Elaboration

Answer: Complexification

• Fixed-length genomes limit progress

• Dominant strategies that utilize the entire genome must alter and thereby sacrifice prior functionality

• If new genes can be added, dominant strategies can be elaborated, maintaining existing capabilities

Test Domain: Robot Duel

• Robot with higher energy wins by colliding with opponent • Moving costs energy • Collecting food replenishes energy • Complex task: When to forage/save energy, avoid/pursue?

Robot Neural Networks

Experimental Setup

• 13 complexifying runs, 15 fixed-topology runs

• 500 generations per run

• 2-population coevolution with hall of fame (Rosin & Belew 1997)

Performance is Difficult to Evaluate in Coevolution

• How can you tell if things are improving when everything is relative?– Number of wins is relative to each generation

• No absolute measure is available

• No benchmark is comprehensive

Expensive Method: Master Tournament

(Cliff and Miller 1995; Floreano and Nolfi 1997)

• Compare all generation champions to each other• Requires n^2 evaluations

– An accurate evaluation may involve e.g. 288 games

• Defeating more champions does not establish superiority

Strict and Efficient Performance Measure: Dominance Tournament

(Stanley & Miikkulainen 2002)

Result: Evolution of Complexity

• As dominance increases so does complexity on average • Networks with strictly superior strategies are more

complex

Comparing Performance

Summary of Performance Comparisons

The Superchamp

Cooperative Coevolution

• Groups attempt to work with each other instead of against each other

• But sometimes it’s not clear what’s cooperation and what’s competition

• Maybe competitive/cooperative is not the best distinction?– Newer idea: Compositional vs. test-based

Summary

• Picking best opponents

• Maintaining and elaborating on strategies

• Measuring performance

• Different types of coevolution

• Advanced papers on coevolution:Ideal Evaluation from Coevolution by De Jong, E.D. and J.B. Pollack (2004)Monotonic Solution Concepts in Coevolution by Ficici, Sevan G. (2005)

http://www.cs.uu.nl/~dejong/publications/coevec.pdf

http://www.cs.uu.nl/~dejong/publications/coevec.pdf

http://demo.cs.brandeis.edu/papers/long.html#ficici_gecco_05

http://demo.cs.brandeis.edu/papers/long.html#ficici_gecco_05

Next Topic: Real-time NEAT (rtNEAT)

• Simultaneous and asynchronous evaluation

• Non-generational

• Useful in video games and simulations

• NERO: Video game with rtNEAT

Homework due 2/27/06: Working genotype to phenotype mapping. Genetic representation completed. Saving and loading of genome file I/O functions completed. Turn in summary, code, and examples demonstrating that it works.

-Shorter symposium paper: Evolving Neural Network Agents in the NERO Video Game by Kenneth O. Stanley and Risto Miikkulainen (2005)-Optional journal (longer, more detailed) paper: Real-time Neuroevolution in the NERO Video Game by Kenneth O. Stanley and Risto Miikkulainen (2005) -http://Nerogame.org-Extra coevolution papers

http://nn.cs.utexas.edu/keyword?stanley:cig05

http://nn.cs.utexas.edu/keyword?stanley:ieeetec05



http://nerogame.org/

Project Milestones (25% of grade)

• 2/6: Initial proposal and project description• 2/15: Domain and phenotype code and examples• 2/27: Genes and Genotype to Phenotype mapping • 3/8: Genetic operators all working• 3/27: Population level and main loop working• 4/10: Final project and presentation due (75% of grade)

cap6938 neuroevolution and artificial embryogeny competitive coevolution dr. kenneth stanley...

Documents

higher energy wins

foragesave energy

costs energy

x pareto

otherthe learners

distinguishing learners

ability of fitness

energy complex task