cap6938 neuroevolution and artificial embryogeny competitive coevolution dr. kenneth stanley...
TRANSCRIPT
CAP6938Neuroevolution and Artificial Embryogeny
Competitive Coevolution
Dr. Kenneth Stanley
February 20, 2006
Example:I Want to Evolve a Go Player
• Go is one of the hardest games for computers• I am terrible at it• There are no good Go programs either
(hypothetically)• I have no idea how to measure the fitness of a
Go player• How can I make evolution solve this problem?
Generally: Fitness May Be Difficult to Formalize
• Optimal policy in competitive domains unknown• Only winner and loser can be easily determined• What can be done?
Competitive Coevolution
• Coevolution: No absolute fitness function
• Fitness depends on direct comparisons with other evolving agents
• Hope to discover solutions beyond the ability of fitness to describe
• Competition should lead to an escalating arms race
The Arms Race
The Arms Race is an AI Dream
• Computer plays itself and becomes champion
• No need for human knowledge whatsoever
• In practice, progress eventually stagnates (Darwen 1996; Floreano and Nolfi 1997; Rosin and Belew 1997)
So Who Plays Against Whom?
• If evaluation is expensive, everyone can’t play everyone
• Even if they could, a lot of candidates might be very poor
• If not everyone, who then is chosen as competition for each candidate?
• Need some kind of intelligent sampling
Challenges with Choosing the Right Opponents
• Red Queen Effect: Running in Circles– A dominates B– C dominates B– A dominates B
• Overspecialization– Optimizing a single skill to the neglect of all others– Likely to happen without diverse opponents in sample
• Several other failure dynamics
Heuristic in NEAT:Utilize Species Champions
Each individual plays all the species champions and keeps a score
Hall of Fame (HOF)(Rosin and Belew 1997)
• Keep around a list of past champions
• Add them to the mix of opponents
• If HOF gets too big, sample from it
More Recently:Pareto Coevolution
• Separate learners and tests• The tests are rewarded for distinguishing
learners from each other• The learners are ranked in Pareto layers
– Each test is an objective– If X wins against a superset of tests that Y wins again,
then X Pareto-dominates Y– The first layer is a nondominated front– Think of tests as objectives in a multiobjective
optimization problem• Potentially costly: All learners play all tests
De Jong, E.D. and J.B. Pollack (2004). Ideal Evaluation from Coevolution Evolutionary Computation, Vol. 12, Issue 2, pp. 159-192, published by The MIT Press.
Choosing Opponents Isn’t Everything
• How can new solutions be continually created that maintain existing capabilities?
• Mutations that lead to innovations could simultaneously lead to losses
• What kind of process ensures elaboration over alteration?
Alteration vs. Elaboration
Answer: Complexification
• Fixed-length genomes limit progress
• Dominant strategies that utilize the entire genome must alter and thereby sacrifice prior functionality
• If new genes can be added, dominant strategies can be elaborated, maintaining existing capabilities
Test Domain: Robot Duel
• Robot with higher energy wins by colliding with opponent • Moving costs energy • Collecting food replenishes energy • Complex task: When to forage/save energy, avoid/pursue?
Robot Neural Networks
Experimental Setup
• 13 complexifying runs, 15 fixed-topology runs
• 500 generations per run
• 2-population coevolution with hall of fame (Rosin & Belew 1997)
Performance is Difficult to Evaluate in Coevolution
• How can you tell if things are improving when everything is relative?– Number of wins is relative to each generation
• No absolute measure is available
• No benchmark is comprehensive
Expensive Method: Master Tournament
(Cliff and Miller 1995; Floreano and Nolfi 1997)
• Compare all generation champions to each other• Requires n^2 evaluations
– An accurate evaluation may involve e.g. 288 games
• Defeating more champions does not establish superiority
Strict and Efficient Performance Measure: Dominance Tournament
(Stanley & Miikkulainen 2002)
Result: Evolution of Complexity
• As dominance increases so does complexity on average • Networks with strictly superior strategies are more
complex
Comparing Performance
Summary of Performance Comparisons
The Superchamp
Cooperative Coevolution
• Groups attempt to work with each other instead of against each other
• But sometimes it’s not clear what’s cooperation and what’s competition
• Maybe competitive/cooperative is not the best distinction?– Newer idea: Compositional vs. test-based
Summary
• Picking best opponents
• Maintaining and elaborating on strategies
• Measuring performance
• Different types of coevolution
• Advanced papers on coevolution:Ideal Evaluation from Coevolution by De Jong, E.D. and J.B. Pollack (2004)Monotonic Solution Concepts in Coevolution by Ficici, Sevan G. (2005)
Next Topic: Real-time NEAT (rtNEAT)
• Simultaneous and asynchronous evaluation
• Non-generational
• Useful in video games and simulations
• NERO: Video game with rtNEAT
Homework due 2/27/06: Working genotype to phenotype mapping. Genetic representation completed. Saving and loading of genome file I/O functions completed. Turn in summary, code, and examples demonstrating that it works.
-Shorter symposium paper: Evolving Neural Network Agents in the NERO Video Game by Kenneth O. Stanley and Risto Miikkulainen (2005)-Optional journal (longer, more detailed) paper: Real-time Neuroevolution in the NERO Video Game by Kenneth O. Stanley and Risto Miikkulainen (2005) -http://Nerogame.org-Extra coevolution papers
Project Milestones (25% of grade)
• 2/6: Initial proposal and project description• 2/15: Domain and phenotype code and examples• 2/27: Genes and Genotype to Phenotype mapping • 3/8: Genetic operators all working• 3/27: Population level and main loop working• 4/10: Final project and presentation due (75% of grade)