artificial intelligence for games online and local search patrick olivier p.l.olivier@ncl.ac.uk

Post on 05-Jan-2016

213 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Artificial Intelligence for Games

Online and local search

Patrick Olivier

p.l.olivier@ncl.ac.uk

Local search• In many optimisation problems we are interested in

the solution rather than the path to it.• Previously the state space was the paths to goal• Now the state space is "complete" configurations• The approach is to find one or more configurations

satisfying constraints then use a local search algorithm and try to improve it

• So local search algorithms need only keep a single "current" state rather than a current path, thus being very memory efficient even in very large problems.

Example problems

• Scheduling

• Layout

• Evolution

• Travelling salesman

• N-queens

Example – N-queens

• Given a n×n chess board, place n queens without any being able to take another in a single chess move.

• i.e. Only a single queen in any row, column, or diagonal.

• Left example a 4-queens problem (n=4).

Objective function

• Local search problems need an objective function.

• This may be the nearness to goal or simply the negative of the “cost” of a given state.

• A high objective function denotes a good solution.

Example objective functions

• Scheduling: negative time to complete, time spend working/wasted time.

• Layout: number of items/amount of space

• Evolution: reproductive success of a species

• Travelling salesman: negative number of cities visited twice or more

• N-queens: number of non-attacking queens

State space landscape

State space landscape

Hill-climbing search• “Climbing Everest in thick fog with amnesia”• Repeatedly “climb” to the neighboring state with

highest objective function until no neighbor has higher objective function.

Local maxima/minina• Problem: depending on initial state, can

get stuck in local maxima/minina

1/ (1+H(n)) = 1/17

1/ (1+H(n)) = 1/2 Local minima

Local beam search

• keep track of k states rather than just one• start with k randomly generated states• at each iteration, all the successors of all k

states are generated• if any one is a goal state, stop; else select k best

successors from complete list and repeat.

Simulated annealing search• Idea: escape local maxima by allowing some "bad" moves but gradually

decrease their frequency and range (VSLI layout, scheduling)

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

Simulated annealing example• Point feature labelling

Genetic algorithm search

• population of k randomly generated states• a state is represented as a string over a finite

alphabet (often a string of 0s and 1s)• evaluation function (fitness function) - higher

values for better states.• produce the next generation of k states by

selection, crossover, and mutation• rates for each of these configure search

– elitism / crossover rate / mutation rate

Genetic algorithms in games

• computationally expensive so primarily used as an offline form of learning

• Cloak, Dagger & DNA (Oidian Systems)– 4 DNA strands defining opponent behaviour– between battles, opponents play each other

• Creatures (Millennium Interactive)– genetic algorithms to learning the weights in a neural

network that defines behaviour

Online search

• All previous techniques have focused on offline reasoning (think first then act).

• Now we will briefly look at online search (think, act, think, act, ...)

• Advantageous in dynamic situations or those with only partial information.

“Real-time” search concepts

• in A* the whole path is computed off-line, before the agent walks through the path

• this solution is only valid for static worlds • if the world changes in the meantime, the initial

path is no longer valid:– new obstacles appear– position of goal changes (e.g. moving target)

“Real-time” definitions

• off-line (non real-time): the solution is computed in a given amount of time before being executed

• real-time: one move computed at a time, and that move executed before computing the next

• anytime: the algorithm constantly improves its solution through time capable of providing “current best” at any time

Agent-based (online) search• for example:

– mobile robot– NPC without perfect knowledge– agent that must act now with limited information

• planning and execution are interleaved• could apply standard search techniques:

– best-first (but we know it is poor)– depth-first (has to physically back-track)– A* (but nodes in the fringe are not accessible)

LRTA*: Learning Real-time A*• augment hill-climbing with memory• store “current best estimate”• follow path based on neighbours’ estimates• update estimates based on experience• experience learning• flatten out local maxima…

LRTA*: example8 9 2 2 4 1

1 1 1 1 1

8 9 3 2 4 11 1 1 1 1

8 9 3 4 4 11 1 1 1 1

8 9 5 4 4 11 1 1 1 1

8 9 5 5 4 11 1 1 1 1

Learning real-time A*

top related