decision making - santa clara universitytschwarz/coen129/ppt/decision making.pdf · decision making...
TRANSCRIPT
Decision MakingDecision MakingArtificial Intelligence for Gaming
BasicsBasics
� Avatar has a set of information
� Avatar has a goal
� Needs to generate a sequence of
actions in order to reach the goalactions in order to reach the goal
Character AI
Group AI
Execution Management
World
Inter
face
Strategy
Decision Makingface
Animation Physics
Decision Making
Movement Pathfinding
Decision MakingDecision Making
Internal Knowledge
Decision
Maker
Action
Request
External Knowledge
Maker Request
DECISION TREEDECISION TREE
Artificial Intelligence for Gaming
Decision TreeDecision Tree
� Tree:
Nodes:
� Interior nodes represent checking a single
variable
� End notes correspond to actions� End notes correspond to actions
Edges:
� two if there is a yes/no decision
� more if the evaluation gives an enumeration
type
Decision TreeDecision Tree
� Transform tree in order to achieve
better performance
Use dynamic programming for an optimal
solution if statistics are known
Decision TreeDecision Tree
� Randomness
Want avatar behavior to be unpredictable
� But within reason
Store old random decision
� Need to remove old decision when situation
has changed
� Store frame number with decision taken
Decision TreeDecision Tree
� Example
Under
Attack?Old Decision:
Frame Number
Defend PatrolStand still
Random
Frame Number
BEHAVIOR TREESBEHAVIOR TREES
Artificial Intelligence for Gaming
Behavior TreesBehavior Trees
� Becomes important since Halo 2
(2004)
� Synthesis of a number of techniques
� Middleware (GUI based)� Middleware (GUI based)
Behavior TreesBehavior Trees
� Task
From simple
� looking up a value
to complex
� running actions
to composite
� groups of tasks
Behavior TreesBehavior Trees
� Simple tasks
Condition tests� Test some property of game
� Proximity, line of sight, state of character, 6
� Usually implemented in a parameterized task
ActionsActions� Alter state of game
� Animation, character movement, change of internal state, audio,
Composites� Interior nodes of tree
� Selectors: returns immediately if one of its children runs successfully
� Sequence: returns immediately with failure if one of its children return unsuccessfully
Behavior TreesBehavior Trees
?
Attack Taunt Stare
Selector Node
Behavior TreesBehavior Trees
�
Enemy
visible
Turn
awayRun Away
Sequence Node
Behavior TreesBehavior Trees
�
Door Move
�
Move Move
?
� Example: Entering a room opening a
door if necessary
Door
Open?
Move
(into room)
Move
(to door)Open door
Move
(into room)
Behavior TreesBehavior Trees
� Condition action in a Sequence is like
an if condition:
Remaining actions are not carried out
Behavior TreesBehavior Trees
�
Door Move
�
Move Move
?
?Door
Open?
Move
(into room)
Move
(to door)
Open door
Move
(into room)
� �
?
Door not
locked?
Force door
open
Door
locked?
Behavior TreesBehavior Trees
� Possible to arrange selector and
sequence nodes in alternative levels
� Implements reactive planning
Avatars check conditions and base Avatars check conditions and base
actions on those checks
Behavior TreesBehavior Trees
� Randomization:
Do not always evaluate from left to right in
a sequencer or selector
Behavior TreesBehavior Trees
� Decorators
Class that wraps another class modifying
behavior
Has same interface as original class
Behavior TreesBehavior Trees
� Decorator examples:Limit the number of tries� E.g.: Do not try to force open a door to often
Running a task until it fails
Guarding resourcesGuarding resources� Limited resources
� Character can have only one animation
�
Animation Engine
available?
Play
animation
Play
animation
Animation
available
Decorator VersionCondition Version
Behavior TreesBehavior Trees
� Concurrency
Behavior trees need to run concurrently
� Implementing with threads
� Cooperative multi-tasking
� Use a new composite task: Parallel
� Need to generate a policy for parallel
� E.g.: Return failure as soon as a child returns with
failure (Sequencer Policy)
Behavior TreesBehavior Trees
� Concurrency
Sequencer �
Player in Position?Open door
automatically
Parallel� �
� �
Player in Position?
Open door
automaticallyUntil Fail
Behavior TreesBehavior Trees
� �
� �
Until
?
� �
� �
Until Tidy Recharge
Until
Fail
Until
Fail
Trash
visible?
Tidy
Trash
Inverter
Trash
visible?
Behavior TreesBehavior Trees
� Need another decorator
This construct will never return
� Need to overwrite the “until fail” decorator
� �� �
� �
Player in Position?
Open door
automaticallyUntil Fail
Behavior TreesBehavior Trees� Use two decorator
An interrupter decorator� Passes on success
An interrupt generator
� �
� �� �
Player in Position?
Open door
automatically
Until
Fail?
Interrupter Interrupter
Behavior TreesBehavior Trees
� Tasks need to have access to global
data
� Passing as parameters generates a
difficult APIdifficult API
� Use blackboard data structure
Write data and messages into a common,
global storage structure
Behavior TreesBehavior Trees
� Instantiation
Probably need several copies of a
behavior tree for different characters
Possibilities:
� Prototype-based object orientation
� Cloning operation to get new instances
� Separate specification and instantiation
� Use behavior trees without local state and use
separate state objects for characters
FUZZY LOGICFUZZY LOGIC
Artificial Intelligence for Gaming
Fuzzy LogicFuzzy Logic
� Fuzzy sets:
Normal set
� Membership is Boolean value
Fuzzy set
� Membership is given by a value between [0,1]
� NOT the same as probability to be in the set
� FuzzificationEnumerated values� Color = “red” � Fuzzy(dark) = 0.5
Numeric fuzzification� Characters A, B are .4 and .85 healthy
Fuzzy LogicFuzzy Logic
� Characters A, B are .4 and .85 healthy
Fuzzy LogicFuzzy Logic
� Defuzzification:Needs to translate to a boolean or floating point value
One element can be in various fuzzy sets� 0.7 for running, 0.4 for walking, 0.2 for creeping
Impossible to give one valueImpossible to give one value
Fuzzy LogicFuzzy Logic
� Highest membershipIn this example “run”
Choose speed as� smallest speed that give membership 1 to run
� highest speed that gives membership 1
� Average of the two
� Bisector � Bisector
Fuzzy LogicFuzzy Logic
� Defuzzification:� Blending
� Speed calculation: 0.7 for running, 0.4 for walking, 0.2 for
creeping
� (0.7/1.3)*(average speed for running) + (0.4/1.3)*(average � (0.7/1.3)*(average speed for running) + (0.4/1.3)*(average
speed for walking) + (0.2/1.3)*(average speed for
creeping)
� Smallest of maximum:
� Blending of minimum values
� Largest of maximum:
� Blending of maximum values
� Mean of maximum:
� Blending of average values
Fuzzy LogicFuzzy Logic
� Defuzzification� Center of Gravity
Diffuse LogicDiffuse Logic
� Defuzzification
To a Boolean value: Use cut-off point
To enumeration value
� If the values form a series:
� Determine cut-offs
� If not:
� Choose the one corresponding to the fuzzy set with
strongest membership
Fuzzy LogicFuzzy Logic
Center of Gravity
� Example
Cutoffs for range values
Fuzzy LogicFuzzy Logic
� Operatorsm(A and B: x) = min(m(A:x), m(B:x))
m(A or B: x) = max(m(A:x), m(B:x))
m(not A) = 1 – m(A:x)
Fuzzy LogicFuzzy Logic
� Fuzzy rules:
A � B: m(B:x) := m(A:x)
Fuzzy LogicFuzzy Logic
� Use a system of Boolean rules
Premises are Boolean combinations
Similar to rule-based systems
� System where premises are � System where premises are
conjunctions
Crisp inputs� fuzzy conditions � fuzzy
conclusions
Rules: i1 , i2 , i3 , 6 in � output
Fuzzy LogicFuzzy Logic
� Rules:corner-entry and going-fast � brake
corner-exit and going-fast � accelerate
corner-entry and going slow � accelerate
corner-exit and going slow � accelerate
� Input statescorner-entry 0.1corner-entry 0.1
corner-exit 0.9
going-fast 0.4
going-slow 0.6
� Resultsbrake = min(0.1, 0.4) = 0.1
accelerate
= max( min(0.9, 0.4), min(0.1, 0.6), min(0.9, 0.6)) = 0.6
Fuzzy LogicFuzzy Logic
� Works well for small sets of rules
� Alternative for large sets of rules
Restrict rules to singletons as premises
Replace “i1 , i2 , i3 , 6 in � output” with “i1Replace “i1 , i2 , i3 , 6 in � output” with “i1� output”, “i2 � output”, “i3 � output”, 6
“in � output”
Fuzzy LogicFuzzy Logic
� Can lead to contradictions
Applied to previous set of rules
� corner-entry � brake
� going-fast � brake
� corner-exit � accelerate
� going-fast� accelerate
� corner-entry � accelerate
� going-slow � accelerate
� corner-exit � accelerate
� going-fast � accelerate
Fuzzy LogicFuzzy Logic
� Combs method exampleNeed to reformulate rules� going-fast � brake
� going-slow � accelerate
� corner-entry � brake
� corner-exit � accelerate
Input statesInput states� corner-entry 0.1
� corner-exit 0.9
� going-fast 0.4
� going-slow 0.6
Combs method� brake � 0.4
� accelerate � 0.9
Boolean rules gave� brake � 0.1
� accelerate � 0.6
FUZZY STATE MACHINESFUZZY STATE MACHINES
Artificial intelligence for gaming
Fuzzy State MachinesFuzzy State Machines
� State machines
System is in one state at a moment
Decisions to transit to another state is
taken based on input and / or randomly
Behavior in a state is well defined
Fuzzy State MachinesFuzzy State Machines
� Fuzzy State MachinesMembership in a state can be fuzzy� System is in one of several active states
� At each iteration, consider transitions from all active states
Example:� System is in State A with DOM 0.6 and in State B
with DOM 0.3
� A transition fires from State A to State B with DOM 0.4
� New DOM for B is max(min(0.4, 0.6),0.3) = 0.3
� Or of being in State B with the and of being in State A and transition firing
Fuzzy State MachinesFuzzy State Machines
� Actions:
Usually, can only have one action
Therefore:
� Generate a fuzzy list of actions
� Defuzzify list to select one action
Fuzzy State MachinesFuzzy State Machines
� Implementation is similar to that of
finite state machine
But:
� Need to consider all active states
� At each iteration:
� Go through all active states in decreasing order of
DOM
� Go through all possible transitions from active states
and decide whether they can fire
� Create a new list for active states with DOMs
� Can do easily because OR and AND are associative
MARKOV SYSTEMSMARKOV SYSTEMS
Artificial Intelligence for Gaming
Markov SystemsMarkov Systems
� Fuzzy state machine can be in
different states with associated
degrees of membership
� Can work directly with numeric values� Can work directly with numeric values
Markov SystemsMarkov Systems
A B
States and Transitions
C D
Markov SystemMarkov System
� State of system is given as an array of
(non-zero) values.
� Values change by multiplying with a
transition matrix vectortransition matrix vector
Markov SystemMarkov System
� Vector represents safety of 5 sniping
positions
� Taking a shot from the first position
alerts enemy toalerts enemy to
Presence of a sniper
Approximate location
� Safety points change
Markov SystemMarkov System
3
3
10
⋅
=
3
3
10
001.100
02.001.10
00001.0
3.3
1.4
1
2
4
2
4
20000
1.02000
4
2.8
Notice: Total weight of vector goes down
Markov SystemMarkov System
� Example (continued)
Avatar takes shot from a position
� Safety vector is updated according to position
chosen
Avatar remains inactiveAvatar remains inactive
� Safety vector is updated every minute of
inaction
Markov SystemMarkov System
� Implementation
Transitions
� belong to the whole state machine
� triggered by conditions
� apply transition matrix to state vector� apply transition matrix to state vector
Default Transition
� Triggered by inaction for a certain time
� Run a timer whenever no transaction fires
Actions
� Not controlled by state, but possibly by transitions
� Indirect through interpretation of state vector
GOAL ORIENTED GOAL ORIENTED
BEHAVIORBEHAVIOR
Artificial Intelligence for Games
BEHAVIORBEHAVIOR
Goal oriented behaviorGoal oriented behavior
� So far: reactive behavior
� Goal oriented behavior
Avatar behavior can be made to appear to
be goal oriented with previous methodsbe goal oriented with previous methods
� Can be used as a paradigm
Example: The Sims
Goal oriented behaviorGoal oriented behavior
� In general
Avatars have a wide range of actions
Display their emotional and physical state
by the actions chosen
Goal oriented behaviorGoal oriented behavior
� Goals (motives)Enumerated with a description� “eat”, “regenerate health”, “kill enemies”, “stay
alive”, ...
Can be transitory:Can be transitory:� “kill enemy 234”
Can be permanent:� “not hungry”
� Goal has urgency, expressed by a number� E.g.: Character eats: “not hungry” becomes a less
urgent goal
Goal oriented behaviorGoal oriented behavior
� Actions
Available to avatar
� Can be given in a central list
� Can be generated by objects in the world
� Depend on the state of the world
Example:
� Sims: Kettle object generates action “boil
kettle”
� Sims: Empty oven object adds “insert raw food
into oven” action
Goal oriented behaviorGoal oriented behavior
� Interaction between actions and goalsPossible actions are evaluated with respect to goals
� Atomic actions:Actions are evaluated directly by goalsActions are evaluated directly by goals
� Indirect Goal Fulfillment:Need action sequence:� E.g.: “Not hungry”: “Go to supermarket”, “Buy
raw food”, “Insert raw food into oven”, “Cook”, “Take food out of oven”, “Eat food”
Goal oriented behaviorGoal oriented behavior
� Simple Selection
Set of possible actions and set of goals
Actions have different impact on goals
Determine most pressing goalDetermine most pressing goal
Evaluate and select action based on
fulfilling the goal
Goal oriented behaviorGoal oriented behavior
� Example:
Goal: Eat = 4, Bathroom = 3
� Out of a scale of 0 – 5
Actions:
� Drink soda: eat -= 2, bathroom += 3
� visit bathroom: eat -= 0, bathroom -= 4
Leads to unreasonable behavior!
Goal oriented behaviorGoal oriented behavior
� Global Utility
Assign utility to all states of goal fulfillment
Evaluate actions based on maximum
change of utilities
Goal oriented behaviorGoal oriented behavior
� Example:
Goal: Eat = 4, Bathroom = 3
Actions:
� Drink soda: eat -= 2, bathroom += 3
� visit bathroom: eat -= 0, bathroom -= 4
Discontentment = eat^2 + bathroom ^2
Drink soda � discontentment to 29
Visit bathroom � discontentment to 16
Goal oriented behaviorGoal oriented behavior
� Introducing timing:
Might need to measure time spent for
sub-actions
Timing costs can be estimated by
heuristics
� Instead of doing costly pathfinding
Goal oriented behaviorGoal oriented behavior
� Introducing timing
Utility can involve time
OR: Prefer short over long actions
Goal oriented behaviorGoal oriented behavior
� Example:Goals: � Eat = 4 (+4 each hour),
� Bathroom = 3 (+2 each hour)
Actions:� Eat snack: eat -= 2, (15 minutes)
� Afterwards: eat = 2, bathroom = 3.5, Discontentment = 16.25� Afterwards: eat = 2, bathroom = 3.5, Discontentment = 16.25
� Eat main meal: eat -= 4 (1 hour)� Afterwards: eat = 0, bathroom = 5, Discontentment = 25
� Visit bathroom: eat -= 0, bathroom -= 4 (15 minutes)� Afterwards: eat = 5, bathroom = 0, Discontentment = 25
Discontentment = eat^2 + bathroom ^2
Drink soda � discontentment to 29
Visit bathroom � discontentment to 16
Goal oriented behaviorGoal oriented behavior
� Goal change over time
Sims: Basic rate at which motives change
Shooter: “Hurt” motive depends on
number of hits taken, goal is not
predictable
� Simplest method
Update goal status periodically
Goal oriented behaviorGoal oriented behavior
� Planning
Needed for more complicated situations
� Actions can enable / disable other actions
Before a warrior enters combat, needs to
decide on armament and munition
Goal oriented behaviorGoal oriented behavior
� Overall Utility Goal Oriented Action
Planning
Standard AI tree search:
� Nodes describe states of the avatar
� Can generate actions in each node
� Each action generates a different node
� Creates a tree of action sequences
Can use depth-first strategy
� Usually too involved looking at all possibilities
Goal oriented behaviorGoal oriented behavior
� Overall Utility Goal Oriented Action
Planning
IDA* (iterative deepening A*)
� Need optimistic heuristic to approximate utility
in a node
Goal oriented behaviorGoal oriented behavior
� Smells
Each possibility to satisfy a goal is given a
smell
� Smell spreads through world
� Slowly around corners� Slowly around corners
� Does not pass through walls
Avatar uses smell to find the place where the
goal can be satisfied
� To avoid pathfinding
� Avatar goes after highest concentration of smell
The Sims
RULE BASED SYSTEMSRULE BASED SYSTEMS
Artificial Intelligence for Gaming
RuleRule--Based SystemsBased Systems
� Classical AI technology (1970 – 1985)
� Used for expert systems
� Rare for Gaming
Considered to heavyConsidered to heavy
RuleRule--Based SystemsBased Systems
� Common structure:
Database of
� Rules
� In form Conditions � Action
� State of the world� State of the world
Arbiter
� To select among triggered rules
At each iteration:
� Update world state, collect triggered rules
RuleRule--Based SystemsBased SystemsRULES:
If Whisker.health <
15 AND radio held
by Whisker
�
Sale take radioArbiter
World
Captain.health = 51
Johnson.health = 38
Sale.health = 42
If ANYONE’s health
< 15 and MY health
> 45
�
Bring bandages
...
Sale.health = 42
Whisker.health = 12
Radio = Whisker
...
RuleRule--Based SystemsBased Systems
� Database matching
Need to match conditions of the rules
� Unification:
Use wildcards to make more general rulesUse wildcards to make more general rules
RuleRule--Based SystemsBased Systems
� Actions change database entries
indirectly
If the rule tells Sale to pick up the radio, then
he will try, but not necessarily succeed
On occasion, actions change entries � On occasion, actions change entries
directly
Change status of soldier to “alert”
� Forward chaining
Start with data, apply rules, if rules change
database, continue applying rules
RuleRule--Based SystemsBased Systems
� Rule Arbitration
Several rules may trigger at the same time
But only one can fire
Use heuristics to decide on order� First applicable
� Possible that the same rule then applies and fires all the time
� Need to suspend rule after firing
� Least Recently Used� Maintain all rules in an ordered queue
� Pull and enqueue fired rule
� Most Specific Condition� Rationale: Rules that have a complicated condition fire rarely,
but are most specific to situation
� Dynamic Priority Allocation
RuleRule--Based SystemsBased Systems
� Unification
Use wildcards:
� “(?person-1 (health 0-15)) AND (Radio held-by
?person-2)”
Replaces many concrete rules with a Replaces many concrete rules with a
generic one
� Matching becomes more different as all
potential assignments need to match
RuleRule--Based SystemsBased Systems
� RETA
AI industry standard for rule-based
systems
Patterns for rules are in a single data
structure, the RETE
RuleRule--Based SystemsBased Systems
(radio (held-by
?person-1))
(?person-1
(health < 15))
(?person-2
(health > 45))
(?person-2 (is-covering
?person-1))
Swap Radio rule Change Backup rule
RuleRule--Based SystemsBased Systems
� Rete
Top: Nodes that represent individual
clauses in a rule
� Pattern Nodes
Nodes representing conjunctions
� Join Nodes
Rules that can be fired
RuleRule--Based SystemsBased Systems
� Rete:
World state is fed into top nodes
A match is passed down to join nodes
� With wild-cards, need to give bindings
If world state changes:
� Removing / adding a fact
� Pattern nodes see whether they have the fact among
their conditions
� If yes, actualize output
� Join nodes react
BLACKBOARD BLACKBOARD
ARCHITECTUREARCHITECTURE
Artificial Intelligence for Games
ARCHITECTUREARCHITECTURE
Blackboard ArchitectureBlackboard Architecture
� Mechanism for coordinating actions of
several decision makers
Decision makers can use different techniques
Decision makers have different areas of
expertiseexpertise
� E.g.: Tank Simulation
Target selection
Firing solution finder
Movement planner
...
Blackboard ArchitectureBlackboard Architecture
Maneuver
Arbiter
Blackboard:
ammo = 4
healthy
enemy-sighted
hide
Maneuver
Expert
Trap
Detection
Stealth
Firearms
Expert
Power-ups
Level
Tactics
Blackboard ArchitectureBlackboard Architecture
� Algorithm in iterations1. Experts look at board and indicate interests
2. Arbiter selects an expert to be in control
3. Experts does some work, usually changes blackboardblackboard
4. Expert relinquishes control
� Decision is taken when no expert indicates interest
� Actions are taken if all experts judge it feasible
SCRIPTINGSCRIPTING
Artificial Intelligence for Games
ScriptingScripting
� Behavior can be hard-coded
Arising need to separate engine from
content
Use scripting for agility during
development
� Scripting allows users to update
behavior
Modding
� Extends full-price shell life of games beyond
eight weeks
ACTION EXECUTIONACTION EXECUTION
Artificial Intelligence for Games
Action ExecutionAction Execution
� Actions
State change actions
Animations
� Make movement request
� Use movement engine to execute movement
AI-Requests
� Higher level decisions are passed on to lower
level entities
Action ExecutionAction Execution
sleep wake-upresting
animation
complete
animation
complete
On guardgo-to-sleep tired
Do not represent transitions as additional states
Otherwise, you blow up the number of states
Action ExecutionAction Execution
� Scripted actionsSet of pre-programmed actions
Carried out in sequence
� Low-cost way of generating complex behavior:behavior:
E.g.: Shooter: Avatar uses correct cover tactics� Fire, Roll, Run
Decision making does not involve modeling the steps