ai in game programming it university of copenhagen learning from observations marco loog
Post on 19-Dec-2015
224 views
TRANSCRIPT
![Page 1: Ai in game programming it university of copenhagen Learning From Observations Marco Loog](https://reader033.vdocuments.mx/reader033/viewer/2022051516/56649d405503460f94a19e46/html5/thumbnails/1.jpg)
ai in game programming it university of copenhagen
Learning From Observations
Marco Loog
![Page 2: Ai in game programming it university of copenhagen Learning From Observations Marco Loog](https://reader033.vdocuments.mx/reader033/viewer/2022051516/56649d405503460f94a19e46/html5/thumbnails/2.jpg)
ai in game programming it university of copenhagen
Learning from Observations
Idea is that percepts should be used for improving agents ability to act in the future, not only for acting per se
![Page 3: Ai in game programming it university of copenhagen Learning From Observations Marco Loog](https://reader033.vdocuments.mx/reader033/viewer/2022051516/56649d405503460f94a19e46/html5/thumbnails/3.jpg)
ai in game programming it university of copenhagen
Outline
Learning agents
Inductive learning
Decision tree learning
![Page 4: Ai in game programming it university of copenhagen Learning From Observations Marco Loog](https://reader033.vdocuments.mx/reader033/viewer/2022051516/56649d405503460f94a19e46/html5/thumbnails/4.jpg)
ai in game programming it university of copenhagen
Learning
Learning is essential for unknown environments, i.e., when designer lacks omniscience
Learning is useful as a system construction method, i.e., expose the agent to reality rather than trying to write it down
Learning modifies the agent’s decision mechanisms to improve performance
![Page 5: Ai in game programming it university of copenhagen Learning From Observations Marco Loog](https://reader033.vdocuments.mx/reader033/viewer/2022051516/56649d405503460f94a19e46/html5/thumbnails/5.jpg)
ai in game programming it university of copenhagen
Learning Agent [Revisited]
Four conceptual components Learning element : responsible for making
improvements Performance element : takes percepts and
decides on actions Critic : provides feedback on how agent is
doing and determines how performance element should be modified
Problem generator : responsible for suggesting actions leading to new and informative experience
![Page 6: Ai in game programming it university of copenhagen Learning From Observations Marco Loog](https://reader033.vdocuments.mx/reader033/viewer/2022051516/56649d405503460f94a19e46/html5/thumbnails/6.jpg)
ai in game programming it university of copenhagen
Figure 2.15 [Revisited]
![Page 7: Ai in game programming it university of copenhagen Learning From Observations Marco Loog](https://reader033.vdocuments.mx/reader033/viewer/2022051516/56649d405503460f94a19e46/html5/thumbnails/7.jpg)
ai in game programming it university of copenhagen
Learning Element
Design of learning element is affected by
Which components of the performance element are to be learned
What feedback is available to learn these components
What representation is used for the components
![Page 8: Ai in game programming it university of copenhagen Learning From Observations Marco Loog](https://reader033.vdocuments.mx/reader033/viewer/2022051516/56649d405503460f94a19e46/html5/thumbnails/8.jpg)
ai in game programming it university of copenhagen
Agent’s Components
Direct mapping from conditions on current state to actions [instructor : brake!]
Means to infer relevant properties about world from percept sequence [learning from images]
Info about evolution of the world and results of possible actions [braking on wet road]
Utility indicating desirability of world state [no tip / component of utility function]
...
Each component can be learned from appropriate feedback
![Page 9: Ai in game programming it university of copenhagen Learning From Observations Marco Loog](https://reader033.vdocuments.mx/reader033/viewer/2022051516/56649d405503460f94a19e46/html5/thumbnails/9.jpg)
ai in game programming it university of copenhagen
Types of Feedback
Supervised learning : correct answers for each example
Unsupervised learning : correct answers not given
Reinforcement learning : occasional rewards
![Page 10: Ai in game programming it university of copenhagen Learning From Observations Marco Loog](https://reader033.vdocuments.mx/reader033/viewer/2022051516/56649d405503460f94a19e46/html5/thumbnails/10.jpg)
ai in game programming it university of copenhagen
Inductive Learning
Simplest form : learn a function from examples
I.e. learn the target function f
Examples : input / output pairs (x, f(x))
![Page 11: Ai in game programming it university of copenhagen Learning From Observations Marco Loog](https://reader033.vdocuments.mx/reader033/viewer/2022051516/56649d405503460f94a19e46/html5/thumbnails/11.jpg)
ai in game programming it university of copenhagen
Inductive Learning
Problem
Find a hypothesis h, such that h ≈ f, based on given training set of examples
= highly simplified model of real learning
Ignores prior knowledge Assumes examples are given
![Page 12: Ai in game programming it university of copenhagen Learning From Observations Marco Loog](https://reader033.vdocuments.mx/reader033/viewer/2022051516/56649d405503460f94a19e46/html5/thumbnails/12.jpg)
ai in game programming it university of copenhagen
Hypothesis
A good hypothesis will generalize well, i.e., able to predict based on unseen examples
![Page 13: Ai in game programming it university of copenhagen Learning From Observations Marco Loog](https://reader033.vdocuments.mx/reader033/viewer/2022051516/56649d405503460f94a19e46/html5/thumbnails/13.jpg)
ai in game programming it university of copenhagen
Inductive Learning Method
E.g. function fitting
Goal is to estimate real underlying functional relationship from example observations
![Page 14: Ai in game programming it university of copenhagen Learning From Observations Marco Loog](https://reader033.vdocuments.mx/reader033/viewer/2022051516/56649d405503460f94a19e46/html5/thumbnails/14.jpg)
ai in game programming it university of copenhagen
Inductive Learning Method
Construct h to agree with f on training set
![Page 15: Ai in game programming it university of copenhagen Learning From Observations Marco Loog](https://reader033.vdocuments.mx/reader033/viewer/2022051516/56649d405503460f94a19e46/html5/thumbnails/15.jpg)
ai in game programming it university of copenhagen
Inductive Learning Method
Construct h to agree with f on training set
![Page 16: Ai in game programming it university of copenhagen Learning From Observations Marco Loog](https://reader033.vdocuments.mx/reader033/viewer/2022051516/56649d405503460f94a19e46/html5/thumbnails/16.jpg)
ai in game programming it university of copenhagen
Inductive Learning Method
Construct h to agree with f on training set
![Page 17: Ai in game programming it university of copenhagen Learning From Observations Marco Loog](https://reader033.vdocuments.mx/reader033/viewer/2022051516/56649d405503460f94a19e46/html5/thumbnails/17.jpg)
ai in game programming it university of copenhagen
Inductive Learning Method
Construct h to agree with f on training set h is consistent if it agrees with f on all
examples
![Page 18: Ai in game programming it university of copenhagen Learning From Observations Marco Loog](https://reader033.vdocuments.mx/reader033/viewer/2022051516/56649d405503460f94a19e46/html5/thumbnails/18.jpg)
ai in game programming it university of copenhagen
Inductive Learning Method
Construct h to agree with f on training set h is consistent if it agrees with f on all
examples
![Page 19: Ai in game programming it university of copenhagen Learning From Observations Marco Loog](https://reader033.vdocuments.mx/reader033/viewer/2022051516/56649d405503460f94a19e46/html5/thumbnails/19.jpg)
ai in game programming it university of copenhagen
So, which ‘Fit’ is Best?
![Page 20: Ai in game programming it university of copenhagen Learning From Observations Marco Loog](https://reader033.vdocuments.mx/reader033/viewer/2022051516/56649d405503460f94a19e46/html5/thumbnails/20.jpg)
ai in game programming it university of copenhagen
So, which ‘Fit’ is Best?
Ockham’s razor : prefer simplest hypothesis consistent with the data
![Page 21: Ai in game programming it university of copenhagen Learning From Observations Marco Loog](https://reader033.vdocuments.mx/reader033/viewer/2022051516/56649d405503460f94a19e46/html5/thumbnails/21.jpg)
ai in game programming it university of copenhagen
So, which ‘Fit’ is Best?
Ockham’s razor : prefer simplest hypothesis consistent with the data What’s consistent? What’s simple?
![Page 22: Ai in game programming it university of copenhagen Learning From Observations Marco Loog](https://reader033.vdocuments.mx/reader033/viewer/2022051516/56649d405503460f94a19e46/html5/thumbnails/22.jpg)
ai in game programming it university of copenhagen
Hypothesis
A good hypothesis will generalize well, i.e., able to predict based on unseen examples
Not-exactly-consistent may be preferable over exactly consistent Nondeterministic behavior Consistency even not always possible
Nondeterministic functions : trade-off complexity of hypothesis / degree of fit
![Page 23: Ai in game programming it university of copenhagen Learning From Observations Marco Loog](https://reader033.vdocuments.mx/reader033/viewer/2022051516/56649d405503460f94a19e46/html5/thumbnails/23.jpg)
ai in game programming it university of copenhagen
Decision Trees
‘Decision tree induction is one of the simplest, and yet most successful forms of learning algorithm’
Good intro to the area of inductive learning
![Page 24: Ai in game programming it university of copenhagen Learning From Observations Marco Loog](https://reader033.vdocuments.mx/reader033/viewer/2022051516/56649d405503460f94a19e46/html5/thumbnails/24.jpg)
ai in game programming it university of copenhagen
Decision Tree
Input : object or situation described by set of attributes / features
Output [discrete or continuous] : decision / prediction
Continuous -> regression Discrete -> classification
Boolean classification : output is binary / ‘true’ or ‘false’
![Page 25: Ai in game programming it university of copenhagen Learning From Observations Marco Loog](https://reader033.vdocuments.mx/reader033/viewer/2022051516/56649d405503460f94a19e46/html5/thumbnails/25.jpg)
ai in game programming it university of copenhagen
Decision Tree
Performs a sequence of tests in order to reach a decision
Tree [as in : graph without closed loops] Internal node : test of the value of single
property Branches labeled with possible test
outcomes Leaf node : specifies output value
Resembles a ‘how to’ manual
![Page 26: Ai in game programming it university of copenhagen Learning From Observations Marco Loog](https://reader033.vdocuments.mx/reader033/viewer/2022051516/56649d405503460f94a19e46/html5/thumbnails/26.jpg)
ai in game programming it university of copenhagen
Decide whether to wait for a Table at a Restaurant
Based on the following attributes Alternate : is there an alternative restaurant nearby? Bar : is there a comfortable bar area to wait in? Fri/Sat : is today Friday or Saturday? Hungry : are we hungry? Patrons : number of people in the restaurant [None,
Some, Full] Price : price range [$, $$, $$$] Raining : is it raining outside? Reservation : have we made a reservation? Type : kind of restaurant [French, Italian, Thai, Burger] WaitEstimate : estimated waiting time [0-10, 10-30,
30-60, >60]
![Page 27: Ai in game programming it university of copenhagen Learning From Observations Marco Loog](https://reader033.vdocuments.mx/reader033/viewer/2022051516/56649d405503460f94a19e46/html5/thumbnails/27.jpg)
ai in game programming it university of copenhagen
Attribute-Based Representations
Examples of decisions
![Page 28: Ai in game programming it university of copenhagen Learning From Observations Marco Loog](https://reader033.vdocuments.mx/reader033/viewer/2022051516/56649d405503460f94a19e46/html5/thumbnails/28.jpg)
ai in game programming it university of copenhagen
Decision Tree
Possible representation for hypotheses Below is the ‘true’ tree [note Type? plays no
role]
![Page 29: Ai in game programming it university of copenhagen Learning From Observations Marco Loog](https://reader033.vdocuments.mx/reader033/viewer/2022051516/56649d405503460f94a19e46/html5/thumbnails/29.jpg)
ai in game programming it university of copenhagen
Expressiveness
Decision trees can express any function of the input attributes
E.g., for Boolean functions, truth table row path to leaf
![Page 30: Ai in game programming it university of copenhagen Learning From Observations Marco Loog](https://reader033.vdocuments.mx/reader033/viewer/2022051516/56649d405503460f94a19e46/html5/thumbnails/30.jpg)
ai in game programming it university of copenhagen
Expressiveness
There is a consistent decision tree for any training set with one path to leaf for each example [unless f nondeterministic in x] but it probably won’t generalize to new examples
Prefer to find more compact decision trees [This Ockham again...]
![Page 31: Ai in game programming it university of copenhagen Learning From Observations Marco Loog](https://reader033.vdocuments.mx/reader033/viewer/2022051516/56649d405503460f94a19e46/html5/thumbnails/31.jpg)
ai in game programming it university of copenhagen
Attribute-Based Representations
Is simply a lookup table Cannot generalize to unseen examples
![Page 32: Ai in game programming it university of copenhagen Learning From Observations Marco Loog](https://reader033.vdocuments.mx/reader033/viewer/2022051516/56649d405503460f94a19e46/html5/thumbnails/32.jpg)
ai in game programming it university of copenhagen
Decision Tree
Applying Ockham’s razor : smallest tree consistent with examples
![Page 33: Ai in game programming it university of copenhagen Learning From Observations Marco Loog](https://reader033.vdocuments.mx/reader033/viewer/2022051516/56649d405503460f94a19e46/html5/thumbnails/33.jpg)
ai in game programming it university of copenhagen
Decision Tree
Applying Ockham’s razor : smallest tree consistent with examples
Able to generalize to unseen examples
No need to program everything out / specify everything in detail
‘true’ tree = smallest tree?
![Page 34: Ai in game programming it university of copenhagen Learning From Observations Marco Loog](https://reader033.vdocuments.mx/reader033/viewer/2022051516/56649d405503460f94a19e46/html5/thumbnails/34.jpg)
ai in game programming it university of copenhagen
Decision Tree Learning
Unfortunately, finding the ‘smallest’ tree is intractable in general
New aim : find a ‘smallish’ tree consistent with the training examples
Idea : [recursively] choose ‘most significant’ attribute as root of [sub]tree
‘Most significant’ : making the most difference to the classification
![Page 35: Ai in game programming it university of copenhagen Learning From Observations Marco Loog](https://reader033.vdocuments.mx/reader033/viewer/2022051516/56649d405503460f94a19e46/html5/thumbnails/35.jpg)
ai in game programming it university of copenhagen
Choosing an Attribute Tests
Idea : a good attribute splits the examples into subsets that are [ideally] ‘all positive’ or ‘all negative’
Patrons? is a better choice
![Page 36: Ai in game programming it university of copenhagen Learning From Observations Marco Loog](https://reader033.vdocuments.mx/reader033/viewer/2022051516/56649d405503460f94a19e46/html5/thumbnails/36.jpg)
ai in game programming it university of copenhagen
Using Information Theory
Information content [entropy] : I(P(v1), … , P(vn)) = Σi=1 -P(vi) log2 P(vi) For a training set containing p positive
examples and n negative examples
Specifies the minimum number of bits of information needed to encode the classification of an arbitrary member
np
n
np
n
np
p
np
p
np
n
np
pI
22 loglog),(
![Page 37: Ai in game programming it university of copenhagen Learning From Observations Marco Loog](https://reader033.vdocuments.mx/reader033/viewer/2022051516/56649d405503460f94a19e46/html5/thumbnails/37.jpg)
ai in game programming it university of copenhagen
Information Gain
Chosen attribute A divides training set E into subsets E1, … , Ev according to their values for A, where A has v distinct values
Information gain [IG] : expected reduction in entropy caused by partitioning the examples
v
i ii
i
ii
iii
np
n
np
pI
np
npAremainder
1
),()(
)(),()( Aremaindernp
n
np
pIAIG
![Page 38: Ai in game programming it university of copenhagen Learning From Observations Marco Loog](https://reader033.vdocuments.mx/reader033/viewer/2022051516/56649d405503460f94a19e46/html5/thumbnails/38.jpg)
ai in game programming it university of copenhagen
Information Gain
Information gain [IG] : expected reduction in entropy caused by partitioning the examples
Choose the attribute with the largest IG
[Wanna know more : Google it...]
)(),()( Aremaindernp
n
np
pIAIG
![Page 39: Ai in game programming it university of copenhagen Learning From Observations Marco Loog](https://reader033.vdocuments.mx/reader033/viewer/2022051516/56649d405503460f94a19e46/html5/thumbnails/39.jpg)
ai in game programming it university of copenhagen
Information Gain [E.g.]
For the training set : p = n = 6, I(6/12, 6/12) = 1 bit
Consider Patrons? and Type? [and others]
Patrons has the highest IG of all attributes and so is chosen as the root Why is IG of Type? equal to zero?
bits 0)]4
2,
4
2(
12
4)
4
2,
4
2(
12
4)
2
1,
2
1(
12
2)
2
1,
2
1(
12
2[1)(
bits 0541.)]6
4,
6
2(
12
6)0,1(
12
4)1,0(
12
2[1)(
IIIITypeIG
IIIPatronsIG
![Page 40: Ai in game programming it university of copenhagen Learning From Observations Marco Loog](https://reader033.vdocuments.mx/reader033/viewer/2022051516/56649d405503460f94a19e46/html5/thumbnails/40.jpg)
ai in game programming it university of copenhagen
Decision Tree Learning
Plenty of other measures for ‘best’ attributes possible...
![Page 41: Ai in game programming it university of copenhagen Learning From Observations Marco Loog](https://reader033.vdocuments.mx/reader033/viewer/2022051516/56649d405503460f94a19e46/html5/thumbnails/41.jpg)
ai in game programming it university of copenhagen
Back to The Example...
‘Training data’
![Page 42: Ai in game programming it university of copenhagen Learning From Observations Marco Loog](https://reader033.vdocuments.mx/reader033/viewer/2022051516/56649d405503460f94a19e46/html5/thumbnails/42.jpg)
ai in game programming it university of copenhagen
Decision Tree Learned
Based on the 12 examples; substantially simpler solution than ‘true’ tree
More complex hypothesis isn’t justified by small amount of data
![Page 43: Ai in game programming it university of copenhagen Learning From Observations Marco Loog](https://reader033.vdocuments.mx/reader033/viewer/2022051516/56649d405503460f94a19e46/html5/thumbnails/43.jpg)
ai in game programming it university of copenhagen
Performance Measurement
How do we know that h ≈ f?
Or : how the h*ll do we know that our decision tree performs well?
Most often we don’t know... for sure
![Page 44: Ai in game programming it university of copenhagen Learning From Observations Marco Loog](https://reader033.vdocuments.mx/reader033/viewer/2022051516/56649d405503460f94a19e46/html5/thumbnails/44.jpg)
ai in game programming it university of copenhagen
Performance Measurement
However prediction quality can be estimated using
theory from computational / statistical learning theory / PAC-learning
Or we could, for example, simply try h on a new test set of examples The crux being of course that there should actually
be new test set...
If no test set is available several possibilities exist for creating ‘training’ and ‘test’ sets from the available data
![Page 45: Ai in game programming it university of copenhagen Learning From Observations Marco Loog](https://reader033.vdocuments.mx/reader033/viewer/2022051516/56649d405503460f94a19e46/html5/thumbnails/45.jpg)
ai in game programming it university of copenhagen
Performance Measurement
Learning curve : ‘%’ correct on test set as function of training set size
![Page 46: Ai in game programming it university of copenhagen Learning From Observations Marco Loog](https://reader033.vdocuments.mx/reader033/viewer/2022051516/56649d405503460f94a19e46/html5/thumbnails/46.jpg)
ai in game programming it university of copenhagen
Bad Conduct in AI
Training on the test set!
May happen before you know it Often very hard justifiable... if at all possible
All I can say is : try to avoid it
![Page 47: Ai in game programming it university of copenhagen Learning From Observations Marco Loog](https://reader033.vdocuments.mx/reader033/viewer/2022051516/56649d405503460f94a19e46/html5/thumbnails/47.jpg)
ai in game programming it university of copenhagen
Ensemble-Learning-in-1-Slide
Idea : collection [ensemble] of hypotheses is used / predictions are combined
Motivation : hope that it is much less likely to misclassify [obviously!] E.g. independence can be exploited
Examples : majority voting / boosting
Ensemble learning simply creates new, more expressive hypothesis space
![Page 48: Ai in game programming it university of copenhagen Learning From Observations Marco Loog](https://reader033.vdocuments.mx/reader033/viewer/2022051516/56649d405503460f94a19e46/html5/thumbnails/48.jpg)
ai in game programming it university of copenhagen
Summary
In general : learning needed for unknown environments or lazy designers
Learning agent = performance element + learning element [Chapter 2]
Supervised learning : the aim is to find simple hypothesis [approximately] consistent with training examples
Decision tree learning using IG Difficult to measure learning performance
Learning curve
![Page 49: Ai in game programming it university of copenhagen Learning From Observations Marco Loog](https://reader033.vdocuments.mx/reader033/viewer/2022051516/56649d405503460f94a19e46/html5/thumbnails/49.jpg)
ai in game programming it university of copenhagen
Next Week
More...
![Page 50: Ai in game programming it university of copenhagen Learning From Observations Marco Loog](https://reader033.vdocuments.mx/reader033/viewer/2022051516/56649d405503460f94a19e46/html5/thumbnails/50.jpg)
ai in game programming it university of copenhagen
![Page 51: Ai in game programming it university of copenhagen Learning From Observations Marco Loog](https://reader033.vdocuments.mx/reader033/viewer/2022051516/56649d405503460f94a19e46/html5/thumbnails/51.jpg)
ai in game programming it university of copenhagen