artificial intelligence comp-424atombe2/classnotes... · 2009-04-16 · artificial intelligence...

104
Artificial Intelligence COMP-424 Lecture notes by Alexandre Tomberg Prof. Joelle Pineau McGill University Winter 2009 Lecture notes Page 1

Upload: others

Post on 05-Jun-2020

8 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Artificial Intelligence COMP-424atombe2/classnotes... · 2009-04-16 · Artificial Intelligence COMP-424 Lecture notes by Alexandre Tomberg Prof ... Winter 2009 Lecture notes Page

Artificial IntelligenceCOMP-424

Lecture notes by Alexandre Tomberg

Prof. Joelle Pineau McGill University

Winter 2009

Lecture notes Page 1

Page 2: Artificial Intelligence COMP-424atombe2/classnotes... · 2009-04-16 · Artificial Intelligence COMP-424 Lecture notes by Alexandre Tomberg Prof ... Winter 2009 Lecture notes Page

History of AII.

Uninformed Search Methods1.Informed Search2.Search for Optimization Problems3.Game Playing4.Constraint Satisfaction5.

SearchII.

Knowledge Representation: Logic1.First Order Logic2.Planning3.Spatial Planning4.

LogicIII.

Reasoning under Uncertainty1.Bayesian Networks2.

ProbabilityIV.

Machine Learning: Parameter Estimation1.Learning with Missing Values2.Supervised Learning3.Neural Nets4.Decision Trees5.

Machine LearningV.

Utility Theory1.Markov Decision Processes (MDPs)2.Reinforcement Learning3.

Decision TheoryVI.

Table of ContentsDecember-03-08

12:16 PM

Lecture notes Page 2

Page 3: Artificial Intelligence COMP-424atombe2/classnotes... · 2009-04-16 · Artificial Intelligence COMP-424 Lecture notes by Alexandre Tomberg Prof ... Winter 2009 Lecture notes Page

History of AIJanuary-06-09

10:03 AM

Lecture notes Page 3

Page 4: Artificial Intelligence COMP-424atombe2/classnotes... · 2009-04-16 · Artificial Intelligence COMP-424 Lecture notes by Alexandre Tomberg Prof ... Winter 2009 Lecture notes Page

Uninformed Search MethodsJanuary-08-09

10:06 AM

Lecture notes Page 4

Page 5: Artificial Intelligence COMP-424atombe2/classnotes... · 2009-04-16 · Artificial Intelligence COMP-424 Lecture notes by Alexandre Tomberg Prof ... Winter 2009 Lecture notes Page

Lecture notes Page 5

Page 6: Artificial Intelligence COMP-424atombe2/classnotes... · 2009-04-16 · Artificial Intelligence COMP-424 Lecture notes by Alexandre Tomberg Prof ... Winter 2009 Lecture notes Page

Generic Search Algorithm:

Algorithm 1: BFS

Lecture notes Page 6

Page 7: Artificial Intelligence COMP-424atombe2/classnotes... · 2009-04-16 · Artificial Intelligence COMP-424 Lecture notes by Alexandre Tomberg Prof ... Winter 2009 Lecture notes Page

Algorithm 2: DFS

Algorithm 3: Depth limited search

Algorithm 4: Iterative Deepening

Lecture notes Page 7

Page 8: Artificial Intelligence COMP-424atombe2/classnotes... · 2009-04-16 · Artificial Intelligence COMP-424 Lecture notes by Alexandre Tomberg Prof ... Winter 2009 Lecture notes Page

Informed SearchJanuary-13-09

10:02 AM

Lecture notes Page 8

Page 9: Artificial Intelligence COMP-424atombe2/classnotes... · 2009-04-16 · Artificial Intelligence COMP-424 Lecture notes by Alexandre Tomberg Prof ... Winter 2009 Lecture notes Page

Algorithm #1: Best-First Search

Algorithm #2: Heuristic Search

AlgorithmsJanuary-13-09

10:34 AM

Lecture notes Page 9

Page 10: Artificial Intelligence COMP-424atombe2/classnotes... · 2009-04-16 · Artificial Intelligence COMP-424 Lecture notes by Alexandre Tomberg Prof ... Winter 2009 Lecture notes Page

Algorithm # 3: A* search

Lecture notes Page 10

Page 11: Artificial Intelligence COMP-424atombe2/classnotes... · 2009-04-16 · Artificial Intelligence COMP-424 Lecture notes by Alexandre Tomberg Prof ... Winter 2009 Lecture notes Page

Lecture notes Page 11

Page 12: Artificial Intelligence COMP-424atombe2/classnotes... · 2009-04-16 · Artificial Intelligence COMP-424 Lecture notes by Alexandre Tomberg Prof ... Winter 2009 Lecture notes Page

Search for Optimization ProblemsJanuary-15-09

10:05 AM

Lecture notes Page 12

Page 13: Artificial Intelligence COMP-424atombe2/classnotes... · 2009-04-16 · Artificial Intelligence COMP-424 Lecture notes by Alexandre Tomberg Prof ... Winter 2009 Lecture notes Page

Algorithm #1: Hill Climbing

Algorithm #2: Simulated Annealing

Iterative Improvement AlgorithmsJanuary-15-09

10:05 AM

Lecture notes Page 13

Page 14: Artificial Intelligence COMP-424atombe2/classnotes... · 2009-04-16 · Artificial Intelligence COMP-424 Lecture notes by Alexandre Tomberg Prof ... Winter 2009 Lecture notes Page

Lecture notes Page 14

Page 15: Artificial Intelligence COMP-424atombe2/classnotes... · 2009-04-16 · Artificial Intelligence COMP-424 Lecture notes by Alexandre Tomberg Prof ... Winter 2009 Lecture notes Page

Genetic AlgorithmsJanuary-15-09

11:06 AM

Lecture notes Page 15

Page 16: Artificial Intelligence COMP-424atombe2/classnotes... · 2009-04-16 · Artificial Intelligence COMP-424 Lecture notes by Alexandre Tomberg Prof ... Winter 2009 Lecture notes Page

Game PlayingJanuary-20-09

10:03 AM

Lecture notes Page 16

Page 17: Artificial Intelligence COMP-424atombe2/classnotes... · 2009-04-16 · Artificial Intelligence COMP-424 Lecture notes by Alexandre Tomberg Prof ... Winter 2009 Lecture notes Page

Minimax SearchJanuary-20-09

10:07 AM

Lecture notes Page 17

Page 18: Artificial Intelligence COMP-424atombe2/classnotes... · 2009-04-16 · Artificial Intelligence COMP-424 Lecture notes by Alexandre Tomberg Prof ... Winter 2009 Lecture notes Page

α-β PruningJanuary-20-09

10:44 AM

Lecture notes Page 18

Page 19: Artificial Intelligence COMP-424atombe2/classnotes... · 2009-04-16 · Artificial Intelligence COMP-424 Lecture notes by Alexandre Tomberg Prof ... Winter 2009 Lecture notes Page

Constraint SatisfactionJanuary-22-09

10:10 AM

Lecture notes Page 19

Page 20: Artificial Intelligence COMP-424atombe2/classnotes... · 2009-04-16 · Artificial Intelligence COMP-424 Lecture notes by Alexandre Tomberg Prof ... Winter 2009 Lecture notes Page

Lecture notes Page 20

Page 21: Artificial Intelligence COMP-424atombe2/classnotes... · 2009-04-16 · Artificial Intelligence COMP-424 Lecture notes by Alexandre Tomberg Prof ... Winter 2009 Lecture notes Page

Lecture notes Page 21

Page 22: Artificial Intelligence COMP-424atombe2/classnotes... · 2009-04-16 · Artificial Intelligence COMP-424 Lecture notes by Alexandre Tomberg Prof ... Winter 2009 Lecture notes Page

Knowledge Representation: LogicJanuary-27-0910:10 AM

Lecture notes Page 22

Page 23: Artificial Intelligence COMP-424atombe2/classnotes... · 2009-04-16 · Artificial Intelligence COMP-424 Lecture notes by Alexandre Tomberg Prof ... Winter 2009 Lecture notes Page

Lecture notes Page 23

Page 24: Artificial Intelligence COMP-424atombe2/classnotes... · 2009-04-16 · Artificial Intelligence COMP-424 Lecture notes by Alexandre Tomberg Prof ... Winter 2009 Lecture notes Page

Lecture notes Page 24

Page 25: Artificial Intelligence COMP-424atombe2/classnotes... · 2009-04-16 · Artificial Intelligence COMP-424 Lecture notes by Alexandre Tomberg Prof ... Winter 2009 Lecture notes Page

Lecture notes Page 25

Page 26: Artificial Intelligence COMP-424atombe2/classnotes... · 2009-04-16 · Artificial Intelligence COMP-424 Lecture notes by Alexandre Tomberg Prof ... Winter 2009 Lecture notes Page

Lecture notes Page 26

Page 27: Artificial Intelligence COMP-424atombe2/classnotes... · 2009-04-16 · Artificial Intelligence COMP-424 Lecture notes by Alexandre Tomberg Prof ... Winter 2009 Lecture notes Page

Lecture notes Page 27

Page 28: Artificial Intelligence COMP-424atombe2/classnotes... · 2009-04-16 · Artificial Intelligence COMP-424 Lecture notes by Alexandre Tomberg Prof ... Winter 2009 Lecture notes Page

First Order LogicFebruary-18-09

7:50 PM

Lecture notes Page 28

Page 29: Artificial Intelligence COMP-424atombe2/classnotes... · 2009-04-16 · Artificial Intelligence COMP-424 Lecture notes by Alexandre Tomberg Prof ... Winter 2009 Lecture notes Page

Lecture notes Page 29

Page 30: Artificial Intelligence COMP-424atombe2/classnotes... · 2009-04-16 · Artificial Intelligence COMP-424 Lecture notes by Alexandre Tomberg Prof ... Winter 2009 Lecture notes Page

Lecture notes Page 30

Page 31: Artificial Intelligence COMP-424atombe2/classnotes... · 2009-04-16 · Artificial Intelligence COMP-424 Lecture notes by Alexandre Tomberg Prof ... Winter 2009 Lecture notes Page

Lecture notes Page 31

Page 32: Artificial Intelligence COMP-424atombe2/classnotes... · 2009-04-16 · Artificial Intelligence COMP-424 Lecture notes by Alexandre Tomberg Prof ... Winter 2009 Lecture notes Page

Lecture notes Page 32

Page 33: Artificial Intelligence COMP-424atombe2/classnotes... · 2009-04-16 · Artificial Intelligence COMP-424 Lecture notes by Alexandre Tomberg Prof ... Winter 2009 Lecture notes Page

Lecture notes Page 33

Page 34: Artificial Intelligence COMP-424atombe2/classnotes... · 2009-04-16 · Artificial Intelligence COMP-424 Lecture notes by Alexandre Tomberg Prof ... Winter 2009 Lecture notes Page

Lecture notes Page 34

Page 35: Artificial Intelligence COMP-424atombe2/classnotes... · 2009-04-16 · Artificial Intelligence COMP-424 Lecture notes by Alexandre Tomberg Prof ... Winter 2009 Lecture notes Page

PlanningFebruary-03-0910:11 AM

Lecture notes Page 35

Page 36: Artificial Intelligence COMP-424atombe2/classnotes... · 2009-04-16 · Artificial Intelligence COMP-424 Lecture notes by Alexandre Tomberg Prof ... Winter 2009 Lecture notes Page

Lecture notes Page 36

Page 37: Artificial Intelligence COMP-424atombe2/classnotes... · 2009-04-16 · Artificial Intelligence COMP-424 Lecture notes by Alexandre Tomberg Prof ... Winter 2009 Lecture notes Page

Lecture notes Page 37

Page 38: Artificial Intelligence COMP-424atombe2/classnotes... · 2009-04-16 · Artificial Intelligence COMP-424 Lecture notes by Alexandre Tomberg Prof ... Winter 2009 Lecture notes Page

Lecture notes Page 38

Page 39: Artificial Intelligence COMP-424atombe2/classnotes... · 2009-04-16 · Artificial Intelligence COMP-424 Lecture notes by Alexandre Tomberg Prof ... Winter 2009 Lecture notes Page

Partial Order Planning AlgorithmFebruary-18-09

8:55 PM

Lecture notes Page 39

Page 40: Artificial Intelligence COMP-424atombe2/classnotes... · 2009-04-16 · Artificial Intelligence COMP-424 Lecture notes by Alexandre Tomberg Prof ... Winter 2009 Lecture notes Page

Least Commitment

Analysis

Lecture notes Page 40

Page 41: Artificial Intelligence COMP-424atombe2/classnotes... · 2009-04-16 · Artificial Intelligence COMP-424 Lecture notes by Alexandre Tomberg Prof ... Winter 2009 Lecture notes Page

Spatial PlanningFebruary-03-09

10:32 AM

Lecture notes Page 41

Page 42: Artificial Intelligence COMP-424atombe2/classnotes... · 2009-04-16 · Artificial Intelligence COMP-424 Lecture notes by Alexandre Tomberg Prof ... Winter 2009 Lecture notes Page

Lecture notes Page 42

Page 43: Artificial Intelligence COMP-424atombe2/classnotes... · 2009-04-16 · Artificial Intelligence COMP-424 Lecture notes by Alexandre Tomberg Prof ... Winter 2009 Lecture notes Page

Lecture notes Page 43

Page 44: Artificial Intelligence COMP-424atombe2/classnotes... · 2009-04-16 · Artificial Intelligence COMP-424 Lecture notes by Alexandre Tomberg Prof ... Winter 2009 Lecture notes Page

Lecture notes Page 44

Page 45: Artificial Intelligence COMP-424atombe2/classnotes... · 2009-04-16 · Artificial Intelligence COMP-424 Lecture notes by Alexandre Tomberg Prof ... Winter 2009 Lecture notes Page

If we know probabilities, what actions should we choose?

Reasoning under UncertaintyFebruary-18-09

9:13 PM

Lecture notes Page 45

Page 46: Artificial Intelligence COMP-424atombe2/classnotes... · 2009-04-16 · Artificial Intelligence COMP-424 Lecture notes by Alexandre Tomberg Prof ... Winter 2009 Lecture notes Page

Lecture notes Page 46

Page 47: Artificial Intelligence COMP-424atombe2/classnotes... · 2009-04-16 · Artificial Intelligence COMP-424 Lecture notes by Alexandre Tomberg Prof ... Winter 2009 Lecture notes Page

Lecture notes Page 47

Page 48: Artificial Intelligence COMP-424atombe2/classnotes... · 2009-04-16 · Artificial Intelligence COMP-424 Lecture notes by Alexandre Tomberg Prof ... Winter 2009 Lecture notes Page

Lecture notes Page 48

Page 49: Artificial Intelligence COMP-424atombe2/classnotes... · 2009-04-16 · Artificial Intelligence COMP-424 Lecture notes by Alexandre Tomberg Prof ... Winter 2009 Lecture notes Page

Lecture notes Page 49

Page 50: Artificial Intelligence COMP-424atombe2/classnotes... · 2009-04-16 · Artificial Intelligence COMP-424 Lecture notes by Alexandre Tomberg Prof ... Winter 2009 Lecture notes Page

Lecture notes Page 50

Page 51: Artificial Intelligence COMP-424atombe2/classnotes... · 2009-04-16 · Artificial Intelligence COMP-424 Lecture notes by Alexandre Tomberg Prof ... Winter 2009 Lecture notes Page

Bayesian NetworksMarch-19-09

3:26 PM

Lecture notes Page 51

Page 52: Artificial Intelligence COMP-424atombe2/classnotes... · 2009-04-16 · Artificial Intelligence COMP-424 Lecture notes by Alexandre Tomberg Prof ... Winter 2009 Lecture notes Page

Lecture notes Page 52

Page 53: Artificial Intelligence COMP-424atombe2/classnotes... · 2009-04-16 · Artificial Intelligence COMP-424 Lecture notes by Alexandre Tomberg Prof ... Winter 2009 Lecture notes Page

Machine Learning: Parameter EstimationMarch-03-09

10:09 AM

Lecture notes Page 53

Page 54: Artificial Intelligence COMP-424atombe2/classnotes... · 2009-04-16 · Artificial Intelligence COMP-424 Lecture notes by Alexandre Tomberg Prof ... Winter 2009 Lecture notes Page

Statistical Parameter FittingMarch-03-09

10:34 AM

Lecture notes Page 54

Page 55: Artificial Intelligence COMP-424atombe2/classnotes... · 2009-04-16 · Artificial Intelligence COMP-424 Lecture notes by Alexandre Tomberg Prof ... Winter 2009 Lecture notes Page

Maximum Likelihood Estimate (MLE)March-03-09

10:53 AM

Lecture notes Page 55

Page 56: Artificial Intelligence COMP-424atombe2/classnotes... · 2009-04-16 · Artificial Intelligence COMP-424 Lecture notes by Alexandre Tomberg Prof ... Winter 2009 Lecture notes Page

Lecture notes Page 56

Page 57: Artificial Intelligence COMP-424atombe2/classnotes... · 2009-04-16 · Artificial Intelligence COMP-424 Lecture notes by Alexandre Tomberg Prof ... Winter 2009 Lecture notes Page

Learning with Missing ValuesMarch-10-09

10:14 AM

Lecture notes Page 57

Page 58: Artificial Intelligence COMP-424atombe2/classnotes... · 2009-04-16 · Artificial Intelligence COMP-424 Lecture notes by Alexandre Tomberg Prof ... Winter 2009 Lecture notes Page

Basic EM algorithm:Start with an initial parameter setting

Expectation Step: Complete the data by assigning values to missing items.Maximization Step: Compute the maximum log-likelihood and new parameters on the complete data.

Repeat:

Lecture notes Page 58

Page 59: Artificial Intelligence COMP-424atombe2/classnotes... · 2009-04-16 · Artificial Intelligence COMP-424 Lecture notes by Alexandre Tomberg Prof ... Winter 2009 Lecture notes Page

Lecture notes Page 59

Page 60: Artificial Intelligence COMP-424atombe2/classnotes... · 2009-04-16 · Artificial Intelligence COMP-424 Lecture notes by Alexandre Tomberg Prof ... Winter 2009 Lecture notes Page

Soft EM for a general Bayes net:

Lecture notes Page 60

Page 61: Artificial Intelligence COMP-424atombe2/classnotes... · 2009-04-16 · Artificial Intelligence COMP-424 Lecture notes by Alexandre Tomberg Prof ... Winter 2009 Lecture notes Page

Machine Learning: ClusteringMarch-19-09

4:21 PM

Lecture notes Page 61

Page 62: Artificial Intelligence COMP-424atombe2/classnotes... · 2009-04-16 · Artificial Intelligence COMP-424 Lecture notes by Alexandre Tomberg Prof ... Winter 2009 Lecture notes Page

Lecture notes Page 62

Page 63: Artificial Intelligence COMP-424atombe2/classnotes... · 2009-04-16 · Artificial Intelligence COMP-424 Lecture notes by Alexandre Tomberg Prof ... Winter 2009 Lecture notes Page

Supervised LearningMarch-10-09

10:55 AM

Lecture notes Page 63

Page 64: Artificial Intelligence COMP-424atombe2/classnotes... · 2009-04-16 · Artificial Intelligence COMP-424 Lecture notes by Alexandre Tomberg Prof ... Winter 2009 Lecture notes Page

Lecture notes Page 64

Page 65: Artificial Intelligence COMP-424atombe2/classnotes... · 2009-04-16 · Artificial Intelligence COMP-424 Lecture notes by Alexandre Tomberg Prof ... Winter 2009 Lecture notes Page

Lecture notes Page 65

Page 66: Artificial Intelligence COMP-424atombe2/classnotes... · 2009-04-16 · Artificial Intelligence COMP-424 Lecture notes by Alexandre Tomberg Prof ... Winter 2009 Lecture notes Page

OverfittingApril-14-09

8:35 PM

Lecture notes Page 66

Page 67: Artificial Intelligence COMP-424atombe2/classnotes... · 2009-04-16 · Artificial Intelligence COMP-424 Lecture notes by Alexandre Tomberg Prof ... Winter 2009 Lecture notes Page

Lecture notes Page 67

Page 68: Artificial Intelligence COMP-424atombe2/classnotes... · 2009-04-16 · Artificial Intelligence COMP-424 Lecture notes by Alexandre Tomberg Prof ... Winter 2009 Lecture notes Page

Gradient Descent:Given w0, for i = 0, 1, 2, ... do:

Repeat until necessary.

Finding Parameters in GeneralApril-14-09

9:05 PM

Lecture notes Page 68

Page 69: Artificial Intelligence COMP-424atombe2/classnotes... · 2009-04-16 · Artificial Intelligence COMP-424 Lecture notes by Alexandre Tomberg Prof ... Winter 2009 Lecture notes Page

Batch vs. Online OptimizationApril-14-09

9:38 PM

Lecture notes Page 69

Page 70: Artificial Intelligence COMP-424atombe2/classnotes... · 2009-04-16 · Artificial Intelligence COMP-424 Lecture notes by Alexandre Tomberg Prof ... Winter 2009 Lecture notes Page

What we should know:

Lecture notes Page 70

Page 71: Artificial Intelligence COMP-424atombe2/classnotes... · 2009-04-16 · Artificial Intelligence COMP-424 Lecture notes by Alexandre Tomberg Prof ... Winter 2009 Lecture notes Page

Neural NetsMarch-19-09

4:48 PM

Lecture notes Page 71

Page 72: Artificial Intelligence COMP-424atombe2/classnotes... · 2009-04-16 · Artificial Intelligence COMP-424 Lecture notes by Alexandre Tomberg Prof ... Winter 2009 Lecture notes Page

Lecture notes Page 72

Page 73: Artificial Intelligence COMP-424atombe2/classnotes... · 2009-04-16 · Artificial Intelligence COMP-424 Lecture notes by Alexandre Tomberg Prof ... Winter 2009 Lecture notes Page

Lecture notes Page 73

Page 74: Artificial Intelligence COMP-424atombe2/classnotes... · 2009-04-16 · Artificial Intelligence COMP-424 Lecture notes by Alexandre Tomberg Prof ... Winter 2009 Lecture notes Page

Lecture notes Page 74

Page 75: Artificial Intelligence COMP-424atombe2/classnotes... · 2009-04-16 · Artificial Intelligence COMP-424 Lecture notes by Alexandre Tomberg Prof ... Winter 2009 Lecture notes Page

Forward pass:

Compute the output of all units in layer kCopy this output as the input to the next layer

for layer k = 1 ... K do:

Feed Forward Neural NetworksApril-15-09

10:48 AM

Lecture notes Page 75

Page 76: Artificial Intelligence COMP-424atombe2/classnotes... · 2009-04-16 · Artificial Intelligence COMP-424 Lecture notes by Alexandre Tomberg Prof ... Winter 2009 Lecture notes Page

Lecture notes Page 76

Page 77: Artificial Intelligence COMP-424atombe2/classnotes... · 2009-04-16 · Artificial Intelligence COMP-424 Lecture notes by Alexandre Tomberg Prof ... Winter 2009 Lecture notes Page

Backpropagation algorithm:Forward pass: compute the output of the network going from input layer to output layer.

1.

Backward pass: compute the gradient of the error for every weight inside the network going from output layer towards the input layer.

2.

Update: update the weights using the standard rule:3.

Lecture notes Page 77

Page 78: Artificial Intelligence COMP-424atombe2/classnotes... · 2009-04-16 · Artificial Intelligence COMP-424 Lecture notes by Alexandre Tomberg Prof ... Winter 2009 Lecture notes Page

Lecture notes Page 78

Page 79: Artificial Intelligence COMP-424atombe2/classnotes... · 2009-04-16 · Artificial Intelligence COMP-424 Lecture notes by Alexandre Tomberg Prof ... Winter 2009 Lecture notes Page

Overfitting in Neural NetApril-15-09

12:56 PM

Lecture notes Page 79

Page 80: Artificial Intelligence COMP-424atombe2/classnotes... · 2009-04-16 · Artificial Intelligence COMP-424 Lecture notes by Alexandre Tomberg Prof ... Winter 2009 Lecture notes Page

Decision TreesApril-15-091:04 PM

Lecture notes Page 80

Page 81: Artificial Intelligence COMP-424atombe2/classnotes... · 2009-04-16 · Artificial Intelligence COMP-424 Lecture notes by Alexandre Tomberg Prof ... Winter 2009 Lecture notes Page

Lecture notes Page 81

Page 82: Artificial Intelligence COMP-424atombe2/classnotes... · 2009-04-16 · Artificial Intelligence COMP-424 Lecture notes by Alexandre Tomberg Prof ... Winter 2009 Lecture notes Page

Lecture notes Page 82

Page 83: Artificial Intelligence COMP-424atombe2/classnotes... · 2009-04-16 · Artificial Intelligence COMP-424 Lecture notes by Alexandre Tomberg Prof ... Winter 2009 Lecture notes Page

Lecture notes Page 83

Page 84: Artificial Intelligence COMP-424atombe2/classnotes... · 2009-04-16 · Artificial Intelligence COMP-424 Lecture notes by Alexandre Tomberg Prof ... Winter 2009 Lecture notes Page

Lecture notes Page 84

Page 85: Artificial Intelligence COMP-424atombe2/classnotes... · 2009-04-16 · Artificial Intelligence COMP-424 Lecture notes by Alexandre Tomberg Prof ... Winter 2009 Lecture notes Page

Utility TheoryApril-15-09

1:54 PM

Lecture notes Page 85

Page 86: Artificial Intelligence COMP-424atombe2/classnotes... · 2009-04-16 · Artificial Intelligence COMP-424 Lecture notes by Alexandre Tomberg Prof ... Winter 2009 Lecture notes Page

Utility Models:

Lecture notes Page 86

Page 87: Artificial Intelligence COMP-424atombe2/classnotes... · 2009-04-16 · Artificial Intelligence COMP-424 Lecture notes by Alexandre Tomberg Prof ... Winter 2009 Lecture notes Page

Maximizing Expected Utility (MEU) PrincipleApril-15-09

2:21 PM

Lecture notes Page 87

Page 88: Artificial Intelligence COMP-424atombe2/classnotes... · 2009-04-16 · Artificial Intelligence COMP-424 Lecture notes by Alexandre Tomberg Prof ... Winter 2009 Lecture notes Page

Lecture notes Page 88

Page 89: Artificial Intelligence COMP-424atombe2/classnotes... · 2009-04-16 · Artificial Intelligence COMP-424 Lecture notes by Alexandre Tomberg Prof ... Winter 2009 Lecture notes Page

What we should know:

Lecture notes Page 89

Page 90: Artificial Intelligence COMP-424atombe2/classnotes... · 2009-04-16 · Artificial Intelligence COMP-424 Lecture notes by Alexandre Tomberg Prof ... Winter 2009 Lecture notes Page

Markov Decision Processes (MDPs)April-15-09

2:50 PM

Lecture notes Page 90

Page 91: Artificial Intelligence COMP-424atombe2/classnotes... · 2009-04-16 · Artificial Intelligence COMP-424 Lecture notes by Alexandre Tomberg Prof ... Winter 2009 Lecture notes Page

Lecture notes Page 91

Page 92: Artificial Intelligence COMP-424atombe2/classnotes... · 2009-04-16 · Artificial Intelligence COMP-424 Lecture notes by Alexandre Tomberg Prof ... Winter 2009 Lecture notes Page

PoliciesApril-15-09

2:50 PM

Lecture notes Page 92

Page 93: Artificial Intelligence COMP-424atombe2/classnotes... · 2009-04-16 · Artificial Intelligence COMP-424 Lecture notes by Alexandre Tomberg Prof ... Winter 2009 Lecture notes Page

Lecture notes Page 93

Page 94: Artificial Intelligence COMP-424atombe2/classnotes... · 2009-04-16 · Artificial Intelligence COMP-424 Lecture notes by Alexandre Tomberg Prof ... Winter 2009 Lecture notes Page

Iterative Policy Evaluation Algorithm:Start with some initial guess1.During iteration k update the function for all states as follows:2.

Lecture notes Page 94

Page 95: Artificial Intelligence COMP-424atombe2/classnotes... · 2009-04-16 · Artificial Intelligence COMP-424 Lecture notes by Alexandre Tomberg Prof ... Winter 2009 Lecture notes Page

Searching for a Good PolicyApril-15-09

4:47 PM

Lecture notes Page 95

Page 96: Artificial Intelligence COMP-424atombe2/classnotes... · 2009-04-16 · Artificial Intelligence COMP-424 Lecture notes by Alexandre Tomberg Prof ... Winter 2009 Lecture notes Page

Policy Iteration Algorithm:Start with an initial policy

Compute using policy evaluation algorithmCompute using greedy policy update rule on

Repeat until

Lecture notes Page 96

Page 97: Artificial Intelligence COMP-424atombe2/classnotes... · 2009-04-16 · Artificial Intelligence COMP-424 Lecture notes by Alexandre Tomberg Prof ... Winter 2009 Lecture notes Page

Value Iteration Algorithm:Start with an initial value

Update the value function estimate using:Repeat until

Lecture notes Page 97

Page 98: Artificial Intelligence COMP-424atombe2/classnotes... · 2009-04-16 · Artificial Intelligence COMP-424 Lecture notes by Alexandre Tomberg Prof ... Winter 2009 Lecture notes Page

Lecture notes Page 98

Page 99: Artificial Intelligence COMP-424atombe2/classnotes... · 2009-04-16 · Artificial Intelligence COMP-424 Lecture notes by Alexandre Tomberg Prof ... Winter 2009 Lecture notes Page

Lecture notes Page 99

Page 100: Artificial Intelligence COMP-424atombe2/classnotes... · 2009-04-16 · Artificial Intelligence COMP-424 Lecture notes by Alexandre Tomberg Prof ... Winter 2009 Lecture notes Page

Reinforcement LearningApril-15-09

5:38 PM

Lecture notes Page 100

Page 101: Artificial Intelligence COMP-424atombe2/classnotes... · 2009-04-16 · Artificial Intelligence COMP-424 Lecture notes by Alexandre Tomberg Prof ... Winter 2009 Lecture notes Page

Lecture notes Page 101

Page 102: Artificial Intelligence COMP-424atombe2/classnotes... · 2009-04-16 · Artificial Intelligence COMP-424 Lecture notes by Alexandre Tomberg Prof ... Winter 2009 Lecture notes Page

TD (order 0) Learning Algorithm:Initialize the value function:1.

Pick a start statea.

Choose an action a based on current policy π and current state si.Take action a, observe reward r and new state s'ii.Compute TD error: δ = r + γ V(s') - V(s)iii.Update the value function: V(s) = V(s) + αs δiv.Update current state: s = s'v.If s' is a terminal state, GoTo 2.vi.

Repeat for every time step tb.

Repeat until feeling sick of it:2.

Lecture notes Page 102

Page 103: Artificial Intelligence COMP-424atombe2/classnotes... · 2009-04-16 · Artificial Intelligence COMP-424 Lecture notes by Alexandre Tomberg Prof ... Winter 2009 Lecture notes Page

Reinforcement Learning for ControlApril-15-09

6:35 PM

Lecture notes Page 103

Page 104: Artificial Intelligence COMP-424atombe2/classnotes... · 2009-04-16 · Artificial Intelligence COMP-424 Lecture notes by Alexandre Tomberg Prof ... Winter 2009 Lecture notes Page

Lecture notes Page 104