artificial intelligence comp-424atombe2/classnotes... · 2009-04-16 · artificial intelligence...

Artificial IntelligenceCOMP-424

Lecture notes by Alexandre Tomberg

Prof. Joelle Pineau McGill University

Winter 2009

Lecture notes

History of AII.

Uninformed Search Methods1.Informed Search2.Search for Optimization Problems3.Game Playing4.Constraint Satisfaction5.

SearchII.

Knowledge Representation: Logic1.First Order Logic2.Planning3.Spatial Planning4.

LogicIII.

Reasoning under Uncertainty1.Bayesian Networks2.

ProbabilityIV.

Machine Learning: Parameter Estimation1.Learning with Missing Values2.Supervised Learning3.Neural Nets4.Decision Trees5.

Machine LearningV.

Utility Theory1.Markov Decision Processes (MDPs)2.Reinforcement Learning3.

Decision TheoryVI.

Table of ContentsDecember-03-08

12:16 PM

Lecture notes

History of AIJanuary-06-09

10:03 AM

Lecture notes

Uninformed Search MethodsJanuary-08-09

10:06 AM

Lecture notes

Lecture notes

Generic Search Algorithm:

Algorithm 1: BFS

Lecture notes

Algorithm 2: DFS

Algorithm 3: Depth limited search

Algorithm 4: Iterative Deepening

Lecture notes

Informed SearchJanuary-13-09

10:02 AM

Lecture notes

Algorithm #1: Best-First Search

Algorithm #2: Heuristic Search

AlgorithmsJanuary-13-09

10:34 AM

Lecture notes

Algorithm # 3: A* search

Lecture notes

Lecture notes

Search for Optimization ProblemsJanuary-15-09

10:05 AM

Lecture notes

Algorithm #1: Hill Climbing

Algorithm #2: Simulated Annealing

Iterative Improvement AlgorithmsJanuary-15-09

10:05 AM

Lecture notes

Lecture notes

Genetic AlgorithmsJanuary-15-09

11:06 AM

Lecture notes

Game PlayingJanuary-20-09

10:03 AM

Lecture notes

Minimax SearchJanuary-20-09

10:07 AM

Lecture notes

α-β PruningJanuary-20-09

10:44 AM

Lecture notes

Constraint SatisfactionJanuary-22-09

10:10 AM

Lecture notes

Lecture notes

Knowledge Representation: LogicJanuary-27-0910:10 AM

Lecture notes

Lecture notes

First Order LogicFebruary-18-09

7:50 PM

Lecture notes

Lecture notes

PlanningFebruary-03-0910:11 AM

Lecture notes

Lecture notes

Partial Order Planning AlgorithmFebruary-18-09

8:55 PM

Lecture notes

Least Commitment

Analysis

Lecture notes

Spatial PlanningFebruary-03-09

10:32 AM

Lecture notes

Lecture notes

If we know probabilities, what actions should we choose?

Reasoning under UncertaintyFebruary-18-09

9:13 PM

Lecture notes

Lecture notes

Bayesian NetworksMarch-19-09

3:26 PM

Lecture notes

Lecture notes

Machine Learning: Parameter EstimationMarch-03-09

10:09 AM

Lecture notes

Statistical Parameter FittingMarch-03-09

10:34 AM

Lecture notes

Maximum Likelihood Estimate (MLE)March-03-09

10:53 AM

Lecture notes

Lecture notes

Learning with Missing ValuesMarch-10-09

10:14 AM

Lecture notes

Basic EM algorithm:Start with an initial parameter setting

Expectation Step: Complete the data by assigning values to missing items.Maximization Step: Compute the maximum log-likelihood and new parameters on the complete data.

Repeat:

Lecture notes

Lecture notes

Soft EM for a general Bayes net:

Lecture notes

Machine Learning: ClusteringMarch-19-09

4:21 PM

Lecture notes

Lecture notes

Supervised LearningMarch-10-09

10:55 AM

Lecture notes

Lecture notes

OverfittingApril-14-09

8:35 PM

Lecture notes

Lecture notes

Gradient Descent:Given w0, for i = 0, 1, 2, ... do:

Repeat until necessary.

Finding Parameters in GeneralApril-14-09

9:05 PM

Lecture notes

Batch vs. Online OptimizationApril-14-09

9:38 PM

Lecture notes

What we should know:

Lecture notes

Neural NetsMarch-19-09

4:48 PM

Lecture notes

Lecture notes

Forward pass:

Compute the output of all units in layer kCopy this output as the input to the next layer

for layer k = 1 ... K do:

Feed Forward Neural NetworksApril-15-09

10:48 AM

Lecture notes

Lecture notes

Backpropagation algorithm:Forward pass: compute the output of the network going from input layer to output layer.

1.

Backward pass: compute the gradient of the error for every weight inside the network going from output layer towards the input layer.

2.

Update: update the weights using the standard rule:3.

Lecture notes

Lecture notes

Overfitting in Neural NetApril-15-09

12:56 PM

Lecture notes

Decision TreesApril-15-091:04 PM

Lecture notes

Lecture notes

Utility TheoryApril-15-09

1:54 PM

Lecture notes

Utility Models:

Lecture notes

Maximizing Expected Utility (MEU) PrincipleApril-15-09

2:21 PM

Lecture notes

Lecture notes

What we should know:

Lecture notes

Markov Decision Processes (MDPs)April-15-09

2:50 PM

Lecture notes

Lecture notes

PoliciesApril-15-09

2:50 PM

Lecture notes

Lecture notes

Iterative Policy Evaluation Algorithm:Start with some initial guess1.During iteration k update the function for all states as follows:2.

Lecture notes

Searching for a Good PolicyApril-15-09

4:47 PM

Lecture notes

Policy Iteration Algorithm:Start with an initial policy

Compute using policy evaluation algorithmCompute using greedy policy update rule on

Repeat until

Lecture notes

Value Iteration Algorithm:Start with an initial value

Update the value function estimate using:Repeat until

Lecture notes

Lecture notes

Reinforcement LearningApril-15-09

5:38 PM

Lecture notes

Lecture notes

TD (order 0) Learning Algorithm:Initialize the value function:1.

Pick a start statea.

Choose an action a based on current policy π and current state si.Take action a, observe reward r and new state s'ii.Compute TD error: δ = r + γ V(s') - V(s)iii.Update the value function: V(s) = V(s) + αs δiv.Update current state: s = s'v.If s' is a terminal state, GoTo 2.vi.

Repeat for every time step tb.

Repeat until feeling sick of it:2.

Lecture notes

Reinforcement Learning for ControlApril-15-09

6:35 PM

Lecture notes

Lecture notes

artificial intelligence comp-424atombe2/classnotes... · 2009-04-16 · artificial intelligence...

Documents