15-381: artificial intelligence€¦ · 15-381: artificial intelligence probabilistic reasoning and...

15-381: Artificial Intelligence

Probabilistic Reasoning and Inference

Advantages of probabilisticreasoning

• Appropriate for complex, uncertain, environments - Will it rain tomorrow?

• Applies naturally to many domains - Robot predicting the direction of road, biology, Word paper clip

• Allows to generalize acquired knowledge andincorporate prior belief

- Medical diagnosis

• Easy to integrate different information sources - Robot’s sensors

Examples• Unmanned vehicles

Examples: Speech processing

Example: Biological data

ATGAAGCTACTGTCTTCTATCGAACAAGCATGCGATATTTGCCGACTTAAAAAGCTCAAGTGCTCCAAAGAAAAACCGAAGTGCGCCAAGTGTCTGAAGAACAACTGGGAGTGTCGCTACTCTCCCAAAACCAAAAGGTCTCCGCTGACTAGGGCACATCTGACAGAAGTGGAATCAAGGCTAGAAAGACTGGAACAGCTATTTCTACTGATTTTTCCTCGAGAAGACCTTGACATGATT

Basic notations• Random variable - referring to an element / event whose status is unknown: A = “it will rain tomorrow”• Domain (usually denoted by Ω) - The set of values a random variable can take: - “A = The stock market will go up this year”: Binary - “A = Number of Steelers wins in 2007”: Discrete - “A = % change in Google stock in 2007”: Continuous

Axioms of probability(Kolmogorov’s axioms)

• A variety of useful facts can be derived from just threeaxioms:

1. 0 ≤ P(A) ≤ 12. P(true) = 1, P(false) = 03. P(A ∨ B) = P(A) + P(B) – P(A ∧ B)

P(Steelers win the 05-06 season) = 1

There have been severalother attempts to provide afoundation for probabilitytheory. Kolmogorov’saxioms are the most widelyused.

Example of using the axioms• Assume a probability for dice as

follows:P(1) = P(2) = P(3) = 0.2P(4) = P(5) = 0.1P(6) = ?• P({heads,tails}) = ?• P(heads)+(1-P(tails)) = ?

Using the axioms• How can we use the axioms to prove that: P(¬A) = 1 – P(A) ?

Priors

P(rain tomorrow) = 0.2

P(no rain tomorrow) = 0.8

No rainDegree of beliefin an event in theabsence of anyother information

Conditional probablity• What is the probability of an event given knowledge of

another event• Example: - P(raining | sunny) - P(raining | cloudy) - P(raining | cloudy, cold)

Conditional probability• P(A = 1 | B = 1): The fraction of cases where A is true if

B is true

P(A = 0.2) P(A|B = 0.5)

Conditional probability• In some cases, given knowledge of one or more

random variables we can improve upon our priorbelief of another random variable

• For example: p(slept in movie) = 0.5 p(slept in movie | liked movie) = 1/3 p(didn’t sleep in movie | liked movie) = 2/3

0.3100.1000.4010.211

PSleptLikedmovie

Joint distributions• The probability that a set of random

variables will take a specific value is theirjoint distribution.

• Notation: P(A ∧ B) or P(A,B)• Example: P(liked movie, slept)

PSleptLikedmovie

Joint distribution (cont)

Evaluation(1-3)

Class sizeTime (regular =2,summer =1)

P(class size > 20) = 0.5

P(summer) = 1/3

Evaluation of classesP(class size > 20, summer) = ?

Evaluation(1-3)

P(summer) = 1/3

Evaluation of classesP(class size > 20, summer) = 0

Evaluation(1-3)

P(eval = 1) = 2/9

P(class size > 20, eval = 1) = 2/9 Evaluation of classes

Chain rule• The joint distribution can be specified in terms of

conditional probability: P(A,B) = P(A|B)*P(B)• Together with Bayes rule (which is actually derived from

it) this is one of the most powerful rules in probabilisticreasoning

Bayes rule• One of the most important rules for AI usage.• Derived from the chain rule: P(A,B) = P(A | B)P(B) = P(B | A)P(A)• Thus,

Thomas Bayes wasan Englishclergyman who setout his theory ofprobability in 1764.

)()|()|(

APABPBAP =

Bayes rule (cont)Often it would be useful to derive the rule a bit

further:

APABPBAP

)()|()|(

This results from:P(B) = ∑AP(B,A)

P(B,A=1) P(B,A=0)

Using Bayes rule• Cards game:

Place your bet on thelocation of the King!

Using Bayes rule• Cards game:

Do you want tochange your bet?

Using Bayes rule

)()|()|(

kCPkCselBPselBkCP

Computing the (posterior) probability: P(C = k | selB)

)10()10|()()|(

CPCselBPkCPkCselBP

kCPkCselBP

Bayes rule

Using Bayes rule

1/31/2

1/2 1/3 2/31/2

)10()10|()()|(

CPCselBPkCPkCselBP

kCPkCselBP

P(C=k | selB) =

Important points• Random variables• Chain rule• Bayes rule• Joint distribution, independence, conditional

independence

Joint distributions• The probability that a set of random

variables will take a specific value is theirjoint distribution.

• Requires a joint probability table tospecify the possible assignments

• The table can grow very rapidly …

PSleptLikedmovie

How can we decrease the number of columns inthe table?

15-381: artificial intelligence€¦ · 15-381: artificial intelligence probabilistic reasoning and...

Documents

probabilistic reasoning over time 2 - queen's u

parallel and distributed systems for probabilistic reasoning

bayesian networks: compact probabilistic reasoning

cs121 heuristic search planning csps adversarial search...

algorithms for reasoning with probabilistic graphical...

hidden markov models: probabilistic reasoning over time

l2 probabilistic reasoning

probabilistic reasoning - department of computer science

avoiding probabilistic reasoning fallacies in legal...

effects of sequence order on probabilistic reasoning

probabilistic reasoning - university of washington

artificial intelligence probabilistic reasoning

pmr introduction - probabilistic modelling and …...

probabilistic modelling and reasoning - school of...

text understanding through probabilistic reasoning about...

probabilistic reasoning about possible worlds

probabilistic reasoning and statistical inference: an

probabilistic reasoning; network-based reasoning

probabilistic reasoning in intelligent systems

probabilistic reasoning over time