artificial intelligence chapter 19 reasoning with ...19.5 uncertain evidence lwe must be certain...
TRANSCRIPT
Artificial Intelligence Artificial Intelligence Chapter 19Chapter 19
Reasoning with Uncertain Reasoning with Uncertain InformationInformation
Biointelligence LabSchool of Computer Sci. & Eng.
Seoul National University
OutlineOutline
l Review of Probability Theoryl Probabilistic Inferencel Bayes Networksl Patterns of Inference in Bayes Networks
(C) 1999-2009 SNU CSE Biointelligence Lab 2
l Patterns of Inference in Bayes Networksl Uncertain Evidencel D-Separationl Probabilistic Inference in Polytrees
19.1 Review of Probability Theory (1/4)19.1 Review of Probability Theory (1/4)
l Random variables
l Joint probabilitykVVV ,...,, 21
),...,,( 2211 kk vVvVvVp ===
(C) 1999-2009 SNU CSE Biointelligence Lab 3
(B (BAT_OK), M (MOVES) , L (LIFTABLE), G (GUAGE))
Joint Probability
(True, True, True, True) 0.5686(True, True, True, False) 0.0299(True, True, False, True) 0.0135(True, True, False, False) 0.0007… …
Ex.
),...,,( 2211 kk vVvVvVp ===
19.1 Review of Probability Theory (2/4)19.1 Review of Probability Theory (2/4)l Marginal probability
Ex. å=
==bB
GLMBpbBp ),,,()(
å==
===mMbB
GLMBpmMbBp,
),,,(),(
(C) 1999-2009 SNU CSE Biointelligence Lab 4
l Conditional probability
¨ Ex. The probability that the battery is charged given that the arm does not move
å==
===mMbB
GLMBpmMbBp,
),,,(),(
( ) ( )( )j
jiji Vp
VVpVVp
,| =
( ) ( )( )FalseMp
FalseMTrueBpFalseMTrueBp=
=====
,|
19.1 Review of Probability Theory (3/4)19.1 Review of Probability Theory (3/4)
(C) 1999-2009 SNU CSE Biointelligence Lab 5
Figure 19.1 A Venn Diagram
19.1 Review of Probability Theory (4/4)19.1 Review of Probability Theory (4/4)
l Chain rule
l Bayes’ rule
( ) ( ) ( ) ( ) ( )MpMGpMGLpMGLBpMGLBp |,|,,|,,, =
( ) ( )Õ=
-=k
iiik VVVpVVVp
11121 ,...,|,...,,
(C) 1999-2009 SNU CSE Biointelligence Lab 6
l Bayes’ rule
l set notation¨Abbreviation for
where
( ) ( ) ( )( )j
iijji Vp
VpVVpVVp
|| =
( )Vp( )kVVVp ,...,, 21
{ }kVVV ,...,, 21=V
19.2 Probabilistic Inference19.2 Probabilistic Inferencel We desire to calculate the probability of some variable Vi
has value vi given the evidence E =e.
[ ]
( ))(
3.0)(1.02.0
)(),,(),,(
)(),()|(
RpRp
RpRQPpRQPp
RpRQpRQp
Ø=
Ø+
=
ØØØ+Ø
=ØØ
=Ø
( ) ( )( )ep
eTrueVpeTrueVp ii =
=====
EE
E,|
(C) 1999-2009 SNU CSE Biointelligence Lab 7
p(P,Q,R) 0.3
p(P,Q,¬R) 0.2
p(P, ¬Q,R) 0.2
p(P, ¬Q,¬R) 0.1
p(¬P,Q,R) 0.05
p(¬P, Q, ¬R) 0.1
p(¬P, ¬Q,R) 0.05
p(¬P, ¬Q,¬R) 0.0
Example [ ]
( ))(
3.0)(1.02.0
)(),,(),,(
)(),()|(
RpRp
RpRQPpRQPp
RpRQpRQp
Ø=
Ø+
=
ØØØ+Ø
=ØØ
=Ø
[ ]
( ))(
1.0)(0.01.0
)(),,(),,(
)(),()|(
RpRp
RpRQPpRQPp
RpRQpRQp
Ø=
Ø+
=
ØØØØ+ØØ
=ØØØ
=ØØ
1)|()|(75.0)|(
=ØØ+Ø=Ø
RQpRQpRQp
Q
Statistical IndependenceStatistical Independence
l Conditional independence
¨ Intuition: Vi tells us nothing more about V than we already knew by knowing Vj
: a set of variables( ) ( ) ( )VVV |||, jiji VpVpVVp = V
(C) 1999-2009 SNU CSE Biointelligence Lab 8
j
l Mutually conditional independence
l Unconditional independence (When is empty)V
( ) ( )
( )Õ
Õ
=
=--
=
=
k
ii
k
iiiij
Vp
VVVVpVVVp
1
112121
|
,,...,,||,...,,
V
VV
( ) ( ) ( ) ( )kj VpVpVpVVVp ...,...,, 2121 =
19.3 Bayes Networks (1/2)19.3 Bayes Networks (1/2)
l Directed, acyclic graph (DAG) whose nodes are labeled by random variables
l Characteristics of Bayesian networks¨Node Vi is conditionally independent of any subset of
(C) 1999-2009 SNU CSE Biointelligence Lab 9
¨Node Vi is conditionally independent of any subset of nodes that are not descendents of Vi given its parents
l Prior probabilityl Conditional probability table (CPT)
( ) ( )Õ=
=k
iiik VPaVpVVVp
121 )(|,...,,
19.3 Bayes Networks (2/2)19.3 Bayes Networks (2/2)Bayes network about the block-lifting example
(C) 1999-2009 SNU CSE Biointelligence Lab 10
19.4 Patterns of Inference in Bayes Networks (1/3)19.4 Patterns of Inference in Bayes Networks (1/3)
l Causal or top-down inference¨ Ex. The probability that the arm moves given that the block is
liftableB
L
( ) ( ) ( )LBMpLBMpLMp |,|,| Ø+=
(C) 1999-2009 SNU CSE Biointelligence Lab 11
(chain rule)
(from the structure)
G M
0.9 * 0.95 0.855 .= =
( ) ( ) ( )LBMpLBMpLMp |,|,| Ø+=
( ) ( ) ( ) ( )LBpLBMpLBpLBMp |,||,| ØØ+=
( ) ( ) ( ) ( )BpLBMpBpLBMp ØØ+= ,|,|
l Diagnostic or bottom-up inference¨ Using an effect (or symptom) to infer a cause¨ Ex. The probability that the block is not liftable given that the arm
does not move.(using causal reasoning)
19.4 Patterns of Inference in Bayes Networks (2/3)19.4 Patterns of Inference in Bayes Networks (2/3)
B
G M
L
( ) 9525.0| =ØØ LMp
(C) 1999-2009 SNU CSE Biointelligence Lab 12
(using causal reasoning)
(using Bayes’ rule)
(using Bayes’ rule)
( ) 9525.0| =ØØ LMp
( ) ( ) ( )( ) ( ) ( )MpMpMp
LpLMpMLpØ
=Ø´
=Ø
ØØØ=ØØ
28575.03.09525.0||
( ) ( ) ( )( ) ( ) ( )MpMpMp
LpLMpMLpØ
=Ø´
=Ø
Ø=Ø
1015.07.0145.0||
( ) 7379.0| =ØØ MLp
l Explaining away¨ One evidence: ¬M (the arm does not move)¨ Additional evidence: ¬B (the battery is not charged)
(Bayes’ rule)
19.4 Patterns of Inference in Bayes Networks (3/3)19.4 Patterns of Inference in Bayes Networks (3/3)B
G M
L
( ) ( ) ( )( )
( ) ( ) ( )( )
( ) ( ) ( )( )
.30.0,
,|,
|,|,|,,|
=ØØ
ØØØØØ=
ØØØØØØØØ
=
ØØØØØØ
=ØØØ
MBpLpBpLBMp
MBpLpLBpLBMp
MBpLpLBMpMBLp
(C) 1999-2009 SNU CSE Biointelligence Lab 13
¨¬B explains ¬M, making ¬L less certain (0.30<0.7379)
(Bayes’ rule)
(def. of conditional prob.)
(structure of the Bayes network)
( ) ( ) ( )( )
( ) ( ) ( )( )
( ) ( ) ( )( )
.30.0,
,|,
|,|,|,,|
=ØØ
ØØØØØ=
ØØØØØØØØ
=
ØØØØØØ
=ØØØ
MBpLpBpLBMp
MBpLpLBpLBMp
MBpLpLBMpMBLp
19.5 Uncertain Evidence19.5 Uncertain Evidence
l We must be certain about the truth or falsity of the propositions they represent.¨ Each uncertain evidence node should have a child node, about
which we can be certain.¨ Ex. Suppose the robot is not certain that its arm did not move.
(C) 1999-2009 SNU CSE Biointelligence Lab 14
¨ Ex. Suppose the robot is not certain that its arm did not move.< Introducing M’ : “The arm sensor says that the arm moved”
– We can be certain that that proposition is either true or false.< p(¬L| ¬B, ¬M’) instead of p(¬L| ¬B, ¬M)
¨ Ex. Suppose we are uncertain about whether or not the battery is charged.< Introducing G : “Battery guage”< p(¬L| ¬G, ¬M’) instead of p(¬L| ¬B, ¬M’)
19.6 D19.6 D--Separation (1/3)Separation (1/3)
l D-saparation: direction-dependent separation
l Two nodes Vi and Vj are conditionally independent given a set of nodes E if for every undirected path in the Bayes network between Vi and Vj, there is some node, V , on the path having one of the following three
(C) 1999-2009 SNU CSE Biointelligence Lab 15
Ethere is some node, Vb, on the path having one of the following three properties.¨ Vb is in E, and both arcs on the path lead out of Vb¨ Vb is in E, and one arc on the path leads in to Vb and one arc leads out.¨ Neither Vb nor any descendant of Vb is in E, and both arcs on the path lead
in to Vb.l Vb blocks the path given E when any one of these conditions holds for
a path.l If all paths between Vi and Vj are blocked, we say that E d-separates Vi
and Vj
19.6 D19.6 D--Separation (2/3)Separation (2/3)
(C) 1999-2009 SNU CSE Biointelligence Lab 16Figure 19.3 Conditional Independence via Blocking Nodes
19.6 D19.6 D--Separation (3/3)Separation (3/3)
l Ex.¨ I(G, L|B) by rules 1 and 3
<By rule 1, B blocks the (only) path between G and L, given B.<By rule 3, M also blocks this path given B.
B
G M
L
(C) 1999-2009 SNU CSE Biointelligence Lab 17
¨ I(G, L)<By rule 3, M blocks the path between G and L.
¨ I(B, L)<By rule 3, M blocks the path between B and L.
l Even using d-separation, probabilistic inference in Bayes networks is, in general, NP-hard.
19.7 Probabilistic Inference in Polytrees (1/2)19.7 Probabilistic Inference in Polytrees (1/2)
l Polytree¨A DAG for which there is just one path, along arcs in
either direction, between any two nodes in the DAG.
(C) 1999-2009 SNU CSE Biointelligence Lab 18
l A node is above Q¨ The node is connected to Q only through Q’s parents
l A node is below Q¨ The node is connected to Q only through Q’s
19.7 Probabilistic Inference in Polytrees (2/2)19.7 Probabilistic Inference in Polytrees (2/2)
(C) 1999-2009 SNU CSE Biointelligence Lab 19
¨ The node is connected to Q only through Q’s immediate successors.
l Three types of evidences¨All evidence nodes are above Q.¨All evidence nodes are below Q.¨ There are evidence nodes both above and below Q.
Evidence Above (1/2)Evidence Above (1/2)
l Bottom-up recursive algorithml Ex. p(Q|P5, P4)( ) ( )
( ) ( )
( ) ( )
( ) ( ) ( )
( ) ( ) ( )å
å
å
å
å
=
=
=
=
=
7,6
7,6
7,6
7,6
7,6
4|75|67,6|
4,5|74,5|67,6|
4,5|7,67,6|
4,5|7,64,5,7,6|
4,5|7,6,4,5|
PP
PP
PP
PP
PP
PPpPPpPPQp
PPPpPPPpPPQp
PPPPpPPQp
PPPPpPPPPQp
PPPPQpPPQp
(C) 1999-2009 SNU CSE Biointelligence Lab 20
( ) ( )
( ) ( )
( ) ( )
( ) ( ) ( )
( ) ( ) ( )å
å
å
å
å
=
=
=
=
=
7,6
7,6
7,6
7,6
7,6
4|75|67,6|
4,5|74,5|67,6|
4,5|7,67,6|
4,5|7,64,5,7,6|
4,5|7,6,4,5|
PP
PP
PP
PP
PP
PPpPPpPPQp
PPPpPPPpPPQp
PPPPpPPQp
PPPPpPPPPQp
PPPPQpPPQp
(Structure of The Bayes network)
(d-separation)
(d-separation)
Evidence Above (2/2)Evidence Above (2/2)
l Calculating p(P7|P4) and p(P6|P5)( ) ( ) ( ) ( ) ( )( ) ( ) ( ) ( )
3 3
1, 2
7 | 4 7 | 3, 4 3 | 4 7 | 3, 4 3
6 | 5 6 | 1, 2 1 | 5 2P P
P P
p P P p P P P p P P p P P P p P
p P P p P P P p P P p P
= =
=
å åå
(C) 1999-2009 SNU CSE Biointelligence Lab 21
l Calculating p(P1|P5)¨ Evidence is “below”¨Here, we use Bayes’ rule
( ) ( ) ( )( )5
11|55|1Pp
PpPPpPPp =
Evidence Below (1/2)Evidence Below (1/2)
( ) ( ) ( )( )
( ) ( )( ) ( ) ( )
12, 13, 14, 11 || 12, 13, 14, 11
12, 13, 14, 11
12, 13, 14, 11 |
12, 13 | 14, 11 |
p P P P P Q p Qp Q P P P P
p P P P P
kp P P P P Q p Q
kp P P Q p P P Q p Q
=
=
=
(C) 1999-2009 SNU CSE Biointelligence Lab 22
l Using a top-down recursive algorithm( ) ( ) ( )12, 13 | 14, 11 |kp P P Q p P P Q p Q=
( ) ( ) ( )
( ) ( )å
å
=
=
9
9
|99|13,12
|9,9|13,12|13,12
P
P
QPpPPPp
QppQPPPpQPPp
( ) ( ) ( )å=8
8,8|9|9P
PpQPPpQPp ( ) ( ) ( )9|139|129|13,12 PPpPPpPPPp =
(d-separation)
Evidence Below (2/2)Evidence Below (2/2)
( ) ( ) ( )
( ) ( ) ( )å
å=
=
10
10
|1010|1110|14
|1010|11,14|11,14
P
P
QPpPPpPPp
QPpPPPpQPPp
(C) 1999-2009 SNU CSE Biointelligence Lab 23
( ) ( ) ( )( ) ( ) ( )
( ) ( ) ( )( ) ( ) ( )
( ) ( ) ( ) ( ) ( )
15
11
1
11 | 10 11 | 15, 10 15 | 10
15 | 10 15 | 10, 11 11
15, 10 | 11 1111 | 15, 10 15, 10 | 11 11
15, 10
15, 10 | 11 15 | 10, 11 10 | 11 15 | 10, 11 10
P
P
p P P p P P P p P P
p P P p P P P p P
p P P P p Pp P P P k p P P P p P
p P P
p P P P p P P P p P P p P P P p P
=
=
= =
= =
åå
Evidence Above and BelowEvidence Above and Below
( ) ( ) ( )| , |p Q p Q- + +
=E E E
( )}11,14,13,12{},4,5{| PPPPPPQpE+ E-
(C) 1999-2009 SNU CSE Biointelligence Lab 24
( ) ( ) ( )( )
( ) ( )( ) ( )
2
2
| , || ,
|
| , |
| |
p Q p Qp Q
p
k p Q p Q
k p Q p Q
- + +
+ -
- +
- + +
- +
=
=
=
E E EE E
E E
E E E
E E
(We have calculated two probabilities already)
(d-separation)
A Numerical Example (1/2)A Numerical Example (1/2)
( ) ( ) ( )QpQUkpUQp || =
•We want to calculate p(Q|U)
( )| ( | ) ( | )P
p U Q pU P p P Q=å
(Bayes’ rule)
(C) 1999-2009 SNU CSE Biointelligence Lab 25
( )| 0.6 0.05 0.03p Q U k k= ´ ´ = ´ To determine k, we need to calculate p(¬Q|U)
( ) ( ) ( )
( ) ( ) ( ) ( )80.099.08.001.095.0,|,|
,||
=´+´=ØØ+=
=åRpQRPpRpQRPp
RpQRPpQPpR
( ) 20.0| =Ø QPp
( ) ( ) ( )60.02.02.08.07.0
2.0|8.0||=´+´=
´Ø+´= PUpPUpQUp
A Numerical Example (2/2)A Numerical Example (2/2)
( ) ( ) ( )| |p Q U kp U Q p QØ = Ø Ø
( )| ( | ) ( | )P
p U Q pU P p P QØ = Øå
( ) ( ) ( )
( ) ( ) ( ) ( )019.099.001.001.090.0,|,|
,||
=´+´=ØØØ+Ø=
Ø=Ø åRpQRPpRpQRPp
RpQRPpQPpR
(Bayes’ rule)
(C) 1999-2009 SNU CSE Biointelligence Lab 26
Finally
( ) ( ) ( )
( ) ( ) ( ) ( )019.099.001.001.090.0,|,|
,||
=´+´=ØØØ+Ø=
Ø=Ø åRpQRPpRpQRPp
RpQRPpQPpR
( ) 98.0| =Ø QPp
( ) ( ) ( )21.098.02.019.07.0
98.0|019.0||=´+´=
´Ø+´=Ø PUpPUpQUp
( ) 20.095.021.0| ´=´´=Ø kkUQp ( )( )
( ) 13.003.035.4|,35.420.095.021.0|
03.005.06.0|
=´==\´=´´=Ø
´=´´=
UQpkkkUQp
kkUQp
Other methods for Probabilistic inference in Other methods for Probabilistic inference in Bayes NetworksBayes Networks
l Bucket elimination
l Monte Carlo methods (when the network is not a polytree)polytree)
l Clustering
(C) 1999-2009 SNU CSE Biointelligence Lab 27
Additional Readings (1/5)Additional Readings (1/5)
l [Feller 1968]¨ Probability Theory
l [Goldszmidt, Morris & Pearl 1990]¨Non-monotonic inference through probabilistic method
(C) 1999-2009 SNU CSE Biointelligence Lab 28
¨Non-monotonic inference through probabilistic method
l [Pearl 1982a, Kim & Pearl 1983]¨Message-passing algorithm
l [Russell & Norvig 1995, pp.447ff]¨ Polytree methods
Additional Readings (2/5)Additional Readings (2/5)
l [Shachter & Kenley 1989]¨Bayesian network for continuous random variables
l [Wellman 1990]¨Qualitative networks
(C) 1999-2009 SNU CSE Biointelligence Lab 29
¨Qualitative networks
l [Neapolitan 1990]¨ Probabilistic methods in expert systems
l [Henrion 1990]¨ Probability inference in Bayesian networks
Additional Readings (3/5)Additional Readings (3/5)
l [Jensen 1996]¨Bayesian networks: HUGIN system
l [Neal 1991]¨Relationships between Bayesian networks and neural
(C) 1999-2009 SNU CSE Biointelligence Lab 30
¨Relationships between Bayesian networks and neural networks
l [Hecherman 1991, Heckerman & Nathwani 1992]¨ PATHFINDER
l [Pradhan, et al. 1994]¨CPCSBN
Additional Readings (4/5)Additional Readings (4/5)
l [Shortliffe 1976, Buchanan & Shortliffe 1984]¨MYCIN: uses certainty factor
l [Duda, Hart & Nilsson 1987]¨ PROSPECTOR: uses sufficiency index and necessity index
(C) 1999-2009 SNU CSE Biointelligence Lab 31
¨ PROSPECTOR: uses sufficiency index and necessity index
l [Zadeh 1975, Zadeh 1978, Elkan 1993]¨ Fuzzy logic and possibility theory
l [Dempster 1968, Shafer 1979]¨Dempster-Shafer’s combination rules
Additional Readings (5/5)Additional Readings (5/5)
l [Nilsson 1986]¨ Probabilistic logic
l [Tversky & Kahneman 1982]¨Human generally loses consistency facing uncertainty
(C) 1999-2009 SNU CSE Biointelligence Lab 32
¨Human generally loses consistency facing uncertainty
l [Shafer & Pearl 1990]¨ Papers for uncertain inference
l Proceedings & Journals¨Uncertainty in Artificial Intelligence (UAI)¨ International Journal of Approximate Reasoning