final exam: may 10 thursday. if event e occurs, then the probability that event h will occur is p (...
TRANSCRIPT
![Page 1: Final Exam: May 10 Thursday. If event E occurs, then the probability that event H will occur is p ( H | E ) IF E ( evidence ) is true THEN H ( hypothesis](https://reader033.vdocuments.mx/reader033/viewer/2022042821/56649cf65503460f949c5861/html5/thumbnails/1.jpg)
Final Exam Review
Final Exam: May 10 Thursday
![Page 2: Final Exam: May 10 Thursday. If event E occurs, then the probability that event H will occur is p ( H | E ) IF E ( evidence ) is true THEN H ( hypothesis](https://reader033.vdocuments.mx/reader033/viewer/2022042821/56649cf65503460f949c5861/html5/thumbnails/2.jpg)
If event E occurs, then the probability thatevent H will occur is p(H|E)
IF E (evidence) is trueTHEN H (hypothesis) is true with
probability p
Bayesian reasoning
HpHEpHpHEp
HpHEpEHp
![Page 3: Final Exam: May 10 Thursday. If event E occurs, then the probability that event H will occur is p ( H | E ) IF E ( evidence ) is true THEN H ( hypothesis](https://reader033.vdocuments.mx/reader033/viewer/2022042821/56649cf65503460f949c5861/html5/thumbnails/3.jpg)
Bayesian reasoning Example: Cancer and Test P(C) = 0.01 P(¬C) = 0.99P(+|C) = 0.9 P(-|C) = 0.1P(+|¬C) = 0.2 P(-|¬C) = 0.8
P(C|+) = ?
HpHEpHpHEp
HpHEpEHp
![Page 4: Final Exam: May 10 Thursday. If event E occurs, then the probability that event H will occur is p ( H | E ) IF E ( evidence ) is true THEN H ( hypothesis](https://reader033.vdocuments.mx/reader033/viewer/2022042821/56649cf65503460f949c5861/html5/thumbnails/4.jpg)
Expand the Bayesian rule to work with multiple hypotheses (H1...Hm) and evidences (E1...En)
Assuming conditional independence among evidences E1...En
Bayesian reasoning with multiple hypotheses and evidences
m
kkknkk
iiniini
HpHEp...HEpHEp
HpHEpHEpHEpE...EEHp
121
2121
...
![Page 5: Final Exam: May 10 Thursday. If event E occurs, then the probability that event H will occur is p ( H | E ) IF E ( evidence ) is true THEN H ( hypothesis](https://reader033.vdocuments.mx/reader033/viewer/2022042821/56649cf65503460f949c5861/html5/thumbnails/5.jpg)
Expert data:
Bayesian reasoning Example
H ypothesi sProbability
1i 2i 3i
0.40
0.9
0.6
0.3
0.35
0.0
0.7
0.8
0.25
0.7
0.9
0.5
iHp
iHEp 1
iHEp 2
iHEp 3
![Page 6: Final Exam: May 10 Thursday. If event E occurs, then the probability that event H will occur is p ( H | E ) IF E ( evidence ) is true THEN H ( hypothesis](https://reader033.vdocuments.mx/reader033/viewer/2022042821/56649cf65503460f949c5861/html5/thumbnails/6.jpg)
user observes E3 E1 E2
H ypothesi sProbability
1i 2i 3i
0.40
0.9
0.6
0.3
0.35
0.0
0.7
0.8
0.25
0.7
0.9
0.5
iHp
iHEp 1
iHEp 2
iHEp 3
![Page 7: Final Exam: May 10 Thursday. If event E occurs, then the probability that event H will occur is p ( H | E ) IF E ( evidence ) is true THEN H ( hypothesis](https://reader033.vdocuments.mx/reader033/viewer/2022042821/56649cf65503460f949c5861/html5/thumbnails/7.jpg)
32,1,=,3
1321
321321 i
HpHEpHEpHEp
HpHEpHEpHEpEEEHp
kkkkk
iiiii
0.4525.09.00.50.7
0.7
0.7
0.5
+.3507.00.00.8+0.400.60.90.30.400.60.90.3
3211
EEEHp
025.09.0+.3507.00.00.8+0.400.60.90.3
35.07.00.00.83212
EEEHp
0.5525.09.00.5+.3507.00.00.8+0.400.60.90.3
25.09.00.70.53213
EEEHp
32,1,=,3
1321
321321 i
HpHEpHEpHEp
HpHEpHEpHEpEEEHp
kkkkk
iiiii
0.4525.09.00.50.7
0.7
0.7
0.5
+.3507.00.00.8+0.400.60.90.30.400.60.90.3
3211
EEEHp
025.09.0+.3507.00.00.8+0.400.60.90.3
35.07.00.00.83212
EEEHp
0.5525.09.00.5+.3507.00.00.8+0.400.60.90.3
25.09.00.70.53213
EEEHp
Bayesian reasoning Example
expert system computesposterior probabilitiesuser observes E2
H ypothesi sProbability
1i 2i 3i
0.40
0.9
0.6
0.3
0.35
0.0
0.7
0.8
0.25
0.7
0.9
0.5
iHp
iHEp 1
iHEp 2
iHEp 3
m
kkknkk
iiniini
HpHEp...HEpHEp
HpHEpHEpHEpE...EEHp
121
2121
...
![Page 8: Final Exam: May 10 Thursday. If event E occurs, then the probability that event H will occur is p ( H | E ) IF E ( evidence ) is true THEN H ( hypothesis](https://reader033.vdocuments.mx/reader033/viewer/2022042821/56649cf65503460f949c5861/html5/thumbnails/8.jpg)
Propagation of CFsFor a single antecedent rule:
cf(E) is the certainty factor of the evidence.cf(R) is the certainty factor of the rule.
![Page 9: Final Exam: May 10 Thursday. If event E occurs, then the probability that event H will occur is p ( H | E ) IF E ( evidence ) is true THEN H ( hypothesis](https://reader033.vdocuments.mx/reader033/viewer/2022042821/56649cf65503460f949c5861/html5/thumbnails/9.jpg)
Single antecedent rule exampleIF patient has toothache THEN problem is cavity {cf 0.3}Patient has toothache {cf 0.9}What is the cf(cavity, toothache)?
![Page 10: Final Exam: May 10 Thursday. If event E occurs, then the probability that event H will occur is p ( H | E ) IF E ( evidence ) is true THEN H ( hypothesis](https://reader033.vdocuments.mx/reader033/viewer/2022042821/56649cf65503460f949c5861/html5/thumbnails/10.jpg)
Propagation of CFs (multiple antecedents)For conjunctive rules:
IF <evidence E1> AND <evidence E2> ... AND <evidence En> THEN <Hypothesis H> {cf}
For two evidences E1 and E2:cf(E1 AND E2) = min(cf(E1), cf(E2))
![Page 11: Final Exam: May 10 Thursday. If event E occurs, then the probability that event H will occur is p ( H | E ) IF E ( evidence ) is true THEN H ( hypothesis](https://reader033.vdocuments.mx/reader033/viewer/2022042821/56649cf65503460f949c5861/html5/thumbnails/11.jpg)
Propagation of CFs (multiple antecedents)For disjunctive rules:
IF <evidence E1> OR <evidence E2> ... OR <evidence En> THEN <Hypothesis H> {cf}
For two evidences E1 and E2:cf(E1 OR E2) = max(cf(E1), cf(E2))
![Page 12: Final Exam: May 10 Thursday. If event E occurs, then the probability that event H will occur is p ( H | E ) IF E ( evidence ) is true THEN H ( hypothesis](https://reader033.vdocuments.mx/reader033/viewer/2022042821/56649cf65503460f949c5861/html5/thumbnails/12.jpg)
ExerciseIF (P1 AND P2) OR P3 THEN C1 (0.7) AND C2 (0.3)Assume cf(P1) = 0.6, cf(P2) = 0.4, cf(P3) =
0.2What is cf(C1), cf(C2)?
![Page 13: Final Exam: May 10 Thursday. If event E occurs, then the probability that event H will occur is p ( H | E ) IF E ( evidence ) is true THEN H ( hypothesis](https://reader033.vdocuments.mx/reader033/viewer/2022042821/56649cf65503460f949c5861/html5/thumbnails/13.jpg)
Defining fuzzy sets with fit-vectorsA can be defined as:
So, for example:Tall men = (0/180, 1/190)Short men=(1/160, 0/170)Average men=(0/165,1/175,0/185)
![Page 14: Final Exam: May 10 Thursday. If event E occurs, then the probability that event H will occur is p ( H | E ) IF E ( evidence ) is true THEN H ( hypothesis](https://reader033.vdocuments.mx/reader033/viewer/2022042821/56649cf65503460f949c5861/html5/thumbnails/14.jpg)
What about linguistic values with qualifiers?e.g. very tall, extremely short, etc.
Hedges are qualifying terms that modifythe shape of fuzzy setse.g. very, somewhat, quite, slightly,
extremely, etc.
Qualifiers & Hedges
![Page 15: Final Exam: May 10 Thursday. If event E occurs, then the probability that event H will occur is p ( H | E ) IF E ( evidence ) is true THEN H ( hypothesis](https://reader033.vdocuments.mx/reader033/viewer/2022042821/56649cf65503460f949c5861/html5/thumbnails/15.jpg)
Representing HedgesHedge Mathematical
Expression
A little
Slightly
Very
Extremely
Graphical Representation
[A(x)]1.3
[A(x)]1.7
[A(x)]2
[A(x)]3
Hedge MathematicalExpression
A little
Slightly
Very
Extremely
Graphical Representation
[A(x)]1.3
[A(x)]1.7
[A(x)]2
[A(x)]3
![Page 16: Final Exam: May 10 Thursday. If event E occurs, then the probability that event H will occur is p ( H | E ) IF E ( evidence ) is true THEN H ( hypothesis](https://reader033.vdocuments.mx/reader033/viewer/2022042821/56649cf65503460f949c5861/html5/thumbnails/16.jpg)
Representing Hedges
Hedge MathematicalExpression Graphical Representation
Very very
More or less
Indeed
Somewhat
2 [A(x )]2
A(x)
A(x)
if 0 A 0.5
if 0.5 < A 1
1 2 [1 A(x)]2
[A(x)]4
![Page 17: Final Exam: May 10 Thursday. If event E occurs, then the probability that event H will occur is p ( H | E ) IF E ( evidence ) is true THEN H ( hypothesis](https://reader033.vdocuments.mx/reader033/viewer/2022042821/56649cf65503460f949c5861/html5/thumbnails/17.jpg)
Representing HedgesHedge Mathematical
Expression Graphical Representation
Very very
More or less
Indeed
Somewhat
2 [A(x )]2
A(x)
A(x)
if 0 A 0.5
if 0.5 < A 1
1 2 [1 A(x)]2
[A(x)]4
Hedge MathematicalExpression Graphical Representation
Very very
More or less
Indeed
Somewhat
2 [A(x )]2
A(x)
A(x)
if 0 A 0.5
if 0.5 < A 1
1 2 [1 A(x)]2
[A(x)]4
![Page 18: Final Exam: May 10 Thursday. If event E occurs, then the probability that event H will occur is p ( H | E ) IF E ( evidence ) is true THEN H ( hypothesis](https://reader033.vdocuments.mx/reader033/viewer/2022042821/56649cf65503460f949c5861/html5/thumbnails/18.jpg)
Crisp Set Operations
Intersection Union
Complement
NotA
A
Containment
AA
B
BA AA B
Intersection Union
Complement
NotA
A
Containment
AA
B
BA AA B
![Page 19: Final Exam: May 10 Thursday. If event E occurs, then the probability that event H will occur is p ( H | E ) IF E ( evidence ) is true THEN H ( hypothesis](https://reader033.vdocuments.mx/reader033/viewer/2022042821/56649cf65503460f949c5861/html5/thumbnails/19.jpg)
ComplementTo what degree do elements not belong to this set?
tall men = {0/180, 0.25/182, 0.5/185, 0.75/187, 1/190};
Not tall men = {1/180, 0.75/182, 0.5/185, 0.25/187, 1/190};
Fuzzy Set Operations
Intersection Union
Complement
NotA
A
Containment
AA
B
BA AA B
Complement
0x
1
(x)
0x
1
Containment
0x
1
0x
1
AB
Not A
A
Intersection
0x
1
0x
AB
Union0
1
ABAB
0x
1
0x
1
B
A
B
A
(x)
(x) (x)
m¬A(x) = 1 – mA(x)
![Page 20: Final Exam: May 10 Thursday. If event E occurs, then the probability that event H will occur is p ( H | E ) IF E ( evidence ) is true THEN H ( hypothesis](https://reader033.vdocuments.mx/reader033/viewer/2022042821/56649cf65503460f949c5861/html5/thumbnails/20.jpg)
ContainmentWhich sets belong to other sets?
tall men = {0/180, 0.25/182, 0.5/185, 0.75/187, 1/190};
very tall men = {0/180, 0.06/182, 0.25/185, 0.56/187, 1/190};
Fuzzy Set Operations
Intersection Union
Complement
NotA
A
Containment
AA
B
BA AA B
Complement
0x
1
(x)
0x
1
Containment
0x
1
0x
1
AB
Not A
A
Intersection
0x
1
0x
AB
Union0
1
ABAB
0x
1
0x
1
B
A
B
A
(x)
(x) (x)
Each element of the fuzzysubset has smaller membership
than in the containing set
![Page 21: Final Exam: May 10 Thursday. If event E occurs, then the probability that event H will occur is p ( H | E ) IF E ( evidence ) is true THEN H ( hypothesis](https://reader033.vdocuments.mx/reader033/viewer/2022042821/56649cf65503460f949c5861/html5/thumbnails/21.jpg)
IntersectionTo what degree is the element in both sets?
Fuzzy Set Operations
Intersection Union
Complement
NotA
A
Containment
AA
B
BA AA B
Complement
0x
1
(x)
0x
1
Containment
0x
1
0x
1
AB
Not A
A
Intersection
0x
1
0x
AB
Union0
1
ABAB
0x
1
0x
1
B
A
B
A
(x)
(x) (x)
mA∩B(x) = min[ mA(x), mB(x) ]
![Page 22: Final Exam: May 10 Thursday. If event E occurs, then the probability that event H will occur is p ( H | E ) IF E ( evidence ) is true THEN H ( hypothesis](https://reader033.vdocuments.mx/reader033/viewer/2022042821/56649cf65503460f949c5861/html5/thumbnails/22.jpg)
tall men = {0/165, 0/175, 0/180, 0.25/182, 0.5/185, 1/190};
average men = {0/165, 1/175, 0.5/180, 0.25/182, 0/185, 0/190};
tall men ∩ average men = {0/165, 0/175, 0/180, 0.25/182, 0/185, 0/190};
ortall men ∩ average men = {0/180, 0.25/182,
0/185};
mA∩B(x) = min[ mA(x), mB(x) ]
![Page 23: Final Exam: May 10 Thursday. If event E occurs, then the probability that event H will occur is p ( H | E ) IF E ( evidence ) is true THEN H ( hypothesis](https://reader033.vdocuments.mx/reader033/viewer/2022042821/56649cf65503460f949c5861/html5/thumbnails/23.jpg)
UnionTo what degree is the element in either or
both sets?
Fuzzy Set Operations
Intersection Union
Complement
NotA
A
Containment
AA
B
BA AA B
Complement
0x
1
(x)
0x
1
Containment
0x
1
0x
1
AB
Not A
A
Intersection
0x
1
0x
AB
Union0
1
ABAB
0x
1
0x
1
B
A
B
A
(x)
(x) (x)
mAB(x) = max[ mA(x), mB(x) ]
![Page 24: Final Exam: May 10 Thursday. If event E occurs, then the probability that event H will occur is p ( H | E ) IF E ( evidence ) is true THEN H ( hypothesis](https://reader033.vdocuments.mx/reader033/viewer/2022042821/56649cf65503460f949c5861/html5/thumbnails/24.jpg)
tall men = {0/165, 0/175, 0/180, 0.25/182, 0.5/185, 1/190};
average men = {0/165, 1/175, 0.5/180, 0.25/182, 0/185, 0/190};
tall men average men = {0/165, 1/175, 0.5/180, 0.25/182, 0.5/185, 1/190};
mAB(x) = max[ mA(x), mB(x) ]
![Page 25: Final Exam: May 10 Thursday. If event E occurs, then the probability that event H will occur is p ( H | E ) IF E ( evidence ) is true THEN H ( hypothesis](https://reader033.vdocuments.mx/reader033/viewer/2022042821/56649cf65503460f949c5861/html5/thumbnails/25.jpg)
25
Choosing the Best Attribute:Binary ClassificationWant a formal measure that returns a maximum
value when attribute makes a perfect split and minimum when it makes no distinction
Information theory (Shannon and Weaver 49)Entropy: a measure of uncertainty of a random
variable A coin that always comes up heads --> 0 A flip of a fair coin (Heads or tails) --> 1(bit) The roll of a fair four-sided die --> 2(bit)
Information gain: the expected reduction in entropy caused by partitioning the examples according to this attribute
![Page 26: Final Exam: May 10 Thursday. If event E occurs, then the probability that event H will occur is p ( H | E ) IF E ( evidence ) is true THEN H ( hypothesis](https://reader033.vdocuments.mx/reader033/viewer/2022042821/56649cf65503460f949c5861/html5/thumbnails/26.jpg)
26
Formula for Entropy
Examples:Suppose we have a collection of 10 examples, 5
positive, 5 negative:H(1/2,1/2) = -1/2log21/2 -1/2log21/2 = 1 bit
Suppose we have a collection of 100 examples, 1 positive and 99 negative:
H(1/100,99/100) = -.01log2.01 -.99log2.99 = .08 bits
![Page 27: Final Exam: May 10 Thursday. If event E occurs, then the probability that event H will occur is p ( H | E ) IF E ( evidence ) is true THEN H ( hypothesis](https://reader033.vdocuments.mx/reader033/viewer/2022042821/56649cf65503460f949c5861/html5/thumbnails/27.jpg)
Information gainInformation gain (from attribute test) =
difference between the original information requirement and new requirement
Information Gain (IG) or reduction in entropy from the attribute test:
Choose the attribute with the largest IG
![Page 28: Final Exam: May 10 Thursday. If event E occurs, then the probability that event H will occur is p ( H | E ) IF E ( evidence ) is true THEN H ( hypothesis](https://reader033.vdocuments.mx/reader033/viewer/2022042821/56649cf65503460f949c5861/html5/thumbnails/28.jpg)
Information gainFor the training set, p = n = 6, I(6/12, 6/12) = 1 bit
Consider the attributes Patrons and Type (and others too):
Patrons has the highest IG of all attributes and so is chosen by the DTL algorithm as the root
![Page 29: Final Exam: May 10 Thursday. If event E occurs, then the probability that event H will occur is p ( H | E ) IF E ( evidence ) is true THEN H ( hypothesis](https://reader033.vdocuments.mx/reader033/viewer/2022042821/56649cf65503460f949c5861/html5/thumbnails/29.jpg)
Example contd.Decision tree learned from the 12 examples:
Substantially simpler than “true”
![Page 30: Final Exam: May 10 Thursday. If event E occurs, then the probability that event H will occur is p ( H | E ) IF E ( evidence ) is true THEN H ( hypothesis](https://reader033.vdocuments.mx/reader033/viewer/2022042821/56649cf65503460f949c5861/html5/thumbnails/30.jpg)
Perceptrons
Threshold
Inputs
x1
x2
Output
Y
HardLimiter
w2
w1
LinearCombiner
X = x1w1 + x2w2
Y = Ystep
![Page 31: Final Exam: May 10 Thursday. If event E occurs, then the probability that event H will occur is p ( H | E ) IF E ( evidence ) is true THEN H ( hypothesis](https://reader033.vdocuments.mx/reader033/viewer/2022042821/56649cf65503460f949c5861/html5/thumbnails/31.jpg)
Perceptrons
How does a perceptron learn?A perceptron has initial (often random)
weights typically in the range [-0.5, 0.5]Apply an established training dataset Calculate the error as
expected output minus actual output:
error e = Yexpected – Yactual
Adjust the weights to reduce the error
![Page 32: Final Exam: May 10 Thursday. If event E occurs, then the probability that event H will occur is p ( H | E ) IF E ( evidence ) is true THEN H ( hypothesis](https://reader033.vdocuments.mx/reader033/viewer/2022042821/56649cf65503460f949c5861/html5/thumbnails/32.jpg)
Perceptrons
How do we adjust a perceptron’s weights to produce Yexpected?If e is positive, we need to increase Yactual
(and vice versa)Use this formula:
, where and
α is the learning rate (between 0 and 1) e is the calculated error
![Page 33: Final Exam: May 10 Thursday. If event E occurs, then the probability that event H will occur is p ( H | E ) IF E ( evidence ) is true THEN H ( hypothesis](https://reader033.vdocuments.mx/reader033/viewer/2022042821/56649cf65503460f949c5861/html5/thumbnails/33.jpg)
Perceptron Example – AND
Train a perceptron to recognize logical AND
Use threshold Θ = 0.2 andlearning rate α = 0.1
![Page 34: Final Exam: May 10 Thursday. If event E occurs, then the probability that event H will occur is p ( H | E ) IF E ( evidence ) is true THEN H ( hypothesis](https://reader033.vdocuments.mx/reader033/viewer/2022042821/56649cf65503460f949c5861/html5/thumbnails/34.jpg)
Perceptron Example – AND
Train a perceptron to recognize logical AND
Use threshold Θ = 0.2 andlearning rate α = 0.1
![Page 35: Final Exam: May 10 Thursday. If event E occurs, then the probability that event H will occur is p ( H | E ) IF E ( evidence ) is true THEN H ( hypothesis](https://reader033.vdocuments.mx/reader033/viewer/2022042821/56649cf65503460f949c5861/html5/thumbnails/35.jpg)
Perceptron Example – ANDRepeat until convergence
i.e. final weights do not change and no error
Use threshold Θ = 0.2 andlearning rate α = 0.1