cs b551: d ecision t rees. a genda decision trees complexity learning curves combatting overfitting...
TRANSCRIPT
![Page 1: CS B551: D ECISION T REES. A GENDA Decision trees Complexity Learning curves Combatting overfitting Boosting](https://reader036.vdocuments.mx/reader036/viewer/2022062620/551b038055034607418b4b48/html5/thumbnails/1.jpg)
CS B551: DECISION TREES
![Page 2: CS B551: D ECISION T REES. A GENDA Decision trees Complexity Learning curves Combatting overfitting Boosting](https://reader036.vdocuments.mx/reader036/viewer/2022062620/551b038055034607418b4b48/html5/thumbnails/2.jpg)
AGENDA
Decision trees Complexity Learning curves Combatting overfitting
Boosting
![Page 3: CS B551: D ECISION T REES. A GENDA Decision trees Complexity Learning curves Combatting overfitting Boosting](https://reader036.vdocuments.mx/reader036/viewer/2022062620/551b038055034607418b4b48/html5/thumbnails/3.jpg)
RECAP
Still in supervised setting with logical attributes
Find a representation of CONCEPT in the form:
CONCEPT(x) S(A,B, …)
where S(A,B,…) is a sentence built with the observable attributes, e.g.:
CONCEPT(x) A(x) (B(x) v C(x))
![Page 4: CS B551: D ECISION T REES. A GENDA Decision trees Complexity Learning curves Combatting overfitting Boosting](https://reader036.vdocuments.mx/reader036/viewer/2022062620/551b038055034607418b4b48/html5/thumbnails/4.jpg)
PREDICATE AS A DECISION TREE
The predicate CONCEPT(x) A(x) (B(x) v C(x)) can be represented by the following decision tree:
A?
B?
C?True
True
True
True
FalseTrue
False
FalseFalse
False
Example:A mushroom is poisonous iffit is yellow and small, or yellow, big and spotted• x is a mushroom• CONCEPT = POISONOUS• A = YELLOW• B = BIG• C = SPOTTED
![Page 5: CS B551: D ECISION T REES. A GENDA Decision trees Complexity Learning curves Combatting overfitting Boosting](https://reader036.vdocuments.mx/reader036/viewer/2022062620/551b038055034607418b4b48/html5/thumbnails/5.jpg)
PREDICATE AS A DECISION TREE
The predicate CONCEPT(x) A(x) (B(x) v C(x)) can be represented by the following decision tree:
A?
B?
C?True
True
True
True
FalseTrue
False
FalseFalse
False
Example:A mushroom is poisonous iffit is yellow and small, or yellow, big and spotted• x is a mushroom• CONCEPT = POISONOUS• A = YELLOW• B = BIG• C = SPOTTED• D = FUNNEL-CAP• E = BULKY
![Page 6: CS B551: D ECISION T REES. A GENDA Decision trees Complexity Learning curves Combatting overfitting Boosting](https://reader036.vdocuments.mx/reader036/viewer/2022062620/551b038055034607418b4b48/html5/thumbnails/6.jpg)
TRAINING SET
Ex. # A B C D E CONCEPT
1 False False True False True False
2 False True False False False False
3 False True True True True False
4 False False True False False False
5 False False False True True False
6 True False True False False True
7 True False False True False True
8 True False True False True True
9 True True True False True True
10 True True True True True True
11 True True False False False False
12 True True False False True False
13 True False True True True True
![Page 7: CS B551: D ECISION T REES. A GENDA Decision trees Complexity Learning curves Combatting overfitting Boosting](https://reader036.vdocuments.mx/reader036/viewer/2022062620/551b038055034607418b4b48/html5/thumbnails/7.jpg)
TrueTrueTrueTrueFalseTrue13
FalseTrueFalseFalseTrueTrue12
FalseFalseFalseFalseTrueTrue11
TrueTrueTrueTrueTrueTrue10
TrueTrueFalseTrueTrueTrue9
TrueTrueFalseTrueFalseTrue8
TrueFalseTrueFalseFalseTrue7
TrueFalseFalseTrueFalseTrue6
FalseTrueTrueFalseFalseFalse5
FalseFalseFalseTrueFalseFalse4
FalseTrueTrueTrueTrueFalse3
FalseFalseFalseFalseTrueFalse2
FalseTrueFalseTrueFalseFalse1
CONCEPTEDCBAEx. #
POSSIBLE DECISION TREE
D
CE
B
E
AA
A
T
F
F
FF
F
T
T
T
TT
![Page 8: CS B551: D ECISION T REES. A GENDA Decision trees Complexity Learning curves Combatting overfitting Boosting](https://reader036.vdocuments.mx/reader036/viewer/2022062620/551b038055034607418b4b48/html5/thumbnails/8.jpg)
POSSIBLE DECISION TREE
D
CE
B
E
AA
A
T
F
F
FF
F
T
T
T
TT
CONCEPT (D(EvA))v(D(C(Bv(B((EA)v(EA))))))
A?
B?
C?True
True
True
True
FalseTrue
False
FalseFalse
False
CONCEPT A (B v C)
![Page 9: CS B551: D ECISION T REES. A GENDA Decision trees Complexity Learning curves Combatting overfitting Boosting](https://reader036.vdocuments.mx/reader036/viewer/2022062620/551b038055034607418b4b48/html5/thumbnails/9.jpg)
POSSIBLE DECISION TREE
D
CE
B
E
AA
A
T
F
F
FF
F
T
T
T
TT
A?
B?
C?True
True
True
True
FalseTrue
False
FalseFalse
False
CONCEPT A (B v C)
KIS bias Build smallest decision tree
Computationally intractable problem greedy algorithm
CONCEPT (D(EvA))v(D(C(Bv(B((EA)v(EA))))))
![Page 10: CS B551: D ECISION T REES. A GENDA Decision trees Complexity Learning curves Combatting overfitting Boosting](https://reader036.vdocuments.mx/reader036/viewer/2022062620/551b038055034607418b4b48/html5/thumbnails/10.jpg)
TOP-DOWNINDUCTION OF A DT
DTL(D, Predicates)1. If all examples in D are positive then return True2. If all examples in D are negative then return False3. If Predicates is empty then return majority rule4. A error-minimizing predicate in Predicates5. Return the tree whose:
- root is A, - left branch is DTL(D+A,Predicates-A), - right branch is DTL(D-A,Predicates-A)
A
C
True
True
TrueB
True
TrueFalse
False
FalseFalse
False
![Page 11: CS B551: D ECISION T REES. A GENDA Decision trees Complexity Learning curves Combatting overfitting Boosting](https://reader036.vdocuments.mx/reader036/viewer/2022062620/551b038055034607418b4b48/html5/thumbnails/11.jpg)
LEARNABLE CONCEPTS
Some simple concepts cannot be represented compactly in DTsParity(x) = X1 xor X2 xor … xor Xn
Majority(x) = 1 if most of Xi’s are 1, 0 otherwise
Exponential size in # of attributesNeed exponential # of examples to
learn exactlyThe ease of learning is dependent on
shrewdly (or luckily) chosen attributes that correlate with CONCEPT
![Page 12: CS B551: D ECISION T REES. A GENDA Decision trees Complexity Learning curves Combatting overfitting Boosting](https://reader036.vdocuments.mx/reader036/viewer/2022062620/551b038055034607418b4b48/html5/thumbnails/12.jpg)
PERFORMANCE ISSUES
Assessing performance: Training set and test set Learning curve
size of training set
% c
orr
ect
on
tes
t se
t 100
Typical learning curve
![Page 13: CS B551: D ECISION T REES. A GENDA Decision trees Complexity Learning curves Combatting overfitting Boosting](https://reader036.vdocuments.mx/reader036/viewer/2022062620/551b038055034607418b4b48/html5/thumbnails/13.jpg)
PERFORMANCE ISSUES
Assessing performance: Training set and test set Learning curve
size of training set
% c
orr
ect
on
tes
t se
t 100
Typical learning curve
Some concepts are unrealizable within a machine’s capacity
![Page 14: CS B551: D ECISION T REES. A GENDA Decision trees Complexity Learning curves Combatting overfitting Boosting](https://reader036.vdocuments.mx/reader036/viewer/2022062620/551b038055034607418b4b48/html5/thumbnails/14.jpg)
PERFORMANCE ISSUES
Assessing performance: Training set and test set Learning curve
OverfittingRisk of using irrelevant
observable predicates togenerate an hypothesis
that agrees with all examples
in the training set
size of training set
% c
orr
ect
on
tes
t se
t
100
Typical learning curve
![Page 15: CS B551: D ECISION T REES. A GENDA Decision trees Complexity Learning curves Combatting overfitting Boosting](https://reader036.vdocuments.mx/reader036/viewer/2022062620/551b038055034607418b4b48/html5/thumbnails/15.jpg)
PERFORMANCE ISSUES
Assessing performance: Training set and test set Learning curve
Overfitting Tree pruning
Risk of using irrelevantobservable predicates togenerate an hypothesis
that agrees with all examples
in the training set
Terminate recursion when# errors / information gain
is small
![Page 16: CS B551: D ECISION T REES. A GENDA Decision trees Complexity Learning curves Combatting overfitting Boosting](https://reader036.vdocuments.mx/reader036/viewer/2022062620/551b038055034607418b4b48/html5/thumbnails/16.jpg)
PERFORMANCE ISSUES
Assessing performance: Training set and test set Learning curve
Overfitting Tree pruning
Terminate recursion when# errors / information gain
is small
Risk of using irrelevantobservable predicates togenerate an hypothesis
that agrees with all examples
in the training setThe resulting decision tree + majority rule may not classify correctly all examples in the training set
![Page 17: CS B551: D ECISION T REES. A GENDA Decision trees Complexity Learning curves Combatting overfitting Boosting](https://reader036.vdocuments.mx/reader036/viewer/2022062620/551b038055034607418b4b48/html5/thumbnails/17.jpg)
PERFORMANCE ISSUES
Assessing performance: Training set and test set Learning curve
Overfitting Tree pruning
Incorrect examples Missing data Multi-valued and continuous attributes
![Page 18: CS B551: D ECISION T REES. A GENDA Decision trees Complexity Learning curves Combatting overfitting Boosting](https://reader036.vdocuments.mx/reader036/viewer/2022062620/551b038055034607418b4b48/html5/thumbnails/18.jpg)
USING INFORMATION THEORY Rather than minimizing the probability of
error, minimize the expected number of questions needed to decide if an object x satisfies CONCEPT
Use the information-theoretic quantity known as information gain
Split on variable with highest information gain
![Page 19: CS B551: D ECISION T REES. A GENDA Decision trees Complexity Learning curves Combatting overfitting Boosting](https://reader036.vdocuments.mx/reader036/viewer/2022062620/551b038055034607418b4b48/html5/thumbnails/19.jpg)
ENTROPY / INFORMATION GAIN Entropy: encodes the quantity of uncertainty in a
random variable H(X) = -xVal(X) P(x) log P(x)
Properties H(X) = 0 if X is known, i.e. P(x)=1 for some value x H(X) > 0 if X is not known with certainty H(X) is maximal if P(X) is uniform distribution
Information gain: measures the reduction in uncertainty in X given knowledge of Y I(X,Y) = Ey[H(X) – H(X|Y)] =
y P(y) x [P(x|y) log P(x|y) – P(x)log P(x)] Properties
Always nonnegative = 0 if X and Y are independent
If Y is a choice, maximizing IG = > minimizing Ey[H(X|Y)]
![Page 20: CS B551: D ECISION T REES. A GENDA Decision trees Complexity Learning curves Combatting overfitting Boosting](https://reader036.vdocuments.mx/reader036/viewer/2022062620/551b038055034607418b4b48/html5/thumbnails/20.jpg)
MAXIMIZING IG / MINIMIZING CONDITIONAL ENTROPY IN DECISION TREES
Ey[H(X|Y)] = y P(y) x P(x|y) log P(x|y)
Let n be # of examples Let n+,n- be # of examples on T/F branches of
Y Let p+,p- be accuracy on true/false branches
of Y P(Y) = (p+n++p-n-)/n P(correct|Y) = p+, P(correct|-Y) = p-
Ey[H(X|Y)] n+ [p+log p+ + (1-p+)log (1-p+)] + n- [p-log p- + (1-p-) log (1-p-)]
![Page 21: CS B551: D ECISION T REES. A GENDA Decision trees Complexity Learning curves Combatting overfitting Boosting](https://reader036.vdocuments.mx/reader036/viewer/2022062620/551b038055034607418b4b48/html5/thumbnails/21.jpg)
CONTINUOUS ATTRIBUTES
Continuous attributes can be converted into logical ones via thresholds X => X<a
When considering splitting on X, pick the threshold a to minimize # of errors / entropy
7 7 6 5 6 5 4 5 4 3 4 5 4 5 6 7
![Page 22: CS B551: D ECISION T REES. A GENDA Decision trees Complexity Learning curves Combatting overfitting Boosting](https://reader036.vdocuments.mx/reader036/viewer/2022062620/551b038055034607418b4b48/html5/thumbnails/22.jpg)
MULTI-VALUED ATTRIBUTES
Simple change: consider splits on all values A can take on
Caveat: the more values A can take on, the more important it may appear to be, even if it is irrelevant More values => dataset split into smaller
example sets when picking attributes Smaller example sets => more likely to fit well
to spurious noise
![Page 23: CS B551: D ECISION T REES. A GENDA Decision trees Complexity Learning curves Combatting overfitting Boosting](https://reader036.vdocuments.mx/reader036/viewer/2022062620/551b038055034607418b4b48/html5/thumbnails/23.jpg)
STATISTICAL METHODS FOR ADDRESSING OVERFITTING / NOISE There may be few training examples that
match the path leading to a deep node in the decision tree More susceptible to choosing irrelevant/incorrect
attributes when sample is small Idea:
Make a statistical estimate of predictive power (which increases with larger samples)
Prune branches with low predictive power Chi-squared pruning
![Page 24: CS B551: D ECISION T REES. A GENDA Decision trees Complexity Learning curves Combatting overfitting Boosting](https://reader036.vdocuments.mx/reader036/viewer/2022062620/551b038055034607418b4b48/html5/thumbnails/24.jpg)
TOP-DOWN DT PRUNING
Consider an inner node X that by itself (majority rule) predicts p examples correctly and n examples incorrectly
At k leaf nodes, number of correct/incorrect examples are p1/n1,…,pk/nk
Chi-squared statistical significance test: Null hypothesis: example labels randomly chosen
with distribution p/(p+n) (X is irrelevant) Alternate hypothesis: examples not randomly
chosen (X is relevant) Prune X if testing X is not statistically
significant
![Page 25: CS B551: D ECISION T REES. A GENDA Decision trees Complexity Learning curves Combatting overfitting Boosting](https://reader036.vdocuments.mx/reader036/viewer/2022062620/551b038055034607418b4b48/html5/thumbnails/25.jpg)
CHI-SQUARED TEST
Let Z = Si (pi – pi’)2/pi’ + (ni – ni’)2/ni’ Where pi’ = pi(pi+ni)/(p+n), ni’ = ni(pi+ni)/(p+n)
are the expected number of true/false examples at leaf node i if the null hypothesis holds
Z is a statistic that is approximately drawn from the chi-squared distribution with k degrees of freedom
Look up p-Value of Z from a table, prune if p-Value > a for some a (usually ~.05)
![Page 26: CS B551: D ECISION T REES. A GENDA Decision trees Complexity Learning curves Combatting overfitting Boosting](https://reader036.vdocuments.mx/reader036/viewer/2022062620/551b038055034607418b4b48/html5/thumbnails/26.jpg)
ENSEMBLE LEARNING (BOOSTING)
![Page 27: CS B551: D ECISION T REES. A GENDA Decision trees Complexity Learning curves Combatting overfitting Boosting](https://reader036.vdocuments.mx/reader036/viewer/2022062620/551b038055034607418b4b48/html5/thumbnails/27.jpg)
IDEA
It may be difficult to search for a single hypothesis that explains the data
Construct multiple hypotheses (ensemble), and combine their predictions
“Can a set of weak learners construct a single strong learner?” – Michael Kearns, 1988
![Page 28: CS B551: D ECISION T REES. A GENDA Decision trees Complexity Learning curves Combatting overfitting Boosting](https://reader036.vdocuments.mx/reader036/viewer/2022062620/551b038055034607418b4b48/html5/thumbnails/28.jpg)
MOTIVATION
5 classifiers with 60% accuracy On a new example, run them all, and pick the
prediction using majority voting
If errors are independent, new classifier has 94% accuracy! (In reality errors will not be independent, but we
hope they will be mostly uncorrelated)
![Page 29: CS B551: D ECISION T REES. A GENDA Decision trees Complexity Learning curves Combatting overfitting Boosting](https://reader036.vdocuments.mx/reader036/viewer/2022062620/551b038055034607418b4b48/html5/thumbnails/29.jpg)
BOOSTING
Main idea: If learner 1 fails to learn an example correctly,
this example is more important for learner 2 If learner 1 and 2 fail to learn an example
correctly, this example is more important for learner 3
… Weighted training set
Weights encode importance
![Page 30: CS B551: D ECISION T REES. A GENDA Decision trees Complexity Learning curves Combatting overfitting Boosting](https://reader036.vdocuments.mx/reader036/viewer/2022062620/551b038055034607418b4b48/html5/thumbnails/30.jpg)
BOOSTING
Weighted training set
Ex. # Weight A B C D E CONCEPT
1 w1 False False True False True False
2 w2 False True False False False False
3 w3 False True True True True False
4 w4 False False True False False False
5 w5 False False False True True False
6 w6 True False True False False True
7 w7 True False False True False True
8 w8 True False True False True True
9 w9 True True True False True True
10 w10 True True True True True True
11 w11 True True False False False False
12 w12 True True False False True False
13 w13 True False True True True True
![Page 31: CS B551: D ECISION T REES. A GENDA Decision trees Complexity Learning curves Combatting overfitting Boosting](https://reader036.vdocuments.mx/reader036/viewer/2022062620/551b038055034607418b4b48/html5/thumbnails/31.jpg)
BOOSTING
Start with uniform weights wi=1/N
Use learner 1 to generate hypothesis h1
Adjust weights to give higher importance to misclassified examples
Use learner 2 to generate hypothesis h2
… Weight hypotheses according to
performance, and return weighted majority
![Page 32: CS B551: D ECISION T REES. A GENDA Decision trees Complexity Learning curves Combatting overfitting Boosting](https://reader036.vdocuments.mx/reader036/viewer/2022062620/551b038055034607418b4b48/html5/thumbnails/32.jpg)
MUSHROOM EXAMPLE
“Decision stumps” - single attribute DT
Ex. # Weight A B C D E CONCEPT
1 1/13 False False True False True False
2 1/13 False True False False False False
3 1/13 False True True True True False
4 1/13 False False True False False False
5 1/13 False False False True True False
6 1/13 True False True False False True
7 1/13 True False False True False True
8 1/13 True False True False True True
9 1/13 True True True False True True
10 1/13 True True True True True True
11 1/13 True True False False False False
12 1/13 True True False False True False
13 1/13 True False True True True True
![Page 33: CS B551: D ECISION T REES. A GENDA Decision trees Complexity Learning curves Combatting overfitting Boosting](https://reader036.vdocuments.mx/reader036/viewer/2022062620/551b038055034607418b4b48/html5/thumbnails/33.jpg)
MUSHROOM EXAMPLE
Pick C first, learns CONCEPT = C
Ex. # Weight A B C D E CONCEPT
1 1/13 False False True False True False
2 1/13 False True False False False False
3 1/13 False True True True True False
4 1/13 False False True False False False
5 1/13 False False False True True False
6 1/13 True False True False False True
7 1/13 True False False True False True
8 1/13 True False True False True True
9 1/13 True True True False True True
10 1/13 True True True True True True
11 1/13 True True False False False False
12 1/13 True True False False True False
13 1/13 True False True True True True
![Page 34: CS B551: D ECISION T REES. A GENDA Decision trees Complexity Learning curves Combatting overfitting Boosting](https://reader036.vdocuments.mx/reader036/viewer/2022062620/551b038055034607418b4b48/html5/thumbnails/34.jpg)
MUSHROOM EXAMPLE
Pick C first, learns CONCEPT = C
Ex. # Weight A B C D E CONCEPT
1 1/13 False False True False True False
2 1/13 False True False False False False
3 1/13 False True True True True False
4 1/13 False False True False False False
5 1/13 False False False True True False
6 1/13 True False True False False True
7 1/13 True False False True False True
8 1/13 True False True False True True
9 1/13 True True True False True True
10 1/13 True True True True True True
11 1/13 True True False False False False
12 1/13 True True False False True False
13 1/13 True False True True True True
![Page 35: CS B551: D ECISION T REES. A GENDA Decision trees Complexity Learning curves Combatting overfitting Boosting](https://reader036.vdocuments.mx/reader036/viewer/2022062620/551b038055034607418b4b48/html5/thumbnails/35.jpg)
MUSHROOM EXAMPLE
Update weights (precise formula given in R&N)
Ex. # Weight A B C D E CONCEPT
1 .125 False False True False True False
2 .056 False True False False False False
3 .125 False True True True True False
4 .125 False False True False False False
5 .056 False False False True True False
6 .056 True False True False False True
7 .125 True False False True False True
8 .056 True False True False True True
9 .056 True True True False True True
10 .056 True True True True True True
11 .056 True True False False False False
12 .056 True True False False True False
13 .056 True False True True True True
![Page 36: CS B551: D ECISION T REES. A GENDA Decision trees Complexity Learning curves Combatting overfitting Boosting](https://reader036.vdocuments.mx/reader036/viewer/2022062620/551b038055034607418b4b48/html5/thumbnails/36.jpg)
MUSHROOM EXAMPLE
Next try A, learn CONCEPT=A
Ex. # Weight A B C D E CONCEPT
1 .125 False False True False True False
2 .056 False True False False False False
3 .125 False True True True True False
4 .125 False False True False False False
5 .056 False False False True True False
6 .056 True False True False False True
7 .125 True False False True False True
8 .056 True False True False True True
9 .056 True True True False True True
10 .056 True True True True True True
11 .056 True True False False False False
12 .056 True True False False True False
13 .056 True False True True True True
![Page 37: CS B551: D ECISION T REES. A GENDA Decision trees Complexity Learning curves Combatting overfitting Boosting](https://reader036.vdocuments.mx/reader036/viewer/2022062620/551b038055034607418b4b48/html5/thumbnails/37.jpg)
MUSHROOM EXAMPLE
Next try A, learn CONCEPT=A
Ex. # Weight A B C D E CONCEPT
1 .125 False False True False True False
2 .056 False True False False False False
3 .125 False True True True True False
4 .125 False False True False False False
5 .056 False False False True True False
6 .056 True False True False False True
7 .125 True False False True False True
8 .056 True False True False True True
9 .056 True True True False True True
10 .056 True True True True True True
11 .056 True True False False False False
12 .056 True True False False True False
13 .056 True False True True True True
![Page 38: CS B551: D ECISION T REES. A GENDA Decision trees Complexity Learning curves Combatting overfitting Boosting](https://reader036.vdocuments.mx/reader036/viewer/2022062620/551b038055034607418b4b48/html5/thumbnails/38.jpg)
MUSHROOM EXAMPLE
Update weights
Ex. # Weight A B C D E CONCEPT
1 0.07 False False True False True False
2 0.03 False True False False False False
3 0.07 False True True True True False
4 0.07 False False True False False False
5 0.03 False False False True True False
6 0.03 True False True False False True
7 0.07 True False False True False True
8 0.03 True False True False True True
9 0.03 True True True False True True
10 0.03 True True True True True True
11 0.25 True True False False False False
12 0.25 True True False False True False
13 0.03 True False True True True True
![Page 39: CS B551: D ECISION T REES. A GENDA Decision trees Complexity Learning curves Combatting overfitting Boosting](https://reader036.vdocuments.mx/reader036/viewer/2022062620/551b038055034607418b4b48/html5/thumbnails/39.jpg)
MUSHROOM EXAMPLE
Next try E, learn CONCEPT=E
Ex. # Weight A B C D E CONCEPT
1 0.07 False False True False True False
2 0.03 False True False False False False
3 0.07 False True True True True False
4 0.07 False False True False False False
5 0.03 False False False True True False
6 0.03 True False True False False True
7 0.07 True False False True False True
8 0.03 True False True False True True
9 0.03 True True True False True True
10 0.03 True True True True True True
11 0.25 True True False False False False
12 0.25 True True False False True False
13 0.03 True False True True True True
![Page 40: CS B551: D ECISION T REES. A GENDA Decision trees Complexity Learning curves Combatting overfitting Boosting](https://reader036.vdocuments.mx/reader036/viewer/2022062620/551b038055034607418b4b48/html5/thumbnails/40.jpg)
MUSHROOM EXAMPLE
Next try E, learn CONCEPT=E
Ex. # Weight A B C D E CONCEPT
1 0.07 False False True False True False
2 0.03 False True False False False False
3 0.07 False True True True True False
4 0.07 False False True False False False
5 0.03 False False False True True False
6 0.03 True False True False False True
7 0.07 True False False True False True
8 0.03 True False True False True True
9 0.03 True True True False True True
10 0.03 True True True True True True
11 0.25 True True False False False False
12 0.25 True True False False True False
13 0.03 True False True True True True
![Page 41: CS B551: D ECISION T REES. A GENDA Decision trees Complexity Learning curves Combatting overfitting Boosting](https://reader036.vdocuments.mx/reader036/viewer/2022062620/551b038055034607418b4b48/html5/thumbnails/41.jpg)
MUSHROOM EXAMPLE
Update Weights…
Ex. # Weight A B C D E CONCEPT
1 0.07 False False True False True False
2 0.03 False True False False False False
3 0.07 False True True True True False
4 0.07 False False True False False False
5 0.03 False False False True True False
6 0.03 True False True False False True
7 0.07 True False False True False True
8 0.03 True False True False True True
9 0.03 True True True False True True
10 0.03 True True True True True True
11 0.25 True True False False False False
12 0.25 True True False False True False
13 0.03 True False True True True True
![Page 42: CS B551: D ECISION T REES. A GENDA Decision trees Complexity Learning curves Combatting overfitting Boosting](https://reader036.vdocuments.mx/reader036/viewer/2022062620/551b038055034607418b4b48/html5/thumbnails/42.jpg)
MUSHROOM EXAMPLE
Final classifier, order C,A,E,D,B Weights on hypotheses determined by overall
error Weighted majority weights
A=2.1, B=0.9, C=0.8, D=1.4, E=0.09 100% accuracy on training set
![Page 43: CS B551: D ECISION T REES. A GENDA Decision trees Complexity Learning curves Combatting overfitting Boosting](https://reader036.vdocuments.mx/reader036/viewer/2022062620/551b038055034607418b4b48/html5/thumbnails/43.jpg)
BOOSTING STRATEGIES
Prior weighting strategy was the popular AdaBoost algorithm
see R&N pp. 667 Many other strategies Typically as the number of hypotheses
increases, accuracy increases as well Does this conflict with Occam’s razor?
![Page 44: CS B551: D ECISION T REES. A GENDA Decision trees Complexity Learning curves Combatting overfitting Boosting](https://reader036.vdocuments.mx/reader036/viewer/2022062620/551b038055034607418b4b48/html5/thumbnails/44.jpg)
ANNOUNCEMENTS
Next class: Neural networks & function learning R&N 18.6-7