1 formal evaluation techniques chapter 7. 2 test set error rates, confusion matrices, lift charts...
TRANSCRIPT
![Page 1: 1 Formal Evaluation Techniques Chapter 7. 2 test set error rates, confusion matrices, lift charts Focusing on formal evaluation methods for supervised](https://reader030.vdocuments.mx/reader030/viewer/2022032706/56649ddc5503460f94ad3dfa/html5/thumbnails/1.jpg)
1
Formal Evaluation Techniques
Chapter 7
![Page 2: 1 Formal Evaluation Techniques Chapter 7. 2 test set error rates, confusion matrices, lift charts Focusing on formal evaluation methods for supervised](https://reader030.vdocuments.mx/reader030/viewer/2022032706/56649ddc5503460f94ad3dfa/html5/thumbnails/2.jpg)
2
• test set error rates, confusion matrices, lift charts
• Focusing on formal evaluation methods for supervised learning and unsupervised clustering
![Page 3: 1 Formal Evaluation Techniques Chapter 7. 2 test set error rates, confusion matrices, lift charts Focusing on formal evaluation methods for supervised](https://reader030.vdocuments.mx/reader030/viewer/2022032706/56649ddc5503460f94ad3dfa/html5/thumbnails/3.jpg)
3
7.1 What Should Be Evaluated?
1. Supervised Model
2. Training Data
3. Attributes
4. Model Builder
5. Parameters
6. Test Set Evaluation
![Page 4: 1 Formal Evaluation Techniques Chapter 7. 2 test set error rates, confusion matrices, lift charts Focusing on formal evaluation methods for supervised](https://reader030.vdocuments.mx/reader030/viewer/2022032706/56649ddc5503460f94ad3dfa/html5/thumbnails/4.jpg)
4
ModelBuilder
SupervisedModel EvaluationData
Instances
Attributes
Parameters
Test Data
Training Data
![Page 5: 1 Formal Evaluation Techniques Chapter 7. 2 test set error rates, confusion matrices, lift charts Focusing on formal evaluation methods for supervised](https://reader030.vdocuments.mx/reader030/viewer/2022032706/56649ddc5503460f94ad3dfa/html5/thumbnails/5.jpg)
5
Single-Valued Summary Statistics
• Mean
• Variance
• Standard deviation
7.2 Tools for Evaluation
![Page 6: 1 Formal Evaluation Techniques Chapter 7. 2 test set error rates, confusion matrices, lift charts Focusing on formal evaluation methods for supervised](https://reader030.vdocuments.mx/reader030/viewer/2022032706/56649ddc5503460f94ad3dfa/html5/thumbnails/6.jpg)
6
-99 -3 -2 -1 0 1 2 3 99
13.54%
34.13%
2.14%
34.13%
13.54%
2.14%
.13%.13%
f(x)
x
The Normal Distribution
![Page 7: 1 Formal Evaluation Techniques Chapter 7. 2 test set error rates, confusion matrices, lift charts Focusing on formal evaluation methods for supervised](https://reader030.vdocuments.mx/reader030/viewer/2022032706/56649ddc5503460f94ad3dfa/html5/thumbnails/7.jpg)
7
Normal Distributions and Sample Means
• A distribution of means taken from random sets of independent samples of equal size are distributed normally.
• Any sample mean will vary less than two standard errors from the population mean 95% of the time.
![Page 8: 1 Formal Evaluation Techniques Chapter 7. 2 test set error rates, confusion matrices, lift charts Focusing on formal evaluation methods for supervised](https://reader030.vdocuments.mx/reader030/viewer/2022032706/56649ddc5503460f94ad3dfa/html5/thumbnails/8.jpg)
8
Computing the Standard Error
• The population variance is estimated by dividing the sample variance by the
sample size.
• The standard error is computed by taking the square root of the
estimated population variance.
![Page 9: 1 Formal Evaluation Techniques Chapter 7. 2 test set error rates, confusion matrices, lift charts Focusing on formal evaluation methods for supervised](https://reader030.vdocuments.mx/reader030/viewer/2022032706/56649ddc5503460f94ad3dfa/html5/thumbnails/9.jpg)
9
Population
Sample 2
Sample 1
X2
X2
X10
X9
X8
X7
X6
X5
X4
X3
X1
X7
X4
X4
X9
X10
Sample 3
X4
X4
X10
![Page 10: 1 Formal Evaluation Techniques Chapter 7. 2 test set error rates, confusion matrices, lift charts Focusing on formal evaluation methods for supervised](https://reader030.vdocuments.mx/reader030/viewer/2022032706/56649ddc5503460f94ad3dfa/html5/thumbnails/10.jpg)
10
A Classical Model for Hypothesis Testing
• Hypothesis: educated guess about the outcome of some event
• Experimental group, Control group
• Null hypothesis– There is no significant difference in the mean
increase or decrease of total allergic reactions per day between patients in the group receiving treatment X and patients in the group receiving the placebo.
![Page 11: 1 Formal Evaluation Techniques Chapter 7. 2 test set error rates, confusion matrices, lift charts Focusing on formal evaluation methods for supervised](https://reader030.vdocuments.mx/reader030/viewer/2022032706/56649ddc5503460f94ad3dfa/html5/thumbnails/11.jpg)
11
A Classical Model for Hypothesis Testing
sizes. sampleingcorrespondareand
means; respectivetheforscoresvarianceareand
samples;tindependentheformeanssampleareand
and; score cesignifican theis
21
21
21
nn
XX
P
where
vv
)//( 2211
21
nvnv
XXP
To be 95% confident, P must >= 2
![Page 12: 1 Formal Evaluation Techniques Chapter 7. 2 test set error rates, confusion matrices, lift charts Focusing on formal evaluation methods for supervised](https://reader030.vdocuments.mx/reader030/viewer/2022032706/56649ddc5503460f94ad3dfa/html5/thumbnails/12.jpg)
12
Table 7.1 • A Confusion Matrix for the Null Hypothesis
Computed Computed Accept Reject
Accept Null True Accept Type 1 ErrorHypothesis
Reject Null Type 2 Error True RejectHypothesis
![Page 13: 1 Formal Evaluation Techniques Chapter 7. 2 test set error rates, confusion matrices, lift charts Focusing on formal evaluation methods for supervised](https://reader030.vdocuments.mx/reader030/viewer/2022032706/56649ddc5503460f94ad3dfa/html5/thumbnails/13.jpg)
13
7.3 Computing Test Set Confidence Intervals
instances set test of #
errors set test of # )( e Error RatClassifier E
![Page 14: 1 Formal Evaluation Techniques Chapter 7. 2 test set error rates, confusion matrices, lift charts Focusing on formal evaluation methods for supervised](https://reader030.vdocuments.mx/reader030/viewer/2022032706/56649ddc5503460f94ad3dfa/html5/thumbnails/14.jpg)
14
Computing 95% Confidence Intervals
1. Given a test set sample S of size n and error rate E
2. Compute sample variance as V= E(1-E)
3. Compute the standard error (SE) as the square root of V divided by n.
4. Calculate an upper bound error as E + 2(SE)
5. Calculate a lower bound error as E - 2(SE)
![Page 15: 1 Formal Evaluation Techniques Chapter 7. 2 test set error rates, confusion matrices, lift charts Focusing on formal evaluation methods for supervised](https://reader030.vdocuments.mx/reader030/viewer/2022032706/56649ddc5503460f94ad3dfa/html5/thumbnails/15.jpg)
15
Three general comments
• The rest data has been randomly chosen from the pool of all possible test set instances
• Test, training, and validation data must represent disjoint sets
• The instances in each class should be distributed in the training, validation, and test data as they are seen in the entire dataset
![Page 16: 1 Formal Evaluation Techniques Chapter 7. 2 test set error rates, confusion matrices, lift charts Focusing on formal evaluation methods for supervised](https://reader030.vdocuments.mx/reader030/viewer/2022032706/56649ddc5503460f94ad3dfa/html5/thumbnails/16.jpg)
16
7.4 Comparing Supervised Learner Models
![Page 17: 1 Formal Evaluation Techniques Chapter 7. 2 test set error rates, confusion matrices, lift charts Focusing on formal evaluation methods for supervised](https://reader030.vdocuments.mx/reader030/viewer/2022032706/56649ddc5503460f94ad3dfa/html5/thumbnails/17.jpg)
17
Comparing Models with Independent Test Data
where
E1 = The error rate for model M1
E2 = The error rate for model M2
q = (E1 + E2)/2
n1 = the number of instances in test set A
n2 = the number of instances in test set B
)2/11/1)(1(
21
nnqq
EEP
![Page 18: 1 Formal Evaluation Techniques Chapter 7. 2 test set error rates, confusion matrices, lift charts Focusing on formal evaluation methods for supervised](https://reader030.vdocuments.mx/reader030/viewer/2022032706/56649ddc5503460f94ad3dfa/html5/thumbnails/18.jpg)
18
7.5 Attribute Evaluation
![Page 19: 1 Formal Evaluation Techniques Chapter 7. 2 test set error rates, confusion matrices, lift charts Focusing on formal evaluation methods for supervised](https://reader030.vdocuments.mx/reader030/viewer/2022032706/56649ddc5503460f94ad3dfa/html5/thumbnails/19.jpg)
19
Locating Redundant Attributes with Excel
• Correlation Coefficient
• Positive Correlation• Negative Correlation• Curvilinear Relationship (curve line)
–Two attributes having a low r value may still have a curvilinear
![Page 20: 1 Formal Evaluation Techniques Chapter 7. 2 test set error rates, confusion matrices, lift charts Focusing on formal evaluation methods for supervised](https://reader030.vdocuments.mx/reader030/viewer/2022032706/56649ddc5503460f94ad3dfa/html5/thumbnails/20.jpg)
20
Positive Correlation r=1
0
2
4
6
8
10
12
0 1 2 3 4 5 6 7 8 9 10
Attribute A
Att
rib
ute
B
![Page 21: 1 Formal Evaluation Techniques Chapter 7. 2 test set error rates, confusion matrices, lift charts Focusing on formal evaluation methods for supervised](https://reader030.vdocuments.mx/reader030/viewer/2022032706/56649ddc5503460f94ad3dfa/html5/thumbnails/21.jpg)
21
Negative Correlation r=-1
0
1
2
3
4
5
6
7
8
9
10
0 1 2 3 4 5 6 7 8 9 10
Attribute A
Att
rib
ute
B
![Page 22: 1 Formal Evaluation Techniques Chapter 7. 2 test set error rates, confusion matrices, lift charts Focusing on formal evaluation methods for supervised](https://reader030.vdocuments.mx/reader030/viewer/2022032706/56649ddc5503460f94ad3dfa/html5/thumbnails/22.jpg)
22
Curvilinear Relationship r=0
0
5
10
15
20
25
30
0 2 4 6 8 10 12
Attribute A
Att
rib
ute
B
![Page 23: 1 Formal Evaluation Techniques Chapter 7. 2 test set error rates, confusion matrices, lift charts Focusing on formal evaluation methods for supervised](https://reader030.vdocuments.mx/reader030/viewer/2022032706/56649ddc5503460f94ad3dfa/html5/thumbnails/23.jpg)
23
Creating a Scatterplot Diagram with MS Excel
![Page 24: 1 Formal Evaluation Techniques Chapter 7. 2 test set error rates, confusion matrices, lift charts Focusing on formal evaluation methods for supervised](https://reader030.vdocuments.mx/reader030/viewer/2022032706/56649ddc5503460f94ad3dfa/html5/thumbnails/24.jpg)
24
Blood Pressure vs. Cholesterol
0
50
100
150
200
250
300
350
400
450
0 20 40 60 80 100 120 140 160 180 200
Blood Pressure
Ch
ole
ste
rol
![Page 25: 1 Formal Evaluation Techniques Chapter 7. 2 test set error rates, confusion matrices, lift charts Focusing on formal evaluation methods for supervised](https://reader030.vdocuments.mx/reader030/viewer/2022032706/56649ddc5503460f94ad3dfa/html5/thumbnails/25.jpg)
25
Hypothesis Testing for Numerical Attribute Significance
jjii
ji
ininstancesofnumber theisand in instancesofnumber theis
. attributefor variancej class theand variancei class the
.attributeformeanjclass theis andmeaniclass theis i
where
CC
Aisis
Aj
XX
nn
vv
)//( jnjviniv
jXiX
ijP
![Page 26: 1 Formal Evaluation Techniques Chapter 7. 2 test set error rates, confusion matrices, lift charts Focusing on formal evaluation methods for supervised](https://reader030.vdocuments.mx/reader030/viewer/2022032706/56649ddc5503460f94ad3dfa/html5/thumbnails/26.jpg)
26
Table 7.2 • Cardiology Patient Data: Numerical Attribute Significance
Class Class ESX Attribute Hypothesis Test Sick Healthy Significance for Significance
Age (Mean) 56.50 52.50 0.45 4.076 (Sd) 7.96 9.55
BP (Mean) 134.40 129.30 0.29 2.511 (Sd) 18.73 16.17
Chol (Mean) 251.09 242.23 0.17 1.495 (Sd) 49.46 53.55
MHR (Mean) 139.10 158.47 0.85 7.955 (Sd) 22.60 19.1
Peak (Mean) 1.59 0.58 0.86 8.001 (Sd) 1.30 0.78
![Page 27: 1 Formal Evaluation Techniques Chapter 7. 2 test set error rates, confusion matrices, lift charts Focusing on formal evaluation methods for supervised](https://reader030.vdocuments.mx/reader030/viewer/2022032706/56649ddc5503460f94ad3dfa/html5/thumbnails/27.jpg)
27
7.6 Unsupervised Evaluation Techniques
• Unsupervised Clustering for Supervised Evaluation– If the instances cluster into the predefined classes contained in the training data, a supervised learner model built with the training data is likely to perform well.
• Supervised Evaluation for Unsupervised Clustering–Designate each formed cluster as a class–Build a supervised learner model by choosing a random sampling of instances from each class–Test the supervised learner model with the remaining instances
• Additional Methods
![Page 28: 1 Formal Evaluation Techniques Chapter 7. 2 test set error rates, confusion matrices, lift charts Focusing on formal evaluation methods for supervised](https://reader030.vdocuments.mx/reader030/viewer/2022032706/56649ddc5503460f94ad3dfa/html5/thumbnails/28.jpg)
28
Additional Methods
• Designate all instances as training data
• Apply an alternative technique’s measure of cluster quality
• Create your own measure of cluster quality
• Perform a between-cluster attribute-value comparison.
![Page 29: 1 Formal Evaluation Techniques Chapter 7. 2 test set error rates, confusion matrices, lift charts Focusing on formal evaluation methods for supervised](https://reader030.vdocuments.mx/reader030/viewer/2022032706/56649ddc5503460f94ad3dfa/html5/thumbnails/29.jpg)
29
7.7 Evaluating Supervised Models with Numeric Output
![Page 30: 1 Formal Evaluation Techniques Chapter 7. 2 test set error rates, confusion matrices, lift charts Focusing on formal evaluation methods for supervised](https://reader030.vdocuments.mx/reader030/viewer/2022032706/56649ddc5503460f94ad3dfa/html5/thumbnails/30.jpg)
30
Mean Squared Error
where for the ith instance,
ai = actual output value
ci = computed output value
n
cacacacamse
2) ( ... )(... 2) ( 2) ( nni i2211
![Page 31: 1 Formal Evaluation Techniques Chapter 7. 2 test set error rates, confusion matrices, lift charts Focusing on formal evaluation methods for supervised](https://reader030.vdocuments.mx/reader030/viewer/2022032706/56649ddc5503460f94ad3dfa/html5/thumbnails/31.jpg)
31
Mean Absolute Error
where for the ith instance,
ai = actual output value
ci = computed output value
n
cacacamae
| | .... | | | | nn2211
![Page 32: 1 Formal Evaluation Techniques Chapter 7. 2 test set error rates, confusion matrices, lift charts Focusing on formal evaluation methods for supervised](https://reader030.vdocuments.mx/reader030/viewer/2022032706/56649ddc5503460f94ad3dfa/html5/thumbnails/32.jpg)
32
Table 7.3 • Absolute and Squared Error
Instance Life Ins. Promo. Computed Absolute SquaredNumber Actual Output Output Error Error
1 0.0 0.024 0.024 0.00052 1.0 0.998 0.002 0.00003 0.0 0.023 0.023 0.00054 1.0 0.986 0.014 0.00025 1.0 0.999 0.001 0.00006 0.0 0.050 0.050 0.00257 1.0 0.999 0.001 0.00008 0.0 0.262 0.262 0.06869 0.0 0.060 0.060 0.003610 1.0 0.997 0.003 0.000011 1.0 0.999 0.001 0.000012 1.0 0.776 0.224 0.050213 1.0 0.999 0.001 0.000014 0.0 0.023 0.023 0.000515 1.0 0.999 0.001 0.0000