overfitting, cross-validation2005/02/02 · overfitting, cross-validation recommended reading: •...

Overfitting, Cross-Validation Recommended reading: • Neural nets: Mitchell Chapter 4 • Decision trees: Mitchell Chapter 3 Machine Learning 10-701 Tom M. Mitchell Carnegie Mellon University

Upload: others

Post on 04-Aug-2021

8 views

Category:

Documents

0 download

Report

Download

Embed Size (px):

TRANSCRIPT

Overfitting, Cross-Validation

Recommended reading:

• Neural nets: Mitchell Chapter 4

• Decision trees: Mitchell Chapter 3

Machine Learning 10-701

Tom M. MitchellCarnegie Mellon University

Page 2: Overfitting, Cross-Validation2005/02/02 · Overfitting, Cross-Validation Recommended reading: • Neural nets: Mitchell Chapter 4 • Decision trees: Mitchell Chapter 3 Machine Learning

Overview

• Followup on neural networks– Example: Face classification

• Cross validation– Training error– Test error– True error

• Decision trees– ID3, C4.5– Trees and rules

Page 3: Overfitting, Cross-Validation2005/02/02 · Overfitting, Cross-Validation Recommended reading: • Neural nets: Mitchell Chapter 4 • Decision trees: Mitchell Chapter 3 Machine Learning

Page 4: Overfitting, Cross-Validation2005/02/02 · Overfitting, Cross-Validation Recommended reading: • Neural nets: Mitchell Chapter 4 • Decision trees: Mitchell Chapter 3 Machine Learning

Page 5: Overfitting, Cross-Validation2005/02/02 · Overfitting, Cross-Validation Recommended reading: • Neural nets: Mitchell Chapter 4 • Decision trees: Mitchell Chapter 3 Machine Learning

Page 6: Overfitting, Cross-Validation2005/02/02 · Overfitting, Cross-Validation Recommended reading: • Neural nets: Mitchell Chapter 4 • Decision trees: Mitchell Chapter 3 Machine Learning

# of gradient descent steps

Page 7: Overfitting, Cross-Validation2005/02/02 · Overfitting, Cross-Validation Recommended reading: • Neural nets: Mitchell Chapter 4 • Decision trees: Mitchell Chapter 3 Machine Learning

# of gradient descent steps

Page 8: Overfitting, Cross-Validation2005/02/02 · Overfitting, Cross-Validation Recommended reading: • Neural nets: Mitchell Chapter 4 • Decision trees: Mitchell Chapter 3 Machine Learning

# of gradient descent steps

Page 9: Overfitting, Cross-Validation2005/02/02 · Overfitting, Cross-Validation Recommended reading: • Neural nets: Mitchell Chapter 4 • Decision trees: Mitchell Chapter 3 Machine Learning

Page 10: Overfitting, Cross-Validation2005/02/02 · Overfitting, Cross-Validation Recommended reading: • Neural nets: Mitchell Chapter 4 • Decision trees: Mitchell Chapter 3 Machine Learning

Page 11: Overfitting, Cross-Validation2005/02/02 · Overfitting, Cross-Validation Recommended reading: • Neural nets: Mitchell Chapter 4 • Decision trees: Mitchell Chapter 3 Machine Learning

Cognitive Neuroscience Models Based on ANN’s[McClelland & Rogers, Nature 2003]

Page 12: Overfitting, Cross-Validation2005/02/02 · Overfitting, Cross-Validation Recommended reading: • Neural nets: Mitchell Chapter 4 • Decision trees: Mitchell Chapter 3 Machine Learning

Page 13: Overfitting, Cross-Validation2005/02/02 · Overfitting, Cross-Validation Recommended reading: • Neural nets: Mitchell Chapter 4 • Decision trees: Mitchell Chapter 3 Machine Learning

Page 14: Overfitting, Cross-Validation2005/02/02 · Overfitting, Cross-Validation Recommended reading: • Neural nets: Mitchell Chapter 4 • Decision trees: Mitchell Chapter 3 Machine Learning

How should we choose the number of weight updates?

Page 15: Overfitting, Cross-Validation2005/02/02 · Overfitting, Cross-Validation Recommended reading: • Neural nets: Mitchell Chapter 4 • Decision trees: Mitchell Chapter 3 Machine Learning

Page 16: Overfitting, Cross-Validation2005/02/02 · Overfitting, Cross-Validation Recommended reading: • Neural nets: Mitchell Chapter 4 • Decision trees: Mitchell Chapter 3 Machine Learning

How should we choose the number of weight updates?

How should we allocate N examples to training, validation sets?

How will curves change if we double training set size?

How will curves change if we double validation set size?

What is our best unbiased estimate of true network error?

Page 17: Overfitting, Cross-Validation2005/02/02 · Overfitting, Cross-Validation Recommended reading: • Neural nets: Mitchell Chapter 4 • Decision trees: Mitchell Chapter 3 Machine Learning

Overfitting and Cross ValidationOverfitting: a learning algorithm overfits the

training data if it outputs a hypothesis, h 2 H, when there exists h’ 2 H such that:

where

Page 18: Overfitting, Cross-Validation2005/02/02 · Overfitting, Cross-Validation Recommended reading: • Neural nets: Mitchell Chapter 4 • Decision trees: Mitchell Chapter 3 Machine Learning

Three types of errorTrue error:

Train set error:

Test set error:

Page 19: Overfitting, Cross-Validation2005/02/02 · Overfitting, Cross-Validation Recommended reading: • Neural nets: Mitchell Chapter 4 • Decision trees: Mitchell Chapter 3 Machine Learning

Bias in estimates

Gives a biased (optimistically) estimate for

Gives an unbiased estimate for

Page 20: Overfitting, Cross-Validation2005/02/02 · Overfitting, Cross-Validation Recommended reading: • Neural nets: Mitchell Chapter 4 • Decision trees: Mitchell Chapter 3 Machine Learning

Leave one out cross validationMethod for estimating true error of h’• e=0• For each training example z

– Training on {data – z}– Test on single example z; if error, then e e+1

Final error estimate (for training on sample of size |data|-1) is: e / |data|

Page 21: Overfitting, Cross-Validation2005/02/02 · Overfitting, Cross-Validation Recommended reading: • Neural nets: Mitchell Chapter 4 • Decision trees: Mitchell Chapter 3 Machine Learning

Leave one out cross validationThe leave-one-out error, e / |data|, gives an almost

unbiased estimate for

Page 22: Overfitting, Cross-Validation2005/02/02 · Overfitting, Cross-Validation Recommended reading: • Neural nets: Mitchell Chapter 4 • Decision trees: Mitchell Chapter 3 Machine Learning

Leave one out cross validationIn fact, the e / |data| estimate of leave-one-out cross

validation is a slightly pessimistic estimate of

Page 23: Overfitting, Cross-Validation2005/02/02 · Overfitting, Cross-Validation Recommended reading: • Neural nets: Mitchell Chapter 4 • Decision trees: Mitchell Chapter 3 Machine Learning

How should we choose the number of weight updates?

How should we allocate N examples to training, validation sets?

How will curves change if we double training set size?

How will curves change if we double validation set size?

What is our best unbiased estimate of true network error?

Page 24: Overfitting, Cross-Validation2005/02/02 · Overfitting, Cross-Validation Recommended reading: • Neural nets: Mitchell Chapter 4 • Decision trees: Mitchell Chapter 3 Machine Learning

What you should know:

• Neural networks– Hidden layer representations

• Cross validation– Training error, test error, true error– Cross validation as low-bias estimator

Cross-validation for detecting and preventing overfitting./awm/tutorials/overfit10.pdf · 2008. 7. 7. · k-fold Cross Validation x y Randomly break the dataset into k partitions

Backtest OverFitting

The problem of overfitting

The Impact of Overfitting and Overgeneralization on the

M -L N P OVERFITTING REDUCTION

Overfitting, Cross Validation, MDL, Structural Risk Minimization, … · 2003. 10. 16. · Overfitting, Cross Validation, MDL, Structural Risk Minimization, Using unlabeled data Tom

What’s learning, revisited Overfitting Bayes optimal ...guestrin/Class/10701-S07/Slides/overfitting-naivebayes.pdf · 1 ©Carlos Guestrin 2005-2007 What’s learning, revisited

Dropout: A Simple Way to Prevent Neural Networks from Overfitting

Justice Mitchell: Psychotic Cross Content Training #FLBlogCon13

if name == ' mainwccclab.cs.nchu.edu.tw/www/images/105-2Python/python 8.pdf · 2017. 5. 31. · Overfitting • Overfitting occurs when a model is excessively complex, such as having

Construction Manual of a Cross-Flow/Banki TurbineThe Cross-Flow Turbine The Cross-Flow turbine, also called the Banki Turbine, was designed by Anthony Mitchell in 1903. “The turbine

Overfitting and K-Nearest Neighbour · 2019. 10. 16. · Overfitting and K-Nearest Neighbour Dr. Xiaowei Huang xiaowei

Bias-Variance Tradeoff › ~guestrin › Class › 15781 › slides › overfitting-logis… · What’s learning, revisited Overfitting Generative versus Discriminative Logistic

THE PROBABILITY OF BACKTEST OVERFITTING

Overfitting and-tbl

Overfitting Overfitting occurs when a statistical model describes random error or noise instead of the underlying relationship. Overfitting generally occurs

In search of relevant predictors for marine species ... · Cross-validation (CV) is a widespread strategy used to perform model selection while avoiding under- and overfitting models

Overfitting in Neural Nets: Backpropagation, Conjugate ... · Overfitting in Neural Nets: Backpropagation, Conjugate Gradient, and Early Stopping Rich Caruana CALD,CMU 5000 Forbes

From Mating Pool Distributions to Model Overfitting

In the United States Court of Appeals MATTHEW MITCHELL ...Mar 09, 2020 · MATTHEW MITCHELL, Plaintiff – Appellant Cross – Appellee v. ORICO BAILEY, Defendant – Appellee HOOPA

Data Mining Model Overfitting Introduction to Data Mining ...kumar001/dmbook/slides/chap3_overfitting.pdf · 03/26/2018 Introduction to Data Mining, 2ndEdition 9 Model Overfitting

Machine learning methodology: Overfitting, regularization, and …russell/classes/cs194... · 2011-09-12 · Machine learning methodology: Overfitting, regularization, and all that

Introduction to Machine Learning - Brown Universitycs.brown.edu/courses/csci1950-f/spring2011/lectures/2011-01-27... · 27/01/2011 · •!Model selection, cross-validation, overfitting

Generalization and Overfitting

Regularization The problem of overfitting Machine Learning

Data Classification Preprocessing Overfitting in Decision ...rjohns15/cse40647.sp14/www/content/lectures/… · Data Preprocessing Classification & Regression Overfitting in Decision

A modeller’s dilemma: overfitting or underperforming

Overfitting - courses.cs.washington.edu

Learning with ensembles: How overfitting can be useful

What’s learning, revisited Overfitting Bayes optimal classifier ...Classification: Mitchell’s Machine Learning, Chapter 6 What’s learning, revisited Overfitting Bayes optimal

Cross-Flow Hydroturbine ExplorationThe turbine in question is a Cross Flow or Banki-Mitchell Turbine. The Cross Flow Turbine was originally designed by an Australian engineer named

HOW TO SPOT BACKTEST OVERFITTING

Chapter 4 Overfitting Avoidance in Regression Treesltorgo/PhD/th4.pdf · Chapter 4 Overfitting Avoidance in Regression Trees This chapter describes several approaches that try to