polynomial curve fitting bits c464/bits f464 navneet goyal department of computer science,...

27
Polynomial Curve Fitting BITS C464/BITS F464 Navneet Goyal Department of Computer Science, BITS-Pilani, Pilani Campus, India

Upload: bruno-burnley

Post on 14-Dec-2015

240 views

Category:

Documents


5 download

TRANSCRIPT

Page 1: Polynomial Curve Fitting BITS C464/BITS F464 Navneet Goyal Department of Computer Science, BITS-Pilani, Pilani Campus, India

Polynomial Curve FittingBITS C464/BITS F464

Navneet Goyal

Department of Computer Science, BITS-Pilani, Pilani Campus, India

Page 2: Polynomial Curve Fitting BITS C464/BITS F464 Navneet Goyal Department of Computer Science, BITS-Pilani, Pilani Campus, India

Polynomial Curve Fitting Seems a very trivial concept!! Why are we discussing it in Machine Learning

course? A simple regression problem!! It motivates a number of key concepts of ML!! Let’s discover…

Page 3: Polynomial Curve Fitting BITS C464/BITS F464 Navneet Goyal Department of Computer Science, BITS-Pilani, Pilani Campus, India

Polynomial Curve Fitting

Observe Real-valuedinput variable x• Use x to predict valueof target variable t• Synthetic datagenerated fromsin(2π x)• Random noise intarget values

Input Variable

Targ

et

Varia

bl

e

Reference: Christopher M Bishop: Pattern Recognition & Machine Leaning, 2006 Springer

Page 4: Polynomial Curve Fitting BITS C464/BITS F464 Navneet Goyal Department of Computer Science, BITS-Pilani, Pilani Campus, India

Polynomial Curve Fitting

Input Variable

Targ

et

Varia

bl

e

N observations of xx = (x1,..,xN)Tt = (t1,..,tN)T• Goal is to exploit trainingset to predict value offrom x• Inherently a difficultproblemData Generation:

N = 10Spaced uniformly in range [0,1]Generated from sin(2πx) by addingsmall Gaussian noiseNoise typical due to unobserved variables

Reference: Christopher M Bishop: Pattern Recognition & Machine Leaning, 2006 Springer

Page 5: Polynomial Curve Fitting BITS C464/BITS F464 Navneet Goyal Department of Computer Science, BITS-Pilani, Pilani Campus, India

Polynomial Curve Fitting

Input Variable

Targ

et

Varia

bl

e

• Where M is the order of the polynomial• Is higher value of M better? We’ll see shortly!• Coefficients w0 ,…wM are denoted by vector w• Nonlinear function of x, linear function of coefficients w• Called Linear Models

Reference: Christopher M Bishop: Pattern Recognition & Machine Leaning, 2006 Springer

Page 6: Polynomial Curve Fitting BITS C464/BITS F464 Navneet Goyal Department of Computer Science, BITS-Pilani, Pilani Campus, India

Sum-of-Squares Error Function

Reference: Christopher M Bishop: Pattern Recognition & Machine Leaning, 2006 Springer

Page 7: Polynomial Curve Fitting BITS C464/BITS F464 Navneet Goyal Department of Computer Science, BITS-Pilani, Pilani Campus, India

Polynomial curve fitting

Reference: Christopher M Bishop: Pattern Recognition & Machine Leaning, 2006 Springer

Page 8: Polynomial Curve Fitting BITS C464/BITS F464 Navneet Goyal Department of Computer Science, BITS-Pilani, Pilani Campus, India

Polynomial curve fitting Choice of M?? Called the model selection or model

comparison

Reference: Christopher M Bishop: Pattern Recognition & Machine Leaning, 2006 Springer

Page 9: Polynomial Curve Fitting BITS C464/BITS F464 Navneet Goyal Department of Computer Science, BITS-Pilani, Pilani Campus, India

0th Order Polynomial

Poor representations of sin(2πx)

Reference: Christopher M Bishop: Pattern Recognition & Machine Leaning, 2006 Springer

Page 10: Polynomial Curve Fitting BITS C464/BITS F464 Navneet Goyal Department of Computer Science, BITS-Pilani, Pilani Campus, India

1st Order Polynomial

Poor representations of sin(2πx)

Reference: Christopher M Bishop: Pattern Recognition & Machine Leaning, 2006 Springer

Page 11: Polynomial Curve Fitting BITS C464/BITS F464 Navneet Goyal Department of Computer Science, BITS-Pilani, Pilani Campus, India

3rd Order Polynomial

Best Fit to sin(2πx)

Reference: Christopher M Bishop: Pattern Recognition & Machine Leaning, 2006 Springer

Page 12: Polynomial Curve Fitting BITS C464/BITS F464 Navneet Goyal Department of Computer Science, BITS-Pilani, Pilani Campus, India

9th Order Polynomial

Over Fit: Poor representation of sin(2πx)

Reference: Christopher M Bishop: Pattern Recognition & Machine Leaning, 2006 Springer

Page 13: Polynomial Curve Fitting BITS C464/BITS F464 Navneet Goyal Department of Computer Science, BITS-Pilani, Pilani Campus, India

Polynomial Curve Fitting Good generalization is the objective Dependence of generalization performance on M? Consider a data set of 100 points Calculate E(w*) for both training data & test data Choose M which minimizes E(w*) Root Mean Square Error (RMS)

Sometimes convenient to use as division by N allows us to compare different sizes of data sets on equal footing

Square root ensures ERMS is measure on the same scale ( and in same units) as the target variable t

Reference: Christopher M Bishop: Pattern Recognition & Machine Leaning, 2006 Springer

Page 14: Polynomial Curve Fitting BITS C464/BITS F464 Navneet Goyal Department of Computer Science, BITS-Pilani, Pilani Campus, India

Flexibility & Model Complexity M=0, very rigid!! Only 1 parameter to play

with!

Page 15: Polynomial Curve Fitting BITS C464/BITS F464 Navneet Goyal Department of Computer Science, BITS-Pilani, Pilani Campus, India

Flexibility & Model Complexity M=1, not so rigid!! 2 parameters to play with!

Page 16: Polynomial Curve Fitting BITS C464/BITS F464 Navneet Goyal Department of Computer Science, BITS-Pilani, Pilani Campus, India

Flexibility & Model Complexity So what value of M is most suitable?

Any Answers???

Page 17: Polynomial Curve Fitting BITS C464/BITS F464 Navneet Goyal Department of Computer Science, BITS-Pilani, Pilani Campus, India

Over-fittingFor small M(0,1,2) Inflexible to handle oscillations of sin(2πx) M(3-8) flexible enough to handle oscillations of sin(2πx) For M=9 Too flexible!! TE = 0GE = high

Why is it happening?

Reference: Christopher M Bishop: Pattern Recognition & Machine Leaning, 2006 Springer

Page 18: Polynomial Curve Fitting BITS C464/BITS F464 Navneet Goyal Department of Computer Science, BITS-Pilani, Pilani Campus, India

Polynomial Coefficients

Reference: Christopher M Bishop: Pattern Recognition & Machine Leaning, 2006 Springer

Page 19: Polynomial Curve Fitting BITS C464/BITS F464 Navneet Goyal Department of Computer Science, BITS-Pilani, Pilani Campus, India

Data Set Size M=9- Larger the data set, the more complex model we can afford to fit to the data- No. of data pts should be no less than 5-10 times the no. of adaptive parameters in the model

Reference: Christopher M Bishop: Pattern Recognition & Machine Leaning, 2006 Springer

Page 20: Polynomial Curve Fitting BITS C464/BITS F464 Navneet Goyal Department of Computer Science, BITS-Pilani, Pilani Campus, India

Over-fitting Problem

Should we limit the no. of parameters according to the available training set?

Complexity of the model should depend only on the complexity of the problem!

LSE represents a specific case of Maximum Likelihood

Over-fitting is a general property of maximum likelihood

Over-fitting Problem can be avoided using the Bayesian Approach!

Page 21: Polynomial Curve Fitting BITS C464/BITS F464 Navneet Goyal Department of Computer Science, BITS-Pilani, Pilani Campus, India

Over-fitting Problem

In Bayesian Approach, the effective number of parameters adapts automatically to the size of the data set

In Bayesian Approach, models can have more parameters than the number of data points

Reference: Christopher M Bishop: Pattern Recognition & Machine Leaning, 2006 Springer

Page 22: Polynomial Curve Fitting BITS C464/BITS F464 Navneet Goyal Department of Computer Science, BITS-Pilani, Pilani Campus, India

Regularization

Penalize large coefficient values

Reference: Christopher M Bishop: Pattern Recognition & Machine Leaning, 2006 Springer

Page 23: Polynomial Curve Fitting BITS C464/BITS F464 Navneet Goyal Department of Computer Science, BITS-Pilani, Pilani Campus, India

Regularization:

Reference: Christopher M Bishop: Pattern Recognition & Machine Leaning, 2006 Springer

Page 24: Polynomial Curve Fitting BITS C464/BITS F464 Navneet Goyal Department of Computer Science, BITS-Pilani, Pilani Campus, India

Regularization:

Reference: Christopher M Bishop: Pattern Recognition & Machine Leaning, 2006 Springer

Page 25: Polynomial Curve Fitting BITS C464/BITS F464 Navneet Goyal Department of Computer Science, BITS-Pilani, Pilani Campus, India

Regularization: vs.

Reference: Christopher M Bishop: Pattern Recognition & Machine Leaning, 2006 Springer

Page 26: Polynomial Curve Fitting BITS C464/BITS F464 Navneet Goyal Department of Computer Science, BITS-Pilani, Pilani Campus, India

Polynomial Coefficients

Reference: Christopher M Bishop: Pattern Recognition & Machine Leaning, 2006 Springer

Page 27: Polynomial Curve Fitting BITS C464/BITS F464 Navneet Goyal Department of Computer Science, BITS-Pilani, Pilani Campus, India

Take Away from Polynomial Curve Fitting Concept of over-fitting Model Complexity & Flexibility

Will keep revisiting it from time to time…