1
Curve-Fitting
Regression
2
Some Applications of Curve Fitting
• To fit curves to a collection of discrete points in order to obtain intermediate estimates or to provide trend analysis
3
Some Applications of Curve Fitting• Function approximation
– e.g.: In the applications of numerical integration
polynomialorderthanis)(where
)()()()(
nxp
xpxfxpxf
n
b
a n
b
an
• Hypothesis testing– Compare theoretical data model to empirical data
collected through experiments to test if they agree with each other.
4
Two Approaches
• Regression – Find the "best" curve to fit the points. The curve does not have to pass through the points. (Fig (a))
• Interpolation – Fit a curve or series of curves that pass through every point. (Figs (b) & (c))
5
Curve FittingRegression
Linear RegressionPolynomial RegressionMultiple Linear RegressionNon-linear Regression
InterpolationNewton's Divided-Difference InterpolationLagrange Interpolating PolynomialsSpline Interpolation
6
• Some data exhibit a linear relationship but have noises
• A curve that interpolates all points (that contain errors) would make a poor representation of the behavior of the data set.
• A straight line captures the linear relationship better.
Linear Regression – Introduction
7
Objective: Want to fit the "best" line to the data points (that exhibit linear relation).
– How do we define "best"?
Linear Regression
Pass through as many points as possible
Minimize the maximum residual of each point
Each point carries the same weight
8
Objective• Given a set of points
( x1, y1 ) , (x2, y2 ), …, (xn, yn )
• Want to find a straight line
y = a0+ a1xthat best fits the points.
The error or residual at each given point can be expressed as
ei = yi – a0 – a1xi
Linear Regression
9
Residual (Error) Measurement
10
Criteria for a "Best" Fit• Minimize the sum of residuals
– Inadequate– e.g.: Any line passing through mid-points
would satisfy the criteria.
n
iii
n
ii xaaye
110
1
)(
• Minimize the sum of absolute values of residuals (L1-norm)
– "Best" line may not be unique
– e.g.: Any line within the upper and lower points would satisfy the criteria.
n
iii
n
ii xaaye
110
1
11
Criteria for a "Best" Fit• Minimax method: Minimize the
largest residuals of all the point (L∞-Norm)
– Not easy to compute– Bias toward outlier– e.g.: Data set with an outlier. The
line is affected strongly by the outlier.
iini
ini
xaaye 1000maxminmaxmin
Outlier
Note: Minimax method is sometimes well suited for fitting a simple function to a complicated function. (Why?)
12
Least-Square Fit
n
iii
n
ii xaaye
1
210
1
2 )(
• Minimize the sum of squares of the residuals (L2-Norm)
• Unique solution• Easy to compute• Closely related to statistics
n
iii xaayaa
1
21010 )( minimize that and find How to
13
Least-Squares Fit of a Straight Line
satisfy that , find can we),,( minimize To
)(),(Let
1010
1
21010
aaaaS
xaayaaS
r
n
iiir
n
iii
n
iii
r
xaay
xaay
a
S
110
110
0
0)(
0)(2
0
n
iiiii
n
iiii
r
xaxayx
xaayx
a
S
1
210
110
1
0)(
0)(2
0
14
Least-Squares Fit of a Straight Line
n
ii
n
ii
n
ii
n
ii
n
ii
n
i
n
ii
n
iii
yaxna
xanay
xaay
xaay
11
10
110
1
11
10
1
110
0
0
0)(
n
iii
n
ii
n
ii
n
ii
n
ii
n
iii
n
ii
n
ii
n
iii
n
iiiii
yxaxax
xaxayx
xaxayx
xaxayx
11
1
20
1
1
21
10
1
1
21
10
1
1
210
0
0
0)(
These are called the normal equations.
How do you find a0 and a1?
15
Least-Squares Fit of a Straight Line
Solving the system of equations yields
n
iii
n
ii
n
ii
n
ii
n
ii
yxaxax
yaxna
11
1
20
1
11
10
xayn
xaya
xxn
yxyxna ii
ii
iiii1
10221
n
iii
n
ii
n
ii
n
ii
n
ii
yx
y
a
a
xx
xn
1
1
1
0
1
2
1
1
16
Statistics Review
• Mean – The "best point" that minimizes the sum of squares of residuals.
• Standard deviation – Measure how the sample (data) spread about the mean. – The smaller the standard deviation the better the mean
describes the sample.
deviation) Standard(1
residuals) theof squares of (Sum)(
Mean)(1
1
2
1
n
SS
yyS
yn
y
ty
n
iit
n
ii
17
Quantification of Error of Linear Regression
Sy/x is called the standard error of the estimate.
Similar to "standard deviation", Sy/x quantifies the spread of the data points around the regression line.
The notation "y/x" designates that the error is for predicted value of y corresponding to a particular value of x.
2/
n
SS r
xy
n
iii
n
iir xaayeS
1
210
1
2 )(
18
(a) Spread of the data around the mean of the dependent variable.
(b) Spread of the data around the best-fit line.
Linear regression with (a) small and (b) large residual errors.
19
"Goodness" of our fit• Let St be the sum of the squares around the
mean for the dependent variable, y
• Let Sr be the sum of the squares of residuals around the regression line
• St - Sr quantifies the improvement or error reduction due to describing data in terms of a straight line rather than as an average value.
n
iit yyS
1
2)(
20
"Goodness" of our fit
• For a perfect fit
Sr=0 and r=r2=1, signifying that the line explains 100 percent of the variability of the data.
• For r=r2=0, Sr=St, the fit represents no improvement.
• e.g.: r2=0.868 means 86.8% of the original uncertainty has been "explained" by the linear model.
t
rt
S
SSr
2
tcoefficienncorrelatio:
iondeterminatoftcoefficien:2
r
r
21
Objective• Given n points
( x1, y1 ) , (x2, y2 ), …, (xn, yn )
• Want to find a polynomial of degree m
y = a0+ a1x + a2x2 + …+ amxm
that best fits the points.
The error or residual at each given point can be expressed as
ei = yi – a0 – a1x – a2x2 – … – amxm
Polynomial Regression
22
Least-Squares Fit of a Polynomial
n
ii
ji
mimii
ji
n
i
mimiii
ji
j
r
yxxaxaxaax
xaxaxaayx
,...,m,ja
S
1
2210
1
2210
)...(
0)...(
yields10for 0Setting
The procedures for finding a0, a1, …, am that minimize the sum of squares of the residuals is the same as those used in the linear least-square regression.
n
i
mimiioi
n
iir xaxaxaayeS
1
2221
1
2 )...(
23
imi
ii
ii
i
m
o
mi
mi
mi
mi
miiii
miiii
miii
yx
yx
yx
y
a
a
a
a
xxxx
xxxx
xxxx
xxxn
22
1
221
2432
132
2
Least-Squares Fit of a Polynomial
To find a0, a1, …, an that minimize Sr, we can solve this system of linear equations.
The standard error of the estimate becomes
)1(/
mn
SS r
xy
24
• In linear regression, y is a function of one variable.
• In multiple linear regression, y is a linear function of multiple variables.
• Want to find the best fitting linear equation
y = a0+ a1x1 + a2x2 + …+ amxm
• Same procedure to find a0, a1, a2, … ,am that minimize the sum of squared residuals
• The standard error of estimate is
Multiple Linear Regression
)1(/
mn
SS r
xy
25
General Linear Least Square• All of simple linear, polynomial, and multiple
linear regressions belong to the following general linear least squares model:
)functions of kindany be can( s' of functionsdifferent are
where
...221100
xz
ezazazazay
i
mm
• It is called "linear" because the dependent variable, y, is a linear function of ai's.
26
How Other Regressions Fit Into Linear Least Square Model• Polynomial:
mm
mm
xzxzxzxz
exaxaxaay
...,,,,1 i.e.,
)(...)()()1(2
210
0
2210
• Multiple linear:
mm
mm
xzxzxzz
exaxaxaay
...,,,,1 i.e.,
)(...)()()1(
22110
22110
• Others:
12133221110
21
33221110
)(,cos,ln,sin i.e.,
)cos()(ln)(sin
xxzxxzxzxz
exx
axxaxaxay
27
.pointtheatfunctionofvaluetherepresentswhere
,...,1,...th
221100
jzz
njezazazazay
iij
imjmjjjj
• We can express the above equations in matrix form as
nm
mnnn
m
m
n
n
e
e
e
a
a
a
zzz
zzz
zzz
y
y
y
or
2
1
1
0
10
21202
111011
eZay
• Given n points, we have
General Linear Least Square
28
General Linear Least Square
The sum of squares of the residuals can be calculated as
To minimize Sr, we can set the partial derivatives of Sr to zeroes and solve the resulting normal equations.
The normal equations can be expressed concisely as
n
i
m
jjijir zayS
1
2
0
yZZaZ TT How should we solve this system?
29
Example
• Find the straight line that best fit the data in least-square sense.
• A straight line can be expressed in the form y = a0 + a1x. That is, with z0 = 1, z1 = x.
• Thus we can construct Z as
X 3 5 6
Y 4 1 4
61
51
31
Z
30
or , solve tois objectiveOur 1
0T yZZZ T
a
a
4
1
4
653
111
61
51
31
653
111
1
0
a
a
41
9
7014
143
1
0
a
a
2143.0
4or
14/3
4
1
0
a
a
Example
x2143.04 y is linebest The
31
Solving ZTZa = ZTyNote: Z is an n by (m+1) matrix.
• Gaussian or LU decomposition– Less efficient
• Cholesky decomposition– Decompose ZTZ into RTR where R is an upper
triangular matrix.– Solve ZTZa = ZTy as RTRa = ZTy
• QR decomposition• Singular value decomposition
32
Solving ZTZa = ZTy (Cholesky decomposition) **
• Given a nxm matrix Z.
• Suppose we have computed Rmxm from ZTZ using Cholesky decomposition
• If we add an additional column to Z, then the new R will be in the form
1,1
1,2
1,1
00 mm
mmm
m
r
r
r
R
• Suitable for testing how much improvement in terms of least-square fit a polynomial of one degree higher can provide
i.e., we only need to compute the (m+1)th column of R.
33
Linearization of Nonlinear Relationships
• Some non-linear relationships can be transformed so that in the transformed space the data exhibit a linear relationship.
• For examples,
33
3
33
222
111
111
equation. rate-Growth
Saturation
logloglogequationPower
lnlnequation lExponentia
2
1
axa
b
yxb
xay
xbayxay
xbayeay
b
xb
34
Fig 17.9
35
Example
Find the saturation growth rate equation
that best fit the data in least-square sense.
Solution: Step 1: Linearize the curve as
X 1 2 3
Y 4 1 4
12
1
11
2111
1
11
1,,
1',
1' where
''111
ac
a
bc
xx
yy
cxcyaxa
b
yxb
xay
xb
xay
11
36
Example
sense). squareleast in optimalnot is(It data thefitsthat
curve good"accetably " an is )4866.0/(4055.1 Thus xxy
X 1 2 3
Y 4 1 4
X' = 1/X 1 1/2 1/3
Y' = 1/Y 1/4 1 1/4
Step 2: Transform data from original space to "linearized space".
Step 3: Perform linear least square fit for y' = c1x' + c2
4866.0/,4055.1/1
7115.0
3462.0 yields Solving
4/1
1
4/1
,
13/1
12/1
11
have data we theFrom
1111112
2
1
2
1T
babcaac
c
c
c
c T yZZZ
yZ
37
Linearization of Nonlinear Relationships
• Best least square fit in the transformed space ≠best least square fit in the original space– For many applications, however, the parameters
obtained from performing least square fit in the transformed space are acceptable.
• Linearization of Nonlinear Relationships– Sub-optimal result– Easy to compute
38
• The relationship among the parameters, ai's, is non-linear and cannot be linearized using direct method.
• For example,
Non-Linear Regression **
)1( 10
xaeay
• Objective: Find a0 and a1 that minimizes
2
10
1
2 )1( 1
n
i
xai
n
ii
ieaye
• Possible approaches to find the solution:– Applying minimization of non-linear function– Set partial derivatives to zero and solve non-linear
equation.– Gauss-Newton Method
39
Other Notes• When performing least square fit,
– The order of the data in the table is not important
– The order in which you arrange the basis functions is not important.
– e.g., Least square fit of
y = a0 + a1x or y = b0x + b1 to
or or
would yield the same straight line.
X 3 5 6
Y 4 1 4
X 6 3 5
Y 4 4 1
X 5 6 3
Y 1 4 4