1 general linear squares and nonlinear regression

1

General Linear Squares General Linear Squares and and

Nonlinear RegressionNonlinear Regression

y = 20.5717 +3.6005x

Error Sr = 4201.3

Correlation r = 0.4434

x = [-2.5 3.0 1.7 -4.9 0.6 -0.5 4.0 -2.2 -4.3 -0.2];

y = [-20.1 -21.8 -6.0 -65.4 0.2 0.6 -41.3 -15.4 -56.1 0.5];

Preferable to fit a parabola

Large error, poor correlation

Polynomial Regression

Quadratic (二次方 ) least squaresy = f(x) = a0+ a1x + a2x2

Minimize total square error

4

n

1i

22i2i10i210r xaxaayaaaS )(),,(

n

1i

2i2i10i

2i

2

r

n

1i

2i2i10ii

1

r

n

1i

2i2i10i

0

r

xaxaayx20a

S

xaxaayx20a

S

xaxaay20a

S

Quadratic Least Squares

Use Cholesky decomposition to solve for the symmetric matrix or use MATLAB function z = A\r

5

n

1ii

2i

n

1iii

n

1ii

2

1

0

n

1i

4i

n

1i

3i

n

1i

2i

n

1i

3i

n

1i

2i

n

1ii

n

1i

2i

n

1ii

yx

yx

y

a

a

a

xxx

xxx

xxn

Standard Error for 2nd Polynomial Regression

n observations 2nd order polynomial (3 coefficients)

Start off with n degrees of freedom, use up m+1 for mth-order polynomial

6

/ 3r

y x

Ss

n

練習：中文書範例 15.1

以二階多項式配適表 15.1之數據，找配適函數

7

» [x,y]=example2» z=Quadratic_LS(x,y) x y (a0+a1*x+a2*x^2) (y-a0-a1*x-a2*x^2) -2.5000 -20.1000 -18.5529 -1.5471 3.0000 -21.8000 -22.0814 0.2814 1.7000 -6.0000 -6.3791 0.3791 -4.9000 -65.4000 -68.6439 3.2439 0.6000 0.2000 -0.2816 0.4816 -0.5000 0.6000 -0.7740 1.3740 4.0000 -41.3000 -40.4233 -0.8767 -2.2000 -15.4000 -14.4973 -0.9027 -4.3000 -56.1000 -53.1802 -2.9198 -0.2000 0.5000 0.0138 0.4862err = 25.6043Syx = 1.9125r = 0.9975z = 0.2668 0.7200 -2.7231 y = 0.2668 + 0.7200 x - 2.7231 x2

Correlation coefficient r

Standard error of the estimate

function [x,y] = example2

x = [ -2.5 3.0 1.7 -4.9 0.6 -0.5 4.0 -2.2 -4.3 -0.2];

y = [-20.1 -21.8 -6.0 -65.4 0.2 0.6 -41.3 -15.4 -56.1 0.5];

Quadratic least square:

y = 0.2668 + 0.7200 x 2.7231 x2

Error Sr = 25.6043Correlation r = 0.9975

Cubic Least Squares

11

n

1ii

3i

n

1ii

2i

n

1iii

n

1ii

3

2

1

0

n

1i

6i

n

1i

5i

n

1i

4i

n

1i

3i

n

1i

5i

n

1i

4i

n

1i

3i

n

1i

2i

n

1i

4i

n

1i

3i

n

1i

2i

n

1ii

n

1i

3i

n

1i

2i

n

1ii

yx

yx

yx

y

a

a

a

a

xxxx

xxxx

xxxx

xxxn

n

1i

23i3

2i2i10r

33

2210

xaxaxaayS

xaxaxaaxf

)(

)(

» [x,y]=example2;

» z=Cubic_LS(x,y)

x y p(x)=a0+a1*x+a2*x^2+a3*x^3 y-p(x)

-2.5000 -20.1000 -19.9347 -0.1653

3.0000 -21.8000 -21.4751 -0.3249

1.7000 -6.0000 -5.0508 -0.9492

-4.9000 -65.4000 -67.4300 2.0300

0.6000 0.2000 0.5842 -0.3842

-0.5000 0.6000 -0.8404 1.4404

4.0000 -41.3000 -41.7828 0.4828

-2.2000 -15.4000 -15.7997 0.3997

-4.3000 -56.1000 -53.2914 -2.8086

-0.2000 0.5000 0.2206 0.2794

err =

15.7361

Syx =

1.6195

r =

0.9985

z =

0.6513 1.5946 -2.8078 -0.0608

y = 0.6513 + 1.5946x – 2.8078x2 0.0608x3

Correlation coefficient r = 0.9985

» [x,y]=example2;

» z1=Linear_LS(x,y); z1

z1 =

-20.5717 3.6005

» z2=Quadratic_LS(x,y); z2

z2 =

0.2668 0.7200 -2.7231

» z3=Cubic_LS(x,y); z3

z3 =

0.6513 1.5946 -2.8078 -0.0608

» x1=min(x); x2=max(x); xx=x1:(x2-x1)/100:x2;

» yy1=z1(1)+z1(2)*xx;

» yy2=z2(1)+z2(2)*xx+z2(3)*xx.^2;

» yy3=z3(1)+z3(2)*xx+z3(3)*xx.^2+z3(4)*xx.^3;

» H=plot(x,y,'r*',xx,yy1,'g',xx,yy2,'b',xx,yy3,'m');

» xlabel('x'); ylabel('y');

» set(H,'LineWidth',3,'MarkerSize',12);

» print -djpeg075 regres4.jpg

Linear Least Square

Quadratic Least Square

Cubic Least Square

Linear least square: y = – 20.5717 + 3.6005x

Quadratic: y = 0.2668 + 0.7200 x 2.7231x2

Cubic: y = 0.6513 + 1.5946x – 2.8078x2 0.0608x3

Standard Error for Polynomial Regression

n observations m-order polynomial

Start off with n degrees of freedom, use up m+1 for mth-order polynomial

16

1mn

Ss r

xy /

Multiple Linear Regression (多線性迴歸 )

Dependence on more than one variable

例： Dependence of runoff volume on soil type and land cover

例： Dependence of aerodynamic drag on automobile shape and speed

17

0 1 1 2 2

0 1 1 2 2( )i i

y a a x a x

e y a a x a x

Multiple Linear Regression

With two independent variables, get a surface Find the best-fit “plane” to the data

18


Much like polynomial regression Sum of squared residuals

19

2

i22i110ir xaxaayS ,,

i22i110ii22

r

i22i110ii11

r

i22i110i0

r

xaxaayx20a

S

xaxaayx20a

S

xaxaay20a

S

,,,

,,,

,,


Rearrange the equations

Very similar to polynomial regression

20

n

1ii

2i

n

1iii

n

1ii

2

1

0

n

1i

4i

n

1i

3i

n

1i

2i

n

1i

3i

n

1i

2i

n

1ii

n

1i

2i

n

1ii

yx

yx

y

a

a

a

xxx

xxx

xxn

n

1iii2,

n

1iii1,

n

1ii

2

1

0

n

1i

2i2,

n

1ii2i1,

n

1ii2,

n

1ii2i1,

n

1i

2i1,

n

1ii1,

n

1ii2,

n

1ii1,

yx

yx

y

a

a

a

xxxx

xxxx

xxn

,

,


Solve by any matrix method Cholesky decomposition is appropriate - symmetric and

positive definite Very useful for fitting power equation

21

mm22110

am

a2

a10

xaxaxaay

xxxay m21

logloglogloglog


設有若干筆數據如下f(0,0) = 5f(2,1) = 10f(2.5, 2) = 9f(1,3) = 0f(4,6) = 3f(7,2) = 27則函數 f()的形式為何？

22


例： Strength of concrete (混擬土 ) depends on cure time and cement/water ratio (or water content W/C)

23

cure time days W/C strength psi2 0.42 27704 0.55 26395 0.7 2519

16 0.53 34503 0.61 23157 0.67 25458 0.55 2613

27 0.66 369414 0.42 341420 0.58 3634

» x1=[2 4 5 16 3 7 8 27 14 20];» x2=[0.42 0.55 0.7 0.53 0.61 0.67 0.55 0.66 0.42 0.58];» y=[2770 2639 2519 3450 2315 2545 2613 3694 3414 3634];» H=plot3(x1,x2,y,'ro'); grid on; set(H,'LineWidth',5);» H1=xlabel('Cure Time (days)'); set(H1,'FontSize',12)» H2=ylabel('Water Content'); set(H2,'FontSize',12)» H3=zlabel('Strength (psi)'); set(H3,'FontSize',12)

Hand Calculations

25

cure time days W/C strength psi x1*x2 x1^2 x2^2 x1*y x2*y2 0.42 2770 0.84 4 0.1764 5540.35 1163.4734 0.55 2639 2.2 16 0.3025 10557.82 1451.75 0.7 2519 3.5 25 0.49 12592.77 1762.988

16 0.53 3450 8.48 256 0.2809 55195.7 1828.3583 0.61 2315 1.83 9 0.3721 6944.629 1412.0757 0.67 2545 4.69 49 0.4489 17815.74 1705.228 0.55 2613 4.4 64 0.3025 20900.17 1436.886

27 0.66 3694 17.82 729 0.4356 99729.22 2437.82514 0.42 3414 5.88 196 0.1764 47793.4 1433.80220 0.58 3634 11.6 400 0.3364 72683.15 2107.811

sum(x1) sum(x2) sum(y) sum(x1x2) sum(x1^2)sum(x2^2) sum(x1y) sum(x2y)106 5.69 29592 61.24 1748 3.3217 349752.9 16740.14

16740

349753

29592

a

a

a

3232461695

24611748106

69510610

2

1

0

...

.

.

n

1iii2,

n

1iii1,

n

1ii

2

1

0

n

1i

2i2,

n

1ii2i1,

n

1ii2,

n

1ii2i1,

n

1i

2i1,

n

1ii1,

n

1ii2,

n

1ii1,

yx

yx

y

a

a

a

xxxx

xxxx

xxn

,

,

Multiple Linear Regression：混凝土例

Solve by Cholesky decomposition

Forward and back substitutions

26

29000

04099240

8015233163

290040801

099245233

00163

3232461695

24611748106

69510610

.

..

...

...

..

.

...

.

.

1827

60

3358

a

a

a

2

1

0

)/()()( CW1827days cure603358psistrength

» [x1,x2,y]=concrete;» z=Multi_Linear(x1,x2,y) x1 x2 y (a0+a1*x1+a2*x2) (y-a0-a1*x1-a2*x2) 2 0.42 2770 2711.3 58.652 4 0.55 2639 2594.7 44.267 5 0.7 2519 2381.1 137.94 16 0.53 3450 3357.3 92.72 3 0.61 2315 2424.6 -109.57 7 0.67 2545 2556.9 -11.895 8 0.55 2613 2836.7 -223.73 27 0.66 3694 3785.2 -91.158 14 0.42 3414 3437.3 -23.339 20 0.58 3634 3507.9 126.11Syx = 130.92r = 0.97553z = 3358 60.499 -1827.8

function [x1,x2,y] = concrete

x1=[2 4 5 16 3 7 8 27 14 20];

x2=[0.42 0.55 0.7 0.53 0.61 0.67 0.55 0.66 0.42 0.58];

y=[2770 2639 2519 3450 2315 2545 2613 3694 3414 3634];

)/(.)(.)( CW 81827days cure 499603358psi strength

Correlation coefficient

(a0 , a1 , a2)


)/(.)(.)( CW 81827days cure 499603358psi strength

» xx=0:0.02:1; yy=0:0.02:1; [x,y]=meshgrid(xx,yy);» z=2*x+3*y+2;» surfc(x,y,z); grid on» axis([0 1 0 1 0 7])» xlabel('x1'); ylabel('x2'); zlabel('y')

General Linear Least Squares (一般線性最小平方 )

Simple linear, polynomial, and multiple linear regressions are special cases of the general linear least squares model

例

Linear in ai , but zi may be highly nonlinear

31

ezazazazay mm221100 ...

xaxay

t at aay

12

0

210

sin

)sin()cos(

General Linear Least Squares

General equation in matrix form

Where

32

eaZy

Dependent variables

Regression coefficients

Residuals

n

T

mT

nT

mnnn

m

m

eeee

aaaa

yyyy

zzz

zzz

zzz

Z

11

11

11

10

21202

11101

General Linear Least Squares

Take partial derivatives to minimize the square errors Sr

This leads to the normal equations

Solve this for {a} using Cholesky LU decomposition, or matrix inverse

33

yZaZZ TT

2n

1i

m

0jjijir zayS


重複之前的 15.1範例，但運用上面介紹的矩陣運算 Z矩陣為何？

34

Nonlinear Regression (非線性迴歸 )

定義的描述式中，包含自己的係數有非線性的相依性例：無法表示成「線性」形式

Use Taylor series expansion to linearize the original equation

Gauss-Newton method Nonlinear function of a1, a2, …, am

f is a nonlinear function of x (xi, yi) are one of a set of n observations

35

eeay xa )1( 10

im10ii eaaaxfy ,...,,;

n

i

xaeayaaf1

2010 )]1([),( 1

練習：中文書 15.5範例

以 fminsearch指令，針對右列數據進行非線性迴歸

36

n

i

axayaaf1

2010 )][),( 1

Nonlinear Regression

Use Taylor series for f, and truncate higher order terms

j+1: the prediction (improved guess)

37

m

m

jijiji

jiji

imii

aa

xfa

a

xfa

a

xfxfxf

eaaaxfy

11

00

1

21 ),,,;(

註 :多變數函數的泰勒展開式

Nonlinear Regression

Plug the Taylor series into original equation

or

38

im

m

ji1

1

ji0

0

ji

jii eaa

xfa

a

xfa

a

xfxfy

im

m

ji1

1

ji0

0

ji

jii eaa

xfa

a

xfa

a

xfxfy

Gauss-Newton Method

Given all n equations

Set up matrix equation

39

nm

m

jn1

1

jn0

0

jn

jnn

2mm

j21

1

j20

0

j2

j22

1mm

j11

1

j10

0

j1

j11

eaa

xfa

a

xfa

a

xfxfy

eaa

xfa

a

xfa

a

xfxfy

eaa

xfa

a

xfa

a

xfxfy

EAZD j

Gauss-Newton Method

40

EAZD j

e

e

e

E ;

a

a

a

A ;

xfy

xfy

xfy

D

n

2

1

m

1

0

jnn

j22

j11

a

f

a

f

a

f

a

f

a

f

a

fa

f

a

f

a

f

a

xf

a

xf

a

xf

a

xf

a

xf

a

xfa

xf

a

xf

a

xf

Z

m

n

1

n

0

n

m

2

1

2

0

2

m

1

1

1

0

1

m

jn

1

jn

0

jn

m

j2

1

j2

0

j2

m

j1

1

j1

0

j1

j

Gauss-Newton Method

Using the same least squares approach Minimizing sum of squares of residuals e

Get A from

Now modify a1, a2, …, am with A and repeat the procedure until convergence is reached

41

DZAZZ Tj

Tj

DZ ZZ A Tj

1

jT

j

function [x,y] = mass_spring

x = [0.00 0.11 0.18 0.25 0.32 0.44 0.55 0.61 0.68 0.80 ...

0.92 1.01 1.12 1.22 1.35 1.45 1.60 1.67 1.76 1.83 2.00];

y = [1.03 0.78 0.62 0.22 0.05 -0.20 -0.45 -0.50 -0.45 -0.31 ...

-0.21 -0.11 0.04 0.12 0.22 0.23 0.18 0.10 0.07 -0.02 -0.10];

Model it with )cos( xaey 1xa0

例： Damped Sinusoidal (受阻尼的正弦曲線 )

Gauss-Newton Method

43

)sin(

)cos(

)cos(

xaxea

f

xaxea

f

xaexf

1xa

1

1xa

0

1xa

0

0

0

)sin()cos(

)sin()cos(

)sin()cos(

n1xa

n1xa

n1xa

21xa

n1xa

11xa

1

n

0

n

1

2

0

2

1

1

0

1

xaxexaxe

xaxexaxe

xaxexaxe

a

f

a

f

a

f

a

fa

f

a

f

Z

n0n0

n020

n010

)cos(

)cos(

)cos(

n1xa

n

21xa

2

11xa

1

xaey

xaey

xaey

D

0

0

0

» [x,y]=mass_spring;

» a=gauss_newton(x,y)

Enter the initial guesses [a0,a1] = [2,3]

Enter the tolerance tol = 0.0001

Enter the maximum iteration number itmax = 50

n =

21

iter a0 a1 da0 da1

1.0000 2.1977 5.0646 0.1977 2.0646

2.0000 1.0264 3.9349 -1.1713 -1.1296

3.0000 1.1757 4.3656 0.1494 0.4307

4.0000 1.1009 4.4054 -0.0748 0.0398

5.0000 1.1035 4.3969 0.0026 -0.0085

6.0000 1.1030 4.3973 -0.0005 0.0003

7.0000 1.1030 4.3972 0.0000 0.0000

Gauss-Newton method has converged

a =

1.1030 4.3972

).cos(. x39724exf x10301

Choose initial a0 = 2, a1 = 3

21 data points

).cos(. x39724exf x10301

a0 = 1.1030, a1 = 4.3972

1 general linear squares and nonlinear regression

Documents

y x y px

y x y a0 a1

squares y

y z2z2

y z1z1

y z3z3

estimatefunction x

correlation r