transformations

Post on 04-Jan-2016

64 Views

Category:

Documents

4 Downloads

Preview:

Click to see full reader

DESCRIPTION

Transformations. Transformations to Linearity. Many non-linear curves can be put into a linear form by appropriate transformations of the either the dependent variable Y or some (or all) of the independent variables X 1 , X 2 , ... , X p. This leads to the wide utility of the Linear model. - PowerPoint PPT Presentation

TRANSCRIPT

Transformations

Transformations to Linearity • Many non-linear curves can be put into a linear

form by appropriate transformations of the either– the dependent variable Y or – some (or all) of the independent variables X1, X2, ... ,

Xp .

• This leads to the wide utility of the Linear model. • We have seen that through the use of dummy

variables, categorical independent variables can be incorporated into a Linear Model.

• We will now see that through the technique of variable transformation that many examples of non-linear behaviour can also be converted to linear behaviour.

Intrinsically Linear (Linearizable) Curves 1 Hyperbolas

y = x/(ax-b)

Linear form: 1/y = a -b (1/x) or Y = 0 + 1 X

Transformations: Y = 1/y, X=1/x, 0 = a, 1 = -b

b/a

1/a

positive curvature b>0

y=x/(ax-b)

y=x/(ax-b)

negative curvature b< 0

1/a

b/a

2. Exponential

y = ex = x

Linear form: ln y = ln + x = ln + ln x or Y = 0 + 1 X

Transformations: Y = ln y, X = x, 0 = ln, 1 = = ln

2100

5

Exponential (B > 1)

x

y aB

a

2100

1

2

Exponential (B < 1)

x

y

a

aB

3. Power Functions

y = a xb

Linear from: ln y = lna + blnx or Y = 0 + 1 X

Power functionsb>0

b > 1

b = 1

0 < b < 1

Power functionsb < 0

b < -1b = -1

-1 < b < 0

Logarithmic Functionsy = a + b lnx

Linear from: y = a + b lnx or Y = 0 + 1 X

Transformations: Y = y, X = ln x, 0 = a, 1 = b

b > 0b < 0

Other special functionsy = a e b/x

Linear from: ln y = lna + b 1/x or Y = 0 + 1 X

Transformations: Y = ln y, X = 1/x, 0 = lna, 1 = b

b > 0 b < 0

Polynomial Models

y = 0 + 1x + 2x2 + 3x

3

Linear form Y = 0 + 1 X1 + 2 X2 + 3 X3

Variables Y = y, X1 = x , X2 = x2, X3 = x3

0 0.5 1 1.5 2 2.5 3

0.5

1

1.5

2

2.5

3

Exponential Models with a polynomial exponent

y e x x 0 1 44

Linear form lny = 0 + 1 X1 + 2 X2 + 3 X3+ 4 X4

Y = lny, X1 = x , X2 = x2, X3 = x3, X4 = x4

0 5 10 15 20 25 30

0.25

0.5

0.75

1

1.25

1.5

1.75

2

0

1

2

3

4

5

6

7

8

9

0 0.5 1 1.5 2

Trigonometric Polynomials

0 1 1 1 1sin 2 cos 2Y

sin 2 cos 2k k k k

• 0, 1, 1, … , k, k are parameters that have to be estimated,

• 1, 2, 3, … , k are known constants (the frequencies in the trig polynomial.

Note:

0 1 1 1 1 k k k kS C S C

sin 2 cos 2k k k k

0 1 1 1 1sin 2 cos 2Y

where sin 2 and cos 2k k k kS C

Trigonometric Polynomial Models

y = 0 + 1cos(21x) + 1sin(21x) + … +

kcos(2kx) + ksin(2kx)

Linear form Y = 0 + 1 C1 + 1 S1 + … + k Ck + k Sk

Variables Y = y, C1 = cos(21x) , S2 = sin(21x) , …

Ck = cos(2kx) , Sk = sin(2kx)

-20

-10

0

10

20

30

0 1

Response Surface modelsDependent variable Y and two independent variables x1 and x2. (These ideas are easily extended to more the two independent variables)The Model (A cubic response surface model)

or

Y = 0 + 1 X1 + 2 X2 + 3 X3 + 4 X4 + 5 X5 + 6 X6 + 7 X7 + 8 X8 + 9 X9+

where

21421322110 xxxxxY

319

22182

217

316

225 xxxxxxx

, , , , , 225214

2132211 xXxxXxXxXxX

319

22182

217

316 and , , xXxxXxxXxX

01

23

45

0

1

23

4

0

20

40

01

23

45

0

1

23

4

The Box-Cox Family of Transformations

0ln(x)

01

)(

x

xdtransformex

The Transformation Staircase

1 2 3 4

-4

-3

-2

-1

1

2

3

4

The Bulging Rule

x up

x up

y upy up

y downy down

x down

x down

Non-Linear Models

Nonlinearizable models

Mechanistic Growth Model

Non-Linear Growth models • many models cannot be transformed into a linear model

The Mechanistic Growth Model

Equation: kxeY 1

or (ignoring ) “rate of increase in Y” = Ykdx

dY

The Logistic Growth Model

or (ignoring ) “rate of increase in Y” = YkY

dx

dY

Equation:

kxeY

1

10864200.0

0.5

1.0

Logistic Growth Model

x

y

k=1/4

k=1/2k=1k=2

k=4

The Gompertz Growth Model:

or (ignoring ) “rate of increase in Y” =

YkY

dx

dY ln

Equation: kxeeY

10864200.0

0.2

0.4

0.6

0.8

1.0

Gompertz Growth Model

x

y

k = 1

Example: daily auto accidents in Saskatchewan to 1984 to 1992

Data collected:

1. Date

2. Number of Accidents

Factors we want to consider:

1. Trend

2. Yearly Cyclical Effect

3. Day of the week effect

4. Holiday effects

TrendThis will be modeled by a Linear function :

Y = 0 +1 X

(more generally a polynomial)

Y = 0 +1 X +2 X2 + 3 X3 + ….

Yearly Cyclical Trend

This will be modeled by a Trig Polynomial – Sin and Cos functions with differing frequencies(periods) :

Y = 1 sin(2f1X) + 1 cos(2f2X) 1 sin(2f2X)

+ 2 cos(2f2X) + …

Day of the week effect:This will be modeled using “dummy”variables :

1 D1 + 2 D2 + 3 D3 + 4 D4 + 5 D5 + 6 D6

Di = (1 if day of week = i, 0 otherwise)

Holiday Effects

Also will be modeled using “dummy”variables :

Independent variables

X = day,D1,D2,D3,D4,D5,D6,S1,S2,S3,S4,S5, S6,C1,C2,C3,C4,C5,C6,NYE,HW,V1,V2,cd,T1,T2.

Si=sin(0.017202423838959*i*day). Ci=cos(0.017202423838959*i*day).

Dependent variableY = daily accident frequency

Independent variables ANALYSIS OF VARIANCE SUM OF SQUARES DF MEAN SQUARE F RATIO REGRESSION 976292.38 18 54238.46 114.60 RESIDUAL 1547102.1 3269 473.2646   VARIABLES IN EQUATION FOR PACC . VARIABLES NOT IN EQUATION  STD. ERROR STD REG F . PARTIAL F VARIABLE COEFFICIENT OF COEFF COEFF TOLERANCE TO REMOVE LEVEL. VARIABLE CORR. TOLERANCE TO ENTER LEVEL (Y-INTERCEPT 60.48909 ) . day 1 0.11107E-02 0.4017E-03 0.038 0.99005 7.64 1 . IACC 7 0.49837 0.78647 1079.91 0 D1 9 4.99945 1.4272 0.063 0.57785 12.27 1 . Dths 8 0.04788 0.93491 7.51 0 D2 10 9.86107 1.4200 0.124 0.58367 48.22 1 . S3 17 -0.02761 0.99511 2.49 1 D3 11 9.43565 1.4195 0.119 0.58311 44.19 1 . S5 19 -0.01625 0.99348 0.86 1 D4 12 13.84377 1.4195 0.175 0.58304 95.11 1 . S6 20 -0.00489 0.99539 0.08 1 D5 13 28.69194 1.4185 0.363 0.58284 409.11 1 . C6 26 -0.02856 0.98788 2.67 1 D6 14 21.63193 1.4202 0.273 0.58352 232.00 1 . V1 29 -0.01331 0.96168 0.58 1 S1 15 -7.89293 0.5413 -0.201 0.98285 212.65 1 . V2 30 -0.02555 0.96088 2.13 1 S2 16 -3.41996 0.5385 -0.087 0.99306 40.34 1 . cd 31 0.00555 0.97172 0.10 1 S4 18 -3.56763 0.5386 -0.091 0.99276 43.88 1 . T1 32 0.00000 0.00000 0.00 1 C1 21 15.40978 0.5384 0.393 0.99279 819.12 1 . C2 22 7.53336 0.5397 0.192 0.98816 194.85 1 . C3 23 -3.67034 0.5399 -0.094 0.98722 46.21 1 . C4 24 -1.40299 0.5392 -0.036 0.98999 6.77 1 . C5 25 -1.36866 0.5393 -0.035 0.98955 6.44 1 . NYE 27 32.46759 7.3664 0.061 0.97171 19.43 1 . HW 28 35.95494 7.3516 0.068 0.97565 23.92 1 . T2 33 -18.38942 7.4039 -0.035 0.96191 6.17 1 .    ***** F LEVELS( 4.000, 3.900) OR TOLERANCE INSUFFICIENT FOR FURTHER STEPPING

D1 4.99945

D2 9.86107

D3 9.43565

D4 13.84377

D5 28.69194

D6 21.63193

Day of the week effects

Day of Week Effect

0.0

20.0

40.0

60.0

80.0

100.0

Mon Tue Wed Thu Fri Sat Sun

NYE 32.46759

HW 35.95494

T2 -18.38942

Holiday Effects

S1 -7.89293

S2 -3.41996

S4 -3.56763

C1 15.40978

C2 7.53336

C3 -3.67034

C4 -1.40299

C5 -1.36866

Cyclical Effects

-30

-20

-10

0

10

20

30

40

0 30 60 90 120 150 180 210 240 270 300 330 360

top related