quadratic regression ©2005 dr. b. c. paul. fitting second order effects can also use least square...

Quadratic Regression

©2005 Dr. B. C. Paul

Fitting Second Order Effects

Can also use least square error formulation to fit an equation of the form

Math is more difficult – but since you don’t have to do it – you may not care.

XBBB XXYo

2

21**)(

Fitting the Model With SPSS

We will re-use our data setWhere we saw a clearQuadratic effect in the trend

Click on Analyze to pull downThe menu

Highlight Regression to bringUp the side menu

Highlight and Click for CurveEstimation.

Setting for a Quadratic Model

Set your Dependent andIndependent Variables asBefore.

Check Off that you wantA Quadratic Model

Note that you have options to fitLogarithmic, Inverse, cubic,Power of your choice, exponentialAnd a number of other models.The computer will fit any of the modelsBy least squares.

Other Options to Check

I can have the model include aConstant or not.

I want it to plot my model.

I also want to see an ANOVA on my model theThe constants in the regression equation.

Click Ok when all is set.

Here Come Results

Tells me it fit a quadraticModel for DependentUsing independent as theControlling variable andThat I had 29 data cases.

Analyzing the Fit of the Quadratic Equation.

R squared value is 1 – pretty muchMeans that quadratic model is aPerfect fit.

Their regression mean square is6 orders of magnitude greater thanThe mean square error and theF test blows the null hypothesis offThe map.

Checking the Significance and Value of the Coefficients

B0=1.163+B1=4.061 ie 4.061*X+B2=0.068 ie 0.068*X2

How Significant are the Values

T tests are used to measure the certainty that ourCoefficient values are not 0. As can be seenNone of them have any noteworthy chance ofBeing a fluke.

Here is the Fit of the Model to the Data

As can be seen theModel fits the pointsIncluding the slightCurvature that reflectsThe quadratic effect.

Lets see if there is a Quadratic Effect of Distance on our MPG

We remember that there isSome definite scatter in ourMPG data. The linearRegression on distance onlyExplained about 37% of theTotal variability.

Of course unlike our last dataSet where we could see theCurve effect in the residuals –The residuals were fairlyScattered for our MPG plot

Looking at Results

The R^2 value is up to 40% of variabilityFrom about 37% - that’s improvement, butNot a lot.

The Regression itself is significantAt the 99.9% level

We have something going onDown here.

Significance of Coefficients

The constant isSignificant.

Significance of our distance and distance squared terms are somewhat lacking

At an alpha level of 9.9% some may not be sure the distance coefficient is notZero.

At an alpha level of 48.7% most people would have a lot of doubt about the quadraticTerm distance squared.

What Happened?

We already ran a linear regression and know we have a significant linear effect.

Now we run a quadratic regression and its telling us its not sure about the linear effect

Significance is measured by how much variation is explained by a term relative to the mean square error As new terms enter the equation the amount of variability

explained by a single term normally drops The prediction accuracy is now being shared It does make a difference what else is in the model

Checking Out the Plot

The curved regression lineDoes not appear to be a badFit to the data. In fact the dataSeems to have a bend.

But the significance of theSquare term is just over 50%Which is not mathematicallyConvincing.

Why are We Being Told Something that Looks Wrong?

Whether a term is significant depends on What else is in the equation

In this case the linear effect would seem stronger than the quadratic effect so we might expect more weight to go to the other variable.

What else is not explained 60% of the variability in this data is not explained by the

regression of distance only Some times the clarity with which we can see a trend

depends on the amount of confusion coming from other sources

Everything else other than distance that might influence gas mileage is being called random

We Know More Than We Are Telling

We would logically guess that gas mileage is influenced by more than distance driven When we leave other sources of prediction

unaccounted for we expand what we are saying is random Not using what we know can cause us to loose a lot of

power in our models Problem is our model can only handle one

independent variable Maybe we need a “Bigger Box”

quadratic regression ©2005 dr. b. c. paul. fitting second order effects can also use least square...

Documents