stepwise regression with press and rank regression

59
SAN 079-1472 Unlimited Release Stepwise Regression With PRESS and Rank Regression (Program User's Guide) Ronald L. Iman, James M. Davenport, Elizabeth L. Frost, Michael J. Shortencarier When printing a copy of any digitized SAND Report, you are required to update the markings to current standards.

Upload: others

Post on 07-Apr-2022

12 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Stepwise Regression With PRESS and Rank Regression

SAN 079-1472

Unlimited Release

Stepwise Regression With PRESS and Rank Regression (Program User's Guide)

Ronald L. Iman, James M. Davenport, Elizabeth L. Frost, Michael J. Shortencarier

When printing a copy of any digitized SAND Report, you are required to update the

markings to current standards.

Page 2: Stepwise Regression With PRESS and Rank Regression

Issued by Sandia Laboratories, operated for the United States D~partment of Energy by Sandia Corporation.

NOTICE

This report was prepared as an account of work sponsored by t~e United States Government. Neither the United States nor the Department of Energy, nor any of their empioyees, nor any of their contractors, subcontractors, or their employees, makes any warranty, express or implied, or assumes any legal liability or responsibility for the accuracy, completeness or usefulness of any information, apparatus, product or process disclosed, or represents that its use would not infringe privately owned rights.

'1;'

Page 3: Stepwise Regression With PRESS and Rank Regression

SAND79-l472 Unlimited Release

Printed January 1980

Stepwise Regression with PRESS and Rank Regression (Program User's Guide)

Ronald L. lman Statistics and Computing Division 1223

Sandia Laboratories Albuquerque, New Mexico 87185

James M. Davenport Department of Mathematics

Texas Tech University Lubbock, Texas 79409

Elizabeth L. Frost Statistics and Computing Division 1223

Sandia Laboratories Albuquerque, New Mexico 87185

Michael J. Shortencarier Statistics and Computing Division 1223

Sandia Laboratories Albuquerque, New Mexico 87185

ABSTRACT

This report contains a description of a stepwise multiple regression

program. This program provides for either a forward stepwise or backward

elimination solution to multiple regression problems. The program also

provides PRESS values that can be used for subset selection as well as

providing options for regression analysis on the ranks of the data and

for a weighted regression analysis on either raw or rank transformed

data. This document has been written and designed for users of this

STEPWISE program.

i

Page 4: Stepwise Regression With PRESS and Rank Regression

THIS PAGE LEFT BLANK INTENTIONALLY.

11

Page 5: Stepwise Regression With PRESS and Rank Regression

Table of Contents

I. Introduction

II. Output

III. Input - Parameter Cards

A. TITLE

B. DATA

C. LABEL

D. BACKWARD

E. FORCE

F. OUTPUT

G. PLOT RESIDUALS

H. STEPWISE

I. MODEL

J. PRESS

K. RANK REGRESSION

L. WEIGHT

M. DROP

N. TRANSFORMATION

O. END OF PARAMETERS

P. FORMAT

Q,. END OF DATA

IV. Additional Data Transformations

V. Output of the Regression Coefficients and the Observations with their Predicted Values When Using Rank Regression

VI. Deck Setup for STEPWISE

VII. Deck Setup for Standardizing Data

VIII. Bibliography

IX. Example

1

2

2

2

2

3

4

5

5 6

6

7

9

10

12

12

13 15 15 16

17

19

22

24

27

28

iii

Page 6: Stepwise Regression With PRESS and Rank Regression

2

values from rank regression analysis can also be saved on disk for

additional investigation outside the STEPWISE program, (10) weighted

regression analysis on the raw data or rank transformed data.

The reader who is familiar with SAND76-0364 (rman, 1976) and has used

STEPWISE on previous occasions is advised to read this report to note the

extensive additions.

II. OUrPur

The program will output means, variances, standard deviations, stan­

dard errors and coefficients of variation for each variable read in or

generated via transformations. As options, the user may request the

table of correlation coefficients, sum of squares and cross-products ma­

trix, inverse correlation matrix and residuals y-values and y's, PRESS

values and a plot of these PRESS values. The program will also print an

analysis of variance table for a regression model and a table of statistics

regarding the regression coefficients. Residual plots may be requested,

and if they are, will be printed on the line printer.

III. INPUT - PARAMETER CARDS

All parameter cards must start in column 1 and may be placed in any

order. An explanation and illustration of each of the parameter cards

follows.

A. TITLE card (optional).

The "TITLE" card may contain any alphameric data that may be meaning­

ful to the user. The word "TITLE" identifies the card, and its contents

are printed at the top of each page of output. Only one "TITLE" card may

be used.

Example:

THIS IS A SAMPLE TITLE CARD

B. DATA card (required).

The "DATA" card has 3 arguments specified as follows:

Page 7: Stepwise Regression With PRESS and Rank Regression

where

NV, Nl', DATDIS.

NV is the number of user supplied variables to be read into the pro­

gramming;

NT is the number of additional variables created as transformations

of other variables.

DATDIS is the data disposition parameter with the following codes:

= 0 the data are to be read from cards and not saved for subse­

quent use (this is the usual case and if not specified DATDIS

will default to 0).

= 1 the data are to be read from cards and saved on disk (file

10 as binary records) for subsequent analysis of the data.

(DATDIS would be set equal to 2 for any subsequent analyses

following DATDIS = 1.)

= 2 the data are to be read from disk (file 10 - binary records).

In most cases the user need specify only the first argument (NV) and

the other two will default to zero. The example below shows how this is

done:

/r DATA, 10.

!

The above card indicates that 10 variables should be input from cards

and not saved on disk. Both NT and DATDIS default to zero because they

are not specified on the card. Note that the "DATA" card must be termina­

ted by a period.

C • LABEL card (opt ional) •

The third type of parameter card is the "LABEL" card. It is used to

assign variable names to the variables in the analysis. Each name may be

up to 8 characters long. The card is identified by the keyword name

"LABEL" which must start in card column 1. The number in parentheses is

the number of the variable that is to be labeled with the first name. It

3

Page 8: Stepwise Regression With PRESS and Rank Regression

4

is assumed that all succeeding labels are sequential. If more than one

card is needed for the labels, the succeeding label cards should have the

same format as the first except the number in parentheses should be the

number of the variable that is to be labeled next. (Variable number is the

same as the subscript of the variable as it is read into the input array

X.) A comma or period should not be placed after the last label on the

card. There can be any number of blank spaces on a card after the last la­

bel on that card. However, column 80 cannot be used (i.e., column 80 is

not read). When making multiple runs on a set of data, the labels from the

first run carryover to subsequent runs.

Example:

( IAiiEL\iF FIRS. ;CAPACIT;riI¢DE,--:.: ,-TiiiP---- -- -

(LABEL(12) TWELrnI,THI!ITEEN, et,.

The above cards would assign X(l) the name FIRST, X(2) the name CAPA­

CIT and so on til X(13) would be named THIRTEEN. There is no limit to the

number of label cards nor the number of names per card. A variable name

may not be continued from one card to the next. Any spaces between the de­

limiting commas are taken as part of the label and are counted as charac­

ters in the name. Allor none, or any portion of the variables may be

labeled.

D. BACKWARD card, (required if the STEPWISE card is not used).

If the user desires a backward elimination solution for "best" subset

selection, a "BACKWARD" card must be included among the parameters. The

form is

/ BACKWARD,SIG= a£.p/ta/taA:

where atphaha.t is replaced by 'the significance level the user wishes

to use

Page 9: Stepwise Regression With PRESS and Rank Regression

for deleting variables from the model. Variables will be dropped from the

model until only variables that have alphahat values less than or equal to

a..trha.ha.t are left in the model.

In you have only one independent variable, you must use the BACKWARD

card instead of the STEPWISE card.

E. FORCE card (optional).

If the user wishes to keep certain variables in the regression model,

regardless of their contribution to the model, he may do so by using the

"FORCE" statement. The "FORCE" statement will keep specified vari.ables in

either the backward elimination or stepwise solution. The following is an

example of the "FORCE" card:

F¢RCE, 3,6,8.

The above card indicates that variables 3, 6, and 8 are to be kept in

the model regardless of their contribution to the model. The "FORCE" card

must end with a period. The maximum number of variables that may be for­

cedinto the model is 10. Do not force all of the independent variables

into the model. If the user desires to have all of the independent vari­

ables in the model, then he should use the BACKWARD option (with SIG=l.O).

F. OUTPUT card (required)

The "OUTPUT" card may have any of the following specifications separa­

ted by connnas •

. Keyword

1. CORR

2. SSXP

3. INVERSE

4. STEPS

Action

Print simple correlations among all variables

specified in the model.

Print the corrected sum of squares and cross­

products matrix for all variables specified in

the model.

Print the inverse correlation matrix at each

step in the analysis for all independent vari­

ables used at that step.

Print an analysis of variance table and the re­

gression coefficient estimates for each step of

5

Page 10: Stepwise Regression With PRESS and Rank Regression

6

5. RESIDUALS

6. ALL

the backward elimination or stepwise variable

selection procedure. If not specified, only

the results for the final model will be printed.

Print the y, y and residual for each observation

using the final regression model.

Caution: Do not request RESIDUALS if the num­

ber of observations is large. This could pro­

duce excessive, if not unwanted, output.

Options 1 through 5 will all be in effect.

Note: Any subset of these options may be requested but should follow the

order of appearance given above. The following example shows how the "OUT_

PUT" card may be used.

f~UTPUT ,c¢Iffi, INVERSE--,-STEPS, RESiDUALS----- -----------/

G. ~ RESIDUALS card, (optional).

The user may request that residuals be plotted on the line printer by

inserting the "PLOT RESIDUALS" card. The plots produced are: each inde­

pendent variable in the final model versus the dependent variable; resi­

duals versus time; residuals versus y's; and residuals versus each of the

independent variables in the model. Time is assumed to be the same as or­

der of data input, i.e., observation 1 is assumed to be recorded first in

time, observation 2 as second, etc. The "PLOT RESIDUALS" card should be

as follows:

PL¢T RESIDUALS

H. STEPWISE card, (required if the BACKWARD card is not used).

The "STEPWISE" card is used to specify that the program should find

the "best" subset of the full model using the stepwise procedure. The user

may specify a significance level for deleting a variable from the model.

The program will continue to add variables to the model as long as it can

find a variable whose regression coefficient is significantly different

Page 11: Stepwise Regression With PRESS and Rank Regression

from zero at the specified significance level for entering variables into

the model. At each step, t-tests are computed on each of the regression co­

efficients of the variables in the model and their alpha hats computed. If

any of the t statistics are not significant at the level specified for de­

leting a variable, the least significant variable is dropped. This process

continues until all variables in the model are significant at the specified

deletion level. The program then searches for a new variable to be added

to the model and this cycle is repeated until no new variables can be found

which are significant at the specified significance level.

The program can be used for computing a forward solution, i.e., add­

ing variables in order of their contribution but not deleting any, by set­

ting the deletion significance level to 1.0.

An example of the "STEPWISE" specification is given below:

~ STEPWISE,SIGIN=0.05,SIQ¢UT=0.10

The above card indicates variables are to be added to the model as

long as variables that are significant at the Q = .05 level can be found.

Variables that are already in the model whose significance level rises

above .10, i.e., a, > .10, will be deleted from the model. If the values

for SIGIN and SIGOUT are not specified, they will each default to .05.

Note: SIGOUT must always be at least as large as SIGIN to avoid an inde­

finite loop where a variable may be introduced and then immediately dele­

ted. Recall that if there is only one independent variable, BACKWARD elim­

ination procedure should be used rather than the STEPWISE procedure.

I. MODEL card, (required).

The model card indicates which variables are in the model and whether

they are dependent or independent variables. Up to 15 dependent variables

may be specified for any single run. A separate analysis is done for each

dependent variable. The model may have up to 179 independent variables if

there is only one dependent variable. The total number of variables both

read in and generated using transformations may not exceed 180. There-

fore, if there are 15 dependent variables specified as the model card, the

7

Page 12: Stepwise Regression With PRESS and Rank Regression

8

maximum number of independent variables is 165. The model may be continued

for as many cards as necessary, just be sure that the continuation occurs

before or after a + sign, i.e., do not allow the continuation to interrupt

a variable subscript. Do not start model continuation cards in column 1.

The writing of the model card is done using variable subscript numbers.

The first variable read in is variable 1 and thus has subscript number 1,

etc. The writing of the model may be in any of the 3 forms that follow,

where the dependent variables are immediately after the word "MODEL", se­

parated by commas and the independent variables are to the right of the

equal sign and separated by plus signs.

1. MODEL,1,3,2=4+5+12.

This method simply uses the variable number to indicate which variables

are in the model and whether they are dependent or independent variables.

For this particular example, variables 1, 3 and 2 are the dependent vari­

ables and 4, 5 and 12 are the independent variables. A separate analysis

would be performed for each of the dependent variables.

2. MODEL,Yl,Y3,Y2=x4+X5+X12.

This specification results in the same model as the method described

above. It allows the user to use more specific notation which may be help­

ful in clarifying the model in some instances.

3. MODEL,Y(1),Y(3),Y(2)=B(4)x(4)+B(5)X(5)+B(12)X(12).

This specification is equivalent to the two above methods but allows

more specificity in describing the model by allowing a dummy indicator,

B(i), for the regression coefficients.

4. MODEL,1,Y3,Y(2)=4+B(5)X(5)+X12.

The above methods may be mixed. Note that the above models are all

terminated by periods. The period is not necessary on the "MODEL" card but

results in slightly faster parameter processing by eliminating the need for

the program to determine whether the model card has been continued or not.

The program always fits the model with the intercept assumed to already be

included in the model. There may be any number of blank spaces between the

independent variables as long as each variable specification is separated

by a "+" sign. However, there can be no blank spaces to the left of the

equal sign.

Page 13: Stepwise Regression With PRESS and Rank Regression

J. PRESS card (optional).

When regression techniques are used to build a response surface, the possibility exists of producing two competing models. This situation could

easily arise in stepwise regression as a new surface is developed each time

a new significant variable is added. Although 11 statistical significance"

is a necessary condition for adding a new variable in stepwise regression,

it is not an end in itself as there exists the possibility of overfitting

the data. For example, it is possible to obtain a good fit on a set of

points by using a polynomial of high degree. However, in doing so, one

can overfit the data and produce a spurious model which makes poor predic­

tions.

To protect against overfit, the Predicted Error Sum of Squares (PRESS)

criterion as given in Allen (1971) can be used to determine the adequacy

of a prediction model. For a regression model containing k variables and

constructed from n observations, PRESS is computed in the following manner.

For i=1,2, •.. ,n, the ith observation is deleted from the original set of n

observations and then a regression model containing the original k variables

is constructed from the remaining n-l observations. With this new regres-~

sion mOdel, the value Yk(i) is the estimate for the deleted observation

Y.. Then, PRESS is defined from the preceding predictions and the n ori-1-

ginal observations by

The regression model having the smallest PRESS value is preferred when

choosing between two competing models, as this is an indication of how well

the basic pattern of the data has been fit versus an overfit or an underfit.

To obtain the PRESS values at each step of the analysis, merely add

the PRESS card to the deck of parameter cards. Also, if the analysis pro­

duces 2 or more steps, a plot of PRESS values versus step number is auto­

matically produced under the PRESS option to make relative comparisons

easier.

9

Page 14: Stepwise Regression With PRESS and Rank Regression

10

Example:

( PRESS

K. RANK REGRESSION card (optional).

To obtain a regression analysis on the ranks of the data instead of

the raw data merely add the RANK REGRESSION card to the deck of parameter

cards. If the user requests RESIDUALS on the OUTPUT card, then the resi­

duals are given as a rank residual (i.e., the difference between the rank

of Y and the predicted rank of Y). In addition, the predicted ranks of the

Y's are used to interpolate in the original raw Y's to obtain the raw Y.

This is done at each step in the analysis and these data are used to com­

pute a "normalized" R2. This R2 value is associated with the raw data in­

stead of the ranked data and is computed by

n

1: 2 1=1

R = -----=-=------h _ 2

(Y.-Y) + 1.

n

E i=l

n h 2 E(y·-y·) i=l 1. 1.

This "normalized" value of R2 varies between zero and one and will be close

to one if the model being analyzed predicts the observed values adequately.

Note that the adjusted total sum of squares can be factored into

three components: (1) sum of squares of regression, (2) sum of squares of

error and (3) sum of cross-products. The formulas are as follows:

n

E i=l

h _ 2 (y.-y) +

1.

n h 2 n E (Y.-Y.) + 1: i=l 1. 1. i=l

(Y.-Y)(Y.-Y.). 1. 1. 1.

If the Y. are the usual least squares predictions then this last term (the 1. A

sum of cross-products) is easily shown to be zero. However, since the Yi computed here by the rank regression technique are definitely not the

least squares predictions, then this term is non-zero (can be positive or

negative). Hence, if the "usual" R2 = (sum of squares of regression)!

(total sum of squares) were computed, it would be either increased or

Page 15: Stepwise Regression With PRESS and Rank Regression

decreased over the "normalized" R2 described above depending on whether the

sum of cross-products is negative or positive. Therefore, it is felt

that the normalized R2 is a more appropriate measure of fit than the usual

R2 in this situation.

The sum of cross-products can be quite large in absolute value, depend­

ing upon the structure of the observations. If the raw data have unusual

spacings or exhibit a non-monotone relationship, then the interpolation ~

scheme used to obtain the Y. 's will not be adequate and the resulting sum ~

of cross-products will be large. Hence as a measure of the "ability (or

inabili ty) to interpolate in the data" a Coefficient of Interpolation is

computed and given for each model analyzed. This coefficient is computed

by

2*

2* n ~ 2

~ (Y.-Y.) . ~ ~ ~=

n ~ -1

+ E (Y.-Y)(Y -Y.) i=l ~ ~

Note that if the sum of cross-products is zero then this coefficient will

be zero and that as the sum of cross-products departs from zero, the co­

efficient approaches one. Therefore as a matter of interpretation, a

value of this coefficient that is "near zero" indicates that the interpo­

lation is adequate. If the coefficient in "near one," then the user should

examine the original observations for unusual spacings and/or non-monotone

relations in the data (such as cyclic).

After the final model has been selected, then the raw residuals are

given (when RESIDUALS are requested) as the difference between the raw Y A

and the raw Y (see Section V for how to save these values for further analy­

sis). The reader is referred to Iman and Conover (1979) for a thorough

discussion of the rank regression analysis.

When using RANK REGRESSION, there is a limit on the number of obser­

vations that can be handled. Let NV = the number of variables input to

STEPWISE (see DATA card, page 2) and N = the number of observations. Then

the following limits are in effect:

1. If NV = 2 (i.e., one independent and one dependent variable) then

N :S 64980.

11

Page 16: Stepwise Regression With PRESS and Rank Regression

12

L. WEIGHT card (optional).

The "WEIGHT" card allows the user to perform a weighted regression

analysis of the data. The card has one argument as shown below:

( WEI GIlT= TIll'

where IWT is the variable number where the weights will appear. (Variable

number is the same as the subscript of the variable as it is read into the

input array X.) If IWT is zero than weighted regression will not be done.

However, the default value of IWT is zero so if the user does not want

to use weighted regression then the "WEIGHT" card need not be included.

Weighted regression may be done on either raw data or rank transformed

data. The weights are normalized by the program so that the sum of the

weight equals the number of observations. When the WEIGHT option is

requested the equations on the previous pages with respect to PRESS and

the R2' on ranks are automatically adjusted to reflect these weights, as

is the case for the normal equations used in the regression analysis.

The card is identified by the keyword "WEIGHT" which must begin in

column one. There must be no blanks before or after the equal sign. A

period to terminate the card is optional.

Example:

WEI GHT= 3

In the example, weighted regression has been requested. The weights will

appear in the data as variable number three.

M. DROP card (option/lJ.).

The "DROP" card allows the user to unconditionally drop observations

from the regression analysis. (For information on how to conditionally drop

observations, see the "Additional Data Transformations" section.) If the

user knows in advance that some observations will need to be discarded, he

may use the "DROP" card, indicating the number(s) of the observation(s)

to be dropped. The card is identified by the keyword "DROP" which must

Page 17: Stepwise Regression With PRESS and Rank Regression

,

begin in column one. The only blank necessary is the one after the word

"DROP." Commas are used to separate the observation numbers and ~ period

is required after the last observation number. The observations being

dropped need not appear in any special order. There must be no blanks

preceding the period and there must be no blanks before or after the

commas unless the card is to be continued. To continue the "DROP" card,

at least one blank must follow the last comma and the continuation card

must begin in column two. There is no limit to the number of continuation

cards or to the number of observation drops per card.

Example:

DROP 9,6,1,5.

In this example, the first, fifth, sixth, and ninth observations will be

dropped. If the user wishes to drop a large number of observations, it

may be advantageous to use the trans subroutine discussed in the

"Additional Data Transfonna.tions" section.

N. TRANSFORMATION card (optional).

The "TRANSFORMATION" card allows the user to define new variables

or redefine existing variables via simple transformations of existing

variables. (For more complicated variable transfonna.tions, see the

"Additional Data Transfonna.tions" section.) The writing of the "TRANS­

FORMATION" card is done using variable subscript numbers. The card is

identified by the keyword "TRANSFORMATION" which must begin in column

one. The only blank necessary is the one after the word "TRANSFORMATION"

and there must be no blanks within the transfonna.tion definitions. The

variable being defined or redefined appears on the left hand side of an

equal sign and the variables used to define or redefine it appear on the

right hand side. There must be no blanks on either side of the equal

sign and the variables need not appear in any special order. Stars are

used to separate the variables on the right hand side and there must be

no blanks on either side of any star in a transformation definition.

There is no limit to the number of variables used in a transformation

definition as long as it fits on one card. A minus sign in front of a

13

Page 18: Stepwise Regression With PRESS and Rank Regression

14

variable on the right hand side indicates that the reciprocal of that

variable is wanted. There must be no blanks on either side of the minus

sign. Commas are used to separate transformation definitions. There must

be no blanks before or after a comma unless the card is to be continued.

In such cases, the comma must be followed by at least one blank and the

continuation card must begin in column two. Do not break up a trans­

formation definition by continuing it on to the next card. There is no

limit to the number of continuation cards or to the number of trans­

formation definitions per card. A period is re~uired after the last

transformation definition and there must be no blanks preceding it.

Labels for the variables created by the "TRANSFORMATION" card are

automatically generated using the characters of the transformation definition.

When using the "TRANSFORMATION" card and/or the TRANS subroutine, the raw

variables must be numbered from NV+l to NV+N.r, where NV is the number of

input variables and NT is the number of new variables created via trans­

formations. (For a complete description of the TRANS subroutine, see the

"Additional Data Transformations" section.)

Example:

TRANSFORMATION

In this example, variable number four has been defined to be the product of

variable 'number one and variable number three, variable number two has been

redefined to be the reciprocal of itself, and variable number eight has

been defined as variable number six times the reciprocal of variable

number seven, that is variable eight is variable six divided by variable

seven. Note that use of a minus sign on this card is used to denote a

reciprocal.

,

Page 19: Stepwise Regression With PRESS and Rank Regression

2. If NV ~ 3, then NV*N ~ 175931.

When raw data are being analyzed, there essentially is no limit to the

number of observations that can be processed. If the user has a data set

falling outside of the above limits but would still like to do a rank re­

gression, then it would be necessary for the user to rank the data outside

of the STEPWISE program and then not use the RANK REGRESSION card. Row-"

ever, using this approach will not provide raw Y's within the STEPWISE

program.

Example:

RANK REGRESSI¢N

O. END OF PARAMETERS card (required).

The "END OF PARAMETERS" card is used to indicate the end of parameter

card input.

Example:

END ¢F PARAMETERS

P. FORMAT card (required, if data are read from cards).

A format card is required if data are to be read from cards. The for­

mat card may contain D, E, and F and (in some cases) I-numeric specifica­

tions and T and X positional specifications. The format must be enclosed

in parentheses and conform to the syntax of the CDC FORTRAN IV variable , format. The format may be continued on up to 10 cards.

Example:

/ I

.~-~ .... --~~~~-~~-. --.----~~~~------~----'-'-.

(F2.l,F2.2,F4.0,lX,Fl.0,3X,F2.0)

15

Page 20: Stepwise Regression With PRESS and Rank Regression

16

In most cases, the user will use the F and X format specifications.

Listed below is a sample data card with data punched in columns 1 through

15.

The preceding format would read in the data card assigning the values

listed below for the respective variables.

X(t) 1.

&tl 0.")7

X(3) 9b"64.O

x(4) 1.0

X(5 ) '6i":O

In the general case of Fw.d, w is the field width of the variable on the

data card (the variables are read in sequentially from 1 to NV, where NV

is the number variables to be read from the cards), i.e., the number of

columns used to specify the values of the variable and d is the number of

decimal places read in or the number of columns to the left of where the

decimal point should be placed. The nX specification is used to skip

columns on the data card where n is the number of columns to be skipped.

If more than one data card is used to input the data from one observation,

a slash in the format is used to specify that reading should continue on

the next card.

Example:

(F2.l,6F2.2,8F5.3,7F4.1!10F5.2)

The above format card would cause the first 22 variables to be read from

the first data card defining the first 22 values in the observation vector

and the next 10 (23 to 32) to be read from the second data card completing

that observation vector. There may be any number of cards per observation

vector.

If more than one card is necessary for specifying the format, it

should be continued for however many cards are necessary without any indi­

cation in the format itself that it is being continued. In other words, it

should be written just as it would be written on one long card.

Q. END OF DATA trailer card (required when reading data from cards).

In order that the user does not have to specify the number of

Page 21: Stepwise Regression With PRESS and Rank Regression

observations he has in his data, the "END OF DATA" trailer card should

follow the data. The trailer is necessary only when multiple sets of

data are being processed in one computer run. The trailer card is used

to separate the data of one data set from the parameters of the next. If

only one set of data is being processed, there is no need for a trailer

card, although it may be used. Do not use an END OF DATA card when

DATDIS = 2.

Example:

DATA

IV. ADDITIONAL DATA TRANSFORMATIONS

It is possible, in addition to the simple transformations specified

by the "TRANSFORMATION" card, to create complicated transformations by use

of a user supplied subroutine called TRANS. The user may redefine user

supplied variables or create new variables as transformations or com­

binations of the variables read in. When using the TRANS subroutine

and/or the "TRANSFORMATION" card, the new variables must be numbered from

NV+l to NV+:NT where NV is the number of input variables and NT is the

number of new variables created via transformations. The simple variable

transformations defined by the "TRANSFORMATION" card are always done

first. The subroutine TRANS is then used to perform the more complicated

transformations or to conditionally drop observations through use of IDROP

as explained below. This subroutine may contain any legal FORTRAN state­

ments. There are five variables made available to the subroutine via

cornmon block IMAN which the user may use to either make decisions while

processing the input data or to modify the input status of an observation.

These variables are NRAW, Nl'RANS, IDROP, IDUM, and IRANK and are discussed

below. The general. form of the subroutine is as follows:

SUBR¢UTINE TRANS (X) DIMENSI¢N x(lBo) C¢MM¢N/IMAN/NRAW, Nl'RANS , IDR¢P, IDUM, IRANK

any transformations

REl'URN END

17

Page 22: Stepwise Regression With PRESS and Rank Regression

18

The above cards are the minimum reQuirements for the subroutine. The sub­

routine reQuires a single argument, X, which is the array of input variables

for a single observation, i.e., the observation ector X. The variables in

common block TIMAN are:

NRAW - is the current count of the raw observation being read in. It

may be used to make decisions when making variable transformations

based on the observation count or it may be used as a label if values

are to be printed out. The value of NRAW must not be changed by the

programmer.

NTRANS - is the current count of the transformed observation being

processed. It will always be the same as NRAW except in the case of

dropping observations. In this case, NTRANS will be the count of

those observations which have not been dropped. The value of NTRANS

must not be changed by the programmer.

IDROP - is used to indicate that the user wishes to drop an observa­

tion from the analysis. The value of IDROP is always zero upon en­

tering the subroutine. When IDROP is set to any non-zero value, the

observation vector being transformed at the time IDROP is set non­

zero, will be dropped from the analysis.

IDUM - has no function within subroutine TRANS and the programmer

should not change the value of IDUM wi thin TRANS.

lRANK - is set within the main program and indicates when the user has

reQuested RANK REGRESSION. When IRANK = 0, the usual regression analy­

sis on the raw data has been reQuested. When IRANK = 1, RANK REGRES­

SION has been reQuested. The value of IRANK must not be changed by

the programmer.

Example of using the TRANS subroutine:

SUBR¢UTINE TRANS(X) DIMENSI¢N X(180) C¢MM¢N/TIMAN/NRAW,NTRANS,IDR¢P,IDUM,IRAI~ IF(NRAW.EQ.l)FAT=O.O IF (X(9).NE.O.0)Q¢ TO 4 IDR¢P=l REruRN

4 X(13)=(x(8)-X(7»/X(5) X(16)=X(1l)/X(9)*100 FAT=FAT + X(10) x(4 )=AL¢G(x(4» WRITE (3,101)(X(J),J=13,16),FAT,NTRANS

101 FORMAT (20X,5F8.3,3X,15)

Page 23: Stepwise Regression With PRESS and Rank Regression

f

RETURN END

In the above example, assume 12 variables are read in from cards

(NV=12) according to the format specifications given by the user on the

variable format card. The first time through, i.e., when NRAW=l, the var­

iable FAT is set to zero. Each time through, X(9) is checked for being

zero and in those cases when it is zero, the observation is dropped for all

variables 1 through 16. If X(9) is non-zero, then X(13) and x(16) are

created as transformations of existing variables. Also x(4) is transformed

to its log (base e) and X(lO) is accumulated in the location named FAT.

Note that variables x(14) and X(15) are not defined in this subroutine.

They may have been previously defined by use of a TRANSFORMATION card,

thus the user is allowed complete freedom in creating transformed

variables. Also, when X(9) is non-zero, the values of the new variables

are printed out along with the cumulative sum of X(lO) and the number of

transformed observation (NTRANS).

An example of the use of IRANK is as follows. If the user is making

multiple passes of the same data set and some of the passes involve RANK

REGRESSION while the remaining passes involve the usual regression on the

raw data, and furthermore some of the data is to be dropped from the raw

analysis but not dropped from the rank analysis: then the programmer can

make use of the passed value of IRANK to determine whether an observation

should be dropped or whether it should remain in the analysis for that par­

ticular run. That is, if IRANK = 1, then you would jump over the statements

that set IDR¢P = 1. This prevents the deletion of the observation when

RANK REGRESSION is requested.

V. OUTPUT OF THE REGRESSION COEFFICIENTS AND THE OBSERVATIONS WITH THEIR PREDICTED VALUES WHEN USING RANK REGRESSION

The user often has need of the regression coefficients from a particu­

lar fitted mOdel. In order to accomodate these users, we have incorporated

into STEPWISE the following option. These coefficients are automatically

written onto TAPE 19 and the user can access these coefficients as descri­

bed below.

For each distinct ANOVA table that STEPWISE produces in a run, a uni­

que sequence number is attached to that ANOVA table and is printed at the

bottom of the table. This sequence number begins with "101" and increments

19

Page 24: Stepwise Regression With PRESS and Rank Regression

20

in unit steps. Also the "number" of the variable that is referred to

below is the number given to that variable when it is read in (or generated

via transformations) and corresponds to the number of the variable that is

used in the MODEL card.

The information written on TAPE 19 is arranged as follows. For each

unique sequence number, a group of card images is written on TAPE 19 fol­

lowed by two blank card images. Within this group of card images the first

card contains the following information:

Columns 1 - 22

Columns 23 - 27

Columns 28 - 66

Columns 67 - 69

Columns 70 - 72

Columns 73 - 80

"Unique Sequence No. = "

Contains the unique sequence number for this group

of cards in an 15 format.

"ANALYSIS FOR DEPENDENT VARIABLE"

Contains the number of the dependent variable used

in this analysis in an 13 format. II 11

Contains the label of the dependent variable (given

in columns 67 - 69) in an A8 format.

The second card contains the T1 T LE (see Section II-A) in an 80 column al­

phameric format. All remaining cards within this group of cards are

written in the following format with the explanation following:

F¢RMAT(lX,13,lX,12,lX,4(13,E15.8 ).

Columns 2 - 4

Columns 6 - 7

Columns 9 - 80

The number of independent variable included in the

fitted model, exlcuding the constant term.

A card number that is unique wi thin this group of

cards.

Contains 4 sets of values, each set' containing the

number of the independent variable include.d in the

fitted model followed by the estimate of the co­

efficicient.

Note: The first coefficient on the first card with a group always

contains the constant term and its "number" is zero.

In order to access this information the user may use one of the fol­

lowing procedures. The user could catalog TAPE 19 as a permanent file and

then read whatever information he desires from the permanent file o For

eXalDP1e, the following sequence would catalog TAPE 19.

Page 25: Stepwise Regression With PRESS and Rank Regression

LDSET,PRESET=INF. STEP,PL=40ooo. CATAL¢G,TAPEl9,THIS-FILE,CY=l,CN=DEFAULTPW. EXIT. EXIT.

If the user does not wish to catalog TAPE 19 as a permanent file but

desires punched cards, then the following sequence would be appropriate.

LDSET,PRESET=INF. STEP,PL=40000. REWIND,TAPEl9. C¢PYCF, TAPEl9,PUNCH. EXIT. EXIT.

Or if the user wished to have a print-out of the card images along

with the punched cards, then the following would suffice.

LDSET,PRESET=INF. STEP,PL=40000. REWIND, TAPE19 C¢PYCF,TAPE19,PUNCH. REWIND, TAPEl9 • C¢PYSBF,TAPEl9,¢UTPUT. EXIT. EXIT.

As mentioned in Section II - K, the user can obtain the original ob­" servations and the predicted raw Y's when using RANK REGRESSION by re-

trieving them from TAPE 20.

When requesting RANK REGRESSION and PLOT RESIDUALS, the plots given

will be the ranks of the data and the predicted ranks or rank residuals

(whichever is appropriate). That is, none of the plots involve any of the

raw data. Therefore, if the user wishes to see plots of the original ob-A

servations versus the raw Y's, residuals, etc., then this must be done se­

parate from the STEPWISE program.

To facilitate this type of further analysis, the STEPWISE program is A

set up to automatically write the original observations and the raw Y's on

TAPE 20 as a binary file. If the user wants to save these for further

21

Page 26: Stepwise Regression With PRESS and Rank Regression

22

analysis, then TAPE 20 should be cataloged as a permanent file as explained

in the previous part of this section.

The data file on TAPE 20 is created in the following manner. All re­

cords are written in unformated binary code. The first record is the TITLE

card information, which is 80 columns of alphameric characters. The second

record is an integer variable that gives the number of observations. All

of the remaining records consist of two floating point numbers. The first

is the original observed Y-value and the second is the predicted raw value "

of Y (i.e., the raw y). The number of records of pairs of data will be

equal to the number of observations.

VI. DECK SETUP FOR STEPWISE

On the following page is an example of how to setup the cards for use

in running STEIWISE. Cards 1 - 15, 22 - 32, and 46 - 64 all begin in card

column one, while the Fortran statements in cards 16 - 21 start in column

7. Cards 35 - 45 are data cards in 5F6. 0 format. Card number 1 is the

JOB card and needs to be changed only to show the user's name and the box

number. Note also that due to large core memory requirements of STEIWISE,

that the extended core parameter must be set at 1200. Card number 2 must

contain the user's social security number, division number and charge num­

ber. Cards 3 - 14 are control cards and will remain the same for all runs.

The only exception being the use of TAPE 19 and TAPE 20 control cards that

are explained in Section V. Cards 16 - 21 provide a dummy subroutine for

transformations with any desired transformations (see page 17) following card number 20. Cards 23 - 32 are parameter cards (see pages 2-17). If

the data is on cards, it will appear starting on card 33 with one vector

of observations per card( s), followed by the END OF DATA card. Cards

47 - 53 reprocess the data with a backward solution. Note the absence of

the LABEL card. The labels used in the first pass of the data will be

used for this second analysis. Cards 54 - 64 reprocess the data with a

stepwise solution and RANK regression.

Card No. DECK SETUP FOR STEIWISE

1 STE,TlO,EC1200. 2 ACC¢UNT ,S509423684, D1223 ,G13 .A0188ooo ,RP , KUNC . 3 FTN ,R=2 ,B=TRANZ • 4 REWIND, TRANZ • 5 ATTACH,¢BJECT,STEPWISE-RLI.

Page 27: Stepwise Regression With PRESS and Rank Regression

6 REWIND ~BJECT • 7 C¢PYL, BJECT, TRANZ, STEP. 8 ATTACH,SUBS,D1223-760G-LIBRARY. 9 ATTACH,FXDPLIB,FXDPLIB.

10 LIBRARY,SUBS,FXDPLIB. II LDSET,PRESET=INF. 12 STEP,PL=400oo. 13 EXIT. 14 EXIT. 15 (END ¢F REC¢RD -- MULTI-PUNCH 7 8 9 in C¢L 1). 16 SUBR¢UTINE TRANS (X) 17 C¢MM¢N/IMAN/NRAW,NTRANS,IDR¢P,IDUM,IRANK 18 DIMENSI¢N X(180) 19 C ANY TRANSF¢RMATI¢NS G¢ HERE 20 RETURN 21 END 22 (END ¢F REC¢RD -- MULTI-PUNCH 7 8 9 in C¢L 1) . 23 TITLE, EXAMPLE ¢F STEPWISE ¢Pl'ION (DATA FR¢M DRAPER AND SMITH,

p. 365 - 402) 24 DATA, 5 ,O,l. 25 LABEL(1)=Xl,X2,X3,X4,Y 26 M¢DEL,5=1+2+3+4. 27 STEPWISE,SIGIN=.05,SIG¢UT=.10 28 PRESS 29 ¢UTPUT,ALL 30 PL¢T RESIDUALS 31 END ¢F PARAMETERS 32 (5F6.0) . 33 7. 26. 6. 60. 78.5 34 1. 29. 15. 52. 74.3 35 ll. 56. 8. 20. 104.3 36 ll. 3l. 8. 47. 87.6 37 7. 52. 6. 33. 95·9 38 11. 55. 9. 22. 109.2 39 3. 7l. 17. 6. 102.7 40 1. 31. 22. 44. 72.5 41 2. 54. 18. 22. 93.1 42 2l. 47. 4. 26. 115.9 43 1. 40. 23. 34. 83.8 44 llo 66. 9· 12. 113.3 45 10. 68. 8. 12. 109.4 46 END ¢F DATA 47 TITLE,EXAMPLE ¢F BACKWARD ¢Pl'I¢N (DATA FR¢M DRAPER & SMITH,

48 p. 364 - 402) DATA,5,0,2.

49 ~DEL,5=1+2+3+4. 50 BACKWARD,SIG=.05 51 ¢UTPUT,C~RR,STEPS,RESIDUALS 52 PRESS 53 END ¢F PARAMETERS 54 TITLE,EXAMPLE ¢F RANK REGRESSI¢N WITH STEPWISE ¢Pl'I¢N (DATA FR¢M

DRAPER/SMITH) 55 DATA,5,0,2. 56 LABEL(l)=RANK(Xl) ,RANK(X2) ,RANK(X3) ,RANK (x4) ,RANK(Y)

23

Page 28: Stepwise Regression With PRESS and Rank Regression

24

57 MODEL,5=1+2+3+4. 58 STEPWISE,SIGIN=.05,SIGOUT=.10 59 PRESS 60 RANK REGRESSION 61 ¢UTPUT,C¢RR,STEPS,RESlDUALS 62 END ¢F PARAMETERS 63 (END ¢.F INF¢.RMATI¢.N -- MULTI-PUNCH 6 7 8 9 IN C¢L 1) 64 (END PF INF¢RMATI¢N -- MULTI-PUNCH 6 7 8 9 IN C¢L 1)

VII. DECK SETUP FOR STANDARDIZING DATA

Any of the variables may be standardized by using the Deck Setup on

the following 2 pages. Remarks on page 22 explain the control cards 1 -

15 • Cards 18 - 30 explain how to use the standardizing program. The user

should note that card 43 must be made to match the data, and that the DATA

parameter card must have a 2 as the third parameter (see page 2).

Card decks of this standardizing program are available from the authors

of this report.

The standardizing program can be replaced by any program that the

user would desire in terms of manipulating the input data set.

Card No.

1 2 3 4 5 6 7 8 9

10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26

DECK SETUP FOR STANDARDIZING DATA BEFORE RUNNING STEPWISE

STE,TlO,EC1200. ACC¢UNT,S509423684,D1223,G1 3,A0188000,RP,KUNC. FTN,R=2,B=STAN. LDSET,PRESET=ZER¢. STAN. FTN,R=2,B=TRANZ. REWIND, TRANZ • ATTACH,¢BJECT,STEPWISE-RLI. REWIND,c,iBJECT. C¢PYL,~BJECT,TRANZ,STEP. ATTACH, SUBS, D1223-7600-LIBRARY. ATTACH ,FXDPLIB ,FXDPLIB. LIBRARY,SUBS,FXDPLIB. LDSET,PRESET=INF. STEP,PL=40000. (END rjF REC9BD -:- MULTI-PUNCH 7 8 9 IN c¢L 1) C C THIS PR¢GRAM WILL STANDARDIZE ANY ¢F THE INPUT VARIABLES. F¢R C STANDARDIZING A CARD MUST BE PLACED IN FR¢NT ¢F THE DATA WITH THE C NUMBER ¢F VARIABLES (NY) RIGHT JUSTIFIED IN C¢LUMNS 1 - 3. THE CREST ¢F THIS CARD AND P¢SSIBLY C¢NTlNUING ¢NT¢ A SEC¢ND CARD DE-C PENDING ON THE NUMBER OF VARIABLES) WILL BE FILLED WITH EITHER ZER¢S C ¢R .¢NES. A 1 PUNCHED IN C¢r.UMN 4 WILL INDICATE THAT VARIABLE NUM­BER ¢NEIS TO BE STANDARDIZED, WHILE A 0 PUNCHED IN THIS c¢wwr WILL INDICATE THAT VARIABLE NUMBER 1 IS N¢T T¢ BE STANDARDIZED. VAR-IABLE NUMBER 2 IS LIKEWISE FLAGGED IN C¢LUMN 5, VARIABLE NUMBER 3 IN

Page 29: Stepwise Regression With PRESS and Rank Regression

27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80

Cr,tLUMN" 6, ET C • C C THE FIRST DD1ENSI¢N ¢F THE DATA MATRIX SH¢'ULD BE AT LEAST ¢NE C GREATER THAN THE NUMBER ¢F ¢BSERVATI¢NS. WHILE THE SEC¢ND DIMEN­C SIr,tN MUST BE AT LEAST EQUAL T¢ NV. C

C

PR¢GRAM STAND(INPO'T,¢,UTPUT,TAPE10, TAPE 5 = INPUT) DIMENSI¢N IFLAG(101),DATA(10000,9) READ(5,105)NV,(IFLAG(J),J=1,NV)

105 F¢RMAT{I3,77I1724Il) NVl = NV - 1

C READ IN DATA. F¢BMAT 100 MUST C¢NFORM T¢ USERS DATA. C

1=1 1 READ(5,100)(DATA(I,J),J=1,NV)

IF(EOF(5))2,3 100 F¢RMAT(2X,7Fl0,5,4XF4.0)

3 C¢NTlNUE I = I + 1 G¢ T¢ 1

2 I = I -1 PRINT 102

102 F¢RMAT(*1MEANS AND ST. DEVS. r,tF VARIABLES THAT WERE STANDARD lIZED*,//,*VARNrj. MEAN ST.DEV.*)

D¢ 4 K = 1,NV IF(IFLAG(K).EQ.O) GO TO 8 XB = o. SD = o. D¢ 5 J = 1,1 XB = XB + DATA(J ,K)

5 SD = SD + DATA(J,K)**2 XB = XB/I SD = SQRT«SD - I*XB**2)/(I-l)) n¢ 6 J = 1,1

6 DATA(J,K) = (DATA(J,K) - XB)/SD PRINT 101,K,XB,SD

101 F~RMAT(*0*,I4,2F12.5) G¢ T¢ 4

8 PRINT 106,K 106 F¢BMAT(*0*,I4,* THIS VARIABLE WAS N¢T STANDARDlZED*)

4 C¢l\lTINUE D¢ 7 J = 1,1

7 WRITE (10)(DATA(J,K),K=1,NV) ENDFlLE 10 REWIND 10 CALL EXIT END

(END rjF RECPRD -- MULTI-PUNCH 7 8 9 IN C¢L 1)

PARAMETER CARD(S) F})R STANDARDIZING VARIABLES G¢ES HERE

FPLL¢wED BY DATA CARDS

25

Page 30: Stepwise Regression With PRESS and Rank Regression

26

81 82 83 (END cjF REC¢RD -- MULTI-PUNCH 7 8 9 IN COL 1) 84 SUBR¢VTINE TRANS(X) 85 CPWPN!JJYlAN!NRAW, NTRANS ,IDR¢' , IDUM , IRANK 86 DIMENSIPN X(180) 87 C ANY TRANSFPRMATI¢NS G¢ RERE 88 RETURN 89 END 90 (END OF REC¢RD -- MULTI-PUNCH 7 8 9 IN cr/L 1) 91 92 PARAMETER CARDS Gr/J RERE 93 94 DATA CARD MUST HAVE A 2 AS THE THIRD PARAMETER --95 96 THIS INDICATES DATA IS ¢N TAPE lO(DISK) 97 98 (END ¢F INFORMATI¢N -- MULTI-PUNCH 6 7 8 9 IN CPL 1) 99 (END h INF0RMATI¢N -- MULTI-PUNCH 6 7 8 9 IN C¢L 1)

Page 31: Stepwise Regression With PRESS and Rank Regression

VIII. BIBLIOGRAPHY

Allen, D. M. (1971). The Prediction Sum of Squares as a Criterion for Selecting Predictor Variables. Technical Report 23, University of Kentucky, Department of Statistics.

Iman, R. L. (1976). STEPWISE Regression. Technical Report, SAND76-0364, Sandia Laboratories, Albuquerque, NM.

lman, R. L. and Conover, W. J. (1979). The Use of the Rank Transform in Regression. Technometrics, 21, 499-509.

27

Page 32: Stepwise Regression With PRESS and Rank Regression

IX. EXAMPLE

28

Page 33: Stepwise Regression With PRESS and Rank Regression

'" '"

3ANC'IA La~:)R/\T()r IFS ~:a. :Tr PW!S:: c::GP,!- SSrQ~' pe:)~(tlt.4 <:. :nJ?TfSv )F )::'l". '}I:" STII.T:i5TIr: - .(.Q,'I~A:: S7.'lTf u~IVF;·.~ITV

ifTlE APPE,t.R.S oN EACH PAG-E

'1

7. flJ reB:::r:

/<r TITlF .[XlI"'I"l": Of" ~PWI5~ r:PT:~ (C~T4 ;:'!)O'1 1~1P~~ A~) SMIjH. r-. 365- .. C2)

5 '''''''''" ~/IR'''81.£5 ........ .e-- No. TRAl'I$foR fi\eD

QATll.,5,G,1. V~Rrj\8Le:5 {SUT ::)'F~) ... C~~)J ~ If-'PIIT (;1 .. 1'::r.< 1F 0.!l~.!l ... q:~s

MT" "" l. L 10£ ke:A%,) FRo~ C.AkDSAND ~1.I"4q~P D= JA~Il\~L~S ~::!I) IN 5Jr..VfD 0"" DISI(.

~r.. 0~ T~tI~SFO~~=) i~;rAlLfS

JATO. r;S>J~ITIO~ IS

lAP;::-l (1 1=)(1,)(;', '(~.)(4.v

MO':>;;L,C=1"2+~"'t.. ........... 5 IS TKf. De.PENbENT VAR.I~aE) 1,1, 3., If ~1tE INDEf'E"t>ENT ""R' .... !.1..E-S

3TFD~IS~tSIGI~=.~~.Sl~nUT=.1(

-4. SI(;NIP'ICAf\lCE: t..E~l- f-oR ENTeRING V"kl~lSt..es p~ ~ 3 S -. tA.LLIA"~"'.E. AND ¥lOT 'PReSS vI\L.\l.ES

OUTpU~. loll. --- OIA"T'P\A,-: 5 .... ""5 OF 5qAARtS MATR.I)C, c,oRR&I-ATION M~iT'Rj"t, fN'f'ER.SE- ~

to2..Rf:-LATlorJ ~ATJi:.l~, ST&1»S IN ~~5510N. AND R.E:SIt:JI.A./tL.5 PL(.T 1:'1"51')UI'II..<> ~ OtA"T"'f'tAT UNE J'P;.INfE'R "fLO"r.5

F~[ CF F!PA~~TF~~

FN "'~T (51:'f',.{; I

TITlr.~xllMOL~ OF ~T=~WI~I" O'TION (O~TA F~O~ )~llP~~ ll~O ~Hlr~, p. ~&5-~(?J

St~rI~ LllQ0QATn~II"S <~<~ STI"P~ISf ~E;R~SSIO~ C~<~ ~PO~ KANSAS STATl U~IVE~SITY

~- CH~CK ON G~

'2

?~.r['r'CCC

X'

~.CJ'...r:rr:

~T I')'lS~Ru,TiW X4

F=- r. .:J C':! (; 0 C

tID. ",A" J/!.T/I, I:-,PJT = @

s 1,\. ;[,0 (1 J (i

~r. ~~A~~FO~~EJ JqSE~VAr!J~S 13

~r. o~ L3SrQVATIJN3 OR1pP~J

(~TA7 CONT()L C/!'~JI

ISTAT CQ~T~)L CQ~:I

(STI'lT ~n~T~Jl C_~JI

ISTAT CJNT~)L CA~).

(STlIj rO'.ji~)L Cll~ :;,

(STAT cr~Ti)l C~~JI

('::TI'IT r:O"47~)L C!i<.j)

:J a:;:

Page 34: Stepwise Regression With PRESS and Rank Regression

w '"

Vr\~UgLF

N.I\~f

x, X2

" x. v

V'C.;::p,D;L: ~I\ 1'-'1 r;t .;

" s

TITL:~':")(r"'Dl! r~ ~,T,rWI<;F o'rro~ (DUll ::-;l,a"! JP~C':I,:" IlkJ ;~Iv'I:4, 1". ;';S-I..~?I ::.,\(.'

St'lW:':Il\ ln01n.:o"l"'o~!rs ST~PWI3' ~fGf;1:::;SII)N 000> =::O;')M l(!'\'~StS ST~Tf II'~T-Vfr.:~!n

~F.H \In::; I !I. N::'=' ST')" 'J Ell" '>, n. !'";' ~ • -; ... ,/ 7" 4F,1 c; 4

"'3.1"; 'I: ~ 11.7~~"

"'r.C C~"~ ClS.42l:1

11+.1=- O?h 2t.? .1t.l It 1. [?51'. 2~C .1:'7 "~f,. !l4

5.?"~23q

15.S5C,'1 6.ltO?1'~

1&.73B2 1;:;" 0 It:! 7

S.E.= ST· :DEli. 1.61'148 t.V. -= ST:.PE.V. If./OO %7 13 .'1 .. ..fn" 4. ~tr::"'l

1.77"-4f 4.1'-.4214 4.1723~

]C. ~ 2 • 31 :> 4" 4? :;">,, 7~ 15" 77

1~ O'\S~~IJL\~In~;3

Xi V' X'

'" V

v, X?

" X4

NA'"'F

"JAM'

~.~":"f.G?

TITLF~~\(A~Pl~ OF ST:FWn~ DPTTON l)II,fA ;;"~O'1 .)j;,!l~::CI, AND S'''I T"'l , P. 3615-402)

~A~DIA LA~r~AiC~IES <>c> ~T~owI;F ~F~R~sSIa~ <><> FRO~ KA~~AS STATE UI~I~E~SITY

0~ nF snull.::;l::s ""AT~ ?~11~+r2 2.qO~f.(~

3 -~.7?F-~+C'? -i.(,oS'5"+ ... q2~E+(.' 4 -2'. Qr-;o:-.cz -3.l41E+ ~. ",l"'r[+Cl 3.3r,?~+J~

S 7.7~cr+~2 2.2i~E+ -~.1~2~+G2 _?4",?c+;3 ?71oE+J3

NO.

" •

'2 " '" v

iITL~t~.4~PL~ Q~ 5T~fWISE O'TIO~ lo~rA ~~Q~ ~R~P~~ AND S~ITH. P. 36~-4L2)

S6!~OIA LApn~ATO~IF~ c.c. STEPwIS~ ~EGR~SSIO~ c.c. F~~~ KANSAS STATE u~r~E~srTY

E.·;;':lIlT!.o.0§~ ~ FoR. AlJ.. TKE: INPUT ~ATA

,~ ~(

'A Gf

1 • C ~ [1 C .~"'~F­

-.i-'?41 1.r1HC -.13q2 1. r. C! L Ii

~~ ~------

l-h6HEST CORR. ~ XIf 15 FIt.ST IfilP&PENf,)E'NT VAltI.4&L.G .wPEO TO THe. /11\0»61-

• -. :?4~4 -.'H3C • ... ~ ... 7 .. "11 F ,.

N0. 2

" X? "

• u2 Q~ -.S"4"

'" •

1.0 .. ()!l

- - r,~.~~/~=~~-~(~-.9T~==Jo)~(-=.="="~/3~)::-·ts.~ - J ~( 1-(-.97Sofl (H-.8%./3)')

.f30~

= -.\B52 PAkT/AL CoRRELATIONS AFTER "'( ts A1>.D£l) TO MoDU •.

r'3S.lf " -.53'f7- (.0%,$)(-·8"3)

~('-(.o%.95)·) (/-{-.8Z1!i)l)

"S.lt = '" .95~8 .7307- (-.1YSY)(-.8tl?)

-I( 1- (-.2.~5'fi') (I-(-.S1/31~5

HI6HeST I'AIO'I,,," c,oRR. * )(, WILL ~ THE

5£00"0 ",,,,,,"&1£. ""PE.P M'T£R ~ ~

Page 35: Stepwise Regression With PRESS and Rank Regression

w I-'

x. ~ l .. r['Gq·o~

~J 1. c.

~AMf X t.

SOU~r':~

FfGR'::':SSIO'" PI=:S!I'lUII,L' H'IT.IlL

r!TL~.~X~~PL~ J~ ST~~wI~F O~~rON (Q~ro :{J~ JFAo~o AND S~Ir~, ~. _~~S-~L?l

~d~nrA LA8~~~TC~IF~ C~<~ ~T~PWliE ~EG~E3SrJ~ c.c> F~O~ K~~~AS STATE UNIJ~~~Irv

~~~~ OF CJ~~ ~AT~ ~ ~R. Ifll'DEf'ENl)ENT VAR"lA'8Le SEt..!;OCoT£D IN .5Te'P ,.

TITL~.C:)(Afo!O,LE OF ST~FWISf OPTtOfll (DoHA :i{1'1 JRA])ER AND SHIT"i, P. 3f:.~-l.oG2)

S~·\lu~A LIl30?ATO':'.IFS 'ST~DWI5C::: ':l.E:;='.::~SI:)'" c>-<> F"~OM I{t!.NSAS S'7A7 1 1)1>+lvr;"~In

oJ. F.

1 @

AOV T<\~l:: ~~~lY~IS O~ ~~r~f~srON ~n~ IA~I~~L~

rrl'l'l~ 11

O;;;~ "3

lA~1.A:q~2 ~-~11;~.~~6q2 ~!:. 1 ,),,: FI : 2 '- 271r;. 7~"Tl

IS 5.~1:,221

~FGi.. ~~SIr."l

(';]t: FFI(":ENT-; STa"'I['1d=<:CI7~D

0:--. ~ "THE nePEN'DeNT \lAAIABLE- IS INPIA.T V'ARIA&E: 6

:!?7q~52!l

srr;'41FlCAN:::

• t; eli IS

~ ...... 2 :::~U:TFS

• .. 6 73 "161:0 r. (; (' C'J

U~IQUE S~~U::N~~ tJIJ~~f~ FO~

u.s.!:> fO'" TESTING- 110: ~ q = 0

'E (~ .. - ii~):L THIS 15 THE- JST ANaiA TABLE WITMIN THIS klAlII OF STEPW,se.

P:;'i='~~~

.. " ""'IT FREE p.eT~, ,.E. ~ = ~q E(Yi-y):L

--. THe: f!,~)5 CAN 'lS.E U.&e.D "OR oRDaRl/'IIG- THe. RE-UTIVe IMl'O~TAJIIC.E .. \t~~AS THE .. suAL ~)6 cANNOT'ltE u.se.;o FOR oPJ)E'RING-

lU:ct-RESSION IEq.U"TION~ 9 = 117 . .5(.1'!J~- .73&''''81 * )eJ. . --" ~

TEIltM

:J ~ r.:

;).a, Gf

.Ill...;)~~

... .a, T S

(§V k"

Page 36: Stepwise Regression With PRESS and Rank Regression

W tv

T!7Lr.~XA~Plr ~~ ~T~rWT~r O?T[ON (onTO ~~r~ ~~~p~~ A~D S~'lr~. p. ~~S-~L2j , tI. ,":, r

Xi X4

1 .. O')'tH-OC

S~~CIO LArO~ATOclf~ c~c~ 5TrpWISr ~FG~=SSI1~ <><~ ~~o~ ~a~5A~ STArr U'.IVt'~~ITY

0~:>~o:- nF C)t:o;, 'HT~0 _ ~ foR. IN'DE::l"E-NOEtJ"f"" "'ARt~:'&L.~S 5El-f:c.-reb IN STep 1.

? .. El?F-Gl 1.r'~t.f+-Cr.

NO.

NlI"'E n

so u.:(; "r

Rfr:;~FSSI:)N

FcFSFJUAL T(1T AL

p ...... ? IS

VARJ,I\~L~

t,tIJIo1RF'<:

4

X, X4

NIP"'!;"

X4

TITl=t~X~~~LF. C~ STFPWlSr OPTIO~ (OATA ~tD~ JRA~E~ AND ~MIT~. p. 36S-4C?

SA~OIA LA~n~n-oolfS <~<. 5TrpWI5~ ~EG~~SSIO~ c>c> FOOM KANSAS STATE U~Ivr,SITY

:i.F.

z I C 12

A~V TA~L:

a~'nLYSrS OF ~~~~ESS!O~ ~O~ 111=lnlL~ ~---y

(TA~LF l'

~';; MS F ~:Lr;'lTFICAt!r::::

i' k,," 1. oj" 1 G 74.7?"'1?

1':1;? .... 5Cj5 7 .. (.1'62112

trS.1)25Q6

crt">. n:Y I

IS ?1.23q~

STa~"A~[Il~D

~fS~FS~IO~

?tl.UHl sse

T-TF 5T VAL UE~_

","'" :' C[LrTE3

H.!t031 .f7 .. r, -12.&f.'

THIS IS THE VAL!.AE TtfAT R'2.

...... Lt- 'De.c.~ASE "'T't:I 1 F ,clf. IS UIIZI'Jur 5;;:1U:::"lC~ ~JlJ"'n£;: FCi= TJ.iIS ANO'I/~

DP'::--:::S IS 121.

,. ~INAL. REG-U5SI0N ~U..ttTION: y

How 'DELETE-D FROM THe-

J190. "2.~(,

27'5·7"3'

103.0.,738 + I.Ll3'3~593 * ~. - _"'3'353,,3 * }I..,

:I!'l ~£-

.\L:t; A "!I, T 3

r\'~'t1 J, t.JOJC

51&NIFICANce LEI/£L. HAD "J'O BE .(. SI6.N = .05 FOR THE: 'llAAIA"&LE- 11:) E.NT£:-R

Ttt~ ~ODt:L

Page 37: Stepwise Regression With PRESS and Rank Regression

w w

T!TL~',EX"~OL~ rF ~Trr~ISF O~T!ON (DATn F~01 )PAP~~ "~0 S~Ir~. p. 1~~-4~2)

SA~0IA LA~o~nTG~!~~ <><. ST~P~I3E ~E~~rssto~ <.<. ~DO~ KANsas 5Ta~~ Ut.i~F~S!iY +---------+---------+---------+---------+---------t--- ______ t _________ + _________ + _________ + _________ + .11q4~+C-r.+ 0 +

.Q7Q6E+C 3+

l'R>;SS vAClAES -fOk 'Tf'E- TwO -

STE.PS

.S5(14EII-(I3+

.3~5I1,F.r~t

.121?~H3+

I YRE:-S5 :. 11",,0. ~ STEP I

Vft.£-ss ~ 12..1. FOR ~e-"P 2.

! o + _________ + _________ + _________ + _________ + _________ + ___ ------t _________ + _________ + _________ + _________ +

C. .btCCC~+~(, .1:?~~IJF+]t .t,JCn+Ul • 240..([:+C1 .~I);:~'~;;-+J1

5TFD NO.

;J 4 J '.

Page 38: Stepwise Regression With PRESS and Rank Regression

" ;:

. , c'

M , v'

y

"' 0 ~

CO

T 0 ~

" ;', ;; 7 0

" 0 0

'" " ~

" ~ " " '-'

-, n

" ~

~

34

+ ,

> ~ ... "'

, , '. , > , , , = , , c- , ,- + ~ , ~ , c, , , '"

, ~ , , z , ~ , y + , , , 0> , n' , "

, , , , · , · + , , 7 , 0 , :;; , , ,n , I,; , '"

, ,,. " ~ ,.' ,n , n " ~ <A

· V · V

,n H u c , " <V C C ~

~

~

~

~

C'

, , , , , , , , , + , , , , , , , , , + , , ,

, + , , , , , , , , ,

, +

I , I I I I I I , ... , I I I I I I I I ... I I I I I I I I I ... I , , I I t I I I ... I I I I I I I I I ...

N u + U' W .

~ I I I I , I I I I ... I I I I I I I I

"' o +

'" o ~

• I I -oj. I I I I I I I I I o

+0 , " ,~ , ~ , , , , , , + , , , , , , 0 , . , 'w +0 , 0

'" , ~

, 'N , 0 , . , 'n • 0 '0 , .' , , , , , , , • , , , , , I. 'w ,w • , '0 , 0 , , , , , , · , , , 'w • u 'w I '~ , " , , , , , , · , , , , , , , +

, " ,~

• • 0

• 0 "< 0 w · " c'

" ·

®

Page 39: Stepwise Regression With PRESS and Rank Regression

• '" o =

T o ,y

"

z o

~ · H ·

.--+111111111+111111111+111111111+111111111+111 1 1 1 1 1 1 + ~,

> ~

H' vi 1 " , " , >, H' Z, -0> , , "- , ,. + ~,

~,

"" , .n • ~,

'" 7' ~,

~. , ~ , n' n , ~, , . , V • · V •

• · 7. ,.,. H' "" "" t" 1 ~, ". w. ~ .

• , w. .n. H' 3' "-, , . ~ . ~. , .

• · • .. • · ~+

~.

H' ,y • ~. ~ , < , n , c , ~ , < • ~ ,

· · , , , , , , , • · • 1 1 1 1 1 1 • I

N o . Ie ~

'" '" '"

I I 1 1 1 I I I • I I • I I 1 I I I • N .,

w . 0, ~

• <, • <, , <,

''''

· , , , + , , , , , 0 , , . :~ · " , ., '''' ,~ , , · , -I , . . • • , , .N I ,~ , . , 1 ':, · ~ I .• ·

o , o , . 'w ,~

+ C_'

'"' ,~

'N , o , , , · · , , ,

N . " o Q

+ 0 ,¢ 0'" ,~

o , o , , , · o , , o , o

o + , " '0 +0 I I I I I 1 1 I I. 0

• w .-

35

Page 40: Stepwise Regression With PRESS and Rank Regression

w

'"

T!TL~.~xa~PlE O~ ST=PWI~~ OPTION ID~TA =~O~ O~~PE~ AND S~IT~, p. ~6~-4~2l

Sft'lrIA LA~0~ATORIfS <.c. STE~WIS[ ~FGR~S5ION <><> FRO~ KA~SAS STATE ~'JIV[;SITY +---------+---------+---------+---------+---------+---------t----------+---------+---------+---------+

.~77~~+Ol •

• 2li1tEt-Clt-

.ZS?TF.+CC+

R

f S T o U ~

L -.tC;(!f:,E+CU

-.'3265E+Olt

-.50?3E+Cl+ +--- ______ + _________ t- _________ + _________ + _________ + ___ -----_+ _________ + _________ + _________ + _________ +

• HIt! OOr-tel .3Lt(Cr:E"~1 .C;~OQCE+01 .320[]Of+Ol .1C6C[]!:+e2 .130GtE+[j2

® OR,:DE:R IN wtHctt 'DATA WA5 U-AD IN

:JAGI: 1!

Page 41: Stepwise Regression With PRESS and Rank Regression

w .....

TITL~t~tA~PLE o~ ST~fWISE C'TIO~ (O~rA F~O~ )R~'E~ a~o SMITH, 0. 3b5-~C2)

s~~~cu LA!3Q~ATrR.IF:S no. STE"PWI3t: ~~GF:::SSla ... <,.<-,. FROM KANSfl3 ST~TE: UN!IJEi<.SITV

+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+ .1:77!)E+tH+

. ?1J11~+01+

.2'S2TE+OO·

R E S I o U

• l

-.lS(16E"!.'1+

-.3?fSE"('1+

-.C;!]2'~E+C1+

..

+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+ . 7?1!I12'f+ iJ7 • ~1 ~o4r .. 07 • C!O':i17r+1l 7 • 93 ... 69E. HH~ .1J ~4ZE +C 3 .1173F+C'!

~ 1"~'PICTION5 lo\S,,.,G- f-lNAt. ,r:et:r-~S510N eql,l,.ATloN

;) 1\ G r 13

Page 42: Stepwise Regression With PRESS and Rank Regression

w 00

Tlrl~.~XA~PLE OF STEPwISE O'!!ON (3AfA FiO~ ~F~~S~ AN~ ~~Ir~, P. 3SS-~~?1

<"A'lurA LA=lC::o,nCR.IFS <><> sT~pwnl'" ~f:;R:::;SIJ\I <><')0 F~C'" ,<ANSAS STAir U'!.IlIf'-<,ITY

+---------~---------+---------+---------+---------+---------+---------+---------+---------+---------+ ."77CE+C1+

.2C11E+01+

_ .

• 21527£+no+

• E S I o U

• L

-.lC;CIS;:..ol+

-. '3?f-5f.+O 1+

-.5(23E+(1·-

..

+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+ .tDDC~F+r1 .50CG~E+01 .qa~'[~+nl .13anDf+02 .1700([+J2 .2100G~+CZ

X1

:>~ Gf: 1.:'

Page 43: Stepwise Regression With PRESS and Rank Regression

" ," n d u n

o

z c

.. ~ ,. H

· 'd + I I I 1 1 I I I 1+11'1"11'+1'1111,11+111",,11+111111111+ ,J

+ , , , , , , , ~,

~,

'H ~, , , ~,

H' , ::: , c· + ~, , , "', , '" , ~,

"', 7' ~,

". , " ~, .. , " , , . , V , . , V • , · z, n' H' "', '" , III I

"" ". "" ~, , , IL' • ," , H' '" ~, l,' + .-, V' , , , , , , V , , Ul" II­~,

H' ~, W, 0-. , ~,

n' , c, a, ~ . -', , ~,

H' n' 7' ~,

'" , · , I I I , I , I I · ... I I 1 I I I I I I + I I I I I I I I I + I I I I I I I 1 ... , I I , , I I I I + f I I , I I I I ~ ~ ~ ~ ~

~ ~ . . !.oJ W ... .-N Q If'I uW(llHCl::>~...Jlf'I N ~

w • '" U' "­N ,., I

I

• ~ L · '" ~ N

" ... I

.~

1= IV 1-<> I I I I I , · I , I I I I , . I~ Iv

• c. , '" I a

I ~ I I I , I I + I I I I , N I 0

I • I I ~ ." , s , ~. I ,., I , I , I I

• I I I I I ,

'; I I 4 I 0

'0 , '" , ...

I N I I I I , I · I , , , , N I , · I U I L .~

I oc , U I , , , I I I

• , I I I , ~ I I · I ~ Ie +~

Q

" "-

.. X

39

Page 44: Stepwise Regression With PRESS and Rank Regression

." o

TITL~.~XAM~Lr n~ STr~WI~f 03TTON (DATA =~01 J~~~~~ AND S~lr~. p. 36S-4G2) ).1.\ (" ~ 1.:'

S{\\lrJI .... l"RO~!',TCc:r~S <>0 STfTWI5E.. ~~GRESSION <><-.. FROM ~A"1SAS STil,1: 'Havf~,SITY

~"~LF nF r~SIOUA~S FO~ ~AilA3_~S C;;----y

TIt.<:::- (O~SI'":;:VE" If A LU::,.) (i"c;;-JI,';'-) t~' Jb /GFSI)" Ai:) 71\. ~,~ f: C / 7?'.P1Q 2.1nL1:: 74. ;:-CiJ fJ 72.611' LS!VI25 iOI.t.30C r- 13El. n'?~

('/- y) -2.1'S7~?

Y '17.f.;~:J= Y q:J~~"11 -2.ltP;11r gS • C!D 0 ~ Q?31?o ? q "I?" 3f1 1c.C!.20] 135.4.3) 3. 7 7"'. Of, 1 'J2 .7(;.J 103.7~4 -1.03353 7? 5{11,J a 77.~?lft -5.02~38

q~.100U g2 •• 7J'! ~S2gof\2

H 115.QC.j HT.37!o -1.47;>:71 11 fl3. FIr::! Q C R3.~6?g .137001)3 l' lU.3QD 111. ;;~g t.73G5'2 U H<:I .. 4.0C 110.113 -. ng5z1

E#D OF ANAI..Y5fS FeIR 1ST P/rSS oN TttE- vATA.

SANDt A LA30~""TO~I~S <> STfPWIS~ ~SGPFSS!ON ppnG~A~ <> :OJRTESY JF J~~T. OF STATISTICS - (A~!SAS STAT~. U~IVERSIT¥

TITLE,EXA"1!lLF OF (£A"CI(WA c>r:, CPT!O!)(OIloTA FROM )RA.:tS~ AN:) S"11T"I, F. 365-<+(2)

NOTe- THE t.,,"8e1.5

cAJU) '5 ~I~ING- : l-ABeLS WII-l- CliltRY ou'E'R FROM rs-r "RIJ."".

OATA.5.G.Z. .I

'D!rTA WAS 5J4.YED oN DISK ON TltE- r,,,"ST

'PASS.

"10:J r L.:.""t+;?+3-1-4 5

IN?UT CH~C< JF PA~A~~r~~s

~UMRFP ~= ~A~I~~lES ~~A] IN

NU. nF T~A~SFOR~~~ #A~IA.~lES

QtTI OIS:tOSITION IS

GACKWAr:'C.SIr;=.C5 -;JI> L5;"&'I- foOlit1>el-E:.TINt;. "'~;~1A'BL.~S

ClUF'tJj,r1~;,SH.PS,?F.S!CUnL~. -If. NO ,N'IellSe. NO Stu\5 0"- sq~5 MA~IJ{

Pk~35

fNO OF PtJ,j:"A"'=:T 17P S

(STAT CONT~:Il. C:fI,~J'

(STAT r.')~T:(Jl Cd ~ j.

'STAT r:n~~":' ~)L G~-(J)

(STH CON";c,JL ell:(::; )

(STAT r;!1'l-;-iJL ca.~J)

{STA';' S'1~n~)L Gil, -( ~)

Page 45: Stepwise Regression With PRESS and Rank Regression

... ....

Xi

7.000£000

YU!A8L£ NAME

Xl ~2 .3· •• y

'Z

'T!TLE,EX.al'lPL£ 01= ::!A,Ct(WA~O OllTION eona f'~O" :lP:l:J::~ ANO SMITI-4. p~ 3f..5-'+021

SANOIA LAAO'::ATOFtfS c:»-c,. STf.PWI3iE ~~:;R~iSloN <:.ou, F~Ot.ll KAN'SA3 Sf ATC UN.I"E~SITV

"

I~IDUT ~~ECK ON 081A

FI~ST O~S~w'ATION '. • 5

;:>6.C~()COC 6. ,)tjOu OGD M.lIJDODO ,'.;;'BDDDa

NO. RAW !JATA I.,.PJT l-:!

NO. TRAt'SFOR"'IEl llJ'SERIt'ATI')N5 13

~n. Of' rBSEeyATIJN$ OROPPE)

TITLE,:XAHPLE OF QlCKMARO O'TION (DATA ~~~ ~P'~=_ AND S~IT~. p. J&~-_G2'

SA~OIA lIB~AT~QI~~ <~<. STEPwtiE ~EG~ESSIO~ ~.<. F~OM KANSAS STATE UNlvl~SITV

VAPIABlF IIIUI49F.;;;> "'FAN VA~UN:E 511'). 'lEV. STD. EPf.

t 7 ."f;1r::;:~ 3"-.f.O'Z6 5.e"z3Q 1-631"6 2 ItS.tS] ~ Z1o.2.1_1 15. r;.61l q It.!l')"!! 3 It.16Q? 41.Jl256 fJ .... U51J 1.776ft€:

• 31J.1H3 C 211;0.1&7 "1;.13S2 4,£,1.234

• ~t;.42,H 22~.314 15.0457 .It.t723R

13 OBSEltflTIONS

K1 12 n ... ,

N/l"'E

1 2 1

• 5

NO.

TJTlE.EX'"~l£ OF RACKMA~O ~TION CO_1a Ct01 >lIPER ANO $MIT". p. 3~~-.uZ'

s.a"oIA lA~o~aTORtES <~<~ STEPWISE ~E:;R£SSION <.<~ FR:O,,", K_N$AS STIlTF uulv~RSlh

1.COOO .lZ..:£,

~."241 -.2lt5"

."n01

1.001:[ -.13~2 -. q-r:aO

.Mt:,3

2

tl:1 )(2 J( 11

1. aooo .O~~

-.5347

(Rnuw.:·ING f •• O~ IN ROW» .....

CORRELaTI~" ~ATRIX

'"

1.0000 -.A2'13

• 1. GOOD

5

• FU\6- lOIDIC/tTlol6- -V -..~ INPUT .... ~5. at:: MATllI. 15 ~ A HESISMoE WlLL." ,...III"ft=.'D AIIO f'Itoc.aSSINfr $~'P •

"l:;E"

;)flGE

~. V.

r 8 .. 8~ 32.31 i4.42 55. r3 15 .. 77

;)~GF

Page 46: Stepwise Regression With PRESS and Rank Regression

... '"

rITLE.~XAMPLF OF 8ACKWAFD O~T[ON (JATA ~~O~ JRA3~~ AND S~ITH, p. 3~5-4C21 '0. ';[

SANOIA LA~O~~TO~TES <~<~ STEPwISE ~fGRESSrO~ <.<~ FROM K~NSAS STAir Ult!VEqSITY

sou~r:E" D. F.

AOV TA"L~ 'NALYS!S or ~EG~ES~IO~ FOR VA~InlL~ 5---'

(TAr'lLF U

ss "S • P.FG~ESSION

RESIDU~L TnTAL

4 2en7. ~gg4 bIJ6.Q,,.Sr, 5.qI\2~t;r.q

111.47917 @ ~7. 86363q

12~ R··2 IS .9"2~f\ I"TF.~CE~T IS f2.-'+0536g STI~O.~D E~qaR Of I~TE~r.E~T IS 7C.0710

V4RIABL'S" "I !IHAqL'= RE GRfSSIO"l NU"aER NA"'IE tDEfFrrIENTS

1 u 1.5511 C26 Z <z .5101b75R 3 n .1f}1QOQ4C 4 •• -.14i+O(;'10~

UNIQUE S~OU~NCF NU~BFo FO~ 1415 ~~nvA

pp~ss IS 110.

Sl'A~O"I;!(;IlI=;O C'EGR.FSSION

t';JEFFICIENTS .6G6512' .5~770o

.0431<=10 -.1F-O?f'.7

103

PAR-TIAl sse

Z5.~50) 2.g7?5

.10Q1

.2'-'+1'0

\ i-TEST VALUES B'D.F.

2.01327 .7(l4g .135C

".2032

SIGNIFICANC~

• DC: 0

1=:··2 ALlI-tA DELETE ~HS

• 97 2~ .J7!S .9"l13 .'J 13 .9623 Gr~6:D .9~ 23 • 1t42

SINCE IIh:t VIiIItIML£S wa_ fORt.£-1) TO

ST,.y IN T",ec ~L, Ttte. VAR.IA'BL.-=

wITH ~ .LMi"&5T .Q > 51.· '=- .05 ."u.. 'II£:. 'DE1SiT&D.

Page 47: Stepwise Regression With PRESS and Rank Regression

.... w

T!TL.~t::)(AMPlE OF' R!I.''';1(WII;O:O O~TION (DATil. C"~O'1 [)R-AtJ::~ AND St"'ITfi'l P • .365-.. C2)

SA~OU lA.a.Oi;l,A'TO:;"IfS STEPWISE ~EGR~SSIO~ <.<> FRO~ KA.NSA.S STATE U~IvEq~IT~

SOU~CF O~ F ~

AOV TBL:: A~ALYSIS O~ REGRESSION ~O~ VAQIA1L:: ~---~

(TADlf 11

55 "' F

REGRESSION 3 2667.7qO~ 8~q.Z'lr.1j 1o:;.~31&" RESnU~L ~ 47.q1?72q TOTAL 12"--- ("115.7631

R;··Z IS .Q"234 I~ER:r.EPT IS 11.64A307 STA~DARO ER~O~ ~F T~TERCEPT IS 14.1'+24

VAttIABL~ ,aIHASLE QEGPfS'SH'I~

lrfUfIfeER NA""E COFFFlr:IENTS

1 U 1.c.51Q31\O Z '2 .. lt1F.l0Q16 ~ •• -.23fl'i4iJ22

5.3303033

STANDA~OI!E(!

F.Er;RE~'SIaN r::)~FFICIC"NTS

.'30&1137

.430414 -O'''61tlP3

UNIQUE SfJ~=~r.f NU~REP FoR THIS ANOVA 104

P~ESS IS ~5.~

PiI,'(TIA.L SSQ

8Z(!.9[74 2&.71Ql,.

9.9318

~ T-TfST VALUES} ~1).F.

12.4tH 2.2418

-1.36SG

SIGNlFICAIiC[

.0 CO 0

R-" 2 OElfTES

• 6e 01 .9725 .q787

TITL~t:XA"PLE OF ~ACKWA~O O'TION (DAT' ~~O~ ~~"E~ _NO S~Ir~. ~. ~65-4ij2J

SANDIA LA~O~BTO~IfS <><~ STEPwISE ~EGREiSIO~ <~<. FROM KANSAS STBTf UNIVE~SIiY

SOU~CE ·0. F.

lOV TAll:: ANAl~SIS OF ~EGPESSION :O~ VA~tA~L; ;--·f

C"!'Aqtc:' l'

SS HS F

"'EGD.ES'SION ? 2&57. ~'5AF.. 1328.,'~2~1 ZZIJ.S037G ~ESIDU~L ® ~1. 904483 TOTAL 12"-- "7115.11;31

R"'.Z IS .1J1"ft~ INTERCfPT IS 5Z.'577~~q STaNO.R~ E~~~~ O~ INTERCEPT IS ?Z~6t.,

VARIaBLE NUM8ER.

1 Z

fAf(IA1!IL": PEGRfSSIO~ NA"'~ COEFFICIENrs

(1 1.46~30~7

'Z .66225Q4q

1j.1qO"''''~3

'STINDAIH 17EO P.~G=ESSHIN

r.OE'FFICIE"TS • '574137 .6~5017

U~IQUE S~O~~~CF ~U~BE~ FO~ T~IS 4~nWA 10~

PARTIA.L sse

848.431~

lZQ1.1~23

\ T-TEST} vALUES IO'D..F.

1?.t047 14.4 .. 2"

SIGNIFICANCE

• lie::. 0

P··2 OElfTES

.f..&63

.15331

~~ GE

Al;;l-tA Ha.T S

.0001

~ ~

NEXT VAfUA'8LE­~ 'Be. 'OCt.eTE'P

:»4GE

4L!Jo-tA I-'U S

~ ~

'&OT1+ (. 511i- ~ .05

~ NO ~ 'DEtETlONS

Page 48: Stepwise Regression With PRESS and Rank Regression

"" ...

TITLE,EXANPl~ O~ qftCK~a~O O~TION I~ATA FiO~ )FA~ER ANO S~IT~, p. 3f5-4G21

SANDIA LA~o~nTO~!FS <~<~ STEPWISE ~EG~E5SIO~ <><> FRO~ KANSAS ~TAT[ ~JNIV[R~IT.

+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+ .1103E+OH (£>

.1DS3E+O:H

.1003E+03+

p

R E S S

.953SE+OZ+

,Q03;E+02+

STE"P I .PI "&lCK:WA1a'>

o Str'P 3

+

• ~5~SE+t2. e s-n:P 2. +

+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+ D. .8COOOE*00 .1E1D[HJ~+:]1 .Z!tQIJO'E+01 .32/JOOf+lj! .ttCJOO;-+Ol

STfP NO.

:l&, G t

Page 49: Stepwise Regression With PRESS and Rank Regression

... '"

TITL=.~~nHPLr. O~ ~ACKWARD O~TION ()~TA F~O~ )P.A~~~ AN~ S~ITH. p. 3~5-4(2)

SA~OIA lABO~ATO~IES <~<~ ST~PWI5E ~fG~ESSI3~ c~c» F~OM KANSAS STATF U~IVE~SITY

TIMI: 1 2 , .. , 6 7

• • if 11 12

"

TA8L~ OF ~ESIOUALS FaR vA~IAa~~5

OgSEQV;:J VAI.UE' 7 ... SC IJ Q T~.30!)1]

1!Pla3!lO ~7.6C.JC

QS.qODO 1"Q.200 102.70n 72.C;000 q~.1!lI!O

11c; .9~e 8~.PIJOG

113.~eG

10Q.4GI}

PREC1I:EJ II\.:"JE M. ~ 7",1) Tl.~;;:n

t.]5. 'H5 ~q.~;~5

97.Z:}25 105.llij.~

10"'.1i02 74.575_ ql.?755 l1lt.'?'3-:l eIJ.:;'JC;7 H~.43r

H2.zcH

s----¥

END of' AfIIAL.Y5rs FO~ 2.Nz:, 'P"'~ oN itt£:- "DATA

PPSIJIJ AL -1.51'+OL

1.ij4q[JA -i.5llt7:' -1.&584A -1.39251

I., aOl.t751 -1.30205 -2.07542 1.82451 1.36246 3.26433 .1'1&2756

-2."9::';44

l'A:;E

SA~DIA LIBO~ATOF.lfS c» STfPWIS~ ~Er,RfSSION PROGfAM c» ;OJRTESY Jf l~PT. OF STATI~TICS - ~ANSAS STATE UNIVf~SITV

TITLE,E.)fA"1:tL:;: OF~II,NI( qE'G~fSSIOB>I'tIfH THE ST::PIIIIS!: OPTIIJNIDATA F?Of1 npAP~r::IS~J'71.1

OAT A'I'5.C.? J~PUT CH~C( JF D.~A~ET=~S

NU~~€R O~ '.~I.~lES ~~~~ [N

Ne. of T~A~SFO~~~J #A~I~BLfS

JATA CIs:tnSITION IS

LAP[LC11=Ii't\NI{1X1'.RANKfY2t,~A"'I(O(JJ,roA"Ii(C)(f"'.~ANl{nl -7 T'tfE:SE LASE:LS WIu..kePLAG~

'1"1te: ~L!i 1A5." oN THE:. MnD~L.~=1+2+~+4. ~V'IOU.S ~ l"A~SE:5_

STfPWISf,SIGIN=.aS,SIGOUT=.tO

PC['Ss

F-ANI( RFGQo~SSInt.l ---iI' A 'JU:Go'll:IiSSION ANAL1srS WIt..L. '!te: l"EJ:.~""&c1:> ON "f1t£ JI."N"-S Ot=- T1tE. "DATA..

OUTPUT,('D~Hi,STFPS,t;ESIOUALS ........ SH'\E ou.Tf'MoT A'5 2.HD 'P~5.

~~r OF PA~AM~T~PS

(STAT ~~~T~Jl C~~]I

(STAT CONT~Jl C4(J,

(STAT CONliJL c~tJ'

(STAT CO~T~Jl CAIJ,

(STAT C~~T<JL C4(Jt

(STAT ~1~T~JL C4~J)

(STAT CQ~T<)l CA~JI

(STAT ca~TIJl C~,JI

Page 50: Stepwise Regression With PRESS and Rank Regression

..,. '"

TTTlE.FXAMPlE OF ~ANK RfG~E~S!ON ~IT~ Ti~ STEPWISE OPTI~~(JATA F~C~ C~A~r~/SMIT4 ;)~ ~E

SANDIA lABOPATO~IfS <~<> STEPWISE ~fGRSSSIO~ <~<~ ~RO~ KANSAS STATf UNIVfF.SITY

INPUT C~ECK ON O~TA

liOTE: T1-I£.~e AfI.E:. Ttf£ RAfIIt::"S A55IGoNED T'O TtU: f",R:.r oM£'It.VATrON.

R:A""I«()(1) ~AtJK(X2' CFIill OP:S::PIATION)~ AA,."s All.&. A,5516"E1) 'B.)' VARIASLE- fllWII1!o£:-R.

PANK(X~' PA~~(X~j ~A~<'.'

2 5

e ODOD]) 1.C~(;iJ[jGO 2.5GOOOOQ l::ri.]OOOOD ~.OODOt)QO

'" ~1'AN1" ftJi,5I.LLTI"'. PllO'"

"£-'0 IHPIo\T 08-

~ .. 'llATIONS.

VARIA9lE' NAME

PANII{('(11 RANI{(xZl RANI(OC~"

R,4NI«(X,,"' ~ANI«(vt

yARIII ~Lf NU"l~£P

1 2 ~

• 5

G' oqsnwniO§)

~ANI«(XlI 1 1.cooe R:AI\IK(X?' 2 .3:')01 ~ANI((x3' 3 -.71AG ~a"4I«(X'+1 , -.:u 20 ~ANI«(V' < .1%2'

"0. NA"4E ~ANI(f)(l )

NO. p~w ~ATA I~PJT l'

N~. T~ANs~n~Mf~ JB~ERVArIJNS 13

~O. D~ OBSEPVArY)NS J~OPPSO

TITl=.~XA~~Lf "F ffA~K R~G~ESSIO~ wIr~ T~~ STEPWISE OPTION{~ATA FROM ORAPER'S~ITH

SANDIA LABO~ATORIES <.<~ STEPWI3E ~f;~EssrON c.c. FRO~ KANSAS STATF UNI~FRSITY

'1EAN

7.COCOO} 7. Ilt (I~ G 7.0liGGO ?flODao ) 7.00GCG

Iwel<A<>E ....... 11' /r!i6,&NED FOR 1'3

O&&&1t.VAT,ON6

157.

yAPIAN:;E

1'+.5"17 117.1250 14.qH7 15.C8J3 1".1657

STO. [lEV.

!. '113]5 3.88909 1.662~1 3.88373 3.e9ft.ltlt

.l-~ WJn+oUT TJE-D ~VA.",oNS '1"M-f;. \Ml.u..£s

ALL at:"1"HE ~Me. WITK,N /lr cm.u.",'"

STD. E~~.

1.~5763

1.1]7864 1.[J7116 1.e7715 1. (;~012

'" ,'" ""TKese COUlMfIIS WILL.

TITLEt=XAMQLf or RANK ~fG~ESS[ON MtT~ T~~ 5r~DNI5E QPTIryN()ATA F~OH O~AP~KIS~ITH

SA~DIA ~A~O~ATORlfS STEPWIiE ~E:;RESSION <.U' F~OM KANSAS STATE. mnVE=<1SITV

COR~ElATIO~ ~ATPIX

} t.couo flANk. CDR1t.E:L-~TIONS .O5?7 1.01] 00

-.QgC3 -.OAoC6 1.00ro .737~ -. 44 R~ -.71721 1.Hon

4 5

F-ANI( O(;?I ~llfo.{I(O:.!) I': ,IHH< O( '+, ~I\!IIi(n.

:JI1.GE

::.1/.

'+. It' 5.% 5.17 . 5.4' 5.63

J

:'G, Gf.

Page 51: Stepwise Regression With PRESS and Rank Regression

... ...,

TITLE. EXAMPLE OF RANK REGRESSION WITH THE STEPWISE OPTION(DATA FROM DRAPER/SMITH PAGE "

SANDIA l~BORATORIES <>(> ST[PWISE REGRESSICN <><> FROM KANSAS STATE UNIVERSITY

SOURCE

REGRESSION RESIDUAL TOTAL

R**2 IS .62600

D.F.

11 12

IWTERCEPT IS 1.3438J~5

STANDARD ERROR OF INTERCEPT IS

AOV TABLE ANALYSIS OF REGRESSION FOR VARIABLE 5---RANK(Y)

(TABLE 1)

SS

113.'1.3123 68.368768 Ul2.00000

1.-\8783

HS

113.-33123 6.1880698

F

18.411433

SIGNIFICANCE

.0013

VARIABLE NU"BER

VARIABLE NAME

REGRESSION COEFfICIENTS

STANDARDIZED REGRESSION

COEFFICIENTS .79119<;1

PARTIAL SSQ

T-TEST VALUES

R**2 DELETES

ALPHA HATS

RANKCX1) .80802292 113.9312 1\.2'90, 0.0000 .0020

UNIQUE SEQUENCE NUMBER FOR THIS ANOVA 106

RANK fIT GIVES A RAW DATA NOR"ALIZ£D R •• 2 .64292846

COEFFICIENT OF INTERPOLATION

PRESS 1'; 'iD.l~l

.68'+O~359E-Ol ~ THE U.5ER.. SHOU,LD NOTE THIlroT Tttt.5 R .. - 2. VALU.E IS c~u:.~ATED /16

1: (Yj- 1>"1. 2"(9,- n"+ ~C'Ii-Yi)'L

NCD S/NCii THE Vi VALUES IIAve; BEEN OJroO\II'£1) THRQ .... GtI INTERPOLATION T\iIS

CALCCALATION WI~L JIOT fG,C,e.SSAfULY ~T IN T-f+£ S-'ME R ..... 2, VALlA.6 AS iME

.. 5 ........ LAL.C.IA.L"TION

~(9i-y)1

LCY;- Y)7..

S/NC£ THE Yi _" ...- ...". 111£. J..E.A5I'" "" ..... ""'5 ESTIMATES, THE q ....... ,TIES E.(9i - 'if AN!> "2. 't';" ",,"-~('f;-~) ,,1\& 11m" _L. T~ IS, THE _ .... _-Vf\01> .... C.T5 <. ('Ii-)';) ('I,-y) IS

~-z.tERo.. IF TttE. su.M oF CR055-~TS l5 JEA.~ ~1IIP, T'1t£N T1fE (.OE:"ICUiNT OF IN1"ER":

1'OI..ImON "'ILL ilia fIEAR 'UiRO.. IF IT ts LA'R6-E, '11tEfil THE: ~H"iC.IENT WILL BE. M&A.'R ONE ~

ge. 'TME Te'IIT fICIR. ~ 'PISClA'SS,O,,", ..

Page 52: Stepwise Regression With PRESS and Rank Regression

.... '"

TITLE,EXAMPlE OF RAN~ REGRESSION .ITH THE STEP~ISE OPTION(OATA FROM DRAPER/SMITH

3ANDIA LAaORAiORItS <)<) STEPWISE REGR[$SION ()() FROM KANSAS STATE U~IVERSITY

AD" TABLE ANALTSIS Of REGRESSION FOR VARIABLE 3---RANK(Y'

(TABLE .. )

;iOURe€.. D.F .. ~s MS

REGRESSION RESIDUAL TOTAL

2 10 12

H2.92283 1~.07117'j

182.o000a

81.4EU13 1 .. 9::1111 15

·-2 IS .. d9;;)18 HERC.Et'T IS 6.5099 jtl~ :ANOARD ERROR CF INTERCEPT I~ 1.31211t

\tARIABlE NUfl:SErI.

VARIAt:\LE NAME

REGRESSION COt:FFI':IENTS

RANK(xlJ .621~4180

RANK(~4J-.~~1~4162

STANDARDIZED REGRESSION

COEFFICIENTS .EiOf!~Ol

-.550.2'

UNIQUE SEQU~NCE NUMUER fOR. THIS ANOVA 107

RA~K fIT uIVES A RA~ vATA NJ~HALll£D R*.2 .52844314

COEFFICIENT OF I~TE~POLATIO~ .l'+5'sanOOE-Ol

FRE-S8 I.::; 35.'i~O

PARTIAL SSO

59.98.22 48 .. 9916

F

42.100384

T-TEST IJALU£S

5.601.:5 -5.06113

SIGNIFICANCE

.0000

R**Z DELETES

.5aSa

.6260

PAGE S

ALPHA HATS

.0007

.0011

Page 53: Stepwise Regression With PRESS and Rank Regression

.. \D

TITL~.€XAMPLE OF R~NK REGREiSION ~IT1 TH~ Sr~FWIiE OPTIO~().T6 r~o~ DRAPfk/SMITH

SANDIA l~80qATOPI£S C~<~ STEPWliE ~E~R~SSIO~ <~<~ FROM KANSAS STATE UN!VE~SIT¥

+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+ .CJ035E+OZ· +

.YQ45£+02+

.6P!.C;;5E+OZ+ p P E S S

.51f]5E+O~+

+

.4G7~E+D2+ •

. 35~&f+O~+ +

+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+ (). .F..CCDDF+OIl .120Cf!f+U1 .l~(I!10E+Ol ·.24DIHlE+Ol .3001l0~+Ol

STEP NO ...

;Ja :it:

Page 54: Stepwise Regression With PRESS and Rank Regression

'" o

TIHF 1 ? 3 4 5 6 7

" 9 10 11 12 U

~IlNI( OF .. 3.e 2.r q.O 5.0 T.C

1 D.C '.C 1.0 0.0

13.0 4.0

12.( 11. [

TITl=,EXAMPlE OF ~ANK RfGRElSION ~IT~ r~~ ST~PWI5E OPTION(JlTA FROM LRAPE~/SHITH ~n rtE

TAALE OF FESIOUA~S F~~ VA~I4a~!s Ij----PANK tv)

PREDICT~o ~ANK OF Y f ANI( P,fSIJUAl iAW r ~ "..RAW YHAT RAW ~tSHlUl,.

~. ~7qq~ -.3rqg7g r 5 • 5 DOC "'n&rSE. TWO 80.S13Q -2.013B 1.1'345'3 • "65ft1? 7 .... HI00 r,Dl.I.A."'NS c.Atl 72.7422 1.5577:;; 11)."300 -1.~3'D02 104.300 ali; St\VE.'D oN 109.366 -5.066D~

6.QflQ23 -1.')&923 ~7.&DaQ 1)1St( 10 ~;:~~~~ -S.213H

£o,.1:-:71)9 .. 111;2;'1] Q5.9.00 2.411t.~

10.0027 -.270,,1r.E-t12 109.200 1[Q.2Cl -.5417H='-Co! <:I.a6~17 -1.:16617 102 .. 700 104.624 -1.~Z4?1

2. 2~167 -1.'n:767 72.5000 75.2982 -2.79"2l ">.Q6269 .37313I)E-,:.1 <n.l00G 92.89 .. 8 .. 2052~1 10.72'133 2.27015 115.900 10q-3~6 6.'55415 2.76q21 1.Z1J n t!3.80DO 71~&llt7 o.1~S3J

1 t. 6513 .34266060 113.31l1J 111.Q6't 1.1'3f)!tj 1~.1035 • e~6:; 21 11lQ.400 109.221 .t7930tt

~~5I)JAl SU" OF $QU~~ES O~ ~AW DOTA = 200.Ui2 EQ.HIIIIATICN ON FND OF FILE - U"IT '--- -..........- ~

5U ,"""til ~ t,ONO"Ii-?. ll'!l"l'3l) i'"OIt.. MI E?Jt'PL-ItNATION o4=­

~., T1t£:SE -a.eSIDlAAL5 AlUi: c.ALColA.I-AT£"D *

Page 55: Stepwise Regression With PRESS and Rank Regression

U1 ,....

SAhDIA LAeDR_TDRIE~ <> STEPwIS£ REGRESSION PRCGRJ" () COURTESY OF CEPT. CF STATISTICS - K_NS_S STAlE UNIVERSITY

T*E.R.E- A.l\E. NO LII.BEL5 S1'"ECIFIEP

FoR V ... RI .... bLES

7) 8) ANt> 9 AS

,-HESiC Uoo1!,ELS

WiLL B& EHCOO1:0

1'1\01'11 THE TRI><NS­

Fo1<.MAT10N C.I>.RD.

SPE.C1F/&S WHICH

Or THE INPUT

vI'<RII'<&LE5 IS TO

61: US.;. P AS Tit 1:

""EI<7I1T5 :

TITlE-.. EIEHT.TRAN5FOR"ATICN. AND DROP OPTlONS EXAPlPlE .. EPER AND SftIIH D~ Stb.ME: \3 'Po\NT 'DA.TA AS USED ON 'PR.E.ViOUS

INPUT CHECK OF FARA"ETERS

N~"BER OF VARIAElES READ IN 6

NO. OF TAANSFOR"ED VJRIABL£S = 3 ______________ ~

CA1A DISPOSITION IS

.&." ......... ,.6.~~ ... "' ..... ~ ....... ,>"'WEIGHT ---. WEIG-ltT IS :ruST A CONVENIENT l.A-e.EL

STEPWtSE,SIGtN=.O~,SIGOUT=.10 TO OSlO tI&RE 1><5 TKE "'''IGtlTS ARI:

£'NTIo.REI:> I>.S -rttE IDTI-I VAR.IAl!.LI:. ~EL'5=1.2.3+~+~.B~, __________ ~

'-.".. NOTE: VARIABI..., NVMS!:R I> '00&5 OUTPUT,STEPS c-ARn ~ ,",OT A1'"P&1'<1\ oN Till' MODEL

~------~======::::====~ TR .. SFORM.TIONE.l.@= ••••• =l~ ol'>S .. 1t~A'ION N""'~ElI. S "'ILL l!oE 1>1'.OPP&D

€IGHT=9 X TltlS cARt> CIU'A'-&S ~ NE .... VAlUA1'>I..E5 END OF FA~UETERS 7 & AND .. ""Hle-I\ I\RE RESl'IOCT1'iELY

z{~L TO "1-,"-, y:~1-, ANt> )(,)( •• FORflAT(6FE.OJ

THIS E)c'AMPL.E. USE;5 'tlEIG~TED R""'G-R~SS'ON. Tt+E: We:.1G-ttTS MUST 6S USER S,Uf'PI-I\:"D

FOR E~Cf.\ O1!.SERVA"TION AltO A"P'"Pi=.A.P. ,...S ON~ O'F Ttti=. I1'lPUT \/ARIASLi::"S; IN THIS

GAS€: Ttl!E: C.Tt4: V .... RI"'SL..£;; "POSITION CONTt\,NS Ttt~ WE.\G-+\TS. ro"R,. 1"'l1ft.POSi=S Of"­

IL.L\J5TR.ATlot-l Ttte. K'E:.SI-ouA.LS ~RoM T~ "PRE.V'OUS ST£:."PWI5E. IIoNAL'ISIS ON Ttte,SE:

13 OATA ""P'OINTS ..,·ERE:. S~L.i;,;D 50 TI+A.T THE: L.AfI;.6-EST RESIDUAL INAlSoSOL-UT&

"ALOE. (5.02.33&) WAS REct'ReSENTE,t) ~5 "Z-E:RO .-.Nv 11\~ S~Al-L-t=-S,- f.I:B5ol-lin=

1U::SIDUA.L (.13010'&3.) wf,s REl"R£:.SENTf:D "SY ONi=:. 1.-1I"'i';,6.:R, INT£;~l'Ot...l\"'IO"N wAS

USE:P Tt> SC~L.~ I.U:. OTt\~fl "RE;S'"DlJA~S • .,.-tti::.SE 5CAJ.....I=:b VALV~S WE.'R€' TtTE.N

use:!> "'50 WE;IQ<t\TS. "Pl-~AS£: fliOTE Tt\A.T Tt-'rts l5 ~N 'ii:NTIRc.L.Y Af:,,~I"TRA:RY wAy

TO AsslG-N wel6-l-\T5

STE.f'V./ISi:: eAA.tJ\pu=;.

(SlAl CONTROL CARD)

ISTAT CONTROL CARD'

(STA' CONTROL CARt)

(STAT CONTROL CARD'

(STAT CONTROL CARD)

ISTAT CONTROL CARD'

(STAT CONTROL CARD)

CS'A' CONTROL CARD)

(STAT CONTROL CARD)

Page 56: Stepwise Regression With PRESS and Rank Regression

'" '"

TITLE-WEIGHT.TRANSfOR"ATDON, AND C~DP CPTJONS EXAMPLE - (DRAPER AND SMITH DAlA) PAGE

SANDtA L~8DRATORI[S <)<) STEPWISE REGRESSION <><> FROM KANSAS STATE UNIVERSITY

INPUT CHECK ON DATA TH~5E LA&ELS ARE CR~ATEb

Kl FIRST CBSERVATIC~

X2 Xl X, y WEIGHT

i BY TH-E. ?RoG-RAM

~ 8 2 3 • 6 1 8

1.0000000 26.000000 6.0000000 50.000000 18.500000 49.000000 3600.0000

e •

_ ( 2.1100/3 - .137083 )

J 5.02.338 - .137083

1\2D.00000

NO. U~ DATA INPUT 13

NO. TR.NSFORMED OBSERVATIQNS 12

NO. Of OBSERVATIONS DRCPPED = ~ OBSERVATioN NUMe,.", 8 wAs t>RQPf'i,D

WEIGHED REGRESSION RE'UESTED.@IGHTS APPEAR 'S VARIOElE kUMflER Y

Page 57: Stepwise Regression With PRESS and Rank Regression

'" w

TITlE-WEIGHT,TRANSFORMATCON. 'ND DPUP CFTJO~S EXAHPLE - (DRAPER AND SMITH DATA)

SANDIA LIBORATORIES ()() STEPWISE REGRESSION ()() FRO" KANSAS STATE UNIVERSITY

VARIABLE VARIABLE NAME ~l"BER JIIlEAN VARUNCE STD. DEV. 5":;. E~R.

Xl 1 1.~1"06 31.8115 6.1411jl1 1.7J5011j X2 2 50.2tUlO 2'f".02" 1S.G21l ".SOIlj~1

Xl 12.00f2 H.lle2 E.HGH 1.8S22G H 4 21.6908 286.1)38 16.S126 ~.88226 1-1 1 SS.031'f 16184.2 121.211 36.12H· ~-. • 100:8 .. <;8 .. 1254S3£+01 1l;!Q.19 323 .. 312 1" 9 182.168 3652 ... 8 191.115 55.1701 Y 5 ')6.8853 157 .. ~HO 1,,* .. 06'38 •• 0061

12 OBSERVATIONS

T,",-ES£ STATlSTICS ARE ALL TH..E. RESULT OF-' If'oiE\frl;'TED C.ALCULATIONS, I. E ~ )

WEIG-fl,"P MEAN ~ wi Xi

2. wi

COM'P'ARts'CNS c..I\N ~e. M""OE. WITtt TH-i:;. -pf(.~\lIOUS UNW~IG-R"i£'D E.)<AM'Pl....E •

PAGE 2

C.V.

83.39 ll.0G SJ."~ 61.08

142.88 108 ... 86 101.H

14 .52

Page 58: Stepwise Regression With PRESS and Rank Regression

'" '"

UNrOUf SEQUE~G~ Nn. ~ 101 ~~ALY5IS FOR OEPtNOF~T VftRIA~LE 5---V TITLE,ElCA'4PL;:: OF STEPW!5E OPT 10", IOATA FRO .... DRAPf'~ J\NO S~ITi, ~. ~&5-~]2)

.tl1S6193~+03 4 -.73~151BIE+OO EXAMPLE OF TKc INfORM,\TION

WRllT'EN OllTo TAPE 19 .

UNIQUE SEaUE~C;:: NO. = tez ANALYSIS FO'" DEPEt..IDF.NT VUUA!lLE S---1 TITLE,EXA~Pl= OF STEPWISE OPTI~N (~ATA FRO~ DRAPE~ A~D S'1ITi, P. 3&~-4l2t

.tCnQ1'.38E-+03 .143QQSS3E"'Cl 4 -.6p:Q53ME.oa \ } TWO "&LiltNk cAkD 1~""G£S

UNIQUE SEQUEIriCE NO. = lCJ A~ALV'SI'S FDP DEFENDENT VII,RIAIJLE 5---\" TITLEtEXA~PLE OF aACKWAPD OPTION (DATA F~O~ DRAPE~ AND SMIT~. P. 3&5-~02 •

.. 1 r .~?40"i36qE+02 1 .155110?6E+(ii 2' .51016758£.00 3 .1IJlqOq~GE.DO

It ~ ,. -.1440HO.lE"'OO COItoa£Sf01'ID5 ""' ~lqu.E. SEqIJ.ENC.e. f'«l

IN ",NO"'''' T-..1!tLE:: TMtS 15 -nte. _c.olllD c.NQ) tDIIoITI\.""IIIIG-

u .... J.'~UI:, "'~ ... u::.' .. "':: NO. = ~ At.I/!,lVSIS FO", DEFENO[NT vARIA1LE ~_, --........ c.oE.f:PIC.I~TS TITLE.E'XA'1PL: OF ~ACKWARD OPTIflN (OATA FROM DRAPE!? ANO SHIT .... P. 365-4:121 ~ 'DE"NDEN"'- V~~i8L'= ,S

... _,., ........ .- ...

3 .71&463C7F.OZ .1451Q380E+Ol 2 .4161C91&E.OU ~ -.Z3&5,n22f.OO

UNIaUE SfQU~~Cf NO.; lC5 A~ALV'SIS FOR OE~ENDfNT V_RIAaLE 5---~ TITLE,fl(a,MlPl'!:: IlF ~ACI(WA·~.O OPTION (DUA FR.OM OR.APf Q AtiiO S"IT·i., p. 365-lt:'2'.

2' 1 .525773,.Qf .. 02 .146A3057E+Ol 2 .662?50ft.'JE.OO

UNIQUE SEQUE'tC:~ NO. = 106 A'4AlVSTS FOR OEp~NOENT VARIAfll=: 5---til~K('f) TITLE.EXA,,",PLE" OF RANI( "";::GF.E'SSION WITi.f TI-IJ: STfPWISE OPTION(DII,TA F{O'1 ]~AJoEt'SIUTJ.t

1 1 Q .L~43~~q5E.ID 1 .80~C??q2f.DO

vA .. '~a.(R ,,u, .. M"&'E;ll & •

~-------------------------------------------------------- ~E CON5T~"T ~~ AND ~

UNIQUE SE~UE~C~ NO. = lD1 A~ALvsts FOR. CEPE~DfNT V.RIASLE ;---~lNK('. N~M'Be.R "0" •

TITLf,f~A~9LE OF FANK QFGRfSSION WIT~ THf ~TEPWISE OPTION(OaTA F~l~ ]~l~E~fS~IT~

CD. 1 0 .G50QQQAftf+Ol ~b21S"18CE"CO 4 -.551~~1&~E.OD

~------------------------------------------------------------------------------- TWO INDE'PE'NPENT 'IIA1UA15oLE$

-mE. FlltED 1U:6«ESSIOftll t:qU.ATlON

(E."c:.t.:,A:ntNG- ntE. CCfII5TAtIT) •

Page 59: Stepwise Regression With PRESS and Rank Regression

DISTRIBUTION: W. J. Conover College of Business Administration Texas Tech University Lubbock, Texas 79409

J. M. Davenport (5) Department of Mathematics Texas Tech University Lubbock, Texas 79409

K. E. Kemp Department of Statistics Kansas State University Manhattan, Kansas 66502

M. C. Cullingford (25) NRC-PAS Washington, D. C. 20555

1221 D. L. Wright 1223 R. R. Prairie 1223 H. E. Anderson 1223 C. R. Clark 1223 R. G. Easterling 1223 E. L. Frost (10) 1223 I. J. Hall 1223 R. L. !man (25) 1223 D. D. Sheldon 1223 B. P. Roudabush 1223 M. Shortencarier (10) 1417 F. W. Muller 1417 E. E. Ard 1417 J. R. Ellefson 1417 F. W. Spencer 1763 M. D. Fimple 4413 J. E. Campbell 4413 R. M. Cranwell 4413 J. C. Helton 4723 K. A. Bergeron 4723 L. L. Lukens 4723 R. R. Peters 5641 G. P. Steck 8324 M. A. Griesel 8346 C. J. Decarli 9515 J. T. Hillman 9625 J. R. Yoder 3141 T. L. Werner (5) 3151 W. L. Garner (3)

for DOE/TIC (Unlimited Release) 3154-3 R. P. Campbell (25)

for DOE/TIC

57