stepwise regression with press and rank regression
TRANSCRIPT
SAN 079-1472
Unlimited Release
Stepwise Regression With PRESS and Rank Regression (Program User's Guide)
Ronald L. Iman, James M. Davenport, Elizabeth L. Frost, Michael J. Shortencarier
When printing a copy of any digitized SAND Report, you are required to update the
markings to current standards.
Issued by Sandia Laboratories, operated for the United States D~partment of Energy by Sandia Corporation.
NOTICE
This report was prepared as an account of work sponsored by t~e United States Government. Neither the United States nor the Department of Energy, nor any of their empioyees, nor any of their contractors, subcontractors, or their employees, makes any warranty, express or implied, or assumes any legal liability or responsibility for the accuracy, completeness or usefulness of any information, apparatus, product or process disclosed, or represents that its use would not infringe privately owned rights.
'1;'
SAND79-l472 Unlimited Release
Printed January 1980
Stepwise Regression with PRESS and Rank Regression (Program User's Guide)
Ronald L. lman Statistics and Computing Division 1223
Sandia Laboratories Albuquerque, New Mexico 87185
James M. Davenport Department of Mathematics
Texas Tech University Lubbock, Texas 79409
Elizabeth L. Frost Statistics and Computing Division 1223
Sandia Laboratories Albuquerque, New Mexico 87185
Michael J. Shortencarier Statistics and Computing Division 1223
Sandia Laboratories Albuquerque, New Mexico 87185
ABSTRACT
This report contains a description of a stepwise multiple regression
program. This program provides for either a forward stepwise or backward
elimination solution to multiple regression problems. The program also
provides PRESS values that can be used for subset selection as well as
providing options for regression analysis on the ranks of the data and
for a weighted regression analysis on either raw or rank transformed
data. This document has been written and designed for users of this
STEPWISE program.
i
•
THIS PAGE LEFT BLANK INTENTIONALLY.
11
Table of Contents
I. Introduction
II. Output
III. Input - Parameter Cards
A. TITLE
B. DATA
C. LABEL
D. BACKWARD
E. FORCE
F. OUTPUT
G. PLOT RESIDUALS
H. STEPWISE
I. MODEL
J. PRESS
K. RANK REGRESSION
L. WEIGHT
M. DROP
N. TRANSFORMATION
O. END OF PARAMETERS
P. FORMAT
Q,. END OF DATA
IV. Additional Data Transformations
V. Output of the Regression Coefficients and the Observations with their Predicted Values When Using Rank Regression
VI. Deck Setup for STEPWISE
VII. Deck Setup for Standardizing Data
VIII. Bibliography
IX. Example
1
2
2
2
2
3
4
5
5 6
6
7
9
10
12
12
13 15 15 16
17
19
22
24
27
28
iii
2
values from rank regression analysis can also be saved on disk for
additional investigation outside the STEPWISE program, (10) weighted
regression analysis on the raw data or rank transformed data.
The reader who is familiar with SAND76-0364 (rman, 1976) and has used
STEPWISE on previous occasions is advised to read this report to note the
extensive additions.
II. OUrPur
The program will output means, variances, standard deviations, stan
dard errors and coefficients of variation for each variable read in or
generated via transformations. As options, the user may request the
table of correlation coefficients, sum of squares and cross-products ma
trix, inverse correlation matrix and residuals y-values and y's, PRESS
values and a plot of these PRESS values. The program will also print an
analysis of variance table for a regression model and a table of statistics
regarding the regression coefficients. Residual plots may be requested,
and if they are, will be printed on the line printer.
III. INPUT - PARAMETER CARDS
All parameter cards must start in column 1 and may be placed in any
order. An explanation and illustration of each of the parameter cards
follows.
A. TITLE card (optional).
The "TITLE" card may contain any alphameric data that may be meaning
ful to the user. The word "TITLE" identifies the card, and its contents
are printed at the top of each page of output. Only one "TITLE" card may
be used.
Example:
THIS IS A SAMPLE TITLE CARD
B. DATA card (required).
The "DATA" card has 3 arguments specified as follows:
where
NV, Nl', DATDIS.
NV is the number of user supplied variables to be read into the pro
gramming;
NT is the number of additional variables created as transformations
of other variables.
DATDIS is the data disposition parameter with the following codes:
= 0 the data are to be read from cards and not saved for subse
quent use (this is the usual case and if not specified DATDIS
will default to 0).
= 1 the data are to be read from cards and saved on disk (file
10 as binary records) for subsequent analysis of the data.
(DATDIS would be set equal to 2 for any subsequent analyses
following DATDIS = 1.)
= 2 the data are to be read from disk (file 10 - binary records).
In most cases the user need specify only the first argument (NV) and
the other two will default to zero. The example below shows how this is
done:
/r DATA, 10.
!
The above card indicates that 10 variables should be input from cards
and not saved on disk. Both NT and DATDIS default to zero because they
are not specified on the card. Note that the "DATA" card must be termina
ted by a period.
C • LABEL card (opt ional) •
The third type of parameter card is the "LABEL" card. It is used to
assign variable names to the variables in the analysis. Each name may be
up to 8 characters long. The card is identified by the keyword name
"LABEL" which must start in card column 1. The number in parentheses is
the number of the variable that is to be labeled with the first name. It
3
4
is assumed that all succeeding labels are sequential. If more than one
card is needed for the labels, the succeeding label cards should have the
same format as the first except the number in parentheses should be the
number of the variable that is to be labeled next. (Variable number is the
same as the subscript of the variable as it is read into the input array
X.) A comma or period should not be placed after the last label on the
card. There can be any number of blank spaces on a card after the last la
bel on that card. However, column 80 cannot be used (i.e., column 80 is
not read). When making multiple runs on a set of data, the labels from the
first run carryover to subsequent runs.
Example:
( IAiiEL\iF FIRS. ;CAPACIT;riI¢DE,--:.: ,-TiiiP---- -- -
(LABEL(12) TWELrnI,THI!ITEEN, et,.
The above cards would assign X(l) the name FIRST, X(2) the name CAPA
CIT and so on til X(13) would be named THIRTEEN. There is no limit to the
number of label cards nor the number of names per card. A variable name
may not be continued from one card to the next. Any spaces between the de
limiting commas are taken as part of the label and are counted as charac
ters in the name. Allor none, or any portion of the variables may be
labeled.
D. BACKWARD card, (required if the STEPWISE card is not used).
If the user desires a backward elimination solution for "best" subset
selection, a "BACKWARD" card must be included among the parameters. The
form is
/ BACKWARD,SIG= a£.p/ta/taA:
where atphaha.t is replaced by 'the significance level the user wishes
to use
for deleting variables from the model. Variables will be dropped from the
model until only variables that have alphahat values less than or equal to
a..trha.ha.t are left in the model.
In you have only one independent variable, you must use the BACKWARD
card instead of the STEPWISE card.
E. FORCE card (optional).
If the user wishes to keep certain variables in the regression model,
regardless of their contribution to the model, he may do so by using the
"FORCE" statement. The "FORCE" statement will keep specified vari.ables in
either the backward elimination or stepwise solution. The following is an
example of the "FORCE" card:
F¢RCE, 3,6,8.
The above card indicates that variables 3, 6, and 8 are to be kept in
the model regardless of their contribution to the model. The "FORCE" card
must end with a period. The maximum number of variables that may be for
cedinto the model is 10. Do not force all of the independent variables
into the model. If the user desires to have all of the independent vari
ables in the model, then he should use the BACKWARD option (with SIG=l.O).
F. OUTPUT card (required)
The "OUTPUT" card may have any of the following specifications separa
ted by connnas •
. Keyword
1. CORR
2. SSXP
3. INVERSE
4. STEPS
Action
Print simple correlations among all variables
specified in the model.
Print the corrected sum of squares and cross
products matrix for all variables specified in
the model.
Print the inverse correlation matrix at each
step in the analysis for all independent vari
ables used at that step.
Print an analysis of variance table and the re
gression coefficient estimates for each step of
5
6
5. RESIDUALS
6. ALL
the backward elimination or stepwise variable
selection procedure. If not specified, only
the results for the final model will be printed.
Print the y, y and residual for each observation
using the final regression model.
Caution: Do not request RESIDUALS if the num
ber of observations is large. This could pro
duce excessive, if not unwanted, output.
Options 1 through 5 will all be in effect.
Note: Any subset of these options may be requested but should follow the
order of appearance given above. The following example shows how the "OUT_
PUT" card may be used.
f~UTPUT ,c¢Iffi, INVERSE--,-STEPS, RESiDUALS----- -----------/
G. ~ RESIDUALS card, (optional).
The user may request that residuals be plotted on the line printer by
inserting the "PLOT RESIDUALS" card. The plots produced are: each inde
pendent variable in the final model versus the dependent variable; resi
duals versus time; residuals versus y's; and residuals versus each of the
independent variables in the model. Time is assumed to be the same as or
der of data input, i.e., observation 1 is assumed to be recorded first in
time, observation 2 as second, etc. The "PLOT RESIDUALS" card should be
as follows:
PL¢T RESIDUALS
H. STEPWISE card, (required if the BACKWARD card is not used).
The "STEPWISE" card is used to specify that the program should find
the "best" subset of the full model using the stepwise procedure. The user
may specify a significance level for deleting a variable from the model.
The program will continue to add variables to the model as long as it can
find a variable whose regression coefficient is significantly different
from zero at the specified significance level for entering variables into
the model. At each step, t-tests are computed on each of the regression co
efficients of the variables in the model and their alpha hats computed. If
any of the t statistics are not significant at the level specified for de
leting a variable, the least significant variable is dropped. This process
continues until all variables in the model are significant at the specified
deletion level. The program then searches for a new variable to be added
to the model and this cycle is repeated until no new variables can be found
which are significant at the specified significance level.
The program can be used for computing a forward solution, i.e., add
ing variables in order of their contribution but not deleting any, by set
ting the deletion significance level to 1.0.
An example of the "STEPWISE" specification is given below:
~ STEPWISE,SIGIN=0.05,SIQ¢UT=0.10
The above card indicates variables are to be added to the model as
long as variables that are significant at the Q = .05 level can be found.
Variables that are already in the model whose significance level rises
above .10, i.e., a, > .10, will be deleted from the model. If the values
for SIGIN and SIGOUT are not specified, they will each default to .05.
Note: SIGOUT must always be at least as large as SIGIN to avoid an inde
finite loop where a variable may be introduced and then immediately dele
ted. Recall that if there is only one independent variable, BACKWARD elim
ination procedure should be used rather than the STEPWISE procedure.
I. MODEL card, (required).
The model card indicates which variables are in the model and whether
they are dependent or independent variables. Up to 15 dependent variables
may be specified for any single run. A separate analysis is done for each
dependent variable. The model may have up to 179 independent variables if
there is only one dependent variable. The total number of variables both
read in and generated using transformations may not exceed 180. There-
fore, if there are 15 dependent variables specified as the model card, the
7
8
maximum number of independent variables is 165. The model may be continued
for as many cards as necessary, just be sure that the continuation occurs
before or after a + sign, i.e., do not allow the continuation to interrupt
a variable subscript. Do not start model continuation cards in column 1.
The writing of the model card is done using variable subscript numbers.
The first variable read in is variable 1 and thus has subscript number 1,
etc. The writing of the model may be in any of the 3 forms that follow,
where the dependent variables are immediately after the word "MODEL", se
parated by commas and the independent variables are to the right of the
equal sign and separated by plus signs.
1. MODEL,1,3,2=4+5+12.
This method simply uses the variable number to indicate which variables
are in the model and whether they are dependent or independent variables.
For this particular example, variables 1, 3 and 2 are the dependent vari
ables and 4, 5 and 12 are the independent variables. A separate analysis
would be performed for each of the dependent variables.
2. MODEL,Yl,Y3,Y2=x4+X5+X12.
This specification results in the same model as the method described
above. It allows the user to use more specific notation which may be help
ful in clarifying the model in some instances.
3. MODEL,Y(1),Y(3),Y(2)=B(4)x(4)+B(5)X(5)+B(12)X(12).
This specification is equivalent to the two above methods but allows
more specificity in describing the model by allowing a dummy indicator,
B(i), for the regression coefficients.
4. MODEL,1,Y3,Y(2)=4+B(5)X(5)+X12.
The above methods may be mixed. Note that the above models are all
terminated by periods. The period is not necessary on the "MODEL" card but
results in slightly faster parameter processing by eliminating the need for
the program to determine whether the model card has been continued or not.
The program always fits the model with the intercept assumed to already be
included in the model. There may be any number of blank spaces between the
independent variables as long as each variable specification is separated
by a "+" sign. However, there can be no blank spaces to the left of the
equal sign.
J. PRESS card (optional).
When regression techniques are used to build a response surface, the possibility exists of producing two competing models. This situation could
easily arise in stepwise regression as a new surface is developed each time
a new significant variable is added. Although 11 statistical significance"
is a necessary condition for adding a new variable in stepwise regression,
it is not an end in itself as there exists the possibility of overfitting
the data. For example, it is possible to obtain a good fit on a set of
points by using a polynomial of high degree. However, in doing so, one
can overfit the data and produce a spurious model which makes poor predic
tions.
To protect against overfit, the Predicted Error Sum of Squares (PRESS)
criterion as given in Allen (1971) can be used to determine the adequacy
of a prediction model. For a regression model containing k variables and
constructed from n observations, PRESS is computed in the following manner.
For i=1,2, •.. ,n, the ith observation is deleted from the original set of n
observations and then a regression model containing the original k variables
is constructed from the remaining n-l observations. With this new regres-~
sion mOdel, the value Yk(i) is the estimate for the deleted observation
Y.. Then, PRESS is defined from the preceding predictions and the n ori-1-
ginal observations by
The regression model having the smallest PRESS value is preferred when
choosing between two competing models, as this is an indication of how well
the basic pattern of the data has been fit versus an overfit or an underfit.
To obtain the PRESS values at each step of the analysis, merely add
the PRESS card to the deck of parameter cards. Also, if the analysis pro
duces 2 or more steps, a plot of PRESS values versus step number is auto
matically produced under the PRESS option to make relative comparisons
easier.
9
10
Example:
( PRESS
K. RANK REGRESSION card (optional).
To obtain a regression analysis on the ranks of the data instead of
the raw data merely add the RANK REGRESSION card to the deck of parameter
cards. If the user requests RESIDUALS on the OUTPUT card, then the resi
duals are given as a rank residual (i.e., the difference between the rank
of Y and the predicted rank of Y). In addition, the predicted ranks of the
Y's are used to interpolate in the original raw Y's to obtain the raw Y.
This is done at each step in the analysis and these data are used to com
pute a "normalized" R2. This R2 value is associated with the raw data in
stead of the ranked data and is computed by
n
1: 2 1=1
R = -----=-=------h _ 2
(Y.-Y) + 1.
n
E i=l
n h 2 E(y·-y·) i=l 1. 1.
This "normalized" value of R2 varies between zero and one and will be close
to one if the model being analyzed predicts the observed values adequately.
Note that the adjusted total sum of squares can be factored into
three components: (1) sum of squares of regression, (2) sum of squares of
error and (3) sum of cross-products. The formulas are as follows:
n
E i=l
h _ 2 (y.-y) +
1.
n h 2 n E (Y.-Y.) + 1: i=l 1. 1. i=l
(Y.-Y)(Y.-Y.). 1. 1. 1.
If the Y. are the usual least squares predictions then this last term (the 1. A
sum of cross-products) is easily shown to be zero. However, since the Yi computed here by the rank regression technique are definitely not the
least squares predictions, then this term is non-zero (can be positive or
negative). Hence, if the "usual" R2 = (sum of squares of regression)!
(total sum of squares) were computed, it would be either increased or
decreased over the "normalized" R2 described above depending on whether the
sum of cross-products is negative or positive. Therefore, it is felt
that the normalized R2 is a more appropriate measure of fit than the usual
R2 in this situation.
The sum of cross-products can be quite large in absolute value, depend
ing upon the structure of the observations. If the raw data have unusual
spacings or exhibit a non-monotone relationship, then the interpolation ~
scheme used to obtain the Y. 's will not be adequate and the resulting sum ~
of cross-products will be large. Hence as a measure of the "ability (or
inabili ty) to interpolate in the data" a Coefficient of Interpolation is
computed and given for each model analyzed. This coefficient is computed
by
2*
2* n ~ 2
~ (Y.-Y.) . ~ ~ ~=
n ~ -1
+ E (Y.-Y)(Y -Y.) i=l ~ ~
Note that if the sum of cross-products is zero then this coefficient will
be zero and that as the sum of cross-products departs from zero, the co
efficient approaches one. Therefore as a matter of interpretation, a
value of this coefficient that is "near zero" indicates that the interpo
lation is adequate. If the coefficient in "near one," then the user should
examine the original observations for unusual spacings and/or non-monotone
relations in the data (such as cyclic).
After the final model has been selected, then the raw residuals are
given (when RESIDUALS are requested) as the difference between the raw Y A
and the raw Y (see Section V for how to save these values for further analy
sis). The reader is referred to Iman and Conover (1979) for a thorough
discussion of the rank regression analysis.
When using RANK REGRESSION, there is a limit on the number of obser
vations that can be handled. Let NV = the number of variables input to
STEPWISE (see DATA card, page 2) and N = the number of observations. Then
the following limits are in effect:
1. If NV = 2 (i.e., one independent and one dependent variable) then
N :S 64980.
11
12
L. WEIGHT card (optional).
The "WEIGHT" card allows the user to perform a weighted regression
analysis of the data. The card has one argument as shown below:
( WEI GIlT= TIll'
where IWT is the variable number where the weights will appear. (Variable
number is the same as the subscript of the variable as it is read into the
input array X.) If IWT is zero than weighted regression will not be done.
However, the default value of IWT is zero so if the user does not want
to use weighted regression then the "WEIGHT" card need not be included.
Weighted regression may be done on either raw data or rank transformed
data. The weights are normalized by the program so that the sum of the
weight equals the number of observations. When the WEIGHT option is
requested the equations on the previous pages with respect to PRESS and
the R2' on ranks are automatically adjusted to reflect these weights, as
is the case for the normal equations used in the regression analysis.
The card is identified by the keyword "WEIGHT" which must begin in
column one. There must be no blanks before or after the equal sign. A
period to terminate the card is optional.
Example:
WEI GHT= 3
In the example, weighted regression has been requested. The weights will
appear in the data as variable number three.
M. DROP card (option/lJ.).
The "DROP" card allows the user to unconditionally drop observations
from the regression analysis. (For information on how to conditionally drop
observations, see the "Additional Data Transformations" section.) If the
user knows in advance that some observations will need to be discarded, he
may use the "DROP" card, indicating the number(s) of the observation(s)
to be dropped. The card is identified by the keyword "DROP" which must
,
begin in column one. The only blank necessary is the one after the word
"DROP." Commas are used to separate the observation numbers and ~ period
is required after the last observation number. The observations being
dropped need not appear in any special order. There must be no blanks
preceding the period and there must be no blanks before or after the
commas unless the card is to be continued. To continue the "DROP" card,
at least one blank must follow the last comma and the continuation card
must begin in column two. There is no limit to the number of continuation
cards or to the number of observation drops per card.
Example:
DROP 9,6,1,5.
In this example, the first, fifth, sixth, and ninth observations will be
dropped. If the user wishes to drop a large number of observations, it
may be advantageous to use the trans subroutine discussed in the
"Additional Data Transfonna.tions" section.
N. TRANSFORMATION card (optional).
The "TRANSFORMATION" card allows the user to define new variables
or redefine existing variables via simple transformations of existing
variables. (For more complicated variable transfonna.tions, see the
"Additional Data Transfonna.tions" section.) The writing of the "TRANS
FORMATION" card is done using variable subscript numbers. The card is
identified by the keyword "TRANSFORMATION" which must begin in column
one. The only blank necessary is the one after the word "TRANSFORMATION"
and there must be no blanks within the transfonna.tion definitions. The
variable being defined or redefined appears on the left hand side of an
equal sign and the variables used to define or redefine it appear on the
right hand side. There must be no blanks on either side of the equal
sign and the variables need not appear in any special order. Stars are
used to separate the variables on the right hand side and there must be
no blanks on either side of any star in a transformation definition.
There is no limit to the number of variables used in a transformation
definition as long as it fits on one card. A minus sign in front of a
13
14
variable on the right hand side indicates that the reciprocal of that
variable is wanted. There must be no blanks on either side of the minus
sign. Commas are used to separate transformation definitions. There must
be no blanks before or after a comma unless the card is to be continued.
In such cases, the comma must be followed by at least one blank and the
continuation card must begin in column two. Do not break up a trans
formation definition by continuing it on to the next card. There is no
limit to the number of continuation cards or to the number of trans
formation definitions per card. A period is re~uired after the last
transformation definition and there must be no blanks preceding it.
Labels for the variables created by the "TRANSFORMATION" card are
automatically generated using the characters of the transformation definition.
When using the "TRANSFORMATION" card and/or the TRANS subroutine, the raw
variables must be numbered from NV+l to NV+N.r, where NV is the number of
input variables and NT is the number of new variables created via trans
formations. (For a complete description of the TRANS subroutine, see the
"Additional Data Transformations" section.)
Example:
TRANSFORMATION
In this example, variable number four has been defined to be the product of
variable 'number one and variable number three, variable number two has been
redefined to be the reciprocal of itself, and variable number eight has
been defined as variable number six times the reciprocal of variable
number seven, that is variable eight is variable six divided by variable
seven. Note that use of a minus sign on this card is used to denote a
reciprocal.
,
2. If NV ~ 3, then NV*N ~ 175931.
When raw data are being analyzed, there essentially is no limit to the
number of observations that can be processed. If the user has a data set
falling outside of the above limits but would still like to do a rank re
gression, then it would be necessary for the user to rank the data outside
of the STEPWISE program and then not use the RANK REGRESSION card. Row-"
ever, using this approach will not provide raw Y's within the STEPWISE
program.
Example:
RANK REGRESSI¢N
O. END OF PARAMETERS card (required).
The "END OF PARAMETERS" card is used to indicate the end of parameter
card input.
Example:
END ¢F PARAMETERS
P. FORMAT card (required, if data are read from cards).
A format card is required if data are to be read from cards. The for
mat card may contain D, E, and F and (in some cases) I-numeric specifica
tions and T and X positional specifications. The format must be enclosed
in parentheses and conform to the syntax of the CDC FORTRAN IV variable , format. The format may be continued on up to 10 cards.
Example:
/ I
.~-~ .... --~~~~-~~-. --.----~~~~------~----'-'-.
(F2.l,F2.2,F4.0,lX,Fl.0,3X,F2.0)
15
16
In most cases, the user will use the F and X format specifications.
Listed below is a sample data card with data punched in columns 1 through
15.
The preceding format would read in the data card assigning the values
listed below for the respective variables.
X(t) 1.
&tl 0.")7
X(3) 9b"64.O
x(4) 1.0
X(5 ) '6i":O
In the general case of Fw.d, w is the field width of the variable on the
data card (the variables are read in sequentially from 1 to NV, where NV
is the number variables to be read from the cards), i.e., the number of
columns used to specify the values of the variable and d is the number of
decimal places read in or the number of columns to the left of where the
decimal point should be placed. The nX specification is used to skip
columns on the data card where n is the number of columns to be skipped.
If more than one data card is used to input the data from one observation,
a slash in the format is used to specify that reading should continue on
the next card.
Example:
(F2.l,6F2.2,8F5.3,7F4.1!10F5.2)
The above format card would cause the first 22 variables to be read from
the first data card defining the first 22 values in the observation vector
and the next 10 (23 to 32) to be read from the second data card completing
that observation vector. There may be any number of cards per observation
vector.
If more than one card is necessary for specifying the format, it
should be continued for however many cards are necessary without any indi
cation in the format itself that it is being continued. In other words, it
should be written just as it would be written on one long card.
Q. END OF DATA trailer card (required when reading data from cards).
In order that the user does not have to specify the number of
observations he has in his data, the "END OF DATA" trailer card should
follow the data. The trailer is necessary only when multiple sets of
data are being processed in one computer run. The trailer card is used
to separate the data of one data set from the parameters of the next. If
only one set of data is being processed, there is no need for a trailer
card, although it may be used. Do not use an END OF DATA card when
DATDIS = 2.
Example:
DATA
IV. ADDITIONAL DATA TRANSFORMATIONS
It is possible, in addition to the simple transformations specified
by the "TRANSFORMATION" card, to create complicated transformations by use
of a user supplied subroutine called TRANS. The user may redefine user
supplied variables or create new variables as transformations or com
binations of the variables read in. When using the TRANS subroutine
and/or the "TRANSFORMATION" card, the new variables must be numbered from
NV+l to NV+:NT where NV is the number of input variables and NT is the
number of new variables created via transformations. The simple variable
transformations defined by the "TRANSFORMATION" card are always done
first. The subroutine TRANS is then used to perform the more complicated
transformations or to conditionally drop observations through use of IDROP
as explained below. This subroutine may contain any legal FORTRAN state
ments. There are five variables made available to the subroutine via
cornmon block IMAN which the user may use to either make decisions while
processing the input data or to modify the input status of an observation.
These variables are NRAW, Nl'RANS, IDROP, IDUM, and IRANK and are discussed
below. The general. form of the subroutine is as follows:
SUBR¢UTINE TRANS (X) DIMENSI¢N x(lBo) C¢MM¢N/IMAN/NRAW, Nl'RANS , IDR¢P, IDUM, IRANK
any transformations
REl'URN END
17
18
The above cards are the minimum reQuirements for the subroutine. The sub
routine reQuires a single argument, X, which is the array of input variables
for a single observation, i.e., the observation ector X. The variables in
common block TIMAN are:
NRAW - is the current count of the raw observation being read in. It
may be used to make decisions when making variable transformations
based on the observation count or it may be used as a label if values
are to be printed out. The value of NRAW must not be changed by the
programmer.
NTRANS - is the current count of the transformed observation being
processed. It will always be the same as NRAW except in the case of
dropping observations. In this case, NTRANS will be the count of
those observations which have not been dropped. The value of NTRANS
must not be changed by the programmer.
IDROP - is used to indicate that the user wishes to drop an observa
tion from the analysis. The value of IDROP is always zero upon en
tering the subroutine. When IDROP is set to any non-zero value, the
observation vector being transformed at the time IDROP is set non
zero, will be dropped from the analysis.
IDUM - has no function within subroutine TRANS and the programmer
should not change the value of IDUM wi thin TRANS.
lRANK - is set within the main program and indicates when the user has
reQuested RANK REGRESSION. When IRANK = 0, the usual regression analy
sis on the raw data has been reQuested. When IRANK = 1, RANK REGRES
SION has been reQuested. The value of IRANK must not be changed by
the programmer.
Example of using the TRANS subroutine:
SUBR¢UTINE TRANS(X) DIMENSI¢N X(180) C¢MM¢N/TIMAN/NRAW,NTRANS,IDR¢P,IDUM,IRAI~ IF(NRAW.EQ.l)FAT=O.O IF (X(9).NE.O.0)Q¢ TO 4 IDR¢P=l REruRN
4 X(13)=(x(8)-X(7»/X(5) X(16)=X(1l)/X(9)*100 FAT=FAT + X(10) x(4 )=AL¢G(x(4» WRITE (3,101)(X(J),J=13,16),FAT,NTRANS
101 FORMAT (20X,5F8.3,3X,15)
f
RETURN END
In the above example, assume 12 variables are read in from cards
(NV=12) according to the format specifications given by the user on the
variable format card. The first time through, i.e., when NRAW=l, the var
iable FAT is set to zero. Each time through, X(9) is checked for being
zero and in those cases when it is zero, the observation is dropped for all
variables 1 through 16. If X(9) is non-zero, then X(13) and x(16) are
created as transformations of existing variables. Also x(4) is transformed
to its log (base e) and X(lO) is accumulated in the location named FAT.
Note that variables x(14) and X(15) are not defined in this subroutine.
They may have been previously defined by use of a TRANSFORMATION card,
thus the user is allowed complete freedom in creating transformed
variables. Also, when X(9) is non-zero, the values of the new variables
are printed out along with the cumulative sum of X(lO) and the number of
transformed observation (NTRANS).
An example of the use of IRANK is as follows. If the user is making
multiple passes of the same data set and some of the passes involve RANK
REGRESSION while the remaining passes involve the usual regression on the
raw data, and furthermore some of the data is to be dropped from the raw
analysis but not dropped from the rank analysis: then the programmer can
make use of the passed value of IRANK to determine whether an observation
should be dropped or whether it should remain in the analysis for that par
ticular run. That is, if IRANK = 1, then you would jump over the statements
that set IDR¢P = 1. This prevents the deletion of the observation when
RANK REGRESSION is requested.
V. OUTPUT OF THE REGRESSION COEFFICIENTS AND THE OBSERVATIONS WITH THEIR PREDICTED VALUES WHEN USING RANK REGRESSION
The user often has need of the regression coefficients from a particu
lar fitted mOdel. In order to accomodate these users, we have incorporated
into STEPWISE the following option. These coefficients are automatically
written onto TAPE 19 and the user can access these coefficients as descri
bed below.
For each distinct ANOVA table that STEPWISE produces in a run, a uni
que sequence number is attached to that ANOVA table and is printed at the
bottom of the table. This sequence number begins with "101" and increments
19
20
in unit steps. Also the "number" of the variable that is referred to
below is the number given to that variable when it is read in (or generated
via transformations) and corresponds to the number of the variable that is
used in the MODEL card.
The information written on TAPE 19 is arranged as follows. For each
unique sequence number, a group of card images is written on TAPE 19 fol
lowed by two blank card images. Within this group of card images the first
card contains the following information:
Columns 1 - 22
Columns 23 - 27
Columns 28 - 66
Columns 67 - 69
Columns 70 - 72
Columns 73 - 80
"Unique Sequence No. = "
Contains the unique sequence number for this group
of cards in an 15 format.
"ANALYSIS FOR DEPENDENT VARIABLE"
Contains the number of the dependent variable used
in this analysis in an 13 format. II 11
Contains the label of the dependent variable (given
in columns 67 - 69) in an A8 format.
The second card contains the T1 T LE (see Section II-A) in an 80 column al
phameric format. All remaining cards within this group of cards are
written in the following format with the explanation following:
F¢RMAT(lX,13,lX,12,lX,4(13,E15.8 ).
Columns 2 - 4
Columns 6 - 7
Columns 9 - 80
The number of independent variable included in the
fitted model, exlcuding the constant term.
A card number that is unique wi thin this group of
cards.
Contains 4 sets of values, each set' containing the
number of the independent variable include.d in the
fitted model followed by the estimate of the co
efficicient.
Note: The first coefficient on the first card with a group always
contains the constant term and its "number" is zero.
In order to access this information the user may use one of the fol
lowing procedures. The user could catalog TAPE 19 as a permanent file and
then read whatever information he desires from the permanent file o For
eXalDP1e, the following sequence would catalog TAPE 19.
•
LDSET,PRESET=INF. STEP,PL=40ooo. CATAL¢G,TAPEl9,THIS-FILE,CY=l,CN=DEFAULTPW. EXIT. EXIT.
If the user does not wish to catalog TAPE 19 as a permanent file but
desires punched cards, then the following sequence would be appropriate.
LDSET,PRESET=INF. STEP,PL=40000. REWIND,TAPEl9. C¢PYCF, TAPEl9,PUNCH. EXIT. EXIT.
Or if the user wished to have a print-out of the card images along
with the punched cards, then the following would suffice.
LDSET,PRESET=INF. STEP,PL=40000. REWIND, TAPE19 C¢PYCF,TAPE19,PUNCH. REWIND, TAPEl9 • C¢PYSBF,TAPEl9,¢UTPUT. EXIT. EXIT.
As mentioned in Section II - K, the user can obtain the original ob" servations and the predicted raw Y's when using RANK REGRESSION by re-
trieving them from TAPE 20.
When requesting RANK REGRESSION and PLOT RESIDUALS, the plots given
will be the ranks of the data and the predicted ranks or rank residuals
(whichever is appropriate). That is, none of the plots involve any of the
raw data. Therefore, if the user wishes to see plots of the original ob-A
servations versus the raw Y's, residuals, etc., then this must be done se
parate from the STEPWISE program.
To facilitate this type of further analysis, the STEPWISE program is A
set up to automatically write the original observations and the raw Y's on
TAPE 20 as a binary file. If the user wants to save these for further
21
22
analysis, then TAPE 20 should be cataloged as a permanent file as explained
in the previous part of this section.
The data file on TAPE 20 is created in the following manner. All re
cords are written in unformated binary code. The first record is the TITLE
card information, which is 80 columns of alphameric characters. The second
record is an integer variable that gives the number of observations. All
of the remaining records consist of two floating point numbers. The first
is the original observed Y-value and the second is the predicted raw value "
of Y (i.e., the raw y). The number of records of pairs of data will be
equal to the number of observations.
VI. DECK SETUP FOR STEPWISE
On the following page is an example of how to setup the cards for use
in running STEIWISE. Cards 1 - 15, 22 - 32, and 46 - 64 all begin in card
column one, while the Fortran statements in cards 16 - 21 start in column
7. Cards 35 - 45 are data cards in 5F6. 0 format. Card number 1 is the
JOB card and needs to be changed only to show the user's name and the box
number. Note also that due to large core memory requirements of STEIWISE,
that the extended core parameter must be set at 1200. Card number 2 must
contain the user's social security number, division number and charge num
ber. Cards 3 - 14 are control cards and will remain the same for all runs.
The only exception being the use of TAPE 19 and TAPE 20 control cards that
are explained in Section V. Cards 16 - 21 provide a dummy subroutine for
transformations with any desired transformations (see page 17) following card number 20. Cards 23 - 32 are parameter cards (see pages 2-17). If
the data is on cards, it will appear starting on card 33 with one vector
of observations per card( s), followed by the END OF DATA card. Cards
47 - 53 reprocess the data with a backward solution. Note the absence of
the LABEL card. The labels used in the first pass of the data will be
used for this second analysis. Cards 54 - 64 reprocess the data with a
stepwise solution and RANK regression.
Card No. DECK SETUP FOR STEIWISE
1 STE,TlO,EC1200. 2 ACC¢UNT ,S509423684, D1223 ,G13 .A0188ooo ,RP , KUNC . 3 FTN ,R=2 ,B=TRANZ • 4 REWIND, TRANZ • 5 ATTACH,¢BJECT,STEPWISE-RLI.
6 REWIND ~BJECT • 7 C¢PYL, BJECT, TRANZ, STEP. 8 ATTACH,SUBS,D1223-760G-LIBRARY. 9 ATTACH,FXDPLIB,FXDPLIB.
10 LIBRARY,SUBS,FXDPLIB. II LDSET,PRESET=INF. 12 STEP,PL=400oo. 13 EXIT. 14 EXIT. 15 (END ¢F REC¢RD -- MULTI-PUNCH 7 8 9 in C¢L 1). 16 SUBR¢UTINE TRANS (X) 17 C¢MM¢N/IMAN/NRAW,NTRANS,IDR¢P,IDUM,IRANK 18 DIMENSI¢N X(180) 19 C ANY TRANSF¢RMATI¢NS G¢ HERE 20 RETURN 21 END 22 (END ¢F REC¢RD -- MULTI-PUNCH 7 8 9 in C¢L 1) . 23 TITLE, EXAMPLE ¢F STEPWISE ¢Pl'ION (DATA FR¢M DRAPER AND SMITH,
p. 365 - 402) 24 DATA, 5 ,O,l. 25 LABEL(1)=Xl,X2,X3,X4,Y 26 M¢DEL,5=1+2+3+4. 27 STEPWISE,SIGIN=.05,SIG¢UT=.10 28 PRESS 29 ¢UTPUT,ALL 30 PL¢T RESIDUALS 31 END ¢F PARAMETERS 32 (5F6.0) . 33 7. 26. 6. 60. 78.5 34 1. 29. 15. 52. 74.3 35 ll. 56. 8. 20. 104.3 36 ll. 3l. 8. 47. 87.6 37 7. 52. 6. 33. 95·9 38 11. 55. 9. 22. 109.2 39 3. 7l. 17. 6. 102.7 40 1. 31. 22. 44. 72.5 41 2. 54. 18. 22. 93.1 42 2l. 47. 4. 26. 115.9 43 1. 40. 23. 34. 83.8 44 llo 66. 9· 12. 113.3 45 10. 68. 8. 12. 109.4 46 END ¢F DATA 47 TITLE,EXAMPLE ¢F BACKWARD ¢Pl'I¢N (DATA FR¢M DRAPER & SMITH,
48 p. 364 - 402) DATA,5,0,2.
49 ~DEL,5=1+2+3+4. 50 BACKWARD,SIG=.05 51 ¢UTPUT,C~RR,STEPS,RESIDUALS 52 PRESS 53 END ¢F PARAMETERS 54 TITLE,EXAMPLE ¢F RANK REGRESSI¢N WITH STEPWISE ¢Pl'I¢N (DATA FR¢M
DRAPER/SMITH) 55 DATA,5,0,2. 56 LABEL(l)=RANK(Xl) ,RANK(X2) ,RANK(X3) ,RANK (x4) ,RANK(Y)
23
24
57 MODEL,5=1+2+3+4. 58 STEPWISE,SIGIN=.05,SIGOUT=.10 59 PRESS 60 RANK REGRESSION 61 ¢UTPUT,C¢RR,STEPS,RESlDUALS 62 END ¢F PARAMETERS 63 (END ¢.F INF¢.RMATI¢.N -- MULTI-PUNCH 6 7 8 9 IN C¢L 1) 64 (END PF INF¢RMATI¢N -- MULTI-PUNCH 6 7 8 9 IN C¢L 1)
VII. DECK SETUP FOR STANDARDIZING DATA
Any of the variables may be standardized by using the Deck Setup on
the following 2 pages. Remarks on page 22 explain the control cards 1 -
15 • Cards 18 - 30 explain how to use the standardizing program. The user
should note that card 43 must be made to match the data, and that the DATA
parameter card must have a 2 as the third parameter (see page 2).
Card decks of this standardizing program are available from the authors
of this report.
The standardizing program can be replaced by any program that the
user would desire in terms of manipulating the input data set.
Card No.
1 2 3 4 5 6 7 8 9
10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26
DECK SETUP FOR STANDARDIZING DATA BEFORE RUNNING STEPWISE
STE,TlO,EC1200. ACC¢UNT,S509423684,D1223,G1 3,A0188000,RP,KUNC. FTN,R=2,B=STAN. LDSET,PRESET=ZER¢. STAN. FTN,R=2,B=TRANZ. REWIND, TRANZ • ATTACH,¢BJECT,STEPWISE-RLI. REWIND,c,iBJECT. C¢PYL,~BJECT,TRANZ,STEP. ATTACH, SUBS, D1223-7600-LIBRARY. ATTACH ,FXDPLIB ,FXDPLIB. LIBRARY,SUBS,FXDPLIB. LDSET,PRESET=INF. STEP,PL=40000. (END rjF REC9BD -:- MULTI-PUNCH 7 8 9 IN c¢L 1) C C THIS PR¢GRAM WILL STANDARDIZE ANY ¢F THE INPUT VARIABLES. F¢R C STANDARDIZING A CARD MUST BE PLACED IN FR¢NT ¢F THE DATA WITH THE C NUMBER ¢F VARIABLES (NY) RIGHT JUSTIFIED IN C¢LUMNS 1 - 3. THE CREST ¢F THIS CARD AND P¢SSIBLY C¢NTlNUING ¢NT¢ A SEC¢ND CARD DE-C PENDING ON THE NUMBER OF VARIABLES) WILL BE FILLED WITH EITHER ZER¢S C ¢R .¢NES. A 1 PUNCHED IN C¢r.UMN 4 WILL INDICATE THAT VARIABLE NUMBER ¢NEIS TO BE STANDARDIZED, WHILE A 0 PUNCHED IN THIS c¢wwr WILL INDICATE THAT VARIABLE NUMBER 1 IS N¢T T¢ BE STANDARDIZED. VAR-IABLE NUMBER 2 IS LIKEWISE FLAGGED IN C¢LUMN 5, VARIABLE NUMBER 3 IN
27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80
Cr,tLUMN" 6, ET C • C C THE FIRST DD1ENSI¢N ¢F THE DATA MATRIX SH¢'ULD BE AT LEAST ¢NE C GREATER THAN THE NUMBER ¢F ¢BSERVATI¢NS. WHILE THE SEC¢ND DIMENC SIr,tN MUST BE AT LEAST EQUAL T¢ NV. C
C
PR¢GRAM STAND(INPO'T,¢,UTPUT,TAPE10, TAPE 5 = INPUT) DIMENSI¢N IFLAG(101),DATA(10000,9) READ(5,105)NV,(IFLAG(J),J=1,NV)
105 F¢RMAT{I3,77I1724Il) NVl = NV - 1
C READ IN DATA. F¢BMAT 100 MUST C¢NFORM T¢ USERS DATA. C
1=1 1 READ(5,100)(DATA(I,J),J=1,NV)
IF(EOF(5))2,3 100 F¢RMAT(2X,7Fl0,5,4XF4.0)
3 C¢NTlNUE I = I + 1 G¢ T¢ 1
2 I = I -1 PRINT 102
102 F¢RMAT(*1MEANS AND ST. DEVS. r,tF VARIABLES THAT WERE STANDARD lIZED*,//,*VARNrj. MEAN ST.DEV.*)
D¢ 4 K = 1,NV IF(IFLAG(K).EQ.O) GO TO 8 XB = o. SD = o. D¢ 5 J = 1,1 XB = XB + DATA(J ,K)
5 SD = SD + DATA(J,K)**2 XB = XB/I SD = SQRT«SD - I*XB**2)/(I-l)) n¢ 6 J = 1,1
6 DATA(J,K) = (DATA(J,K) - XB)/SD PRINT 101,K,XB,SD
101 F~RMAT(*0*,I4,2F12.5) G¢ T¢ 4
8 PRINT 106,K 106 F¢BMAT(*0*,I4,* THIS VARIABLE WAS N¢T STANDARDlZED*)
4 C¢l\lTINUE D¢ 7 J = 1,1
7 WRITE (10)(DATA(J,K),K=1,NV) ENDFlLE 10 REWIND 10 CALL EXIT END
(END rjF RECPRD -- MULTI-PUNCH 7 8 9 IN C¢L 1)
PARAMETER CARD(S) F})R STANDARDIZING VARIABLES G¢ES HERE
FPLL¢wED BY DATA CARDS
25
26
81 82 83 (END cjF REC¢RD -- MULTI-PUNCH 7 8 9 IN COL 1) 84 SUBR¢VTINE TRANS(X) 85 CPWPN!JJYlAN!NRAW, NTRANS ,IDR¢' , IDUM , IRANK 86 DIMENSIPN X(180) 87 C ANY TRANSFPRMATI¢NS G¢ RERE 88 RETURN 89 END 90 (END OF REC¢RD -- MULTI-PUNCH 7 8 9 IN cr/L 1) 91 92 PARAMETER CARDS Gr/J RERE 93 94 DATA CARD MUST HAVE A 2 AS THE THIRD PARAMETER --95 96 THIS INDICATES DATA IS ¢N TAPE lO(DISK) 97 98 (END ¢F INFORMATI¢N -- MULTI-PUNCH 6 7 8 9 IN CPL 1) 99 (END h INF0RMATI¢N -- MULTI-PUNCH 6 7 8 9 IN C¢L 1)
VIII. BIBLIOGRAPHY
Allen, D. M. (1971). The Prediction Sum of Squares as a Criterion for Selecting Predictor Variables. Technical Report 23, University of Kentucky, Department of Statistics.
Iman, R. L. (1976). STEPWISE Regression. Technical Report, SAND76-0364, Sandia Laboratories, Albuquerque, NM.
lman, R. L. and Conover, W. J. (1979). The Use of the Rank Transform in Regression. Technometrics, 21, 499-509.
27
IX. EXAMPLE
28
'" '"
3ANC'IA La~:)R/\T()r IFS ~:a. :Tr PW!S:: c::GP,!- SSrQ~' pe:)~(tlt.4 <:. :nJ?TfSv )F )::'l". '}I:" STII.T:i5TIr: - .(.Q,'I~A:: S7.'lTf u~IVF;·.~ITV
ifTlE APPE,t.R.S oN EACH PAG-E
'1
7. flJ reB:::r:
/<r TITlF .[XlI"'I"l": Of" ~PWI5~ r:PT:~ (C~T4 ;:'!)O'1 1~1P~~ A~) SMIjH. r-. 365- .. C2)
5 '''''''''" ~/IR'''81.£5 ........ .e-- No. TRAl'I$foR fi\eD
QATll.,5,G,1. V~Rrj\8Le:5 {SUT ::)'F~) ... C~~)J ~ If-'PIIT (;1 .. 1'::r.< 1F 0.!l~.!l ... q:~s
MT" "" l. L 10£ ke:A%,) FRo~ C.AkDSAND ~1.I"4q~P D= JA~Il\~L~S ~::!I) IN 5Jr..VfD 0"" DISI(.
~r.. 0~ T~tI~SFO~~=) i~;rAlLfS
JATO. r;S>J~ITIO~ IS
lAP;::-l (1 1=)(1,)(;', '(~.)(4.v
MO':>;;L,C=1"2+~"'t.. ........... 5 IS TKf. De.PENbENT VAR.I~aE) 1,1, 3., If ~1tE INDEf'E"t>ENT ""R' .... !.1..E-S
3TFD~IS~tSIGI~=.~~.Sl~nUT=.1(
-4. SI(;NIP'ICAf\lCE: t..E~l- f-oR ENTeRING V"kl~lSt..es p~ ~ 3 S -. tA.LLIA"~"'.E. AND ¥lOT 'PReSS vI\L.\l.ES
OUTpU~. loll. --- OIA"T'P\A,-: 5 .... ""5 OF 5qAARtS MATR.I)C, c,oRR&I-ATION M~iT'Rj"t, fN'f'ER.SE- ~
to2..Rf:-LATlorJ ~ATJi:.l~, ST&1»S IN ~~5510N. AND R.E:SIt:JI.A./tL.5 PL(.T 1:'1"51')UI'II..<> ~ OtA"T"'f'tAT UNE J'P;.INfE'R "fLO"r.5
F~[ CF F!PA~~TF~~
FN "'~T (51:'f',.{; I
TITlr.~xllMOL~ OF ~T=~WI~I" O'TION (O~TA F~O~ )~llP~~ ll~O ~Hlr~, p. ~&5-~(?J
St~rI~ LllQ0QATn~II"S <~<~ STI"P~ISf ~E;R~SSIO~ C~<~ ~PO~ KANSAS STATl U~IVE~SITY
~- CH~CK ON G~
'2
?~.r['r'CCC
X'
~.CJ'...r:rr:
~T I')'lS~Ru,TiW X4
F=- r. .:J C':! (; 0 C
tID. ",A" J/!.T/I, I:-,PJT = @
s 1,\. ;[,0 (1 J (i
~r. ~~A~~FO~~EJ JqSE~VAr!J~S 13
~r. o~ L3SrQVATIJN3 OR1pP~J
(~TA7 CONT()L C/!'~JI
ISTAT CQ~T~)L CQ~:I
(STI'lT ~n~T~Jl C_~JI
ISTAT CJNT~)L CA~).
(STlIj rO'.ji~)L Cll~ :;,
(STAT cr~Ti)l C~~JI
('::TI'IT r:O"47~)L C!i<.j)
:J a:;:
w '"
Vr\~UgLF
N.I\~f
x, X2
" x. v
V'C.;::p,D;L: ~I\ 1'-'1 r;t .;
" s
TITL:~':")(r"'Dl! r~ ~,T,rWI<;F o'rro~ (DUll ::-;l,a"! JP~C':I,:" IlkJ ;~Iv'I:4, 1". ;';S-I..~?I ::.,\(.'
St'lW:':Il\ ln01n.:o"l"'o~!rs ST~PWI3' ~fGf;1:::;SII)N 000> =::O;')M l(!'\'~StS ST~Tf II'~T-Vfr.:~!n
~F.H \In::; I !I. N::'=' ST')" 'J Ell" '>, n. !'";' ~ • -; ... ,/ 7" 4F,1 c; 4
"'3.1"; 'I: ~ 11.7~~"
"'r.C C~"~ ClS.42l:1
11+.1=- O?h 2t.? .1t.l It 1. [?51'. 2~C .1:'7 "~f,. !l4
5.?"~23q
15.S5C,'1 6.ltO?1'~
1&.73B2 1;:;" 0 It:! 7
S.E.= ST· :DEli. 1.61'148 t.V. -= ST:.PE.V. If./OO %7 13 .'1 .. ..fn" 4. ~tr::"'l
1.77"-4f 4.1'-.4214 4.1723~
]C. ~ 2 • 31 :> 4" 4? :;">,, 7~ 15" 77
1~ O'\S~~IJL\~In~;3
Xi V' X'
'" V
v, X?
" X4
NA'"'F
"JAM'
~.~":"f.G?
TITLF~~\(A~Pl~ OF ST:FWn~ DPTTON l)II,fA ;;"~O'1 .)j;,!l~::CI, AND S'''I T"'l , P. 3615-402)
~A~DIA LA~r~AiC~IES <>c> ~T~owI;F ~F~R~sSIa~ <><> FRO~ KA~~AS STATE UI~I~E~SITY
0~ nF snull.::;l::s ""AT~ ?~11~+r2 2.qO~f.(~
3 -~.7?F-~+C'? -i.(,oS'5"+ ... q2~E+(.' 4 -2'. Qr-;o:-.cz -3.l41E+ ~. ",l"'r[+Cl 3.3r,?~+J~
S 7.7~cr+~2 2.2i~E+ -~.1~2~+G2 _?4",?c+;3 ?71oE+J3
NO.
" •
'2 " '" v
iITL~t~.4~PL~ Q~ 5T~fWISE O'TIO~ lo~rA ~~Q~ ~R~P~~ AND S~ITH. P. 36~-4L2)
S6!~OIA LApn~ATO~IF~ c.c. STEPwIS~ ~EGR~SSIO~ c.c. F~~~ KANSAS STATE u~r~E~srTY
E.·;;':lIlT!.o.0§~ ~ FoR. AlJ.. TKE: INPUT ~ATA
,~ ~(
'A Gf
1 • C ~ [1 C .~"'~F
-.i-'?41 1.r1HC -.13q2 1. r. C! L Ii
~~ ~------
l-h6HEST CORR. ~ XIf 15 FIt.ST IfilP&PENf,)E'NT VAltI.4&L.G .wPEO TO THe. /11\0»61-
• -. :?4~4 -.'H3C • ... ~ ... 7 .. "11 F ,.
N0. 2
" X? "
• u2 Q~ -.S"4"
'" •
1.0 .. ()!l
•
- - r,~.~~/~=~~-~(~-.9T~==Jo)~(-=.="="~/3~)::-·ts.~ - J ~( 1-(-.97Sofl (H-.8%./3)')
.f30~
= -.\B52 PAkT/AL CoRRELATIONS AFTER "'( ts A1>.D£l) TO MoDU •.
r'3S.lf " -.53'f7- (.0%,$)(-·8"3)
~('-(.o%.95)·) (/-{-.8Z1!i)l)
"S.lt = '" .95~8 .7307- (-.1YSY)(-.8tl?)
-I( 1- (-.2.~5'fi') (I-(-.S1/31~5
HI6HeST I'AIO'I,,," c,oRR. * )(, WILL ~ THE
5£00"0 ",,,,,,"&1£. ""PE.P M'T£R ~ ~
w I-'
x. ~ l .. r['Gq·o~
~J 1. c.
~AMf X t.
SOU~r':~
FfGR'::':SSIO'" PI=:S!I'lUII,L' H'IT.IlL
r!TL~.~X~~PL~ J~ ST~~wI~F O~~rON (Q~ro :{J~ JFAo~o AND S~Ir~, ~. _~~S-~L?l
~d~nrA LA8~~~TC~IF~ C~<~ ~T~PWliE ~EG~E3SrJ~ c.c> F~O~ K~~~AS STATE UNIJ~~~Irv
~~~~ OF CJ~~ ~AT~ ~ ~R. Ifll'DEf'ENl)ENT VAR"lA'8Le SEt..!;OCoT£D IN .5Te'P ,.
TITL~.C:)(Afo!O,LE OF ST~FWISf OPTtOfll (DoHA :i{1'1 JRA])ER AND SHIT"i, P. 3f:.~-l.oG2)
S~·\lu~A LIl30?ATO':'.IFS 'ST~DWI5C::: ':l.E:;='.::~SI:)'" c>-<> F"~OM I{t!.NSAS S'7A7 1 1)1>+lvr;"~In
oJ. F.
1 @
AOV T<\~l:: ~~~lY~IS O~ ~~r~f~srON ~n~ IA~I~~L~
rrl'l'l~ 11
O;;;~ "3
lA~1.A:q~2 ~-~11;~.~~6q2 ~!:. 1 ,),,: FI : 2 '- 271r;. 7~"Tl
IS 5.~1:,221
~FGi.. ~~SIr."l
(';]t: FFI(":ENT-; STa"'I['1d=<:CI7~D
0:--. ~ "THE nePEN'DeNT \lAAIABLE- IS INPIA.T V'ARIA&E: 6
:!?7q~52!l
srr;'41FlCAN:::
• t; eli IS
~ ...... 2 :::~U:TFS
• .. 6 73 "161:0 r. (; (' C'J
U~IQUE S~~U::N~~ tJIJ~~f~ FO~
u.s.!:> fO'" TESTING- 110: ~ q = 0
'E (~ .. - ii~):L THIS 15 THE- JST ANaiA TABLE WITMIN THIS klAlII OF STEPW,se.
P:;'i='~~~
.. " ""'IT FREE p.eT~, ,.E. ~ = ~q E(Yi-y):L
--. THe: f!,~)5 CAN 'lS.E U.&e.D "OR oRDaRl/'IIG- THe. RE-UTIVe IMl'O~TAJIIC.E .. \t~~AS THE .. suAL ~)6 cANNOT'ltE u.se.;o FOR oPJ)E'RING-
lU:ct-RESSION IEq.U"TION~ 9 = 117 . .5(.1'!J~- .73&''''81 * )eJ. . --" ~
TEIltM
:J ~ r.:
;).a, Gf
.Ill...;)~~
... .a, T S
(§V k"
W tv
T!7Lr.~XA~Plr ~~ ~T~rWT~r O?T[ON (onTO ~~r~ ~~~p~~ A~D S~'lr~. p. ~~S-~L2j , tI. ,":, r
Xi X4
1 .. O')'tH-OC
S~~CIO LArO~ATOclf~ c~c~ 5TrpWISr ~FG~=SSI1~ <><~ ~~o~ ~a~5A~ STArr U'.IVt'~~ITY
0~:>~o:- nF C)t:o;, 'HT~0 _ ~ foR. IN'DE::l"E-NOEtJ"f"" "'ARt~:'&L.~S 5El-f:c.-reb IN STep 1.
? .. El?F-Gl 1.r'~t.f+-Cr.
NO.
NlI"'E n
so u.:(; "r
Rfr:;~FSSI:)N
FcFSFJUAL T(1T AL
p ...... ? IS
VARJ,I\~L~
t,tIJIo1RF'<:
4
X, X4
NIP"'!;"
X4
TITl=t~X~~~LF. C~ STFPWlSr OPTIO~ (OATA ~tD~ JRA~E~ AND ~MIT~. p. 36S-4C?
SA~OIA LA~n~n-oolfS <~<. 5TrpWI5~ ~EG~~SSIO~ c>c> FOOM KANSAS STATE U~Ivr,SITY
:i.F.
z I C 12
A~V TA~L:
a~'nLYSrS OF ~~~~ESS!O~ ~O~ 111=lnlL~ ~---y
(TA~LF l'
~';; MS F ~:Lr;'lTFICAt!r::::
i' k,," 1. oj" 1 G 74.7?"'1?
1':1;? .... 5Cj5 7 .. (.1'62112
trS.1)25Q6
crt">. n:Y I
IS ?1.23q~
STa~"A~[Il~D
~fS~FS~IO~
?tl.UHl sse
T-TF 5T VAL UE~_
","'" :' C[LrTE3
H.!t031 .f7 .. r, -12.&f.'
THIS IS THE VAL!.AE TtfAT R'2.
...... Lt- 'De.c.~ASE "'T't:I 1 F ,clf. IS UIIZI'Jur 5;;:1U:::"lC~ ~JlJ"'n£;: FCi= TJ.iIS ANO'I/~
DP'::--:::S IS 121.
,. ~INAL. REG-U5SI0N ~U..ttTION: y
How 'DELETE-D FROM THe-
J190. "2.~(,
27'5·7"3'
103.0.,738 + I.Ll3'3~593 * ~. - _"'3'353,,3 * }I..,
:I!'l ~£-
.\L:t; A "!I, T 3
r\'~'t1 J, t.JOJC
51&NIFICANce LEI/£L. HAD "J'O BE .(. SI6.N = .05 FOR THE: 'llAAIA"&LE- 11:) E.NT£:-R
Ttt~ ~ODt:L
w w
T!TL~',EX"~OL~ rF ~Trr~ISF O~T!ON (DATn F~01 )PAP~~ "~0 S~Ir~. p. 1~~-4~2)
SA~0IA LA~o~nTG~!~~ <><. ST~P~I3E ~E~~rssto~ <.<. ~DO~ KANsas 5Ta~~ Ut.i~F~S!iY +---------+---------+---------+---------+---------t--- ______ t _________ + _________ + _________ + _________ + .11q4~+C-r.+ 0 +
.Q7Q6E+C 3+
l'R>;SS vAClAES -fOk 'Tf'E- TwO -
STE.PS
.S5(14EII-(I3+
.3~5I1,F.r~t
.121?~H3+
I YRE:-S5 :. 11",,0. ~ STEP I
Vft.£-ss ~ 12..1. FOR ~e-"P 2.
! o + _________ + _________ + _________ + _________ + _________ + ___ ------t _________ + _________ + _________ + _________ +
C. .btCCC~+~(, .1:?~~IJF+]t .t,JCn+Ul • 240..([:+C1 .~I);:~'~;;-+J1
5TFD NO.
;J 4 J '.
" ;:
. , c'
M , v'
y
"' 0 ~
CO
T 0 ~
" ;', ;; 7 0
" 0 0
'" " ~
" ~ " " '-'
-, n
" ~
~
34
+ ,
> ~ ... "'
, , '. , > , , , = , , c- , ,- + ~ , ~ , c, , , '"
, ~ , , z , ~ , y + , , , 0> , n' , "
, , , , · , · + , , 7 , 0 , :;; , , ,n , I,; , '"
, ,,. " ~ ,.' ,n , n " ~ <A
· V · V
,n H u c , " <V C C ~
~
~
~
~
C'
, , , , , , , , , + , , , , , , , , , + , , ,
, + , , , , , , , , ,
, +
I , I I I I I I , ... , I I I I I I I I ... I I I I I I I I I ... I , , I I t I I I ... I I I I I I I I I ...
N u + U' W .
~ I I I I , I I I I ... I I I I I I I I
"' o +
'" o ~
• I I -oj. I I I I I I I I I o
+0 , " ,~ , ~ , , , , , , + , , , , , , 0 , . , 'w +0 , 0
'" , ~
, 'N , 0 , . , 'n • 0 '0 , .' , , , , , , , • , , , , , I. 'w ,w • , '0 , 0 , , , , , , · , , , 'w • u 'w I '~ , " , , , , , , · , , , , , , , +
, " ,~
• • 0
• 0 "< 0 w · " c'
" ·
®
• '" o =
T o ,y
"
z o
~ · H ·
.--+111111111+111111111+111111111+111111111+111 1 1 1 1 1 1 + ~,
> ~
H' vi 1 " , " , >, H' Z, -0> , , "- , ,. + ~,
~,
"" , .n • ~,
'" 7' ~,
~. , ~ , n' n , ~, , . , V • · V •
• · 7. ,.,. H' "" "" t" 1 ~, ". w. ~ .
• , w. .n. H' 3' "-, , . ~ . ~. , .
• · • .. • · ~+
~.
H' ,y • ~. ~ , < , n , c , ~ , < • ~ ,
· · , , , , , , , • · • 1 1 1 1 1 1 • I
N o . Ie ~
'" '" '"
I I 1 1 1 I I I • I I • I I 1 I I I • N .,
w . 0, ~
• <, • <, , <,
''''
· , , , + , , , , , 0 , , . :~ · " , ., '''' ,~ , , · , -I , . . • • , , .N I ,~ , . , 1 ':, · ~ I .• ·
o , o , . 'w ,~
+ C_'
'"' ,~
'N , o , , , · · , , ,
N . " o Q
+ 0 ,¢ 0'" ,~
o , o , , , · o , , o , o
o + , " '0 +0 I I I I I 1 1 I I. 0
• w .-
35
w
'"
T!TL~.~xa~PlE O~ ST=PWI~~ OPTION ID~TA =~O~ O~~PE~ AND S~IT~, p. ~6~-4~2l
Sft'lrIA LA~0~ATORIfS <.c. STE~WIS[ ~FGR~S5ION <><> FRO~ KA~SAS STATE ~'JIV[;SITY +---------+---------+---------+---------+---------+---------t----------+---------+---------+---------+
.~77~~+Ol •
• 2li1tEt-Clt-
.ZS?TF.+CC+
R
f S T o U ~
L -.tC;(!f:,E+CU
-.'3265E+Olt
-.50?3E+Cl+ +--- ______ + _________ t- _________ + _________ + _________ + ___ -----_+ _________ + _________ + _________ + _________ +
• HIt! OOr-tel .3Lt(Cr:E"~1 .C;~OQCE+01 .320[]Of+Ol .1C6C[]!:+e2 .130GtE+[j2
® OR,:DE:R IN wtHctt 'DATA WA5 U-AD IN
:JAGI: 1!
w .....
TITL~t~tA~PLE o~ ST~fWISE C'TIO~ (O~rA F~O~ )R~'E~ a~o SMITH, 0. 3b5-~C2)
s~~~cu LA!3Q~ATrR.IF:S no. STE"PWI3t: ~~GF:::SSla ... <,.<-,. FROM KANSfl3 ST~TE: UN!IJEi<.SITV
+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+ .1:77!)E+tH+
. ?1J11~+01+
.2'S2TE+OO·
R E S I o U
• l
-.lS(16E"!.'1+
-.3?fSE"('1+
-.C;!]2'~E+C1+
•
..
+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+ . 7?1!I12'f+ iJ7 • ~1 ~o4r .. 07 • C!O':i17r+1l 7 • 93 ... 69E. HH~ .1J ~4ZE +C 3 .1173F+C'!
~ 1"~'PICTION5 lo\S,,.,G- f-lNAt. ,r:et:r-~S510N eql,l,.ATloN
;) 1\ G r 13
w 00
Tlrl~.~XA~PLE OF STEPwISE O'!!ON (3AfA FiO~ ~F~~S~ AN~ ~~Ir~, P. 3SS-~~?1
<"A'lurA LA=lC::o,nCR.IFS <><> sT~pwnl'" ~f:;R:::;SIJ\I <><')0 F~C'" ,<ANSAS STAir U'!.IlIf'-<,ITY
+---------~---------+---------+---------+---------+---------+---------+---------+---------+---------+ ."77CE+C1+
.2C11E+01+
_ .
• 21527£+no+
• E S I o U
• L
-.lC;CIS;:..ol+
-. '3?f-5f.+O 1+
-.5(23E+(1·-
..
+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+ .tDDC~F+r1 .50CG~E+01 .qa~'[~+nl .13anDf+02 .1700([+J2 .2100G~+CZ
X1
:>~ Gf: 1.:'
" ," n d u n
o
z c
.. ~ ,. H
· 'd + I I I 1 1 I I I 1+11'1"11'+1'1111,11+111",,11+111111111+ ,J
+ , , , , , , , ~,
~,
'H ~, , , ~,
H' , ::: , c· + ~, , , "', , '" , ~,
"', 7' ~,
". , " ~, .. , " , , . , V , . , V • , · z, n' H' "', '" , III I
"" ". "" ~, , , IL' • ," , H' '" ~, l,' + .-, V' , , , , , , V , , Ul" II~,
H' ~, W, 0-. , ~,
n' , c, a, ~ . -', , ~,
H' n' 7' ~,
'" , · , I I I , I , I I · ... I I 1 I I I I I I + I I I I I I I I I + I I I I I I I 1 ... , I I , , I I I I + f I I , I I I I ~ ~ ~ ~ ~
~ ~ . . !.oJ W ... .-N Q If'I uW(llHCl::>~...Jlf'I N ~
w • '" U' "N ,., I
I
•
• ~ L · '" ~ N
" ... I
.~
1= IV 1-<> I I I I I , · I , I I I I , . I~ Iv
• c. , '" I a
I ~ I I I , I I + I I I I , N I 0
I • I I ~ ." , s , ~. I ,., I , I , I I
• I I I I I ,
'; I I 4 I 0
'0 , '" , ...
I N I I I I , I · I , , , , N I , · I U I L .~
I oc , U I , , , I I I
• , I I I , ~ I I · I ~ Ie +~
Q
" "-
.. X
39
." o
TITL~.~XAM~Lr n~ STr~WI~f 03TTON (DATA =~01 J~~~~~ AND S~lr~. p. 36S-4G2) ).1.\ (" ~ 1.:'
S{\\lrJI .... l"RO~!',TCc:r~S <>0 STfTWI5E.. ~~GRESSION <><-.. FROM ~A"1SAS STil,1: 'Havf~,SITY
~"~LF nF r~SIOUA~S FO~ ~AilA3_~S C;;----y
TIt.<:::- (O~SI'":;:VE" If A LU::,.) (i"c;;-JI,';'-) t~' Jb /GFSI)" Ai:) 71\. ~,~ f: C / 7?'.P1Q 2.1nL1:: 74. ;:-CiJ fJ 72.611' LS!VI25 iOI.t.30C r- 13El. n'?~
('/- y) -2.1'S7~?
Y '17.f.;~:J= Y q:J~~"11 -2.ltP;11r gS • C!D 0 ~ Q?31?o ? q "I?" 3f1 1c.C!.20] 135.4.3) 3. 7 7"'. Of, 1 'J2 .7(;.J 103.7~4 -1.03353 7? 5{11,J a 77.~?lft -5.02~38
q~.100U g2 •• 7J'! ~S2gof\2
H 115.QC.j HT.37!o -1.47;>:71 11 fl3. FIr::! Q C R3.~6?g .137001)3 l' lU.3QD 111. ;;~g t.73G5'2 U H<:I .. 4.0C 110.113 -. ng5z1
E#D OF ANAI..Y5fS FeIR 1ST P/rSS oN TttE- vATA.
SANDt A LA30~""TO~I~S <> STfPWIS~ ~SGPFSS!ON ppnG~A~ <> :OJRTESY JF J~~T. OF STATISTICS - (A~!SAS STAT~. U~IVERSIT¥
TITLE,EXA"1!lLF OF (£A"CI(WA c>r:, CPT!O!)(OIloTA FROM )RA.:tS~ AN:) S"11T"I, F. 365-<+(2)
NOTe- THE t.,,"8e1.5
cAJU) '5 ~I~ING- : l-ABeLS WII-l- CliltRY ou'E'R FROM rs-r "RIJ."".
OATA.5.G.Z. .I
'D!rTA WAS 5J4.YED oN DISK ON TltE- r,,,"ST
'PASS.
"10:J r L.:.""t+;?+3-1-4 5
IN?UT CH~C< JF PA~A~~r~~s
~UMRFP ~= ~A~I~~lES ~~A] IN
NU. nF T~A~SFOR~~~ #A~IA.~lES
QtTI OIS:tOSITION IS
GACKWAr:'C.SIr;=.C5 -;JI> L5;"&'I- foOlit1>el-E:.TINt;. "'~;~1A'BL.~S
ClUF'tJj,r1~;,SH.PS,?F.S!CUnL~. -If. NO ,N'IellSe. NO Stu\5 0"- sq~5 MA~IJ{
Pk~35
fNO OF PtJ,j:"A"'=:T 17P S
(STAT CONT~:Il. C:fI,~J'
(STAT r.')~T:(Jl Cd ~ j.
'STAT r:n~~":' ~)L G~-(J)
(STH CON";c,JL ell:(::; )
(STAT r;!1'l-;-iJL ca.~J)
{STA';' S'1~n~)L Gil, -( ~)
... ....
Xi
7.000£000
YU!A8L£ NAME
Xl ~2 .3· •• y
'Z
'T!TLE,EX.al'lPL£ 01= ::!A,Ct(WA~O OllTION eona f'~O" :lP:l:J::~ ANO SMITI-4. p~ 3f..5-'+021
SANOIA LAAO'::ATOFtfS c:»-c,. STf.PWI3iE ~~:;R~iSloN <:.ou, F~Ot.ll KAN'SA3 Sf ATC UN.I"E~SITV
"
I~IDUT ~~ECK ON 081A
FI~ST O~S~w'ATION '. • 5
;:>6.C~()COC 6. ,)tjOu OGD M.lIJDODO ,'.;;'BDDDa
NO. RAW !JATA I.,.PJT l-:!
NO. TRAt'SFOR"'IEl llJ'SERIt'ATI')N5 13
~n. Of' rBSEeyATIJN$ OROPPE)
TITLE,:XAHPLE OF QlCKMARO O'TION (DATA ~~~ ~P'~=_ AND S~IT~. p. J&~-_G2'
SA~OIA lIB~AT~QI~~ <~<. STEPwtiE ~EG~ESSIO~ ~.<. F~OM KANSAS STATE UNlvl~SITV
VAPIABlF IIIUI49F.;;;> "'FAN VA~UN:E 511'). 'lEV. STD. EPf.
t 7 ."f;1r::;:~ 3"-.f.O'Z6 5.e"z3Q 1-631"6 2 ItS.tS] ~ Z1o.2.1_1 15. r;.61l q It.!l')"!! 3 It.16Q? 41.Jl256 fJ .... U51J 1.776ft€:
• 31J.1H3 C 211;0.1&7 "1;.13S2 4,£,1.234
• ~t;.42,H 22~.314 15.0457 .It.t723R
13 OBSEltflTIONS
K1 12 n ... ,
N/l"'E
1 2 1
• 5
NO.
TJTlE.EX'"~l£ OF RACKMA~O ~TION CO_1a Ct01 >lIPER ANO $MIT". p. 3~~-.uZ'
s.a"oIA lA~o~aTORtES <~<~ STEPWISE ~E:;R£SSION <.<~ FR:O,,", K_N$AS STIlTF uulv~RSlh
1.COOO .lZ..:£,
~."241 -.2lt5"
."n01
1.001:[ -.13~2 -. q-r:aO
.Mt:,3
2
tl:1 )(2 J( 11
1. aooo .O~~
-.5347
(Rnuw.:·ING f •• O~ IN ROW» .....
CORRELaTI~" ~ATRIX
'"
1.0000 -.A2'13
• 1. GOOD
5
• FU\6- lOIDIC/tTlol6- -V -..~ INPUT .... ~5. at:: MATllI. 15 ~ A HESISMoE WlLL." ,...III"ft=.'D AIIO f'Itoc.aSSINfr $~'P •
"l:;E"
;)flGE
~. V.
r 8 .. 8~ 32.31 i4.42 55. r3 15 .. 77
;)~GF
... '"
rITLE.~XAMPLF OF 8ACKWAFD O~T[ON (JATA ~~O~ JRA3~~ AND S~ITH, p. 3~5-4C21 '0. ';[
SANOIA LA~O~~TO~TES <~<~ STEPwISE ~fGRESSrO~ <.<~ FROM K~NSAS STAir Ult!VEqSITY
sou~r:E" D. F.
AOV TA"L~ 'NALYS!S or ~EG~ES~IO~ FOR VA~InlL~ 5---'
(TAr'lLF U
ss "S • P.FG~ESSION
RESIDU~L TnTAL
4 2en7. ~gg4 bIJ6.Q,,.Sr, 5.qI\2~t;r.q
111.47917 @ ~7. 86363q
12~ R··2 IS .9"2~f\ I"TF.~CE~T IS f2.-'+0536g STI~O.~D E~qaR Of I~TE~r.E~T IS 7C.0710
V4RIABL'S" "I !IHAqL'= RE GRfSSIO"l NU"aER NA"'IE tDEfFrrIENTS
1 u 1.5511 C26 Z <z .5101b75R 3 n .1f}1QOQ4C 4 •• -.14i+O(;'10~
UNIQUE S~OU~NCF NU~BFo FO~ 1415 ~~nvA
pp~ss IS 110.
Sl'A~O"I;!(;IlI=;O C'EGR.FSSION
t';JEFFICIENTS .6G6512' .5~770o
.0431<=10 -.1F-O?f'.7
103
PAR-TIAl sse
Z5.~50) 2.g7?5
.10Q1
.2'-'+1'0
\ i-TEST VALUES B'D.F.
2.01327 .7(l4g .135C
".2032
SIGNIFICANC~
• DC: 0
1=:··2 ALlI-tA DELETE ~HS
• 97 2~ .J7!S .9"l13 .'J 13 .9623 Gr~6:D .9~ 23 • 1t42
SINCE IIh:t VIiIItIML£S wa_ fORt.£-1) TO
ST,.y IN T",ec ~L, Ttte. VAR.IA'BL.-=
wITH ~ .LMi"&5T .Q > 51.· '=- .05 ."u.. 'II£:. 'DE1SiT&D.
.... w
T!TL.~t::)(AMPlE OF' R!I.''';1(WII;O:O O~TION (DATil. C"~O'1 [)R-AtJ::~ AND St"'ITfi'l P • .365-.. C2)
SA~OU lA.a.Oi;l,A'TO:;"IfS STEPWISE ~EGR~SSIO~ <.<> FRO~ KA.NSA.S STATE U~IvEq~IT~
SOU~CF O~ F ~
AOV TBL:: A~ALYSIS O~ REGRESSION ~O~ VAQIA1L:: ~---~
(TADlf 11
55 "' F
REGRESSION 3 2667.7qO~ 8~q.Z'lr.1j 1o:;.~31&" RESnU~L ~ 47.q1?72q TOTAL 12"--- ("115.7631
R;··Z IS .Q"234 I~ER:r.EPT IS 11.64A307 STA~DARO ER~O~ ~F T~TERCEPT IS 14.1'+24
VAttIABL~ ,aIHASLE QEGPfS'SH'I~
lrfUfIfeER NA""E COFFFlr:IENTS
1 U 1.c.51Q31\O Z '2 .. lt1F.l0Q16 ~ •• -.23fl'i4iJ22
5.3303033
STANDA~OI!E(!
F.Er;RE~'SIaN r::)~FFICIC"NTS
.'30&1137
.430414 -O'''61tlP3
UNIQUE SfJ~=~r.f NU~REP FoR THIS ANOVA 104
P~ESS IS ~5.~
PiI,'(TIA.L SSQ
8Z(!.9[74 2&.71Ql,.
9.9318
~ T-TfST VALUES} ~1).F.
12.4tH 2.2418
-1.36SG
SIGNlFICAIiC[
.0 CO 0
R-" 2 OElfTES
• 6e 01 .9725 .q787
TITL~t:XA"PLE OF ~ACKWA~O O'TION (DAT' ~~O~ ~~"E~ _NO S~Ir~. ~. ~65-4ij2J
SANDIA LA~O~BTO~IfS <><~ STEPwISE ~EGREiSIO~ <~<. FROM KANSAS STBTf UNIVE~SIiY
SOU~CE ·0. F.
lOV TAll:: ANAl~SIS OF ~EGPESSION :O~ VA~tA~L; ;--·f
C"!'Aqtc:' l'
SS HS F
"'EGD.ES'SION ? 2&57. ~'5AF.. 1328.,'~2~1 ZZIJ.S037G ~ESIDU~L ® ~1. 904483 TOTAL 12"-- "7115.11;31
R"'.Z IS .1J1"ft~ INTERCfPT IS 5Z.'577~~q STaNO.R~ E~~~~ O~ INTERCEPT IS ?Z~6t.,
VARIaBLE NUM8ER.
1 Z
fAf(IA1!IL": PEGRfSSIO~ NA"'~ COEFFICIENrs
(1 1.46~30~7
'Z .66225Q4q
1j.1qO"''''~3
'STINDAIH 17EO P.~G=ESSHIN
r.OE'FFICIE"TS • '574137 .6~5017
U~IQUE S~O~~~CF ~U~BE~ FO~ T~IS 4~nWA 10~
PARTIA.L sse
848.431~
lZQ1.1~23
\ T-TEST} vALUES IO'D..F.
1?.t047 14.4 .. 2"
SIGNIFICANCE
• lie::. 0
P··2 OElfTES
.f..&63
.15331
~~ GE
Al;;l-tA Ha.T S
.0001
~ ~
NEXT VAfUA'8LE~ 'Be. 'OCt.eTE'P
:»4GE
4L!Jo-tA I-'U S
~ ~
'&OT1+ (. 511i- ~ .05
~ NO ~ 'DEtETlONS
"" ...
TITLE,EXANPl~ O~ qftCK~a~O O~TION I~ATA FiO~ )FA~ER ANO S~IT~, p. 3f5-4G21
SANDIA LA~o~nTO~!FS <~<~ STEPWISE ~EG~E5SIO~ <><> FRO~ KANSAS ~TAT[ ~JNIV[R~IT.
+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+ .1103E+OH (£>
.1DS3E+O:H
.1003E+03+
p
R E S S
.953SE+OZ+
,Q03;E+02+
STE"P I .PI "&lCK:WA1a'>
o Str'P 3
+
•
• ~5~SE+t2. e s-n:P 2. +
+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+ D. .8COOOE*00 .1E1D[HJ~+:]1 .Z!tQIJO'E+01 .32/JOOf+lj! .ttCJOO;-+Ol
STfP NO.
:l&, G t
... '"
TITL=.~~nHPLr. O~ ~ACKWARD O~TION ()~TA F~O~ )P.A~~~ AN~ S~ITH. p. 3~5-4(2)
SA~OIA lABO~ATO~IES <~<~ ST~PWI5E ~fG~ESSI3~ c~c» F~OM KANSAS STATF U~IVE~SITY
TIMI: 1 2 , .. , 6 7
• • if 11 12
"
TA8L~ OF ~ESIOUALS FaR vA~IAa~~5
OgSEQV;:J VAI.UE' 7 ... SC IJ Q T~.30!)1]
1!Pla3!lO ~7.6C.JC
QS.qODO 1"Q.200 102.70n 72.C;000 q~.1!lI!O
11c; .9~e 8~.PIJOG
113.~eG
10Q.4GI}
PREC1I:EJ II\.:"JE M. ~ 7",1) Tl.~;;:n
t.]5. 'H5 ~q.~;~5
97.Z:}25 105.llij.~
10"'.1i02 74.575_ ql.?755 l1lt.'?'3-:l eIJ.:;'JC;7 H~.43r
H2.zcH
s----¥
END of' AfIIAL.Y5rs FO~ 2.Nz:, 'P"'~ oN itt£:- "DATA
PPSIJIJ AL -1.51'+OL
1.ij4q[JA -i.5llt7:' -1.&584A -1.39251
I., aOl.t751 -1.30205 -2.07542 1.82451 1.36246 3.26433 .1'1&2756
-2."9::';44
l'A:;E
SA~DIA LIBO~ATOF.lfS c» STfPWIS~ ~Er,RfSSION PROGfAM c» ;OJRTESY Jf l~PT. OF STATI~TICS - ~ANSAS STATE UNIVf~SITV
TITLE,E.)fA"1:tL:;: OF~II,NI( qE'G~fSSIOB>I'tIfH THE ST::PIIIIS!: OPTIIJNIDATA F?Of1 npAP~r::IS~J'71.1
OAT A'I'5.C.? J~PUT CH~C( JF D.~A~ET=~S
NU~~€R O~ '.~I.~lES ~~~~ [N
Ne. of T~A~SFO~~~J #A~I~BLfS
JATA CIs:tnSITION IS
LAP[LC11=Ii't\NI{1X1'.RANKfY2t,~A"'I(O(JJ,roA"Ii(C)(f"'.~ANl{nl -7 T'tfE:SE LASE:LS WIu..kePLAG~
'1"1te: ~L!i 1A5." oN THE:. MnD~L.~=1+2+~+4. ~V'IOU.S ~ l"A~SE:5_
STfPWISf,SIGIN=.aS,SIGOUT=.tO
PC['Ss
F-ANI( RFGQo~SSInt.l ---iI' A 'JU:Go'll:IiSSION ANAL1srS WIt..L. '!te: l"EJ:.~""&c1:> ON "f1t£ JI."N"-S Ot=- T1tE. "DATA..
OUTPUT,('D~Hi,STFPS,t;ESIOUALS ........ SH'\E ou.Tf'MoT A'5 2.HD 'P~5.
~~r OF PA~AM~T~PS
(STAT ~~~T~Jl C~~]I
(STAT CONT~Jl C4(J,
(STAT CONliJL c~tJ'
(STAT CO~T~Jl CAIJ,
(STAT C~~T<JL C4(Jt
(STAT ~1~T~JL C4~J)
(STAT CQ~T<)l CA~JI
(STAT ca~TIJl C~,JI
..,. '"
TTTlE.FXAMPlE OF ~ANK RfG~E~S!ON ~IT~ Ti~ STEPWISE OPTI~~(JATA F~C~ C~A~r~/SMIT4 ;)~ ~E
SANDIA lABOPATO~IfS <~<> STEPWISE ~fGRSSSIO~ <~<~ ~RO~ KANSAS STATf UNIVfF.SITY
INPUT C~ECK ON O~TA
liOTE: T1-I£.~e AfI.E:. Ttf£ RAfIIt::"S A55IGoNED T'O TtU: f",R:.r oM£'It.VATrON.
R:A""I«()(1) ~AtJK(X2' CFIill OP:S::PIATION)~ AA,."s All.&. A,5516"E1) 'B.)' VARIASLE- fllWII1!o£:-R.
PANK(X~' PA~~(X~j ~A~<'.'
2 5
e ODOD]) 1.C~(;iJ[jGO 2.5GOOOOQ l::ri.]OOOOD ~.OODOt)QO
'" ~1'AN1" ftJi,5I.LLTI"'. PllO'"
"£-'0 IHPIo\T 08-
~ .. 'llATIONS.
VARIA9lE' NAME
PANII{('(11 RANI{(xZl RANI(OC~"
R,4NI«(X,,"' ~ANI«(vt
yARIII ~Lf NU"l~£P
1 2 ~
• 5
G' oqsnwniO§)
~ANI«(XlI 1 1.cooe R:AI\IK(X?' 2 .3:')01 ~ANI((x3' 3 -.71AG ~a"4I«(X'+1 , -.:u 20 ~ANI«(V' < .1%2'
"0. NA"4E ~ANI(f)(l )
NO. p~w ~ATA I~PJT l'
N~. T~ANs~n~Mf~ JB~ERVArIJNS 13
~O. D~ OBSEPVArY)NS J~OPPSO
TITl=.~XA~~Lf "F ffA~K R~G~ESSIO~ wIr~ T~~ STEPWISE OPTION{~ATA FROM ORAPER'S~ITH
SANDIA LABO~ATORIES <.<~ STEPWI3E ~f;~EssrON c.c. FRO~ KANSAS STATF UNI~FRSITY
'1EAN
7.COCOO} 7. Ilt (I~ G 7.0liGGO ?flODao ) 7.00GCG
Iwel<A<>E ....... 11' /r!i6,&NED FOR 1'3
O&&&1t.VAT,ON6
157.
yAPIAN:;E
1'+.5"17 117.1250 14.qH7 15.C8J3 1".1657
STO. [lEV.
!. '113]5 3.88909 1.662~1 3.88373 3.e9ft.ltlt
.l-~ WJn+oUT TJE-D ~VA.",oNS '1"M-f;. \Ml.u..£s
ALL at:"1"HE ~Me. WITK,N /lr cm.u.",'"
STD. E~~.
1.~5763
1.1]7864 1.[J7116 1.e7715 1. (;~012
'" ,'" ""TKese COUlMfIIS WILL.
TITLEt=XAMQLf or RANK ~fG~ESS[ON MtT~ T~~ 5r~DNI5E QPTIryN()ATA F~OH O~AP~KIS~ITH
SA~DIA ~A~O~ATORlfS STEPWIiE ~E:;RESSION <.U' F~OM KANSAS STATE. mnVE=<1SITV
COR~ElATIO~ ~ATPIX
} t.couo flANk. CDR1t.E:L-~TIONS .O5?7 1.01] 00
-.QgC3 -.OAoC6 1.00ro .737~ -. 44 R~ -.71721 1.Hon
4 5
F-ANI( O(;?I ~llfo.{I(O:.!) I': ,IHH< O( '+, ~I\!IIi(n.
:JI1.GE
::.1/.
'+. It' 5.% 5.17 . 5.4' 5.63
J
:'G, Gf.
... ...,
TITLE. EXAMPLE OF RANK REGRESSION WITH THE STEPWISE OPTION(DATA FROM DRAPER/SMITH PAGE "
SANDIA l~BORATORIES <>(> ST[PWISE REGRESSICN <><> FROM KANSAS STATE UNIVERSITY
SOURCE
REGRESSION RESIDUAL TOTAL
R**2 IS .62600
D.F.
11 12
IWTERCEPT IS 1.3438J~5
STANDARD ERROR OF INTERCEPT IS
AOV TABLE ANALYSIS OF REGRESSION FOR VARIABLE 5---RANK(Y)
(TABLE 1)
SS
113.'1.3123 68.368768 Ul2.00000
1.-\8783
HS
113.-33123 6.1880698
F
18.411433
SIGNIFICANCE
.0013
VARIABLE NU"BER
VARIABLE NAME
REGRESSION COEFfICIENTS
STANDARDIZED REGRESSION
COEFFICIENTS .79119<;1
PARTIAL SSQ
T-TEST VALUES
R**2 DELETES
ALPHA HATS
RANKCX1) .80802292 113.9312 1\.2'90, 0.0000 .0020
UNIQUE SEQUENCE NUMBER FOR THIS ANOVA 106
RANK fIT GIVES A RAW DATA NOR"ALIZ£D R •• 2 .64292846
COEFFICIENT OF INTERPOLATION
PRESS 1'; 'iD.l~l
.68'+O~359E-Ol ~ THE U.5ER.. SHOU,LD NOTE THIlroT Tttt.5 R .. - 2. VALU.E IS c~u:.~ATED /16
1: (Yj- 1>"1. 2"(9,- n"+ ~C'Ii-Yi)'L
NCD S/NCii THE Vi VALUES IIAve; BEEN OJroO\II'£1) THRQ .... GtI INTERPOLATION T\iIS
CALCCALATION WI~L JIOT fG,C,e.SSAfULY ~T IN T-f+£ S-'ME R ..... 2, VALlA.6 AS iME
.. 5 ........ LAL.C.IA.L"TION
~(9i-y)1
LCY;- Y)7..
S/NC£ THE Yi _" ...- ...". 111£. J..E.A5I'" "" ..... ""'5 ESTIMATES, THE q ....... ,TIES E.(9i - 'if AN!> "2. 't';" ",,"-~('f;-~) ,,1\& 11m" _L. T~ IS, THE _ .... _-Vf\01> .... C.T5 <. ('Ii-)';) ('I,-y) IS
~-z.tERo.. IF TttE. su.M oF CR055-~TS l5 JEA.~ ~1IIP, T'1t£N T1fE (.OE:"ICUiNT OF IN1"ER":
1'OI..ImON "'ILL ilia fIEAR 'UiRO.. IF IT ts LA'R6-E, '11tEfil THE: ~H"iC.IENT WILL BE. M&A.'R ONE ~
ge. 'TME Te'IIT fICIR. ~ 'PISClA'SS,O,,", ..
.... '"
TITLE,EXAMPlE OF RAN~ REGRESSION .ITH THE STEP~ISE OPTION(OATA FROM DRAPER/SMITH
3ANDIA LAaORAiORItS <)<) STEPWISE REGR[$SION ()() FROM KANSAS STATE U~IVERSITY
AD" TABLE ANALTSIS Of REGRESSION FOR VARIABLE 3---RANK(Y'
(TABLE .. )
;iOURe€.. D.F .. ~s MS
REGRESSION RESIDUAL TOTAL
2 10 12
H2.92283 1~.07117'j
182.o000a
81.4EU13 1 .. 9::1111 15
·-2 IS .. d9;;)18 HERC.Et'T IS 6.5099 jtl~ :ANOARD ERROR CF INTERCEPT I~ 1.31211t
\tARIABlE NUfl:SErI.
VARIAt:\LE NAME
REGRESSION COt:FFI':IENTS
RANK(xlJ .621~4180
RANK(~4J-.~~1~4162
STANDARDIZED REGRESSION
COEFFICIENTS .EiOf!~Ol
-.550.2'
UNIQUE SEQU~NCE NUMUER fOR. THIS ANOVA 107
RA~K fIT uIVES A RA~ vATA NJ~HALll£D R*.2 .52844314
COEFFICIENT OF I~TE~POLATIO~ .l'+5'sanOOE-Ol
FRE-S8 I.::; 35.'i~O
PARTIAL SSO
59.98.22 48 .. 9916
F
42.100384
T-TEST IJALU£S
5.601.:5 -5.06113
SIGNIFICANCE
.0000
R**Z DELETES
.5aSa
.6260
PAGE S
ALPHA HATS
.0007
.0011
.. \D
TITL~.€XAMPLE OF R~NK REGREiSION ~IT1 TH~ Sr~FWIiE OPTIO~().T6 r~o~ DRAPfk/SMITH
SANDIA l~80qATOPI£S C~<~ STEPWliE ~E~R~SSIO~ <~<~ FROM KANSAS STATE UN!VE~SIT¥
+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+ .CJ035E+OZ· +
.YQ45£+02+
.6P!.C;;5E+OZ+ p P E S S
.51f]5E+O~+
+
.4G7~E+D2+ •
. 35~&f+O~+ +
+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+ (). .F..CCDDF+OIl .120Cf!f+U1 .l~(I!10E+Ol ·.24DIHlE+Ol .3001l0~+Ol
STEP NO ...
;Ja :it:
'" o
TIHF 1 ? 3 4 5 6 7
" 9 10 11 12 U
~IlNI( OF .. 3.e 2.r q.O 5.0 T.C
1 D.C '.C 1.0 0.0
13.0 4.0
12.( 11. [
TITl=,EXAMPlE OF ~ANK RfGRElSION ~IT~ r~~ ST~PWI5E OPTION(JlTA FROM LRAPE~/SHITH ~n rtE
TAALE OF FESIOUA~S F~~ VA~I4a~!s Ij----PANK tv)
PREDICT~o ~ANK OF Y f ANI( P,fSIJUAl iAW r ~ "..RAW YHAT RAW ~tSHlUl,.
~. ~7qq~ -.3rqg7g r 5 • 5 DOC "'n&rSE. TWO 80.S13Q -2.013B 1.1'345'3 • "65ft1? 7 .... HI00 r,Dl.I.A."'NS c.Atl 72.7422 1.5577:;; 11)."300 -1.~3'D02 104.300 ali; St\VE.'D oN 109.366 -5.066D~
6.QflQ23 -1.')&923 ~7.&DaQ 1)1St( 10 ~;:~~~~ -S.213H
£o,.1:-:71)9 .. 111;2;'1] Q5.9.00 2.411t.~
10.0027 -.270,,1r.E-t12 109.200 1[Q.2Cl -.5417H='-Co! <:I.a6~17 -1.:16617 102 .. 700 104.624 -1.~Z4?1
2. 2~167 -1.'n:767 72.5000 75.2982 -2.79"2l ">.Q6269 .37313I)E-,:.1 <n.l00G 92.89 .. 8 .. 2052~1 10.72'133 2.27015 115.900 10q-3~6 6.'55415 2.76q21 1.Z1J n t!3.80DO 71~&llt7 o.1~S3J
1 t. 6513 .34266060 113.31l1J 111.Q6't 1.1'3f)!tj 1~.1035 • e~6:; 21 11lQ.400 109.221 .t7930tt
~~5I)JAl SU" OF $QU~~ES O~ ~AW DOTA = 200.Ui2 EQ.HIIIIATICN ON FND OF FILE - U"IT '--- -..........- ~
5U ,"""til ~ t,ONO"Ii-?. ll'!l"l'3l) i'"OIt.. MI E?Jt'PL-ItNATION o4=
~., T1t£:SE -a.eSIDlAAL5 AlUi: c.ALColA.I-AT£"D *
U1 ,....
SAhDIA LAeDR_TDRIE~ <> STEPwIS£ REGRESSION PRCGRJ" () COURTESY OF CEPT. CF STATISTICS - K_NS_S STAlE UNIVERSITY
T*E.R.E- A.l\E. NO LII.BEL5 S1'"ECIFIEP
FoR V ... RI .... bLES
7) 8) ANt> 9 AS
,-HESiC Uoo1!,ELS
WiLL B& EHCOO1:0
1'1\01'11 THE TRI><NS
Fo1<.MAT10N C.I>.RD.
SPE.C1F/&S WHICH
Or THE INPUT
vI'<RII'<&LE5 IS TO
61: US.;. P AS Tit 1:
""EI<7I1T5 :
TITlE-.. EIEHT.TRAN5FOR"ATICN. AND DROP OPTlONS EXAPlPlE .. EPER AND SftIIH D~ Stb.ME: \3 'Po\NT 'DA.TA AS USED ON 'PR.E.ViOUS
INPUT CHECK OF FARA"ETERS
N~"BER OF VARIAElES READ IN 6
NO. OF TAANSFOR"ED VJRIABL£S = 3 ______________ ~
CA1A DISPOSITION IS
.&." ......... ,.6.~~ ... "' ..... ~ ....... ,>"'WEIGHT ---. WEIG-ltT IS :ruST A CONVENIENT l.A-e.EL
STEPWtSE,SIGtN=.O~,SIGOUT=.10 TO OSlO tI&RE 1><5 TKE "'''IGtlTS ARI:
£'NTIo.REI:> I>.S -rttE IDTI-I VAR.IAl!.LI:. ~EL'5=1.2.3+~+~.B~, __________ ~
'-.".. NOTE: VARIABI..., NVMS!:R I> '00&5 OUTPUT,STEPS c-ARn ~ ,",OT A1'"P&1'<1\ oN Till' MODEL
~------~======::::====~ TR .. SFORM.TIONE.l.@= ••••• =l~ ol'>S .. 1t~A'ION N""'~ElI. S "'ILL l!oE 1>1'.OPP&D
€IGHT=9 X TltlS cARt> CIU'A'-&S ~ NE .... VAlUA1'>I..E5 END OF FA~UETERS 7 & AND .. ""Hle-I\ I\RE RESl'IOCT1'iELY
z{~L TO "1-,"-, y:~1-, ANt> )(,)( •• FORflAT(6FE.OJ
THIS E)c'AMPL.E. USE;5 'tlEIG~TED R""'G-R~SS'ON. Tt+E: We:.1G-ttTS MUST 6S USER S,Uf'PI-I\:"D
FOR E~Cf.\ O1!.SERVA"TION AltO A"P'"Pi=.A.P. ,...S ON~ O'F Ttti=. I1'lPUT \/ARIASLi::"S; IN THIS
GAS€: Ttl!E: C.Tt4: V .... RI"'SL..£;; "POSITION CONTt\,NS Ttt~ WE.\G-+\TS. ro"R,. 1"'l1ft.POSi=S Of"
IL.L\J5TR.ATlot-l Ttte. K'E:.SI-ouA.LS ~RoM T~ "PRE.V'OUS ST£:."PWI5E. IIoNAL'ISIS ON Ttte,SE:
13 OATA ""P'OINTS ..,·ERE:. S~L.i;,;D 50 TI+A.T THE: L.AfI;.6-EST RESIDUAL INAlSoSOL-UT&
"ALOE. (5.02.33&) WAS REct'ReSENTE,t) ~5 "Z-E:RO .-.Nv 11\~ S~Al-L-t=-S,- f.I:B5ol-lin=
1U::SIDUA.L (.13010'&3.) wf,s REl"R£:.SENTf:D "SY ONi=:. 1.-1I"'i';,6.:R, INT£;~l'Ot...l\"'IO"N wAS
USE:P Tt> SC~L.~ I.U:. OTt\~fl "RE;S'"DlJA~S • .,.-tti::.SE 5CAJ.....I=:b VALV~S WE.'R€' TtTE.N
use:!> "'50 WE;IQ<t\TS. "Pl-~AS£: fliOTE Tt\A.T Tt-'rts l5 ~N 'ii:NTIRc.L.Y Af:,,~I"TRA:RY wAy
TO AsslG-N wel6-l-\T5
STE.f'V./ISi:: eAA.tJ\pu=;.
(SlAl CONTROL CARD)
ISTAT CONTROL CARD'
(STA' CONTROL CARt)
(STAT CONTROL CARD'
(STAT CONTROL CARD)
ISTAT CONTROL CARD'
(STAT CONTROL CARD)
CS'A' CONTROL CARD)
(STAT CONTROL CARD)
'" '"
TITLE-WEIGHT.TRANSfOR"ATDON, AND C~DP CPTJONS EXAMPLE - (DRAPER AND SMITH DAlA) PAGE
SANDtA L~8DRATORI[S <)<) STEPWISE REGRESSION <><> FROM KANSAS STATE UNIVERSITY
INPUT CHECK ON DATA TH~5E LA&ELS ARE CR~ATEb
Kl FIRST CBSERVATIC~
X2 Xl X, y WEIGHT
i BY TH-E. ?RoG-RAM
~ 8 2 3 • 6 1 8
1.0000000 26.000000 6.0000000 50.000000 18.500000 49.000000 3600.0000
e •
_ ( 2.1100/3 - .137083 )
J 5.02.338 - .137083
1\2D.00000
NO. U~ DATA INPUT 13
NO. TR.NSFORMED OBSERVATIQNS 12
NO. Of OBSERVATIONS DRCPPED = ~ OBSERVATioN NUMe,.", 8 wAs t>RQPf'i,D
WEIGHED REGRESSION RE'UESTED.@IGHTS APPEAR 'S VARIOElE kUMflER Y
'" w
TITlE-WEIGHT,TRANSFORMATCON. 'ND DPUP CFTJO~S EXAHPLE - (DRAPER AND SMITH DATA)
SANDIA LIBORATORIES ()() STEPWISE REGRESSION ()() FRO" KANSAS STATE UNIVERSITY
VARIABLE VARIABLE NAME ~l"BER JIIlEAN VARUNCE STD. DEV. 5":;. E~R.
Xl 1 1.~1"06 31.8115 6.1411jl1 1.7J5011j X2 2 50.2tUlO 2'f".02" 1S.G21l ".SOIlj~1
Xl 12.00f2 H.lle2 E.HGH 1.8S22G H 4 21.6908 286.1)38 16.S126 ~.88226 1-1 1 SS.031'f 16184.2 121.211 36.12H· ~-. • 100:8 .. <;8 .. 1254S3£+01 1l;!Q.19 323 .. 312 1" 9 182.168 3652 ... 8 191.115 55.1701 Y 5 ')6.8853 157 .. ~HO 1,,* .. 06'38 •• 0061
12 OBSERVATIONS
T,",-ES£ STATlSTICS ARE ALL TH..E. RESULT OF-' If'oiE\frl;'TED C.ALCULATIONS, I. E ~ )
WEIG-fl,"P MEAN ~ wi Xi
2. wi
COM'P'ARts'CNS c..I\N ~e. M""OE. WITtt TH-i:;. -pf(.~\lIOUS UNW~IG-R"i£'D E.)<AM'Pl....E •
PAGE 2
C.V.
83.39 ll.0G SJ."~ 61.08
142.88 108 ... 86 101.H
14 .52
'" '"
UNrOUf SEQUE~G~ Nn. ~ 101 ~~ALY5IS FOR OEPtNOF~T VftRIA~LE 5---V TITLE,ElCA'4PL;:: OF STEPW!5E OPT 10", IOATA FRO .... DRAPf'~ J\NO S~ITi, ~. ~&5-~]2)
.tl1S6193~+03 4 -.73~151BIE+OO EXAMPLE OF TKc INfORM,\TION
WRllT'EN OllTo TAPE 19 .
UNIQUE SEaUE~C;:: NO. = tez ANALYSIS FO'" DEPEt..IDF.NT VUUA!lLE S---1 TITLE,EXA~Pl= OF STEPWISE OPTI~N (~ATA FRO~ DRAPE~ A~D S'1ITi, P. 3&~-4l2t
.tCnQ1'.38E-+03 .143QQSS3E"'Cl 4 -.6p:Q53ME.oa \ } TWO "&LiltNk cAkD 1~""G£S
UNIQUE SEQUEIriCE NO. = lCJ A~ALV'SI'S FDP DEFENDENT VII,RIAIJLE 5---\" TITLEtEXA~PLE OF aACKWAPD OPTION (DATA F~O~ DRAPE~ AND SMIT~. P. 3&5-~02 •
.. 1 r .~?40"i36qE+02 1 .155110?6E+(ii 2' .51016758£.00 3 .1IJlqOq~GE.DO
It ~ ,. -.1440HO.lE"'OO COItoa£Sf01'ID5 ""' ~lqu.E. SEqIJ.ENC.e. f'«l
IN ",NO"'''' T-..1!tLE:: TMtS 15 -nte. _c.olllD c.NQ) tDIIoITI\.""IIIIG-
u .... J.'~UI:, "'~ ... u::.' .. "':: NO. = ~ At.I/!,lVSIS FO", DEFENO[NT vARIA1LE ~_, --........ c.oE.f:PIC.I~TS TITLE.E'XA'1PL: OF ~ACKWARD OPTIflN (OATA FROM DRAPE!? ANO SHIT .... P. 365-4:121 ~ 'DE"NDEN"'- V~~i8L'= ,S
... _,., ........ .- ...
3 .71&463C7F.OZ .1451Q380E+Ol 2 .4161C91&E.OU ~ -.Z3&5,n22f.OO
UNIaUE SfQU~~Cf NO.; lC5 A~ALV'SIS FOR OE~ENDfNT V_RIAaLE 5---~ TITLE,fl(a,MlPl'!:: IlF ~ACI(WA·~.O OPTION (DUA FR.OM OR.APf Q AtiiO S"IT·i., p. 365-lt:'2'.
2' 1 .525773,.Qf .. 02 .146A3057E+Ol 2 .662?50ft.'JE.OO
UNIQUE SEQUE'tC:~ NO. = 106 A'4AlVSTS FOR OEp~NOENT VARIAfll=: 5---til~K('f) TITLE.EXA,,",PLE" OF RANI( "";::GF.E'SSION WITi.f TI-IJ: STfPWISE OPTION(DII,TA F{O'1 ]~AJoEt'SIUTJ.t
1 1 Q .L~43~~q5E.ID 1 .80~C??q2f.DO
vA .. '~a.(R ,,u, .. M"&'E;ll & •
~-------------------------------------------------------- ~E CON5T~"T ~~ AND ~
UNIQUE SE~UE~C~ NO. = lD1 A~ALvsts FOR. CEPE~DfNT V.RIASLE ;---~lNK('. N~M'Be.R "0" •
TITLf,f~A~9LE OF FANK QFGRfSSION WIT~ THf ~TEPWISE OPTION(OaTA F~l~ ]~l~E~fS~IT~
CD. 1 0 .G50QQQAftf+Ol ~b21S"18CE"CO 4 -.551~~1&~E.OD
~------------------------------------------------------------------------------- TWO INDE'PE'NPENT 'IIA1UA15oLE$
-mE. FlltED 1U:6«ESSIOftll t:qU.ATlON
(E."c:.t.:,A:ntNG- ntE. CCfII5TAtIT) •
DISTRIBUTION: W. J. Conover College of Business Administration Texas Tech University Lubbock, Texas 79409
J. M. Davenport (5) Department of Mathematics Texas Tech University Lubbock, Texas 79409
K. E. Kemp Department of Statistics Kansas State University Manhattan, Kansas 66502
M. C. Cullingford (25) NRC-PAS Washington, D. C. 20555
1221 D. L. Wright 1223 R. R. Prairie 1223 H. E. Anderson 1223 C. R. Clark 1223 R. G. Easterling 1223 E. L. Frost (10) 1223 I. J. Hall 1223 R. L. !man (25) 1223 D. D. Sheldon 1223 B. P. Roudabush 1223 M. Shortencarier (10) 1417 F. W. Muller 1417 E. E. Ard 1417 J. R. Ellefson 1417 F. W. Spencer 1763 M. D. Fimple 4413 J. E. Campbell 4413 R. M. Cranwell 4413 J. C. Helton 4723 K. A. Bergeron 4723 L. L. Lukens 4723 R. R. Peters 5641 G. P. Steck 8324 M. A. Griesel 8346 C. J. Decarli 9515 J. T. Hillman 9625 J. R. Yoder 3141 T. L. Werner (5) 3151 W. L. Garner (3)
for DOE/TIC (Unlimited Release) 3154-3 R. P. Campbell (25)
for DOE/TIC
57