weighted least squares 2
TRANSCRIPT
-
8/11/2019 Weighted Least Squares 2
1/93
Weighted Least-
Squares Regression
A technique for correcting the
problem of heteroskedasticity by log-
likelihood estimation of a weight thatadjusts the errors of prediction
Weighted Least-Squares Regression: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
-
8/11/2019 Weighted Least Squares 2
2/93
Key Concepts
*****
Weighted Least-Squares Regression
OLS
Parameter estimates as:
Unbiased
Efficient
BLUE
Theoretical Sampling distribution of b
Standard error of b
Relationship between the standard error of b and:
The variance of XThe residual sum of squares
The sample size
Gauss-Markov Theorem
Assumptions about the errors (e) in regression analysis and the
consequences of their violation:
e is uncorrelated with X
e has the same variance across all levels of X
The values of e are independent of each other
e is normally distributedThe concepts of homoskedasticity and heteroskedasticity of the
error distributions
The concept of autocorrelation or serial correlation
Spurious relationships
Collinear relationships
Intervening relationships
Techniques for identifying heteroskedasticity
Graphic
Statistical
Whites Test for heteroskedasticity
Rezidualizing a variable
Techniques for identifying WLS weights
Theory, the literature, or prior experience
Regression of e2on X and transformation
Weighted Least-squares Regression: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
2
-
8/11/2019 Weighted Least Squares 2
3/93
Log-likelihood estimation of wi
SPSS weight estimation procedure
SPSS WLS>>procedure
Weighted Least-squares Regression: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
3
-
8/11/2019 Weighted Least Squares 2
4/93
Overview
Theoretical sampling distribution of b
Assumptions about errors in regression
Identifying heteroskedasticity
The concept of weighted least-squares
regression
Methods for estimating weights
Regressing ei2on X
Log-likelihood estimation of weights
Using WLS>> command in SPSS
Weighted Least-squares Regression: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
4
-
8/11/2019 Weighted Least Squares 2
5/93
References
White, Halbert (1980)A heteroskedasticity-consistent covariance matrix estimator and a
direct test for heteroskedasticity. Econometrica 48:817-838.
Graybill, Fraklin A. and Iyer, Hariharan K. (1994) Regression Analysis: Concepts and
Applications. Duxbury Press 571-592.
Freund, Rudolf J. and Wilson, William J. (1998) Regression Analysis: Statistical
Modeling of a Response Variable. Academic Press 378-382.
McClendon, McKee J. (1994) Multiple Regression and Causal Analysis. F. E. PeacockPublishers, Inc. 138-146, 174-181, 189-197.
Weighted Least-squares Regression: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
5
-
8/11/2019 Weighted Least Squares 2
6/93
Violation of OLS Regression Assumptions
Y = a + b1X1+ b2X2+ + bkXk
OLS regression makes various assumptions about
the errors that result from a regression model.
If these assumptions are met
One can assume that the estimates of the
regression constant (a) and the regression
coefficients (bk) are
Unbiased: Replications of the study will
yield values of a and bkwhich will be
distributed on either side of their respective
parametersandk
Efficient: The standard errors of a and
bkwill neither over- nor underestimate
their associated theoretical standard
errors
Weighted Least-squares Regression: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
6
-
8/11/2019 Weighted Least Squares 2
7/93
Violation of one or more of these assumptions may
lead to biased and/or inefficient estimates.
Weighted Least-squares Regression: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
7
-
8/11/2019 Weighted Least Squares 2
8/93
Theoretical Sampling Distribution of b
1
Population
Y =+X 2
3m
Theoretical sampling distribution of b b =
b
68.26%
Theoretical standard error of b
b= () / (SXn ) = ( Y Y)2/ N
Weighted Least-squares Regression: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
8
-
8/11/2019 Weighted Least Squares 2
9/93
SX= (X X)2/ N
Weighted Least-squares Regression: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
9
-
8/11/2019 Weighted Least Squares 2
10/93
The Theoretical Standard Error of b
b= () / (SX n )
The standard error (b) is directly related
to the standard deviation of the errors produced bythe model ()
The greater the errors produced by the
model, the greater the standard error
of b
The standard error(b)is inversely related to
the standard deviation of the predictor
variable (SX)
As the variability of X increases, thestandard error of b decreases
The standard error(b)is inversely related to
Weighted Least-squares Regression: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
10
-
8/11/2019 Weighted Least Squares 2
11/93
the sample size (n)
As the sample size increases, thestandard error of b decreases
Estimation of the Theoretical
Standard Error of b
The theoretical standard error of b (b) is usuallyestimated from a single sample, vis--vis a sampling
distribution of b.
SEb= (Se) / ( TSSX)
Se= RSS / (n- k)
Weighted Least-squares Regression: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
11
-
8/11/2019 Weighted Least Squares 2
12/93
TSSX=(X X)2
Weighted Least-squares Regression: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
12
-
8/11/2019 Weighted Least Squares 2
13/93
Gauss-Markov Theorem
b is an unbiased estimate of. On repeated
estimates, the distribution of b will be centered
around.
The sampling distribution of b will be normalif the samples are large and a sufficient number of
samples are taken.
OLS provides the best linear unbiased estimate of
(BLUE)
Best means:
OLS provides the most unbiased and
efficient estimate of.
Efficiency refers to the size of the
standard error of b (b); neither too largenor small.
Weighted Least-squares Regression: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
13
-
8/11/2019 Weighted Least Squares 2
14/93
Weighted Least-squares Regression: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
14
-
8/11/2019 Weighted Least Squares 2
15/93
The FourAssumptions About
Regression Error
e = (Y Y)
e = prediction error
e is uncorrelated with X, the independence
assumption.
e has the same variance (Se2) across the
different levels of X, i.e. the variance of e is
homoskedastic v heteroskedastic.
The values of e are independent of each
other, i.e. not autocorrelated or seriallycorrelated.
Weighted Least-squares Regression: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
15
-
8/11/2019 Weighted Least Squares 2
16/93
e is normally distributed.
Weighted Least-squares Regression: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
16
-
8/11/2019 Weighted Least Squares 2
17/93
The Problem of the Correlation of e & X
Y = a + bX
Spurious relationship: e and X may be correlated
because Z is a common cause of X and Y. In thiscase b is a biased estimate of.
spurious relationship
X Y
Z
Collinear Relationship: If X2is correlated with X1& Y
but is not the cause of either, b1will be a biased
estimate of1
X1 Y
Weighted Least-squares Regression: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
17
-
8/11/2019 Weighted Least Squares 2
18/93
X2
Weighted Least-squares Regression: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
18
-
8/11/2019 Weighted Least Squares 2
19/93
Correlation of e with X ( con'd )
Intervening Relationship: X2 intervenes in the
relationship between X1and Y. In this case b1will not
be a biased estimate of, but:
It will reflect both the direct and indirect
effects of X1on Y.
X1 X2 Y
Weighted Least-squares Regression: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
19
-
8/11/2019 Weighted Least Squares 2
20/93
Homoskedasticity of Errors (e) Over Levels
of X
The dotted lines represent the pattern of the
dispersion of the residuals.
0 0
Homoskedastic Heteroskedastic (+)RXSe2>0.0
0 0
Weighted Least-squares Regression: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
20
-
8/11/2019 Weighted Least Squares 2
21/93
Heteroskedastic (-) Heteroskedastic
RXSe2
-
8/11/2019 Weighted Least Squares 2
22/93
Consequences of Heteroscedasticity
b will be an unbiased estimate of, but SEbwill be inefficient, too large or small.
SEb= (Y-Y )2/ (n-k)
TSSx
If SEbis overestimated, (RXSe20.0)
Weighted Least-squares Regression: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
22
-
8/11/2019 Weighted Least Squares 2
23/93
b will not be an efficient estimate ofand a
Type I error may occur, since
t = (b / SEb)
Weighted Least-squares Regression: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
23
-
8/11/2019 Weighted Least Squares 2
24/93
-
8/11/2019 Weighted Least Squares 2
25/93
may occur.
If SEbis underestimated, a Type I errormay occur.
Weighted Least-squares Regression: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
25
-
8/11/2019 Weighted Least Squares 2
26/93
The Distribution of the Errors
OLS regression assumes that the errors of
prediction are normally distributed.
This can be tested by saving the errors and
Plotting
A histogram or
A normal probability plot
Histogram of errors
Weighted Least-squares Regression: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
26
Std. Dev = 4.04
Mean = 22.9
N = 70.00
35.032.530.027.525.022.520.017.5
Errors as a function of predictions
Predictions
30
20
10
0
-
8/11/2019 Weighted Least Squares 2
27/93
Distribution of errors ( con'd )
Normal probability plot of errors
If the errors are non-normally distributed
b may still be unbiased and efficient if
The homoskedasticity and independence
assumptions are met and the sample is large
Weighted Least-squares Regression: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
27
Nora! pro"a"i!it# p!ot of errors
$"served %uu!ative Pro"a"i!it#
1.00.75.50.250.00
1.00
.75
.50
.25
0.00
-
8/11/2019 Weighted Least Squares 2
28/93
If the sample is small, the use of the t distribution
in determining the significance of b and its
confidence interval will be biased.
Weighted Least-squares Regression: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
28
-
8/11/2019 Weighted Least Squares 2
29/93
Summary of Assumptions and
The Consequences of Their Violation
Assumption Violation Consequences
Errors correlated with X
Spurious relationship b biased estimate of
Collinear relationship b biased estimate of
Intervening relationship b unbiased estimate ofbut reflects both direct &
indirect effects
Heteroskedastisity
(RXSe20.0)
b unbiased but not
efficient, SEbtoo
small/large, Type I or II
error may result
Autocorrelated errors
b unbiased but not
efficient, SEbtoo
small/large, Type I or II
error may result
Errors non-normally
b may be unbiased if
homoskedasticity & in-
Weighted Least-squares Regression: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
29
-
8/11/2019 Weighted Least Squares 2
30/93
distributed dependence assumptions
met & N is large. If N is
small, t distribution may
be biased.
Weighted Least-squares Regression: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
30
-
8/11/2019 Weighted Least Squares 2
31/93
Heteroskedastic Errors and
Weighted Least-Squares Regression
If the errors are heteroskedastically distributed
The SEbmay inefficient, i.e. either too small or
large, which may lead to a Type I or IIerror
Ways to detect heteroskedasticity
Scatterplot of X against Y (prior to analysis)
Scatterplot of predictions against residuals, either
unstandardized or standardized
Scatterplot of X against residuals
Scatterplot of X against the absolute value
of the residuals (e)
Weighted Least-squares Regression: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
31
-
8/11/2019 Weighted Least Squares 2
32/93
Scatterplot of X against the squared
residuals (e2)
Whites Test for homoskedasticiy
Weighted Least-squares Regression: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
32
-
8/11/2019 Weighted Least Squares 2
33/93
Example
Scatterplot of X Against Y
Sentence length (Y) as a function of
drug dependency (X)
Heteroskedasticity
As drug score increases, the variability in
sentence increases
Weighted Least-squares Regression: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
33
D&'S%$&E
1210()420
30
20
10
0
-
8/11/2019 Weighted Least Squares 2
34/93
Example
Scatterplot of Predictions Against Residuals
Sentence length (Y) as a function of
drug dependency (X)
Heteroskedasticity
As predicted sentence becomes longer,
Weighted Least-squares Regression: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
34
Predicted *a!ue
9(7)5432
20
10
0
+10
-
8/11/2019 Weighted Least Squares 2
35/93
variability in residuals becomes greater.
Weighted Least-squares Regression: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
35
-
8/11/2019 Weighted Least Squares 2
36/93
-
8/11/2019 Weighted Least Squares 2
37/93
Ho= residuals are homoskedastic
n = number of cases
df = number of independent variables
Whites Test for Heteroskedasticity (cont.)
Example
The regression of sentence on dr_score
R2= 0.06517
2= n R2= (70) (0.06517) = 4.56
df = 1
p
-
8/11/2019 Weighted Least Squares 2
38/93
Reject the Hothat the residuals are
homoskedastic
Weighted Least-squares Regression: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
38
-
8/11/2019 Weighted Least Squares 2
39/93
How Does One Correct the Problem of
Heteroskedastic Errors?
Solution: Weighted Least-Squares Regression (WLS
Regression)
The logic of WLS Regression
Find a weight (wi)
That can be used to modify the influence of
large errors on the estimation of
The best fit values of
The regression constant (a)
The regression coefficients (bk)
OLS is designed to minimize:(Y Y)2
In WLS, values of a and bkare estimatedwhich minimize RSS =wi(Y Y)
2
Weighted Least-squares Regression: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
39
-
8/11/2019 Weighted Least Squares 2
40/93
This process has the effect of minimizing the
influence of a case with a large error on the
estimation of a and bk
And maximizing the influence of a case with a
small error on the estimation of a and bk
Weighted Least-squares Regression: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
40
-
8/11/2019 Weighted Least Squares 2
41/93
Techniques for Estimating a Suitable
Value of a Weight wi
From theory, the literature, or experience gained in
prior research.
Rarely will this approach prove successful,
except by trial and error
Estimate wiby regressing e2on theoffending
independent variable X and
Transforming the values of X and Y.
This is called residualizing the variable X.
Use log-likelihood estimation to determine a
suitable value of wi
This can be done in SPSS using theregression weight estimation procedure
coupled with the WLS>>procedure.
Weighted Least-squares Regression: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
41
-
8/11/2019 Weighted Least Squares 2
42/93
In the following case study both the residualizing
and SPSS WLS>>procedures will be demonstrated.
Weighted Least-squares Regression: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
42
-
8/11/2019 Weighted Least Squares 2
43/93
An Example
*****
The Relationship Between Drug Dependency &Length of Sentence
The model
Sentence = a + b (drug_score)
The results
Sentence = 1.97 + 0.6438 (drug_score)
For this model to be BLUE, the residuals must be
homoskedastic.
Q Are the residuals homoskedastic?
Weighted Least-squares Regression: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
43
-
8/11/2019 Weighted Least Squares 2
44/93
An Example (cont.)
Scatterplot of the residuals
Notice how the residuals become larger the
greater the degree of drug dependency. These
are heteroskedastic residuals.
Weighted Least-squares Regression: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
44
,eteros-edastic &esidua!s
Standardied Predicted *a!ue
1.51.0.50.0+.5+1.0+1.5+2.0
5
4
3
2
1
0
+1
+2
-
8/11/2019 Weighted Least Squares 2
45/93
Solving the Problem of Heteroskedasticity
Solution
Residualize the offending variable X
Steps in the process
1. Plot X against Y to determine the presence of
heteroskedasticity
2. Estimate the following regression equation
and save the residuals (e = Y Y). In SPSSthe residuals appear as res_1
Weighted Least-squares Regression: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
45
-
8/11/2019 Weighted Least Squares 2
46/93
Y = a + bX
Weighted Least-squares Regression: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
46
-
8/11/2019 Weighted Least Squares 2
47/93
Solving the Problem of Heteroskedasticity (cont.)
3. Square the residuals
e2= (res_1)
2= residsq
4. Regress residsq on X and save the predicted
residsq, in SPSS this is called pre_2
Residsq = a + bX
5. Transform X and Y, and compute a weight wi
called wtsqroot
wtX = X / pre_2
Weighted Least-squares Regression: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
47
-
8/11/2019 Weighted Least Squares 2
48/93
wtY = Y / pre_2
wtsqroot = 1 / pre_2
Solving the Problem of Heteroskedasticity (cont.)
6. Estimate the following weighted regression
equation through the origin, i.e. with a
regression constant equal to 0.0
wtY = a(wtsqroot) + b(wtX)
Weighted Least-squares Regression: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
48
-
8/11/2019 Weighted Least Squares 2
49/93
Step 1
Sentence length as a function of
drug dependency
SPSS scatterplot of sentence as a function of
dr_score . This can only be done when there are 2
or less IV.
Weighted Least-squares Regression: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
49
Sentence as a function of dru/ dependenc#
Dru/ Dependenc#
1210()420
30
20
10
0
-
8/11/2019 Weighted Least Squares 2
50/93
Heteroskedastic The variability in sentence
length increases as the degree of drug dependency
increases.
Weighted Least-squares Regression: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
50
-
8/11/2019 Weighted Least Squares 2
51/93
Step 2
Regress sentence length on drug
dependency, save the residuals (res_1) and
the predictions (pre_1)
sentence = 1.975 + 0.644 dr_score
R2= 0.12 (F = 9.24, p = 0.003)
SPSS results for Step 2
Regression
Variables Entered/Removedb
DR_SCOR
Ea . Enter
Model1
Variables
Entered
Variables
Removed Method
All requested variables entered.a.
Dependent Variable: SE!ECEb.
Weighted Least-squares Regression: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
51
-
8/11/2019 Weighted Least Squares 2
52/93
Step 2 (cont.)
Model Summaryb
."#$a .1%& .1&' #.$(1$
Model
1
R R Square
Ad)usted
R Square
Std. Error o*
the Estimate
+redi,tors: -Constant/ DR_SCOREa.
Dependent Variable: SE!ECEb.
ANOVAb
%&%.01$ 1 %&%.01$ .%#& .&&"a
1#&."0$ $( %1.1'
1$%.('1 $
Re2ression
Residual
!otal
Model1
Sum o*
S uares d* Mean S uare 3 Si .
+redi,tors: -Constant/ DR_SCOREa.
Dependent Variable: SE!ECEb.
Coefficientsa
1.'0 1.#%0 1."($ .1'&.$## .%1% ."#$ ".& .&&"
-ConstantDR_SCORE
Model
1
4 Std. Error
5nstandardi6ed
Coe**i,ients
4eta
Standardi
6ed
Coe**i,ien
ts
t Si .
Dependent Variable: SE!ECEa.
Weighted Least-squares Regression: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
52
-
8/11/2019 Weighted Least Squares 2
53/93
Casewise Diagnosticsa
".0$ %0.&&Case umber$&
Std. Residual SE!ECE
Dependent Variable: SE!ECEa.
Step 2 (cont.)
Residuals Statisticsa
%.$1(0 (.#1%( 0.0'1 1.'1"% '&
7'.#1%( 1(.01($ 7'.$1E71' #.$#'0 '&
71.# 1.#"" .&&& 1.&&& '&
71.0(" ".0$ .&&& ." '&
+redi,ted Value
Residual
Std. +redi,ted Value
Std. Residual
Minimum Ma8imum Mean Std. Deviation
Dependent Variable: SE!ECEa.
Weighted Least-squares Regression: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
53
-
8/11/2019 Weighted Least Squares 2
54/93
N.B. The residuals are heteroskedastic. Compare
this scatterplot with the scatterplot of sentence as a
function of dr_score. Notice that the patters are the
same.
Weighted Least-squares Regression: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
54
,eteros-edastic &esidua!s
Standardied Predicted *a!ue
1.51.0.50.0+.5+1.0+1.5+2.0
5
4
3
2
1
0
+1
+2
-
8/11/2019 Weighted Least Squares 2
55/93
Step 3
Calculate the squared residuals
In SPSS, the unstandardized residuals are saved as
res_1.
Step 3 involves squaring the residuals by use of thedata transformation procedure in SPSS.
squared residual = (res_1)2= residsq
The SPSS syntax for this transformation is as
follows:
Residsq = res_1**2
Weighted Least-squares Regression: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
55
-
8/11/2019 Weighted Least Squares 2
56/93
The steps used in this transformation process are
described in the case study associated with this
module.
Step 3 (cont.)
SPSS results for Step 3
pre_1 res_1 residsq
7.76901 -6.76901 45.82
8.41282 -7.41282 54.95
8.41282 -7.41282 54.95
6.48139 -5.48139 30.05
7.76901 -5.76901 33.28
7.12520 -5.12520 26.27
7.76901 -5.76901 33.28
8.41282 -6.41282 41.12
5.83758 -2.83758 8.057.76901 -4.76901 22.74
Weighted Least-squares Regression: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
56
-
8/11/2019 Weighted Least Squares 2
57/93
2.61852 -.61852 .38
2.61852 .38148 .15
3.26233 1.73767 3.025.19377 1.80623 3.26
8.41282 -.41282 .17
7.76901 1.23099 1.52
7.12520 2.87480 8.26
6.48139 5.51861 30.46
7.12520 6.87480 47.26
7.12520 7.87480 62.01
Weighted Least-squares Regression: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
57
-
8/11/2019 Weighted Least Squares 2
58/93
Step 4
Regress the squared residuals on the independent
variable dr-score and save the predictions as (pre_2)
residsq = -7.587 + 4.6685 dr_score
R2= 0.065 (F = 4.74, p = 0.0329)
This process is called residualizing a variable.
By OLS definition, the residuals (residsq)represent
the variance in Y that is unrelated to X.
Therefore, there should be no significant relationship
between X and residsq.
If there is, one or more OLS regression
assumptions have been violated.
In this case, the violated assumption is the
homoskedasticity of the residuals.
Weighted Least-squares Regression: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
58
-
8/11/2019 Weighted Least Squares 2
59/93
Step 4 (cont.)
SPSS results for Step 4
Regression
Variables Entered/Removedb
DR_SCOR
Ea . Enter
Model1
Variables
Entered
Variables
Removed Method
All requested variables entered.a.
Dependent Variable: RES9DSb.
Model Summaryb
.%00a .&$0 .&01 #'."0"Model1
R R S uare
Ad)usted
R S uare
Std. Error o*
the Estimate
+redi,tors: -Constant/ DR_SCOREa.
Dependent Variable: RES9DSb.
ANOVAb
1&$#(.(&% 1 1&$#(.(&% #.'#1 .&""a
10%'#.# $( %%#$."101$""(.% $
Re2ression
Residual!otal
Model1
Sum o*
S uares d* Mean S uare 3 Si .
+redi,tors: -Constant/ DR_SCOREa.
Dependent Variable: RES9DSb.
Coefficientsa
7'.0(' 1#.#%% 7.0%$ .$&1#.$$ %.1## .%00 %.1'' .&""
-ConstantDR_SCORE
Model
1
4 Std. Error
5nstandardi6ed
Coe**i,ients
4eta
Standardi
6ed
Coe**i,ien
ts
t Si .
Dependent Variable: RES9DSa.
Weighted Least-squares Regression: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
59
-
8/11/2019 Weighted Least Squares 2
60/93
-
8/11/2019 Weighted Least Squares 2
61/93
Step 5
Compute the absolute value of pre_2 and three new
variables wtsent, wtdrug and the weight wtsqroot.
wtsent = (sentence) / abspre_2
wtdrug = (dr_score) / abspre_2
wtsqroot = (1) / abspre_2
pre_2 from the previous step is the information in the
squared residuals (residsq) that is related to the IVdr_score.
Dividing sentence and dr_score by pre_2 reduces
the influence of extreme values on the estimation of
a and b.
Finally a third transformation is performed by
creating the variable wtsqroot.
Weighted Least-squares Regression: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
61
-
8/11/2019 Weighted Least Squares 2
62/93
This will serve as a weighting factor.
Weighted Least-squares Regression: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
62
-
8/11/2019 Weighted Least Squares 2
63/93
Step 5 (cont.)
SPSS results for Step 5
abspre_2 wtsent wtdr_sco wtsqroot
34.43. 17 1.53 .17
39.10. 16 1.60 .16
39.10. 16 1.60 .16
25.09 .20 1.40 .20
34.43 .34 1.53 .17
29.76 .37 1.47 .18
34.43 .34 1.53 .17
39.10 .32 1.60 .16
20.42 .66 1.33 .22
34.43 .51 1.53 .17
20.42 .66 1.33 .22
39.10 1.28 1.60 .16
34.43 1.53 1.53 .17
29.76 1.83 1.47 .18
25.09 2.40 1.40 .20
29.76 2.57 1.47 .18
29.76 2.75 1.47 .18
Weighted Least-squares Regression: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
63
-
8/11/2019 Weighted Least Squares 2
64/93
Step 6
Compute the WLS regression
wtsent = a(wtsqroot) + b (wtdrug)
Results:
wtsent = 1.159 (wtsqroot) + 0.7833 (wtdrug)
R2= 0.674 (F = 70.29, p
-
8/11/2019 Weighted Least Squares 2
65/93
Step 6 (cont.)
Model Summarycd
.(%1b .$'# .$$# .$00Model1
R R S uarea
Ad)usted
R S uare
Std. Error o*
the Estimate
3or re2ression throu2h the ori2in -the no7inter,ept
model/ R Square measures the proportion o* the
variabilit= in the dependent variable about the ori2ine8plained b= re2ression. !his CAO! be ,ompared
to R Square *or models >hi,h in,lude an inter,ept.
a.
+redi,tors: ;!SROO!/ ;!DR_SCOb.
Dependent Variable: ;!SE!,.
-
8/11/2019 Weighted Least Squares 2
66/93
Step 6 (cont.)
Casewise Diagnosticsab
".'$ #.Case umber$&
Std. Residual ;!SE!
Dependent Variable: ;!SE!a.
-
8/11/2019 Weighted Least Squares 2
67/93
N.B. The heteroskedasticity has been reduced.
Weighted Least-squares Regression: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
67
&esidua!s of tsent re/ressed on tdru/
tdru/
2.22.01.(1.)1.41.21.0
4
3
2
1
0
+1
+2
-
8/11/2019 Weighted Least Squares 2
68/93
Comparison of the OLS v Residualized
Regression Models
Compare of scatterplots, notice thesubstantial
reduction of heteroskedasticity
Statistical results
Method a b SEa SEb R2 p
OLS 1.975 0.644 1.425 0.212 0.1196 0.0034
Resid-
ualized
1.159 0.783 0.605 0.139 0.6740 0.0001
The residualized model is more efficient, SEs are smaller.
Comparison of 95% confidence intervals
Method
95% Confidence
Interval Difference
OLS 0.221 to 1.066 0.845
Residualized 0.504 to 1.062 0.558
Weighted Least-squares Regression: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
68
-
8/11/2019 Weighted Least Squares 2
69/93
N.B. The width of the residualized 95% confidence interval
is less than that of the OLS interval.
Weighted Least-squares Regression: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
69
-
8/11/2019 Weighted Least Squares 2
70/93
An Alternative Procedure for Correcting
Heteroskedasticity
Log-Likelihood Estimation of wi
If it can be assumed that the variance in the DV
Is proportional to the IV,
Log-likelihood estimation can be used to
estimate wI.
In this case it is assumed that
Sy2(X)wor Sy
2(1 / Xw)
(is read proportional to)
In log-likelihood estimation of wi, the question is:
Weighted Least-squares Regression: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
70
-
8/11/2019 Weighted Least Squares 2
71/93
What power of X, i.e. wi, is most likely to have
produce the proportional relationship between
Sy2
and X ?
Weighted Least-squares Regression: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
71
-
8/11/2019 Weighted Least Squares 2
72/93
SPSS Weight Estimation and WLS>>
Regression Procedures
This procedure begins by using log-likelihood
estimation to iteratively determine a weight wI
To be used in estimating the values of the
regression constant (a) and the regressioncoefficient (b)
Such that the RSS is minimized.
RSS =[ (1 / Xwi) (Y Y)2]
This may solve the heteroskedasticity problem if:
Sy2(X)wor Sy
2(1 / Xw)
Weighted Least-squares Regression: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
72
-
8/11/2019 Weighted Least Squares 2
73/93
Weighted Least-squares Regression: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
73
-
8/11/2019 Weighted Least Squares 2
74/93
Step 1
Estimation of the weight wiusing the SPSS weight
estimation procedure
The result
The most likely weight = 1.8
The variance in sentence is estimated to be
Sy2
= (dr_score)
1.8
Regression equation
sentence = 0.94 + 0.83 (dr_score)
Weighted Least-squares Regression: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
74
-
8/11/2019 Weighted Least Squares 2
75/93
For a drug score of 6,the prediction would be
sentence = 0.94 + 0.83 (6) = 5.92 years
Step 1 (cont.)
Examination of weights for individual subjects
Subject dr_score Weight
Jones 10 1/(10)1.8= 0. 01585
Smith 1 1/(1)1.8= 1.00
Weighted Least-squares Regression: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
75
-
8/11/2019 Weighted Least Squares 2
76/93
Step 1 (cont.)
SPSS results
!eig"ted #east S$uaresMODEL: MOD_1.
Source variable.. DR_SCORE Dependent variable.. SENTENCE
Log-lieli!ood "unction # -$%&.&'%'() *O+ER value # -).'''Log-lieli!ood "unction # -$%%.&11')% *O+ER value # -$.,''Log-lieli!ood "unction # -$%1.&%&% *O+ER value # -$.''Log-lieli!ood "unction # -$),.,11)& *O+ER value # -$.%''Log-lieli!ood "unction # -$).('( *O+ER value # -$.$''Log-lieli!ood "unction # -$)).')'$, *O+ER value # -$.'''Log-lieli!ood "unction # -$)'.1,1(, *O+ER value # -1.,''Log-lieli!ood "unction # -$$&.)&%, *O+ER value # -1.''Log-lieli!ood "unction # -$$%.(&$(& *O+ER value # -1.%''Log-lieli!ood "unction # -$$1.,1') *O+ER value # -1.$''Log-lieli!ood "unction # -$1(.1)), *O+ER value # -1.'''Log-lieli!ood "unction # -$1.%()1& *O+ER value # -.,''Log-lieli!ood "unction # -$1).,&,(%' *O+ER value # -.''Log-lieli!ood "unction # -$11.)1& *O+ER value # -.%''Log-lieli!ood "unction # -$',.,1) *O+ER value # -.$''Log-lieli!ood "unction # -$'.)&(,, *O+ER value # .'''Log-lieli!ood "unction # -$'%.'$,&1$ *O+ER value # .$''Log-lieli!ood "unction # -$'1.&&&$%' *O+ER value # .%''Log-lieli!ood "unction # -1((.%&) *O+ER value # .''Log-lieli!ood "unction # -1(&.,)1, *O+ER value # .,''Log-lieli!ood "unction # -1(.,&$' *O+ER value # 1.'''Log-lieli!ood "unction # -1(%.)1)()( *O+ER value # 1.$''Log-lieli!ood "unction # -1().'%1$$ *O+ER value # 1.%''Log-lieli!ood "unction # -1($.1$&$ *O+ER value # 1.''Log-lieli!ood "unction # -1(1.,' *O+ER value # 1.,''Log-lieli!ood "unction # -1(1.&$,1,( *O+ER value # $.'''Log-lieli!ood "unction # -1($.%1) *O+ER value # $.$''Log-lieli!ood "unction # -1().(%'( *O+ER value # $.%''Log-lieli!ood "unction # -1(.)'$)& *O+ER value # $.''Log-lieli!ood "unction # -1((.$)'' *O+ER value # $.,''Log-lieli!ood "unction # -$').(%$$( *O+ER value # ).'''
T!e /alue o0 *O+ER Mai2i3ing Log-lieli!ood "unction # 1.,''
log-likelihood estimated weight wi = 1.8
Weighted Least-squares Regression: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
76
-
8/11/2019 Weighted Least Squares 2
77/93
Step 1 (cont.)
Estimation of the weighted regression model
Source variable.. DR_SCORE *O+ER value # 1.,''
Dependent variable.. SENTENCE
Li4t5i4e Deletion o0 Mi44ing Data
Multiple R .)%,R S6uare .%'1'7d8u4ted R S6uare .)()Standard Error .,%)&
7nal94i4 o0 /ariance:
D" Su2 o0 S6uare4 Mean S6uare
Regre44ion 1 )$.(% )$.(%Re4idual4 , %,.%'(,$ .&11('(
" # %.)'&$ Signi0 " # .''''
------------------ /ariable4 in t!e E6uation ------------------
/ariable SE eta T Sig T
DR_SCORE .,$,%&' .1$1&%& .)%&, .,' .'''';Con4tant< .()((&& .)(%,, $.),$ .'$''
Log-lieli!ood "unction # -1(1.,'
T!e 0ollo5ing ne5 variable4 are being created:
Na2e Label
+=T_1 +eig!t 0or SENTENCE 0ro2 +LS> MOD_1 DR_SCORE?? -1.,''
Weighted equation
Sentence = 0.9399 + 0.8285 (dr_score)
Unweighted equation
Weighted Least-squares Regression: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
77
-
8/11/2019 Weighted Least Squares 2
78/93
Sentence = 1.97 + 0.6438 (dr_score)
Weighted Least-squares Regression: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
78
-
8/11/2019 Weighted Least Squares 2
79/93
Step 2
Plot the relationship between dr_score
and the weight wi
The heteroscdasticity problem
Recall the previous scatterplot: as the value of drug scores
increases, the variance in sentences increases as well.
The log-likelihood estimated weight is such that
As the value of drug score increases, the weight adjusted
drug score (wgt_1) decreases.
Weighted Least-squares Regression: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
79
Scatterp!ot of adusted dr'score and dr'score
ei/t adusted dr'score=1dr'score6 1.(
D&'S%$&E
1210()420
1.2
1.0
.(
.)
.4
.2
0.0
-
8/11/2019 Weighted Least Squares 2
80/93
-
8/11/2019 Weighted Least Squares 2
81/93
The weight of 1.8 reduces the effect of large
errors on the RSS providing a more efficient
estimate of the SEb.
Weighted Least-squares Regression: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
81
-
8/11/2019 Weighted Least Squares 2
82/93
Step 3
The SPSS WLS>>in linear regression
If an appropriate weight wiis already known by
another means
The WLS>>procedure in SPSS linear
regression can be use instead of the SPSS
weight estimation procedure
The procedure
Simply specify the regression model
Enter the known weight-variable under the
WLS>>command and estimate themodel
In this case, the weight variable wgt _1 from
Step 2 will be used
Weighted Least-squares Regression: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
82
-
8/11/2019 Weighted Least Squares 2
83/93
Step 3 (cont.)
The results of the WLS>>analysis using a weight-
variable wgt_1, wi= 1.8, with regression through the
origin
R2= 0.668, F = 139.14, p = 0.0001
sentence = 1.036 (dr_score)
SEb=0.087
N.B. Since this model does not include a constant
(a), the R2and the other statistical results can not be
compared with the associated values of a model that
does use a constant (a).
Weighted Least-squares Regression: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
83
-
8/11/2019 Weighted Least Squares 2
84/93
Step 3 (cont)
SPSS results
Regression
Variables Entered/Removedbc
DR_SCOR
Ea . Enter
Model1
Variables
Entered
Variables
Removed Method
All requested variables entered.a.
Dependent Variable: SE!ECEb.
;ei2hted
-
8/11/2019 Weighted Least Squares 2
85/93
Step 3 (cont)
Coefficients
ab
.#& ."0 %."(% .&%&
.(%( .1%% .$"$ $.(&0 .&&&
-Constant
DR_SCORE
Model1
4 Std. Error
5nstandardi6ed
Coe**i,ients
4eta
Standardi
6ed
Coe**i,ien
ts
t Si .
Dependent Variable: SE!ECEa.
;ei2hted
-
8/11/2019 Weighted Least Squares 2
86/93
Step 3 (cont)
Saved predicted & residual values, and the weighted
values of dr_score (i.e. wgt_1)
wtg_1 = 1 / (dr_score)1.8
wgt_1 pre_1 res_1
%&'(') *%+(),' -.%+(),'%&'* (%,,0)* -*%,,0)*%&'* (%,,0)* -*%,,0)*%&+&', )%.+(,. -%.+(,.%&'(') *%+(),' -)%+(),'
%&,+)* .%)..0 -%)..0%&'(') *%+(),' -)%+(),'%&'* (%,,0)* -.%,,0)*%&+(. %('&*& -,%('&*&
%&,+)* .%)..0 ,%0+,,)%&+&', )%.+(,. %,)&.+%&,+)* .%)..0 )%0+,,)
%&,+)* .%)..0 .%0+,,)
Weighted Least-squares Regression: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
86
-
8/11/2019 Weighted Least Squares 2
87/93
Step 4
Variable transformations and
plot of the residuals
Unfortunately, the weighted residuals and predictions
produced by the SPSS weight estimation and
WLS>>procedures
Can not be directly graphed from the saved
residuals and predictions
The residuals and the predictions must first betransformed as follows:
Transformed residual = (res_1) (wt)0.5
Transformed prediction = (pre_1) (wt)0.5
Weighted Least-squares Regression: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
87
-
8/11/2019 Weighted Least Squares 2
88/93
Weighted Least-squares Regression: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
88
-
8/11/2019 Weighted Least Squares 2
89/93
Step 4 (cont.)
SPSS results
Compare the degree of heteroskedasticity in this
scatterplot with
The plot of the residuals from the un-weighted regression model.
Weighted Least-squares Regression: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
89
Scatterp!ot of te 8ransfored &esidua!s
8ransfored ei/ted Predictions
1.41.31.21.11.0
4
3
2
1
0
+1
+2
-
8/11/2019 Weighted Least Squares 2
90/93
Notice the substantial change in the degree of
heteroskedasticity.
Weighted Least-squares Regression: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
90
-
8/11/2019 Weighted Least Squares 2
91/93
Step 4 (cont.)
Transformed variables transres and transpre
transres = res_1*sqrt(wgt_1)
transpre = pre_1*sqrt(wgt_1)
transres transpre
-'%&, '%')-'%&0 '%')-'%&0 '%')-'%&& '%'.-%*( '%')-%*) '%')-%*( '%')-%(' '%')
%+. '%')%(' '%'.%(( '%')'%'0 '%')
Weighted Least-squares Regression: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
91
-
8/11/2019 Weighted Least Squares 2
92/93
Comparison of Results:
OLS, Residualized , and Log-Likelihood
Models
Method a b SEa SEb
OLS 1.975 0.644 1.425 0.212
Rezidu-
alized1.159 0.783 0.605 0.139
LogLike-
lihood
0.940 0.828 0.394 0.121
N.B. The standard errors of the residualized &
log-likelihood models are lower than the OLS model.
The log-likelihood model produces smaller standard
errors than the residualized model.
Weighted Least-squares Regression: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
92
-
8/11/2019 Weighted Least Squares 2
93/93
93