some effects of errors of measurement on multiple correlation

14
Some Effects of Errors of Measurement on Multiple Correlation Author(s): W. G. Cochran Source: Journal of the American Statistical Association, Vol. 65, No. 329 (Mar., 1970), pp. 22- 34 Published by: American Statistical Association Stable URL: http://www.jstor.org/stable/2283572 . Accessed: 14/06/2014 03:14 Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at . http://www.jstor.org/page/info/about/policies/terms.jsp . JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about JSTOR, please contact [email protected]. . American Statistical Association is collaborating with JSTOR to digitize, preserve and extend access to Journal of the American Statistical Association. http://www.jstor.org This content downloaded from 91.229.229.162 on Sat, 14 Jun 2014 03:14:34 AM All use subject to JSTOR Terms and Conditions

Upload: w-g-cochran

Post on 21-Jan-2017

214 views

Category:

Documents


1 download

TRANSCRIPT

Some Effects of Errors of Measurement on Multiple CorrelationAuthor(s): W. G. CochranSource: Journal of the American Statistical Association, Vol. 65, No. 329 (Mar., 1970), pp. 22-34Published by: American Statistical AssociationStable URL: http://www.jstor.org/stable/2283572 .

Accessed: 14/06/2014 03:14

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at .http://www.jstor.org/page/info/about/policies/terms.jsp

.JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range ofcontent in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new formsof scholarship. For more information about JSTOR, please contact [email protected].

.

American Statistical Association is collaborating with JSTOR to digitize, preserve and extend access to Journalof the American Statistical Association.

http://www.jstor.org

This content downloaded from 91.229.229.162 on Sat, 14 Jun 2014 03:14:34 AMAll use subject to JSTOR Terms and Conditions

o Journal of the American Statistical Association March 1970, Volume 65, Number 329

Applications Section

Some Effects of Errors of Measurement on Multiple Correlation

W. G. COCHRAN*

In many multiple correlation and regression studies the values of the variables are difficult to measure accurately. This paper attempts to discuss the effects of such errors of measurements on the squared multiple correlation coefficient. With independent errors and a multivariate normal model, the effect is very roughly that R2 becomes reduced to about Rf'2 =R2 gMw, where g9 is the coefficient of reli- ability of y and g. is a weighted mean of the coefficients of reliability of the x's. If most intercorrelations among the true X's are detrimental to R2, the value of R'2 may exceed R2g,,o . by 10 percent to 20 percent. The situation in which errors are correlated with the true values and the effects of errors on the interpretation of regression coefficients are briefly discussed.

1. INTRODUCTION In recent years there has been an increase in multiple regression studies on

problems in which some of the independent variables represent quantities that are obviously difficult to measure and are presumably measured with substan- tial errors. In the social sciences, for example, these variables may include measures of a person's skills at certain tasks or his attitudes and psychological characteristics, the data being obtained from a questionnaire plus perhaps some kind of examination. Such studies raise the question: to what extent do these errors of measurement weaken or vitiate the uses to which the multiple regres- sioru is put? In examining this question, my results fall a good deal short of what is desirable. The only tractable mathematical model is the multivariate normal -simpler than is needed for many applications. Even with this model, the effects of the errors are complex. I have, however, tried to discover approxi- mately what happens in situations representative of at least a substantial number of applications.

This paper deals mainly with the relation between R2, the squared multiple correlation coefficient between y and the X's when these are correctly measured, and R'2, the corresponding value when errors of measurement are present. Multiple regression equations have several different uses. Among them are: (1) To predict the variable y. The relevant issue here is the relation between uv2(1-R2) and o,2(1-R'2). (2) To estimate the proportion of the variance of y that is associated with variations in the X's. For this application we usually want to know the value of R2, even if our measurements of the X's are subject To errors. Results given here on the relation between R'2 and R2, plus informa- tion on the precision of measurement from pilot or past studies, would enable R2 to be estimated roughly from R'2, but the problem requires further work and

* W. G. Cochran iq professor, Department of Statistics, Harvard University. The author is grateful for sugges- tions received from the editor and a referee. This article was supported by the Office of Naval Research, Contract N00014-67-A-0298-017.

22

This content downloaded from 91.229.229.162 on Sat, 14 Jun 2014 03:14:34 AMAll use subject to JSTOR Terms and Conditions

Effect of Errors on Multiple Correlation 23

will not be pursued here. (3) To study and interpret the values of individual regression coefficients fli. The nature of the effects of errors of measurement on the values of the fpi has been indicated previously [2] and is discussed briefly in Section 8.

2. MATHEMATICAL MODEL

Using capital letters to denote correctly measured values, we suppose that in the population the variate Y,, has a linear regression a+ EiXi.+d on the k X's, where d is the random residual from the regression. Owing to errors of measurement, the variates actually recorded for Y. and for the ith X-variate are

yU= Yu +a+e, xiu = ZXu + at + eiu, (2.1)

where a and the ai represeint overall constant biases of measurement, while eu and the eiu are fluctuating components that follow frequency distributions with means zero.

For this type of model Lindley [10] gave the necessary and sufficient rela- tions that must hold between the joint frequency function of the Xi. and that of the ei in order that the regression of yu on the xi. remains linear. In particu- lar, if yu and the Xi. follow a multivariate normal distribution, the eju must also follow a multivariate normal. This case is assumed here. Clearly, if Yu, eu, Xiu, and the eju jointly follow a multivariate normal, it follows from relationls (2.1) that y. and the xi also follow a multivariate normal and hence that the regression of yu on the xi, is linear.

For the present it is assumed further that e. is independent of Y,, and that any ei. is independent of Xi. or any Xi. (ifri) and of any other eju. These last assumptions are not essential to ensure linearity of the regression of yu on the xiu, and some remarks about the nonindependent case will be made in Section 6. For many applications in which both the Xi. and the eju appear non-normal, it would be desirable to bypass the normality assumptions, but I have no results for this situation.

The bias terms a and at in (2.1) affect the constant term in the regression of y on the xi, but do not affect the multiple correlation coefficient between y and the xi, and hence do not enter into the following sections on Rf2.

3. EFFECT ON R2 WHEN X'S ARE INDEPENDENT

With k X-variates, the following notation will be used for the relevant popu- lation parameters:

ori2 is variance of the correct Xi., e2y fi2 is variance of ey, eju,

pii is correlation coefficient between Xi. and Xj., bi is correlation coefficient between Xi. and Y,.

The symbol bi is used instead of the more natural pi, because this helps to avoid confusion between different kinds of correlation in later discussion. The sign attached to each Xiu is assumed chosen so that bi ?0.

The value of R12, the population squared multiple correlation between Y and the Xi, is completely determined by the pij and the 6i. Primes will denote the

This content downloaded from 91.229.229.162 on Sat, 14 Jun 2014 03:14:34 AMAll use subject to JSTOR Terms and Conditions

24 Journal of the American Statistical Association, March 1 970

corresponding correlations R'2, p'ij, Si', between the observed y and the xi. From the assumptions we have

P'ij Cov[(Xi + ei), (Xj + ej)] = Pi9i?j (3.1) X/(oi2 + E2) (o2 + j2) v(oi2 + .2) (oj2 + Ej2)

Cov[(Xi + ei), (Y + e)] _ i_i_Y_(3.2) X/(oi2 + Ei2)(oy2 + E2) V'(0i2 + Ei2)(0y2 + E2)

In psychometric writings the quantity gi = or2/ (Q2+E,2) is often called the co- efficient of reliability of the measurement xi. Similarly we write gy = cTy2/(cy2+E 2)

=coefficient of reliability of y. Hence, from (3.1) and (3.2),

pIij = PijVgigj, Si' = SiVgigqq (3.3)

If the X's are mutually independent, it is well know that k

R2 2 (3.4) i=1l

Since our assumptions guarantee that the x's are also independent, k k

Rt1 ,1= 2 = gy E 2g,. (3.5) i=l ~i--l

Hence, k k

R 12 = R2gy Si62gi / ,d2 = R %gvwj (3.6) il= i=l

where g, is a weighted mean of the coefficients of reliability of the xi. Consider now the residual variance from the regression. With the correct

measurements this is ay2(1-R2). With the fallible measurements it becomes

aV 2(1 - R'2) = a y2 + E2 - a2gswR2 = y2(j - R2#w) + E2 (3.7)

since oY2gy = ay2. The effect of errors of measurement of y with variance E2 is simply to increase the residual variance by E2, as is well known.

Regarding errors of measurement of the xi, two points are worth noting from (3.7) in relation to applications. For a given reliability of measurement, i.e., a given value of - , the deleterious effect on the residual variance increases as R2 increases, being greater when the prediction formula is very good than when it is mediocre. For example, suppose that gw = 0.5, representing a poor reliability in measurement of the xi. If R2 = 0.9, the residual variance is increased by errors of measurement of the xi from O.l0y2 to 0.55y2, over a fivefold increase. With R2 = 0.4, the increase is only from 0.60y2 to O.Sfy2, a 33 percent jump.

Second, as would be expected, the quality of measurement of those Xi that are individually good predictors is much more important than that of poorer predictors. With k =2, 81 = 0.9, 82 = 0.3, and no errors of measurement, we have R2=0.90, (1-R2) =0.1. If gi=0.5, 92= 1, this gives (1-R'2) =0.505, but with 91= 1, 92=0.5, (1-R'2) =0.145, a much smaller increase.

This content downloaded from 91.229.229.162 on Sat, 14 Jun 2014 03:14:34 AMAll use subject to JSTOR Terms and Conditions

Effect of Errors on Multiple Correlation 25

4. EFFECT OF CORRELATION BETWEEN X'S: TWO VARIATES After working several numerical examples, my approach was to try to con-

struct an approximation of the form R'2 = R2g,w1,f, where f is a correction factor to allow for the effect of correlations among the X's, being equal to 1 when the X's are independent. With numerous X variables, all intercorrelated, it soon appeared that no simple correction factor was likely to be generally applicable. However, we will continue to study the relation of R'2 to Rlgw Further, since the effect of errors in y under this present model is always just to introduce the factor g,, this factor will be omitted in what follows so as to concentrate atten- tion on correlations among the X's.

With two X-variates having a correlation p, the values of R2 and R'2 work out as follows:

R2 = (612 + a2a - 2p3162)/(l - p2) (4.1) R'2 = (g1a12 + g2a22 - 2g1g2p3162)/(1 - g9g2p2). (4.2)

For given 1, a2, the correlation p lies within the limits a162 ? (1 - 612) (1 - 622),

otherwise B2 would exceed 1. Within these limits,

B'2 - R2 (g1a12 + g2622 - 2g1g2pa6a2) (1 - p- )

(612 + a22- 2p6a12) (1 - g912p2)

Inserting gw = (g31a2+g2a22)/(a12+ 622) as a factor, we may write

RB2 R-2w (A) (B), (4.4)

where B is the term

B = (1- p2)/(l - 9192P2). (4.5)

For 9lg2 < 1, p 0, this term is always < 1. For fixed 9192 it decreases monotoni- cally towards 0 as p moves from 0 towards either + 1 or -1.

The remaining factor A takes the form

2p61a2 2p162

a12+ 22 a12 + 622 (4.6)

92 g1

For O <gla2 < 1, it follows that A > 1 if p is positive while A < 1 if p is negative, since we have postulated that a1, a2 are both >0.

Hence, if p is negative the factor f= AB is always <1, decreasing towards zero as p approaches -1. If p is positive, the situation is not so clear, since A > 1 and B < 1. However, when p is small the factor A, which contains only linear terms in p, tends to dominate B which is quadratic in p. Thus when p is positive, f= AB increases and is greater than 1 when p is small, but then decreases as p increases further becoming less than 1 if p is high enough. The only exception is the case a1 = a2. g 1= 92 = g: f then reduces to (1 +p)/l +gp), which increases from f= 1 at p = 0 to f= 2/(1 +9) at p= 1. Incidentally, when3a = A2, the range of pis from (-1+2a12) to +1.

The size of the product glg2 is also relevant to f. For given p, both A and B

This content downloaded from 91.229.229.162 on Sat, 14 Jun 2014 03:14:34 AMAll use subject to JSTOR Terms and Conditions

26 Journal of the American Statistical Association, March 1970

Table 7. VALUES OF f= R'2/R2gw FOR SIX EXAMPLES

1i=.6, 82=.4 a,=.71 -2

p gi= .9 .7 .8, .6 .7, .5 gi= .9, .7 .8, .6 .7, .5

R2 f f f R2 f f f

- . 5 -n - - .893 0.84 0.78 0.74 -.4 .848 0.87 0.82 0.78 .764 0.89 0.85 0.81 -.3 .730 0.91 0.88 0.85 .675 0.93 0.90 0.88 -.2 .642 0.95 0.92 0.90 .610 0.96 0.94 0.93 -.1 .574 0.98 0.97 0.96 .564 0.98 0.98 0.97 0 .520 1.00 1.00 1.00 .530 1.00 1.00 1.00 .1 .477 1.02 1.03 1.04 .507 1.01 1.02 1.02 .2 .442 1.04 1.06 1.07 .494 1.02 1.02 1.03 .3 .413 1.06 1.08 1.10 .490 1.02 1.02 1.03 .4 .390 1.07 1.10 1.12 .498 1.00 1.00 1.01 .5 .373 1.08 1.11 1.14 .520 0.98 0.97 0.97 .6 .362 1.08 1.11 1.14 .566 0.94 0.91 0.90 .7 .361 1.07 1.09 1.12 .655 0.86 0.82 0.79 .8 .378 1.03 1.03 1.06 .850 0.73 0.67 0.63 .9 .463 0.86 0.85 0.85 - - -

9w= .838 .738 .638 gus= .885 .785 .685

a Impossible because R2>1.

tend to approach 1 as 91g2 increases towards 1. Thus the formula R'2 - R2-w is closer to the truth when gi and q2 are high.

As an illustration, Table 1 shows the values of R2 andf for p =-0.5(0.1) +0.9, for six examples. In the first three, 31= 0.6, 52= 0.4, and in the second, a =0.7, 32=0.2. In both sets, R2 is close to 0.5 when p=0. The three pairs 1, 92 = (0.9, 0.7), (0.8, 0.6), and (0.7, 0.5) are given. The principal difference between the cases 6 =0.6, 62=0.4 and 61=0.7, 62=0.2 is as follows. When 63 and b2 differ greatly and p is positive, R2 begins to increase and f to decrease for quite moderate values of p (around 0.3 for &1 =.7, 82 = .2), while when 61 and 62 are more nearly equal, R2 decreases and f increases until p is closer to 1. The turning value of f is a complicated expression, but is usually close to that of R2.

The complementary sets gi, 92+(0.7, 0.9), (0.6, 0.8), (0.5, 0.7), not shown here, exhibit the same behavior with f lying a little nearer 1, except for high, positive p when f becomes less than 1.

The results that f is less than 1 when p is negative and that f is usually greater than 1 when p is positive and moderate can be rationalized as follows. Suppose we compare the value of R2 from formula (3.1) with the value R2ind = a12+ 622

that R2 would have if p were 0. With p negative, R2 increases steadily as p de- parts from 0. With p positive, R2 decreases at first but has a minimum at p = 62/61, where 62? <1, and thereafter increases. It does not reach R2ind = (612+ 622)

until p = 26162/(512+ 622), which is high if Al and 82 are not too different. Thus in a bivariate correlation, negative correlation between xi and x2 increases R2 while positive correlation decreases R2 unless p is high enough.

Now the errors of measurement in the x's reduce p to PV\/912. Thus the errors

This content downloaded from 91.229.229.162 on Sat, 14 Jun 2014 03:14:34 AMAll use subject to JSTOR Terms and Conditions

Effect of Errors on Multiple Correlation 27

diminish both a helpful negative p, making f < 1 with p negative, and a harmful positive p, making f > 1 when p is positive and moderate.

In these examples gw lies between 0.638 and 0.885. Regarding the crude approximatlon R'2"R2? , in these examples this is correct to within ? 15 per- cent for p lying between -0.3 and +0.5, being much closer than this through- out most of Table 1.

To summarize for two independent variates: when p is negative, f < 1 because the negative correlation P'= PV\/i12 iS less helpful to R'2 than the negative correlation p is to R2. When p is positive and small or modest, f exceeds 1, be- cause the harmful positive correlation between X1 and X2 is decreased by the errors of measurement. If p becomes high enough, however, positive correlation becomes helpful and f drops below 1. For given p, f departs further from 1 as the product 9lg2 decreases.

With three X-variates denoted by the subscripts i, j, and k, the value of R2 may be expressed as

E 3,2(l - p2jk) - 2 E (P8j- PikPjk)8i8j

R , (4.7) 1 - p2ij + 2P12P13P23

i>i

while R'2 has the corresponding value found by substituting ai' = 8iVgi, p'i, =pijx/gigj. These expressions are discouraging to the prospect of finding an approximation for f that would be valid over a wide range of values of the gi and the Pij. With regard to R2 itself, (4.7) suggests that with all bi> 0, negative values of Pii are likely to be helpful, since the only linear term in the P2; is -2pij5ijj in the numerator.

With 3 X-variables, a table like Table 1 involves choosing the values of 9 parameters-3 bi, 3 gi, and 3 Pti. The situation rapidly becomes more complex with more than 3 X's. Before proceeding further, we digress to consider the values of the Pi; and the gi likely to occur in practice in the hope that a smaller range of pij and gi might be representative of many practical situations.

5. SOME VALUES OF rij IN PRACTICAL APPLICATIONS

When the sign attached to each xi is chosen so that Sj20, these decisions determine the sign attached to every p0j. In studying the estimates rij of the Pij found in 12 well-known examples of the discriminant function [1], I noted that most-of the rij are positive and modest in size, while those that are negative are usually small. The same situation appears to hold in many applications of multiple regression. Table 2 shows the distributions of the rij in (a) the discrimi- nant function examples, (b) the numerical examples of multiple regression given in 12 standard statistical texts, and (c) a single large example-the prediction of verbal ability scores of 12th grade white students in the north of the U. S. from 20 variables representing data on the student, the quality of the school, and the student's home environment [3].

The percentages of r's that are positive are 83, 84, and 74 in the three sets. The negative r's average to between -0.2 and 0, the averages of the positive

This content downloaded from 91.229.229.162 on Sat, 14 Jun 2014 03:14:34 AMAll use subject to JSTOR Terms and Conditions

28 Journal of the American Statistical Association, March 1970

Table 2. DISTRIBUTIONS OF ESTIMATED CORRELATIONS BETWEEN X'S

rii negative rij positive rnj Verbal r Verbal

D.F. Texts ability D.F. Texts ability

<-.5 1 1 0 Oto .1 15 5 58 -.5 to-.4 2 0 0 .1to .2 22 8 41 - .4to-.3 1 0 1 .2 to .3 25 7 25 -.3 to-.2 4 2 2 .3 to .4 18 9 7 -.2 to-.1 4 2 10 .4 to .5 6 7 3 -.1 toO 9 5 36 .5 to .6 10 6 6

.6to .7 4 5 1

.7to .8 1 4 0 >.8 0 3 0

Total 21 10 49 Total 101 54 141 -0.19 -0.17 -0.09 +0.30 +0.41 +0.16

r's being a bit higher (0.16 to 0.41). Though some allowance is needed for the sampling errors of the rij, since our interest is in the unknown pij, my impres- sion is that most of the degrees of freedom are large enough so that the effects of sampling errors on the average r's should be small. In the discriminant func- tion and text examples there may have been some selection towards more interesting examples, but this would probably affect the sizes of Si rather than the pij.

In calculations for three or more x's, these results led me to concentrate on pii less than 0.5, mainly on two cases: (1) all pij positive, and (2) only a minority negative.

6. ESTIMATING RELIABILITY

With variables that are hard to measure, the problem of estimating the reliability of measurement is itself formidable. There is an extensive literature dealing with the study of errors of measurement, particularly in sample surveys. However, much of this is not usable for my purpose, because it concentrates on the overall bias in a measuring process, or deals essentially with 0-1 variates, or, when handling continuous variates, arranges them in an ordered classified form and reports only the percentages of cases in which the fallible method was wrong by one or more classes.

Direct estimation of g is possible only when it is feasible to measure both the correct value X and the fallible value x for a sample of items. Data of this type are likely to be available only in the simpler problems of measurement in which a recognized correct method exists. Statements of age, for instance, might be compared with birth certificate records. To cite a few examples, Gray [5] reports ' = 0.93 for verbal reports of number of days sick leave taken during the period July 1 through November 15 as compared with the firm's records, and g = 0.90 for number of days annual leave taken in the summer months. Data of Kish and Lansing [9] suggest g- around 0.83 for appraisers' estimates of the selling prices of homes, as compared with actual prices for those homes that had been sold recently. Data collected for the National Center for Health

This content downloaded from 91.229.229.162 on Sat, 14 Jun 2014 03:14:34 AMAll use subject to JSTOR Terms and Conditions

Efect of Errors on Multiple Correlation 29

Statistics [11] provide a = 0.86 for annual family expenditure on medical care as obtained from a short questionnaire.

Under the Consumer Savings Project, studies have been made of the accu- racy of the reporting of financial assets. Ferber [4] presents results of urban and rural studies of time deposits in savings institutions with validation from the records of these institutions. Sample averages gave underestimates of as much as 50 percent or more, because (a) nonrespondents had larger savings than respondents, and (b) among respondents, a substantial number errone- ously reported no savings. Respondents who reported savings had generally high accuracy. Thus the distribution of errors of measurement consists of one set of ei that are small and have mean almost zero, and another set in which e= -Xi since xi = 0. Quite apart from the bias, these features make the distri- bution of errors far from normal and give values of g as low as 0.1 or 0.2.

A more widely used technique for estimating g, particularly when no method of making a correct measurement is known, is to attempt to take two indepen- dent measurements x =X+ell x2 = X+e2, where Cov(el, e2) =0, on a sample of specimens by the fallible process. Assuming further that X is uncorrelated with el and e2, the covariance of xi and x2 estimates rX2, so that g is estimated by Cov(x1, X2)/S.2, where S.2 is the pooled within-sample mean square for xi and x2.

With independence, this method works even if xi and x2 are alternative forms of the fallible process that have different g's, since Cov(x1, x2)/sS12 estimates gi for xi and Cov(x1, X2)/SX22 estimates 92 for x2. This method is frequently quoted in the rating of examinations, where xi and x2 may be split halves of an exam or alternative forms of the exam. Sulzmann, et at. [13] report - values averaging 0.81 for test-retest values of standard visual acuity tests. An average =0.71 was computed from repetitions of the one-hour blood glucose level (after over- night fasting) in the test for the detection of diabetes [7]. In a review of ap- plications of the Wechsler Intelligence Scale given to normal children aged 5-15, Sells [121 reports g values for 44 subtests in different studies by split- half or test-retest methods. These subtests gave an average g of 0.83, with a range from 0.65 to 0.96, while 16 subtests on retarded or guidance clinic children averaged g = 0.89. Many studies have reported correlations between the Wechsler and the Stanford-Binet scores as alternative measures of facets of intelligence. On normals, the average g was 0.75 (129 subtests) as against an average g = 0.69 for 23 subtests on mental defectives or retarded children [12].

Numerous complications can arise with this method. The assumption of independence of the two repetitions is vital. With positive correlation p be- tween el and e2 (e.g., memory of the same wrong answer on two different oc- casions), the quantity Cov(x1, x2)/s,2 overestimates g. In the simplest cor- related case, in which X is uncorrelated with el or e2 and ,e,2= 2=e22, the quan- tity Cov(x1, x2)/s 2 is a consistent estimate of {g+p(l-a) 1. The overestima- tion can easily be substailtial. On the other hand, if the correct X varies with time and if the reliability of measurement as of a specific time is wanted, a test-retest correlation may underestimate this g, since the retest for a subject is measuring a different X from the test. This effect may account for low 9

This content downloaded from 91.229.229.162 on Sat, 14 Jun 2014 03:14:34 AMAll use subject to JSTOR Terms and Conditions

30 Journal of the American Statistical Association, March 1970

values of 0.22 for respiration period, 0.36 for white blood cell count and 0.65 for blood sugar content reported by Guilford [6] from retests one day later on subjects.

In fact, when X varies through time, the relevant variable for inclusion in the multiple correlation may be a complicated function of the levels of X through a period of time, and the appropriate function may not be known. This is presumably the case, for instance, in attempts to relate the amount of cigarette smoking to the death rate. Estimates of the reliability of estimation of X for a specific day or short period of time may be only partially relevant. A similar comment applies to attempts to measure personal traits such as aggressiveness, shyness, or leadership, where the scores on alternative forms of a questionnaire may correlate highly, but neither form may measure the name given by the investigator to the trait.

For these reasons I am uncertain about the range of g values that can be used to describe practical applications in difficult measurement problems. My calculations have been done for the range g ?0.5. With g = 0.5, the varianlce of the errors of measurement equals the variance of the true measurements, and this seems a rather low standard of measurement. For g<0.5, such as occurs in reports of consumer savings, my approximation for R2 becomes poor. Incidentally, for a given variance of the errors of measurement, the value of g of course depends on ?x2; for the same variable, g can be much higher in a widespread population than in a narrowly restricted one. The low value

= 0.41 reported by IKinsey [8] for "age at first knowledge of venereal disease" may reflect the fact that the correct X for this variable has a low standard deviation.

7. EFFECT OF ERRORS WITH MORE THAN TWO X VARIATES Returning to the relation between R'2 and R2 with k X-variates (k>2),

assume first that all pij>0. The simplest situation is that in which Pij=p>O, g,= g. In this case R2 and R'2 have the values

R 2 E ,2 [1+ kp E(Si

- 3) 2 (7.1) 1 +(k -1)p[+ (1 .p) E 62J(71

R2 = 9 Ei [1 + gkCp Ea IS 6)2 (7.2) 1 + (kI- 1)gp - (1-gp) E 2

If the bi are approximately equal, i.e., the Xi are individually about equally good, the first terms in (7.1) and (7.2) dominate. Then we have

R'2 1 + (k-1)p gR2 1 + (k - 1)gp

For p positive, this f exceeds 1 and increases steadily as p goes from 0 to 1. For given g and p, the excess above 1 increases with k, the number of variables.

The second terms inside the brackets in (7.1) and (7.2) begin to assume im- portance when the bi vary substantially. The ratio of the second term in (7.2) to that in (7.1) is g(1 -p)/(1 -gp), which is less than 1 and decreases as p in-

This content downloaded from 91.229.229.162 on Sat, 14 Jun 2014 03:14:34 AMAll use subject to JSTOR Terms and Conditions

Effect of Errors on Multiple Correlation 31

Table 3. VALUES OF f= R'2/gR2 FOR THREE EXAMPLES WITH POSITIVE p

k- 3 5 10 .6, .5, .4 .5, (.4)2, .3, .2 .5, .4, (.3)2, (.2)3, (.1)3

g\p .2 .3 .4 .2 .3 .4 .2 .3 .4

0.9 1.03 1.03 1.04 1.04 1.04 1.03 1.02 1.01 0.98 0.8 1.06 1.07 1.08 1.08 1.08 1.07 1.05 1.02 0.98 0.7 1.09 1.11 1.13 1.12 1.13 1.13 1.08 1.04 0.98 0.6 1.12 1.16 1.18 1.17 1.19 1.19 1.13 1.08 1.00 0.5 1.15 1.21 1.25 1.23 1.26 1.26 1.18 1.12 1.03

creases. The effect of this term, in conjunction with the first term, is to make the amount by which f rises above 1 smaller for positive and moderate values of p.

Table 3 shows three examples with positive p,j and 3, 5, and 10 x-variates, respectively. Values of f-R'2/gR2 are given for g=0.9(0.1)0.5 and p= .2, .3, .4.

The examples for three and five variables show rather similar values of f. Two offsetting influences are at work: k1=5 has more variables, associated with higher f's, but also has more variation among the 6i, leading to lower f's. The k =10 example, which has still more variation among the bi, gives values of f usually nearer 1. With all p's positive, the value of f also tends to increase steadily as g diminishes.

Table 4 presents four examples in which a minority of the pij are negative. The values of k are 3, with one Pii out of 3 negative; 5, with 4 pij out of 10 negative; and 10, with 9 pij out of 45 negative. As Table 4 shows, with these mixtures of positive and negative Pij the values of f stay much closer to unity than when all Puj are positive. The main reason is presumably that both the

Table 4. VALUES OF f- R'2/gR2 FOR EXAMPLES WITH SOME Puj NEGATIVE

k= 3 3 ai - .6, .5, .4 .6, .5, .4

Pii - .2,.2, -.2 .3,.3, -.2 .4,.4, -.2 .3, .3,-.2 .3, .3,-.3 .3, .3, -.4

g=0.9 1.01 1.02 1.02 1.02 1.01 0.98 0.8 1.02 1.04 1.05 1.04 1.02 0.98 0.7 1.04 1.06 1.08 1.06 1.03 0.98 0.6 1.05 1.09 1.11 1.09 1.05 0.99 0.5 1.07 1.11 1.15 1.11 1.07 1.00

k = 5 10 ai= .5,.4,.3, .2,.4 .5, .4, (.3)2, (.2)2, (.1)5, .2

.2 (j z5) .3 (j p5) .4 (j p5) .2 (j 10) .3 (j 10) .4 j 10) Pij' ~ -.2 (j = 5) -.2 (j = 5) -.2 (j 5) -.2 (j 10) -.2 (j =10) -.2 (j 10)

g=0.9 0.99 0.99 0.99 1.01 0.98 0.99 0.8 0.97 0.99 0.99 1.02 1.02 0.99 0.7 0.96 0.99 1.00 1.03 1.03 1.02 0.6 0.96 0.99 1.00 1.06 1.06 1.02 0.5 0.95 1.00 1.01 1.09 1.10 1.06

This content downloaded from 91.229.229.162 on Sat, 14 Jun 2014 03:14:34 AMAll use subject to JSTOR Terms and Conditions

32 Journal of the American Statistical Association, March 1970

helpful negative Pij and the harmful positive Pij are decreased by the errors in the x's.

When the gi are very unequal and some pij are negative, f is more erratic, though its behavior still follows the lines indicated above. For instance, if the gi are such that the harmful correlations remain about the same but the helpful ones are much reduced, f can be substantially less than 1. In the opposite case, the excess of f above 1 can be greater than indicated by Table 4.

So far as they go, the preceding results suggest that the relation R'2 = R2gVgw may serve as a rough guide to the effect of errors of measurement on the squared multiple correlation coefficienit in many applications. The value of R'2 may be up to 10 percent higher than this if most correlations are positive and harmful and the g values exceed 0.7, and up to 25 percent higher if the g's are as low as 0.5. If we are lucky enough to have mainly helpful correlations among the true X's, the errors of measurement tend to reduce them, so that R'2 may be slightly less than R2g11gw. These results assume that errors of measurement ei are independent of one another and of the correct X's.

In the situation in which ei is correlated with Xi there is a choice of mathe- matical models. It is not obvious which best describes practical conditions. To specify fully the correlations among the four variables X1, e1, X2, e2 requires six correlation coefficients (with of course some restrictions among their values). If we specify the correlations between X1 and e1, X2 and e2, and X1 and X2, there remain three correlations to be specified. One set of assumptions that seems not unreasonable is that in which

ei= Xj(Xi - i) + ei', (7.4)

where Xi is the regression of es on Xi and the residuals ei' are uncorrelated with any Xi, Xj or ej' and have variance E/12. In this model the correlations between es and Xj, ej and Xi, and ej are only those that arise as reflections of the cor- relations of es with Xi and ej with Xj. Consequences of this model are

Xi = Xi + e, = (1 + Xj)Xj - Xii + es',

i= p= =(1 + Xi)oibi/V{ (1 + Xi)2'of2 + Ei'2},

Pij -(1 + Xi) (I -+ Xj)o1jptj/V I(1 + Xi)2ao2 + Ei'2} { (1 + Xj)2oj2 + E,12}

In the original model with ei independent of Xi we had 3j'= \/gi Si and pij' - Vgigj Pij. These relations continue to hold in this situation if gi is redefined as

gi = (1 + Xi)2o.,2/{ (1 + Xi)2oj2 + Ei'2}. (7.5)

In this redefinition gi no longer equals oxi2/oxi2 but (1 +Xi)2oxi2/o,~2. However, if Xil, Xi2 are two repeat estimates of Xi in this model and if e'i1 and e'i2 are inde- pendent, the quantity Cov(xii, xi2)/A'i still equals this redefined gi value, so that the usual method of estimation of reliability by repeat measurements still estimates the relevant gi.

For given o 2, it follows from (7.4) that E-E2=- i2Xi2oi2 so that the rele- vant gi in (7.5) becomes

9;= (1 + Xi) 2lo2/ { (1 + 2Xj)oj2 + ej2J

This is an increasing function of Xi. Thus, under this model positive correlation

This content downloaded from 91.229.229.162 on Sat, 14 Jun 2014 03:14:34 AMAll use subject to JSTOR Terms and Conditions

Effect of Errors on Multiple Correlation 33

between ei and Xi mitigates the effect of errors of measurement, as is not sur- prising since part of the error ei is doing the same work as Xi. Negative cor- relation accentuates the effects.

8. DISCUSSION

As indicated in the introduction, my interest in this problem was stimulated primarily by two kinds of incidents. Occasionally I see a multiple regression study, based on perhaps 50-80 independent variables involving the measure- ments of many complex aspects of human behavior, motives and attitudes. The data were obtained by questionnaires filled out in a hurry by apparently disinterested graduate students. The proposal to consign this material at once to the circular file (except that my current wastebasket is rectangular) has some appeal. Second, many multiple regression studies seem to give disap- pointing results, the discussion in the paper being mainly a search for reasons why R'2 as computed from the data turned out to be so low. Where difficult measurement problems are involved, I have sometimes wondered whether errors of measurement in y or the x's may not supply a substantial part of the explanation.

If it can be trusted to apply to a reasonable number of practical applications, the result that R'2 equals or may somewhat exceed R2g,gw throws some light on both the preceding examples. The result reinforces the need to learn more about the sizes of coefficients of reliability.

Errors of measurement in the xi also affect other common uses of multiple regressions. For instance, if xi and x2 are two variables or linear combinations of two sets of variables, each set measuring a distinct aspect of behavior, the investigator sometimes reports (a) the percentage of the variance of y unique- ly associated with xi, i.e., the percentage uncorrelated with x2, (b) the per- centage uniquely associated with x2, and (c) the percentage that is "common" to x1 and x2. With l = 62= 0.5, p = 0.3, and no errors of measurement, these per- centages are (a) 13.5 percent, (b) 13.5 percent, and (c) 11.5 percent, adding to R2 = 38.5 percent. If q1 = 0.8, 92 = 1, the percentages become (a) 10.6 percent, (b) 15.6 percent, and (c) 9.4 percent, and with gi = 0.6, 92= 1 they become (a) 7.8 percent, (b) 17.8 percent, and (c) 7.2 percent. In this last case the variable X2 would be reported as having a more important association with y than xi, although the variables are of equal importance if both are measured correctly.

-As discussed elsewhere [2], errors of measurement also affect the interpreta- tion of the regression coefficients. In brief, if oij is the covariance of Xi and Xi and if o'ixj= 0ij when i#j while 'u- = i2 +2, then in the simplest case when ei and Xi are independent we have

k

$ = As _ E Sijlej 2#,

where fl' is the regression coefficient of y on xi while rij' is the inverse of o'ij Separating out the coefficient of pi on the right, we get

=* -i3(l - *2a1,,) _ Eij,fj2#j

This content downloaded from 91.229.229.162 on Sat, 14 Jun 2014 03:14:34 AMAll use subject to JSTOR Terms and Conditions

34 Journal of the American Statistical Association, March 1970

Now o-'ii> l/(i2+ei2), so that the coefficient of Ai in f3i' is at most o-2/(Oi2+Ei2)

= gi. Thus the direct effect of an error in Xi on pi is to decrease its absolute value to gqif or something less, but pi' also receives contributions from errors of measurement in any other Xj that is correlated with Xi. Even if such errors occur in only Xi, they can affect the values of all the /j'. By working a few ex- amples with varying gi and P3i, it becomes evident that interpretation of the /3/ as if they were the f3i can become quite misleading unless all the gi are high.

REFERENCES [11 Cochran, W. G., "On the Performance of the Linear Discriminant Function," Bulletin

de l'Institut International de Statistique, 39, 2 (1961), 435-47. [21 , "Errors of Measurement in Statistics," Technometrics, 10 (1968) 637-66. [31 Coleman, J. B., et al., "Equality of Educational Opportunity," Washington: U. S.

Government Pfinting Office, 1966. [4] Ferber, R., "The Reliability of Consumer Surveys of Financial Holdings: Time

Deposits," Journal of the American Statistical Association, 60 (1965), 148-63. [5] Gray, P. G., "The Memory Factor in Social Surveys," Journal of the American Sta-

tistical Association, 50 (1955), 344-63. [61 Guilford, J. P., Personality, New York: McGraw-Hill, 1959. [71 Hayner, N. S., et al., "The One-Hour Glucose Tolerance Test," National Center for

Health Statistics, Series 2, No. 3, 1963. [81 Kinsey, A. C., Pomeroy, W. B., and Martin, C. E., Sexual Behavior in the Human

Male, Philadelphia: W. B. Saunders, 1948. [9] Kish, L. and Lansing, J. B., "Response Errors in Estimating the Value of Homes,"

Journal of the American Statistical Association, 49 (1954), 520-38. [101 Lindley, D. V., "Regression Lines and the Linear Functional Relationship," Journal

of the Royal Statistical Society, B, 9 (1947), 218-24. [11] National Center for Health Statistics, "Measurement of Personal Health Expendi-

tures," Series 2, No. 2, 1963. [121 Sells, S. B., "Evaluation of Psychological Measures Used in the Health Examination

Survey," National Center for Health Statistics, Series 2, No. 15, 1966. [13] Sulzmann, J. H., et al., "Data Quoted in Comparison of Two Vision-Testing Devices,"

National Center for Health Statistics, Series 2, No. 1, 1966.

This content downloaded from 91.229.229.162 on Sat, 14 Jun 2014 03:14:34 AMAll use subject to JSTOR Terms and Conditions