galton's problem as spatial autocorrelation: comments on ember's

12
University of Pittsburgh- Of the Commonwealth System of Higher Education Galton's Problem as Spatial Autocorrelation: Comments on Ember's Empirical Test Author(s): Colin Loftin Reviewed work(s): Source: Ethnology, Vol. 11, No. 4 (Oct., 1972), pp. 425-435 Published by: University of Pittsburgh- Of the Commonwealth System of Higher Education Stable URL: http://www.jstor.org/stable/3773073 . Accessed: 17/01/2013 01:18 Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at . http://www.jstor.org/page/info/about/policies/terms.jsp . JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about JSTOR, please contact [email protected]. . University of Pittsburgh- Of the Commonwealth System of Higher Education is collaborating with JSTOR to digitize, preserve and extend access to Ethnology. http://www.jstor.org This content downloaded on Thu, 17 Jan 2013 01:18:42 AM All use subject to JSTOR Terms and Conditions

Upload: others

Post on 25-Mar-2022

11 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Galton's Problem as Spatial Autocorrelation: Comments on Ember's

University of Pittsburgh- Of the Commonwealth System of Higher Education

Galton's Problem as Spatial Autocorrelation: Comments on Ember's Empirical TestAuthor(s): Colin LoftinReviewed work(s):Source: Ethnology, Vol. 11, No. 4 (Oct., 1972), pp. 425-435Published by: University of Pittsburgh- Of the Commonwealth System of Higher EducationStable URL: http://www.jstor.org/stable/3773073 .

Accessed: 17/01/2013 01:18

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at .http://www.jstor.org/page/info/about/policies/terms.jsp

.JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range ofcontent in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new formsof scholarship. For more information about JSTOR, please contact [email protected].

.

University of Pittsburgh- Of the Commonwealth System of Higher Education is collaborating with JSTOR todigitize, preserve and extend access to Ethnology.

http://www.jstor.org

This content downloaded on Thu, 17 Jan 2013 01:18:42 AMAll use subject to JSTOR Terms and Conditions

Page 2: Galton's Problem as Spatial Autocorrelation: Comments on Ember's

Galtons Problem as Spatial

Autocorrelation: Comments

on Ember's Empirical Test1

Colin Loftin Brown University

In a recent paper in this journal Melvin Ertlber (I97I) dismisses Galton's problem (the problem of interdependent observations in cross-cultural research) as a serious methodological problem and suggests (Ember I97I: I06) ". . . that cross-cultural researchers can safely ignore Galton's problem so long as their samples are selected in some random fashion." His argument rests primarily on empirical research which shows that his- torically interdependent samples do not produce consistently higher esti- mates of the strength of relationship between variables than do samples that are relatively independent. Since most of the literature on Galton's problem leads one to assume that the very essence of the problem is the "artificial inflation of correlations,s' Ember's discovery that interdependent samples do not necessarily lead to spuriously high correlations is an im- portant contribution to the literature. Unfortunately, however, his con- clusion that Galton's problem can safely be ignored is quite misleading and should not be allowed to stand without correction.

Ember's own data clearly show that Galton's problem is a very serious one that cannot be solved simply by random sampling. If the problem is ignored, and standard procedures are used, serious errors are likely to re- sult. The basic problem is simple enough. The purpose of sampling is to provide a basis for making inferences about the nature of some universe or population. We usually assume-in fact, all the statistical tests of signifi- cance that are commonly used in cross-cultural research are based on the assumption that the larger the size of the sample, the more confidence we can have in the validity of our inferences about the nature of the popula- tion. This is true, however, only to the extent that the sample approxi- mates an independent random sample. If the units in the sample are in- terdependent that is, if knowledge of the properties of one case allows one to predict the properties of other cases in the sample-then examina- tion of these additional cases tells us relatively less about the nature of the population from which the sample was drawn than would be the case if

425

This content downloaded on Thu, 17 Jan 2013 01:18:42 AMAll use subject to JSTOR Terms and Conditions

Page 3: Galton's Problem as Spatial Autocorrelation: Comments on Ember's

426 ETHNOLOGY

the units were independent. In the extreme case in which the sanzple is completely interdependent, every case in the sample will be exactly like every other case, and after we examine the first case, we learn nothing new about the population; we are in effect looking at the same case over and over again. However large the actual size of a sample may be, interde- pendence means that its egective size is smaller, the diderence between the actual and the effective samples depending on the degree of interde- pendence.

EMBER S RESEARCH To understand more clearly why Ember's conclusions are misleading it

will be helpful to look at his research design and the results of his study. The general procedure that Ember used was to compare correlations

computed for three types of samples that were designed to vary systemati- cally with respect to the amount of historical relatedness (i.e., interde- pendence). According to his reasoning, if the interdependence of cases artificially inflates correlations, the mean size of the correlations should vary systematically with the type of sample. The greater the historical re- latedness, the higher the correlation. The three types of samples were defined as follows:

I. Language Family Samples-random selection from the population of all language families listed in the Ethnographic Atlas (Murdock I967) which contain a minimum of twenty societies. Once the language family was selectedn a sample of twenty societies was selected from that language family with the aid of a table of random numbers. 2. Simple Random Samples random selection of socicties listed in the Ethnographic Atlas exclusive of societies already selected in the Language Family Samples. 3. Purifed Random Samples random selection except that any case which either belonged to the same language family as a previously selected case or was located less than ten degrees (latitude or longitude) from a previ- ously selected case was excluded. Cases previously selected for Language Family and Simple Random Samples were also excluded.

Six samples of approximately twenty cases were seIected using each of the three proceduresS but the usable samples ranged in size from sixteen to twenty because cases with missing data were excluded from the analysis. Five different relationships were examined, including four substantive hy- potheses and replication for one of the hypotheses. Thus the result was go correlations computed from go different samples (3 types of samples X 6 samples of each type X 5 relationships-go). The four hypotheses were: (I) patrilocal residence with patrilineal descent (this one was replicated), (2) male genital mutilation around puberty with polygyny, (3) social stratification with complex political organization, and (4) bride-price with herding of bovine animals. The measure of correlation was the phi coeEcient.

This content downloaded on Thu, 17 Jan 2013 01:18:42 AMAll use subject to JSTOR Terms and Conditions

Page 4: Galton's Problem as Spatial Autocorrelation: Comments on Ember's

GALTON S PROBLEM: COMMENTS ON EMBER 427

TABLE 1 Data from Ember's Study

Phi Coeflicients Between Patrilocal Residence and Patrilineal Descent Language Family Simple Random Purified Random

Samples Samples Samples . 134 .903 .500 .229 .406 .664 .312 .734 .577 .681 .367 .522 .638 . 704 . 738 .577 .471 .899

Mean .428 .597 . 649 S. D. .232 .214 .151

Phi Coefficients Between Patrilocal Residence and Patrilineal Descent (Replication)

Language Family Simple Random Purified Random Samples Samples Samples

.000 .538 .408

.185 .664 .471

.302 .667 .577

.373 . 704 .599

.553 .811 .698

.553 .904 .738 Mean .327 .714 .581 S. D. .215 .127 .127

Phi Coefficients Between Male Genital Mutiliation Around Puberty and Polygyny

Language Family Simple Random Purified Random Samples Samples Samples

.000 .000 . 145

.000 .267 -.171

.000 . 100 000

.000 .343 .000

. 130 .392 183

.000 .533 184

Mean .021 . 272 .056 S. D. .053 . 195 . 140

Phi Coefficients Between Social Strat;fication and Complex Political Organization

Language Family Simple Random Purified Random Samples Samples Samples -. 365 .444 -.088 -.111 .459 .236

.000 .484 . 258

.298 .544 .606

. 4S6 .600 .618 1.000 . 777 .676

Mean .212 .551 .384 S. D. .483 .124 .299

This content downloaded on Thu, 17 Jan 2013 01:18:42 AMAll use subject to JSTOR Terms and Conditions

Page 5: Galton's Problem as Spatial Autocorrelation: Comments on Ember's

428 ETHNOLOGY

TABLE 1. (Continued) Phi Coefficients Between Bride Price and Herding of Bovine Animals

Language Family Simple Random Purified Random Samples Samples Samples - .098 - . 171 . 179

. 000 .000 . 288

.000 .206 .341

. 0OO .242 .344

.000 .375 .567

. 126 .454 . 630 Mean .004 . 184 .391 S. D. .071 .233 .172

Analysis of the data showed that the inclusion of historically related cases did not inflate correlations. In only two of the five sets of comparisons was there ". . . a significant difference (with respect to the amount of correlation) among [the] three types of samples...." (Ember I97I: I04),

and in those two comparisons, the difference was not in the direction pre- dicted. Table I reproduces the phi coefficients for the sets of six samples from Ember I97I: Tables I-5, adding for each set the mean and stand- ard deviation (S.D.). Ember concluded from these phi coefficients that cross-cultural researchers need not worry about Galton's problem. How- ever, the highly variable correlations in Table I provide little comfort to one who is doing cross-cultural research. For example, if one were studying the relationship between stratification and political organization with the use of a language family sample, the best guess as to the true relationship would be somewhere between -.36 (a weak negative relationship) and + I.0 (a perfect positive relationship). This is the most extreme example of a general tendency of the coeEcients in Table I to vary considerably be- tween replications of the same relationship. To a large extent this wide range of variation is a result of the unfortunately small size of the sam- ples that Ember selected for his analysis (N 20 or less).

While this is certainly much too small a sample to yield very reliable estimates of the true correlation, the size of the sample is not the only problem. More important for the present purpose is the fact that the correiations based on the language family samples, those with the lowest independence, are generally more variable than are the correlations based on the two types of random samples. This difference is indicated by the generally higher standard deviations of the phi coefiicients for language family samples than for random samples. The exceptions to this tendency are in the third and iSfth relationships shown in Table I, where the lan- guage family samples produce a large number of zero correlations, but this only provides additional evidence for our contention that interde- pendence is a serious problem in cross-cultural research. Ember reports only phi coefficients and not the original data which they are intended to summarize, so one cannot be certain of the reason for the zero coeHicients, but he suggests (Ember I97I: I05) that it is because all of the societies in the sample lack a particular custom, providing no variance to be ex- plained by the other trait (i.e., the row or column in the 2 X 2 tables

This content downloaded on Thu, 17 Jan 2013 01:18:42 AMAll use subject to JSTOR Terms and Conditions

Page 6: Galton's Problem as Spatial Autocorrelation: Comments on Ember's

GALTON S PROBLEM: COMMENTS ON EMBER 429

contains no cases).2 In other words the societies in the sample are homo- geneous with respect to at least one of the traits being studiedn and we learn very little about the population of societies that vary with respect to the traits being studied. They are limiting cases, prototypes of Galton's problem, in that one may know nothing more about the population after he has examined twenty of these societies than he did after he examined one of them. The situation might be compared to examining a photostatic copy of a source as a check on the validity of that source.

The two types of random sample are not systematically different in the variability of the correlations, but these types of sample also are not likely to vary signifieantly in the degree to which they are historically inde- pendent. Contrary tO Ember's reasoning, a simple random sample of ap- proximately twenty cases from the Ethnographic Atlas is not likely to contain very many cases that are historically related.

Thus while Ember's data do demonstrate that historical interdependence does not consistently produce spuriously high correlations, the same data also demonstrate that Galton's problem tends tO produce unreliable or highly dispersed estimates of correlation. The previous literature on the problem has been misleading in suggesting that interdependence of units will con- sistently increase correlations, whereas, in fact, it leads to underestimating the average variability (standard error) of the estimate of population

* *

ci laracterlstlcs.

On first consideration there may appear to be an inconsistency be- tween the contention that Galtons problem produces highly dispersed estimates and that it leads to underestimates of average variability, i.e., that estimates are both more variable and less variable than they would be were the samples independent. The difference, of course, is between look- ing at a single interdependent sample and a distribution of interdeperldent samples. Any particular interdependent sample will be more homogeneous than an independent sample of comparable size and since a researcher will usually have only one sample, he will underestimate the average variability or standard error of his sample statistic. On the other hand, if a number of different interdependent samples were examined (as was the case in Em- ber's study) they would differ among themselves considerably more than would be the case with a comparable number of independent samples. It is in this sense that estimates would be more dispersed than estimates based on truly independent samples.

To illustrate the kind of mistakes that are likely to result from the problem, consider the simple problem of estimating the mean value of some variable listed in the Ethnographic Atlas on the basis of a language family sample. The usual formula for standard error of the mean

a S _ o_

x = a/n

s = standard deviation of the sample n = size of the sample

This content downloaded on Thu, 17 Jan 2013 01:18:42 AMAll use subject to JSTOR Terms and Conditions

Page 7: Galton's Problem as Spatial Autocorrelation: Comments on Ember's

430 ETHNOLOGY

would seriously underestimate the standard error, because the estimate of the variance which is based on the standard deviation of the interdependent sample values would be consistently smaller than the true population vari- ance. An alternative statement is that the number of independently se- lected cases would be smaller than the value of n.

This is) of course, not limited to estimates of the mean. A11 sample statistics phi coeW;cients, product moment correlation coefiicients, regres- sion coefiicientsn proportions, etc.-will be influenced in the same way and tests of their statistical significance will be misleading

A similar situation occurs in surveys which use cluster sampling pro- cedures (Kish I+5: I6I) and in the analysis of time-series data (Blalock I08. I72; Tintner I968: 48). The case of time-series is particularly in- teresting because it is so closely analogous to Galton's problem and be- cause a considerable amount of research has been done on its effects. Most variables change slowly and regularly through time, so that successive observations of the same variable do not form independent samples of ob- servations unless the variable changes quickly and erratically. Compared to independent samples, observations drawn from time-series will usually be more homogeneous) and the shorter the time intervals the more homo- geneous (interdependent) the observations will be. It is unlikely that the highest value and the lowest value in a time seriest for example, the stock market-will occur on successive days. Yet, under conditions of indepen- dence, this occurrence would be as likely as any other. The correlation between successive observations of this type is referred to as autocor- relation or serial correlation (Tintner I96&: 52; Blalock I968: I72).

The situation is the same in the case of cross-cultural observations. Societies close to each other in space will tend, because of diffusion, com- mon history, or similar ecological characteristics, to be more similar than those that are separated by greater distances. The major difference be- tween time-series autocorrelation and spatial autocorrelation is the fact that time-series produces dependence in only one dimension (time), while dependence in space series may extend in all directions. It is interesting to note that Naroll's (I+4) Linked Pair Test for Galton's problem is a form of spatial coefficient of autocorrelation which is an analogue of the time- series coefficient of autocorrelation (see also Barry I969).

For time-series data it is well known that for recursive models3, no systematic biases are introduced by autocorrelation, but estimates of popu- lation parameters will be very unreliable and highly dispersed around the trueparameter (WallisandRoberts I956:562; Johnson I963: I79; Wonno- cott and Wonnocott I970: I36-I40). This is generally the pattern that we find in Ember's data. There is some tendency for his estimates of relationship tO be biased in the direction of weak relationships, but this is, as he notes (Ember I97I: I05), because of the relatively small amount of variance within the Language Family samples.4

In conclusion, therefore, Galton's problem can be an important problem in cross-cultural research because it leads to unreliable estimates of popu- lation characterisiics. If it iS ignored as Ember suggests it should be, serious errors of inference will result.

This content downloaded on Thu, 17 Jan 2013 01:18:42 AMAll use subject to JSTOR Terms and Conditions

Page 8: Galton's Problem as Spatial Autocorrelation: Comments on Ember's

GALTON S PROBLEM: COMMENTS ON EMBER 43I

RANDOM SAMPLING AS A WAY 0F ACHIEVING INDEPENDENCE One objection to the foregoing line of reasoning, implicit in Ember's

discussion, is to argue that historically interdependent sampIes will not produce errors of inference provided the samples that are drawn are ran- dom. According to this view while it is true that many of the cases in the sample will not be historically independent, this will not be an error since interdependence is indeed a characteristic of the population Accordingly, the usual procedures for computing standard errors with simple random samples can be used to establish confidence intervals or to test hypotheses, and one can accurately use probability theory to estimate the likelihood of one's error. IfS for some reason, the closer similarities of certain cases, characteristic of the population, were not reflected in a sample drawn from the population, the sample would not be representative of the population

and would likely lead one to arl incorrect inference. Thus, Ember argues that random sampling is the only precaution necessary to avoid Galton's problem.

This argument is valid? only if one's goal is limited to describing the population from which the sample was drawn. Then simple random sampling is adequate, and Galtorl's problem wouId not exist. If many of the societies in the sample were historically reIated, the sample would be likely to tell us that many of the cases were similar, and this would be an accurate description of the population.

Clearly, however, we must make a distinction between random sampling from a fixed population, where we assure independent selection of elements by some mecharlical procedure (e.g., a table of random numbers), and the more complex situation where we assume that the events observed in active human populations are independent replications of those events rather than instances of one event influencing the probability of the second. In more concrete termsS if our goal is to describe the societies in the Ethno- graphic Atlas with respect to the relationship between variables X and Y, we can draw a random sample of societies, compute the correlation between X and Y, compute confidence intervals. and estimate precisely how ac- curate (in the long run) our estimate of the true correlation between the X and Y variables will be. On the other hand, if our purpose is to evaluate the hypothesis that changes in variable X produce changes in variable Y, and we wish to use the Ethnographic Atlas as a source of data to test and quantify this hypothesis, the situation is much more complicated and random sampling from the Ethnographic Atlas does not begin to deal with the problems of the extent to which the data represent mutually in- dependent replications of the effects of variable X on variable Y. Ember's (I97I: IO6) assertion ". . . that cross-cultural researchers can safely ignore Galton's problem so long as their samples are selected in some random fashion," is applicable only to descriptive studies. Since practically all cross- cultural research aspires to go beyond description and attempts to explain the relationship between variables, Ember's conclusions are again found to be misleading.

The distinction made here between the use of statistics to describe a

This content downloaded on Thu, 17 Jan 2013 01:18:42 AMAll use subject to JSTOR Terms and Conditions

Page 9: Galton's Problem as Spatial Autocorrelation: Comments on Ember's

432 ETHNOLOGY

population and the use of statistics to make causal inferences is an im-

portant point that has resulted in a considerable amount of confusion in

other fields (Morrison and Henkel I970; Blalock I968: I96-I98) and

should be emphasized. Wold (I956) has pointed out that the statistical

techniques found in almost all statistics textbks have been developed for

problems that are not completely comparable to the problems found in the

nonexperimental social sciences. The vast majority of available techniques

apply either to the description of a population (e.g., measures of central

tendency, measures of dispersion, and sampling techniques) or to making

causal inferences from experiments where lrariables can be controlled, repli-

cations can be made independent, and uncontrolled variables can be sub-

jected to randomization. Many of the same statistical procedures can be

generalized and adapted for use in the causal analysis of nonexperimental

data, but there are no routine methods for avoiding errors in interpretatio

because the researcher lacks technical control over causal factors. As a

result one must use a variety of ad hoc procedures to make his worls

interpretable (Wold I956: 4I). Since most cross-cultural research attempts to test causal hypotheses on

the basis of nonexperimental data, the basic hypothesis usually takes the

forrrl of the statement that, other things being equal, a change in variable

X will be followed by a change in variable Y. This is very difTerent from

the descriptive question: What does population A look like ? True, the

cross-cultural researcher is usually interested in this question, but only as a

means for testing and quantifying his explanatory hypothesis. The im-

portant question is: How did population A get to look as it does? That is,

what mechanisms could and could not have produced a population that

looks like A? This is a very much more complex and difficult question than

the descriptive one. Galton's problem, of course, only arises if we ask how the population

came to look as it does. Is it a result of diffusion, or is it a result of a func-

tional relationship between variables ? Ember's random sampling allows

him to estimate the frequency of certain traits or the correlation between

certain traits in the population, but random sampling alone cannot enable

him to estimate reliably whether or not there is a functional relationship

between the variables. To do this-and this is the general procedure for

making causal inferences from nonexperimental data one must be able

to show that alternative hypotheses cannot reasonably account for the re-

lationship between the variables that one is studying (Blalock Ig64). There

may be many competing hypotheses, and one's data may be so meager that

one will not be able to eliminate all of them, but if one is skillful enough

and lucky enough one can significantly narrow the range of competing

hypotheses and specify the circumstances under which one's hypothesis

will apply. Galton's problem is just one of the many competing hypotheses

that must be eliminated in successful cross^cultural research.5

GALTON S PROBLEM AS A COMPETING INTERPRETATION

We have argued that the edect of Galton's problem is to produce unre-

liable estimates of population characteristics. It should now be clear that

This content downloaded on Thu, 17 Jan 2013 01:18:42 AMAll use subject to JSTOR Terms and Conditions

Page 10: Galton's Problem as Spatial Autocorrelation: Comments on Ember's

GALTON S PROBLEM: COMMENTS ON EMBER 433

the "population" referred to here is not the population of all societies listed in the Ethnographic Atlas (or some such list of societies), but rather is the theoretical population of societies in which the variables under study rep- resent independent replications of the effects of one variable on another. We have also argued that Galton's problem arises because it provides an interpretation for correlations (or lack of correlations) between variables which may be equally plausible as the functional hypothesis that changes in variable X produce changes in variable Y (or that X does not produce Y)

There is a paradox here which appears to be at the center of current confusion over the nature of Galton's problem. How is it that the effect of interdependent samples can be to produce unreliable but not consistently biased estimates (i.e., not "multiply instances of correlation"), and at the same time provide an alternative explanation for the relationship between variables? If there is no consistent bias, will we not usually reach the right conclusion? This line of reasoning is represented in Ember's argu- ment that if Galton's problem does not consistently produce spuriously high correlations, there is nothing to worry about.

The resolution of the paradox lies in the fact that, since the researcher ordinarily has only one sample of societies and frequently not a very large one, and since interdependence produces unreliable estimates, his results are likely to be a poor description of the population. He will not know a priori whether it is likely to be too high, or too low. It may even be near the correct figure, but unless he can either estimate the degree to which his societies are interdependent or assume that they are independent, he will not be able to use statistical sampling techniques to compute the stand- ard error of his estimate. That his sample is not a good description of the true population therefore becomes an alternative explanation for any hnd- ing. There is no consolation in the fact that one is no more likely to over- estimate the true relationship than to underestimate it, especially since the over-all probability of error in any direction will be higher as a conse- quence of the lack of independence.

The direction of error will depend completely on what kind of relation- ship is most diAused throughout the sample. The sample will overestimate the true relationship, only if this kind of relationship happens to be diffused extensively within the sample. This presumably was the case in the langu- age family sample that produced a correlation of +I.O between social stratification and complex political organization in Ember's data (see Ta- ble I). On the other hand, the sample will underestimate the correlation when instances of low association are diffused extensively within the sam- ple, or the sample may produce correlations in the reverse direction, as in the two language family samples that produced negative correlations be- tween social stratiScation and complex political organization.

Increasing the sample size will help by reducing the variability of esti- mates only if by increasing the sample size we also increase the number of truly independent replications of the events under study. In most studies this will be an adequate procedure for coping with the problem, especially if it is coupled with a sampling procedure designed to reduce historical

This content downloaded on Thu, 17 Jan 2013 01:18:42 AMAll use subject to JSTOR Terms and Conditions

Page 11: Galton's Problem as Spatial Autocorrelation: Comments on Ember's

434 ETHNOLOGY

interdependenLce (Murdocli I966; Murdock arld White I969; Naroll I970).

However, this remains a relatively inexact procedure, and there are no

available procedures for accurately estimating standard errors. Another

problem is that the extent to which this is a helpful strategy will vary with

the variables that are being studied and the data that are available. For

most preindustrial societies, the evidence supports the assumption that the

geographical area through which most variables may diffuse will be

limited. Therefore, writh a worldwide sample, no particular pattern will

dominate, and we should get reasonably good estimates of true relation-

ships. Naroll and D'Andrade (I963: IO59) have shown, however, that

some variables, particularly those "strongly associated with sociocultural

evolution," are distributed throughout very large diffusion patches in

some parts of the world so that increasing sample size will be relatively

less helpful in such cases. The problem is probxably extreme with modern

industrial societies; the diiTusion of certain traits of Western culture

throughout the rest of the world makes inferences about the causal effects

of other variables extremely unreliable. Yet almost no attention has been

given to this problem in comparative studies in sociology and political

science.

SUMMARY AND CONCLUSIONS

An examination of Ember's empirical test of Galton's problem reveals

that, in spite of the fact that interdependent observations do not artificially

inflate correlations, Galton's problem remains a serious threat to the valid-

ity of cross-cultural studies. We have argued that the effect of interde-

pendent observations on cross-cultural data is to produce inconsistent or

unreliable estimates of the characteristics of the true population. To the

extent that the data are interdependent, they are redundant (repetitions of

the same information). If the usual tests of significance are computed on

the basis of these data, serious overestimates of their reliability will result.

The problem is analogous to autocorrelation in time-series analysis, and

can be thought of as spatial autocorrelation. In the case of time-series au-

tocorrelation it is well established that the eSect is to produce unreliable

estimatesS and we believe that just such an effect can be seen in Ember's

data testing the effects of Galton's problem. We have similarly taken issue with Ember's suggestion that random

sampling is a suicient safeguard against the eGects of Galton's problem.

This would be the case only if the goal of cross-cultural research were

to describe rather than to explain. As long as our goal is to test and quantify

hypotheses about the effects of one variabIe on another, Galton's problem

will remain a serious threat to the validity of our inferences.

NOTES

r. I am indebted to H. M. Blalock, Jr., Robert Hill, Gerhard Lenski and Julie Loftin

for their comments on the draft of this paper. 2. Herbert Barry III, in commenting on a draft of this paper, pointed out that the

meaning of a zero phi coefEcient is ambiguous. It can result from a genuine lack of

association between the two variables, but it may also occur when the data are so

This content downloaded on Thu, 17 Jan 2013 01:18:42 AMAll use subject to JSTOR Terms and Conditions

Page 12: Galton's Problem as Spatial Autocorrelation: Comments on Ember's

(;ALTON S PROBLEM: COMMENTS ON EMBER 435

uniform that an association cannot be determined. On the basis of Ember's brief com- ment we assume that the latter was the case with most of the zero association in the Language Family Samples.

It should also be noted that the maximum correlation between two dichotomous variables (phi) is restricted by the extent to which the column and row marginals differ. A perfect association cannot be obtained unless the marginals are the same and the greater the discrepancies between the row and column marginals, the lower the upper limit of the relationship (Blalock I960: 23I; Nunnally I967: I30-I3I). This undoubtedly explains the tendency, which is apparent in Ember's data, for the non- zero coefficients in the Language Family Samples to show lower correlations than in the Random Samples. Presumably the greater homogeneity of the Language Family Samples produced more uneven marginals in these samples. 3. In the context of cross-cultural research recursive models are those in which we can rule out twoway causation (Blalock I964: 54).

4. See note 2 above. 5. We do not mean to imply that the process of making causal inferences from non- experimental data is a simple task or, even, that it will always be possible tQ interpret one's data unambiguously in terms of causal relationships. Certainly we do not mean to imply that Galton's problem is the only problem that must be overcome in making causal inferences in cross-cultural studies. Our only poirlt is that if the goal is to make causal inferences, one must proceed differently than if one wishes simply to describe a population. For a discussion of some of the other problems in making causal inferences from nonexperimental data and for a general introduction to this growing methodolog- ical point of view the interested- reader should examine Blalock (I964, I97I).

BIBLIOGRAPHY

Barry, H., III. I9X. Cross-cultural Research With Matched Pairs of Societies. Journal of Social Psychology 79: 25-33.

BlaIock, H. M. Jr. I960. Social Statistics. New York. - I964. Causal Inferences in Nonexperimental Research. Chapel Hill.

I968. Theory Building and Causal Inferences. Methodology in Social Re- search, ed. H. M. Blalock, Jr., and A. B. Blalock, pp. I55-I98. New York.

I97I. Causal Models in the Social Sciences. Chicago. Ember, M. I97I. An Empirical Test of Galton's Problem. Ethnology I0: 98-I06.

Johnson, J. I963. Econometric Method. New York. Kish, L. I965. Survey Sampling. New York. Morrison, D. E., and R. E. Henkel. Ig7o. The Significance Test Controversy. Chicago. Murdock, G. P. I966. Crosscultural Sampling. Ethnology 5: 97-II4.

I967. Ethnographic Atlas: A Summary. Ethnology 6: I09-236.

Murdock, G. P., and D. R. White. I969. Standard Cross-cultural Sample. Etlmology 8: 329-369

Naroll, R. I964. A Fifth Solution to Galton's ProbIem. American Anthropologist 66: 863-867.

I970. Gross-cultural Sampling. A Handbook of Method in Cultural Anthro- pology, ed. R. Naroll and R. Cohena pp. 889<z6. Garden City.

Naroll, R.) and R. G. D'Andrade. I963. Two Further Solutions to Galton's Problem. American Anthropologist 65: I053-I067.

Nunnally, J. C. I967. Psychometric Theory. New York. Tintner, G. I968. Times Series: I General. International Encyclopedia of the Social

Sciences, ed. D. L. Sills, pp. 47-59. New York. Wallis, W. A., and H. V. Roberts. I956. Statistics: A New Approach. New York. Wold, H. I956. Causal Inferences from Observational Data: A Review of Ends and

Means. Journal of the Royal Statistical Society II9: 28-50.

Wonnocott, R. J., and T. H. Wonnocott. I970. Econometrics. New York.

This content downloaded on Thu, 17 Jan 2013 01:18:42 AMAll use subject to JSTOR Terms and Conditions