theory and general principle in statistics · theory and general principle in ... remote from much...

11
Theory and General Principle in Statistics Author(s): D. R. Cox Source: Journal of the Royal Statistical Society. Series A (General), Vol. 144, No. 3 (1981), pp. 289-297 Published by: Wiley for the Royal Statistical Society Stable URL: http://www.jstor.org/stable/2981796 Accessed: 28-02-2017 12:20 UTC JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about JSTOR, please contact [email protected]. Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at http://about.jstor.org/terms Royal Statistical Society, Wiley are collaborating with JSTOR to digitize, preserve and extend access to Journal of the Royal Statistical Society. Series A (General) This content downloaded from 163.1.41.46 on Tue, 28 Feb 2017 12:20:32 UTC All use subject to http://about.jstor.org/terms

Upload: truongtuyen

Post on 28-Aug-2018

217 views

Category:

Documents


0 download

TRANSCRIPT

Theory and General Principle in StatisticsAuthor(s): D. R. CoxSource: Journal of the Royal Statistical Society. Series A (General), Vol. 144, No. 3 (1981),pp. 289-297Published by: Wiley for the Royal Statistical SocietyStable URL: http://www.jstor.org/stable/2981796Accessed: 28-02-2017 12:20 UTC

JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted

digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about

JSTOR, please contact [email protected].

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at

http://about.jstor.org/terms

Royal Statistical Society, Wiley are collaborating with JSTOR to digitize, preserve and extend access toJournal of the Royal Statistical Society. Series A (General)

This content downloaded from 163.1.41.46 on Tue, 28 Feb 2017 12:20:32 UTCAll use subject to http://about.jstor.org/terms

" ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~' ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ......

.. . . . .

This content downloaded from 163.1.41.46 on Tue, 28 Feb 2017 12:20:32 UTCAll use subject to http://about.jstor.org/terms

J. R. Statist. Soc. A (1981), 289 144, Part 3, pp. 289-297

Theory and General Principle in Statistics

By D. R. Cox

Department of Mathematics, Imperial College, London

[The Address of the President, delivered to the ROYAL STATISTICAL SOCIETY on Wednesday, March 18th, 1981]

SUMMARY

Some notions common to statistical investigations over a wide field are reviewed. Some technical issues where further development is desirable are sketched. Finally, some broad implications for statistics and statisticians are outlined.

1. INTRODUCTION

FIRST I record my deep appreciation of the invitation to serve as President. Statistics claims to deal in a broad way with the collection and analysis of data. The general

notions which give the subject unity, and which in particular link the very varied activities of our Society, are no doubt well known. The object of this paper is, nevertheless, to attempt a review of some such ideas and their implications.

Some, seeing the word "principle" in the title, may be expecting a discussion of such ideas as sufficiency principle, conditionality principle, likelihood principle, and so on. Fascinating though these are, they are not quite what is in mind here. After all, the formal theory of statistical inference is fairly remote from much statistical practice, although of course attitudes to issues of general theory may have important, if indirect, implications for practice. By and large, however, the superstructure in science supports the foundations, and the matters discussed in this paper are intended to be fairly close to the main body of application.

One looks to theory and general principle for three rather different things: (i) a systematic framework in terms of which to think about new substantive investigations; (ii) a starting point for developing new statistical techniques; (iii) a basis for developing current knowledge in a systematic rather than a fragmentary

fashion. The last is of importance particularly to teachers. Theory, while often mathematical, is not necessarily so and theory is certainly not

synonymous with mathematical theory. Lord Rayleigh defined applied mathematics as being concerned with quantitative investigation of the real world "neither seeking nor evading mathematical difficulties". This describes rather precisely the delicate relation that ideally should hold between mathematics and statistics. Much fine work in statistics involves minimal mathematics; some bad work in statistics gets by because of its apparent mathematical content. Yet it would be harmful for the development of the subject for there to be widespread an antimathematical attitude, a fear of powerful mathematics appropriately deployed.

A central question is the extent to which theory as currently developed meets the needs (i)-(iii), especially (i) and (ii).

2. SOME QUALITATIVE IDEAS

What then are the qualitative features common to many investigations and which it is valuable to consider when faced with a new investigation? Here are some points loosely organized in the time sequence in which they are likely to arise in practice.

The form of investigations This involves a critical examination of the nature, respective advantages of and difficulties of

interpretation of experiments, sample surveys, and relatively controlled observational studies,

This content downloaded from 163.1.41.46 on Tue, 28 Feb 2017 12:20:32 UTCAll use subject to http://about.jstor.org/terms

290 Cox - Theory and General Principle in Statistics [Part 3,

including the use of "historical" data. Special problems arise with large 'multi-centre" studies and, more broadly, with the design of investigations in fields in which interventionist studies are hard to implement for ethical or political reasons.

The objectives of investigation

It is a truism that planning and analysis must depend on objectives, but consideration of when such objectives can and should be tightly formulated, e.g. in terms of some immediate decision-taking task, raises difficult issues. When there is a fairly specific purpose such as forecasting, the role of preliminary analysis to "understand" the system under study needs particular consideration. A distinction should certainly also be drawn between the careful analysis of more or less unique sets of data and the monitoring and analysis of "routine" data arriving in a steady stream.

The nature and choice of observations

Observations may be classified by their objective, e.g. as response, intermediate response or explanatory variables and indeed the last can be subclassified in various ways. There is also a classification by structure of the values possible (binary, ..., continuous). The need to record those things crucial for interpretation, and as few unnecessary variables as practicable, is critical. In many fields much data are collected that are never analysed. The question of how to measure satisfactorily complex concepts frequently arises.

Variability The analysis of variability is, of course, central to statistics under two different guises.

Natural variability is to be studied as something of intrinsic interest, and the notion of frequency distributions rather than single quantities is a key one. The complementary aspect is the careful study of error, distinguishing where appropriate external and internal systematic errors and random errors of possibly complex structure. Associated with this is concern with data quality.

Role of external information At all stages it may be important to bring in special subject-matter considerations, including

information from related investigations or from theory. Such information may be used qualitatively, or to guide model choice, or quantitatively, and may or may not be absorbed into the main body of the analysis and conclusions. There may be some conflict with the desire to have individual investigations that are convincing largely in their own right. Development of special stochastic models is an increasingly important part of statistical work and closer integration of this with the analysis of data is certainly desirable.

Strategy of analysis

In planning the general approach to be taken in analysis, we have to consider the checking of data quality, the more exploratory stages of analysis and the final or specific analysis. Splitting of data into sections for analysis, followed by a second stage of pooling of the various sections, may be good in complex cases. The balance of effort between data analysis and data collection needs consideration, as does the depth of elaboration appropriate. Some data, e.g. data subject to unassessable systematic errors, may merit at most brief analysis. One suspects that in many fields too little effort is spent on data analysis relative to that devoted to collecting the data; one wonders also how often overelaborate methods of analysis are used. In all stages of analysis some balance between probabilistic and descriptive methods may be called for and some balance also between numerical and graphical techniques.

This content downloaded from 163.1.41.46 on Tue, 28 Feb 2017 12:20:32 UTCAll use subject to http://about.jstor.org/terms

1981] Cox - Theory and General Principle in Statistics 291

Details of analysis

The study of particular methods of analysis and their implementation is clearly a major part of the subject, but inappropriate for discussion here.

Nature of generalization The generality of the conclusions from a particular set of data may need explicit

consideration. The basis of generalization will be some mixture of random sampling, of demonstration of absence of interaction with important intrinsic properties of the material, of consistency with independent related studies and of consistency with theoretical knowledge of the topic. The relative importance of these varies greatly between different fields of study. Thus in physics there is usually a substantial theoretical background that provisionally can be regarded as given, whereas in, say, criminology this is not so much the case.

The uncertainty of conclusions

It is, a distinctive feature of statistical arguments to recognize explicitly that conclusions are uncertain and to attempt to measure that uncertainty.

Presentation of conclusions The setting out of conclusions in a way that it vivid, simple, accurate and integrated with

subject-matter considerations is a very important part of statistical analysis. This is well recognized where the data are extensive and of complex structure, and tabular or graphical display is involved. It is perhaps less well recognized when relatively elaborate methods of analysis involving the fitting of complex models are involved. Simple and enlightening presentation of conclusions is especially important in such contexts. When the conclusions are to be the basis of immediate action special considerations apply; simple graphical represen- tation may then be particularly good.

Current theory deals patchily with these topics. Some have been developed in great detail, whereas the more broadly strategical have, for fairly obvious reasons, been neglected. Thus, current theory gives rather little guidance on the strategy to follow in analysing complex data.

3. SOME MORE TECHNICAL ISSUES

In Section 2 some rather broad matters concerned with statistical analysis were discussed. In the present section a few implications for more technical statistical work are examined, highly specialized questions of technique being excluded. All the issues are widely known.

There is a clear need for systematic discussion of graphical and tabular arrangement, distinguishing sharply the roles in preliminary analysis and in the final presentation of conclusions, and taking account of developments in computer technology.

In design, observational studies need further discussion with particular emphasis on bias isolation and removal. Somewhat related is the use of historical controls instead of or in addition to concurrent controls in experimental design. The merits of balancing versus randomization when there are large numbers of concomitant variables are still to some extent controversial, and design for interaction detection is becoming increasingly important. There are many more technical issues connected with design.

While there has, of course, been very extensive discussion of the role of non-frequency notions of probability, their position and importance in immediate practice still seem unresolved. Three different levels at least are involved. First, what is the role of non-frequency probability in model formulation for phenomena, such as the works of Plato, in which no repetition is conceivable? Is it enough to say that a probability model is an assumption that certain features of the system correspond to what would have been observed had the data been generated from a probability model in the frequency sense? This assumption is capable of

This content downloaded from 163.1.41.46 on Tue, 28 Feb 2017 12:20:32 UTCAll use subject to http://about.jstor.org/terms

292 Cox - Theory and General Principle in Statistics [Part 3,

limited test, for example by examining the dispersion between subsections of data. In effect, the variation to be expected under a frequency probability model is used as a reference standard for assessing observed variation.

A second somewhat specialized but widely important possible role for non-frequency probability is in connection with very rare events, especially when there is little or no "hard" data. Here the objective is a frequency, but the evidence about it may be rather indirect. There is substantial recent work on so-called risk assessment.

Finally, there is the role of generalized notions of probability in assessing overall uncertainty in conclusions even when "hard" data are available; some notion of subjective probability can be advocated as a way of incorporating judgement and supplementary evidence into a comprehensive assessment of uncertainty.

Despite the great interest of recent work on this theme, it can be criticized for being wrongly focused. The primary requirement that the subjective probabilities are mutually consistent should be replaced by one that the probabilities agree with information from the real world. The case for taking subjective probability as the single unifying notion of formal statistical inference has been forcefully presented in recent years, but seems to me unconvincing. This is not because personal judgement does not arise in all fields of work. Rather, it is a belief that, at least in many applications, an object of statistical argument is to isolate those considerations that are largely impersonal and public in character, so far as feasible isolating matters of judgement to be dealt with mainly qualitatively. The introduction of specifically personal probabilities is a different issue from that of the role of Bayes's theorem. To refloat the Titanic, by providing a careful justification of "objective" priors, would be a cause for rejoicing.

Somewhat related to the issues of personal probability is the role of statistical decision theory. It is unreasonable to expect a mathematical theory to capture all the nuances of a complex issue. Nevertheless, we must ask whether decision theory as normally presented represents the essential elements of practical decision making. In industrial acceptance sampling, in control theory, and perhaps in some aspects of medical diagnosis, where the core of the decision problem is the uncertainty about the "true state of nature" arising from limited statistical information, the conventional formulation seems apposite. Where, however, the core is either the specification of a utility function or of prior information, not directly based on statistical information, the standard formulation seems largely circular.

A pressing issue in terms of general ideas on technique is to come to some agreement about the nature, purposes and limitations of significance tests, or their Bayesian analogues, and to adjust teaching at all levels. There is almost certainly no single topic so misunderstood in applications, and criticisms of overemphasis on significance tests have been widespread for many years. Discussion should include the distinction between the calculation of P values as indicators of consistency with some hypothesis, versus the use of tests as a form of decision rule. Also their role in exploratory analysis and as a guide in model choice is quite different from that in presenting the conclusions of "final" analyses.

Still at a fairly general level, there is the question of adjusting methods of analysis in the light of the data, something we often need to do and which causes some problems in all schools of formal inference. The following situations need to be distinguished:

(i) the primary question under study is unaffected by inspection of the data, but the secondary assumptions, e.g. about error structure, are changed;

(ii) the precise mathematical formulation of the primary problem is modified after inspection of the data, but the qualitative aspects remain unchanged;

(iii) the primary question at issue, e.g. the existence of such and such an effect, is formulated only after exploratory analysis of the data;

(iv) an initially complicated model, e.g. a regression model containing many explanatory variables, is simplified in the light of the data.

It is only when (iii) is involved and statistical significance is at issue that allowance for

This content downloaded from 163.1.41.46 on Tue, 28 Feb 2017 12:20:32 UTCAll use subject to http://about.jstor.org/terms

1981] Cox - Theory and General Principle in Statistics 293

"selection" seems crucial. In the other cases it is arguable that no adjustment for adaptation in the light of the data is called for. In (iv), however, it is necessary and desirable that all sufficiently simple models consistent with the data are listed and not just one somewhat arbitrarily chosen from among equally well-fitting models. If this idea is accepted, the role for automatic model selection procedures such as AIC and its modifications and equivalents is restricted.

Links between descriptive and probabilistically based statistics are important. In particular, there is need for some formal theory, even if very rough, in fields like cluster analysis which at present largely lack internal devices for assessing precision. Here methods based partly or primarily on computer simulation seem likely to be fruitful.

More technical questions of theory concern asymptotic theory, in particular when there are large numbers of nuisance parameters, and of conditional inference and of the links between these two, for instance via some notion approximate ancillarity.

At the edge of formal mathematical theory lies the choice of models for formal analysis and the study of sensitivity of conclusions to the assumptions made in analysis.

With probabilistic-based methods the choice of model seems usually a more critical issue than differences between the results of various schools of formal inference. The model serves both to define the problem, i.e. to dictate the form in which the conclusions will be expressed, and to indicate the formal procedures for statistical analysis. The former is the more important role. Particularly in complex problems, it will be sensible to make simplifying assumptions for the second aspect, checking that there is unlikely to be a drastic effect on the final conclusions.

The computer makes possible large numbers of subsidiary plots and tests and there seems some danger of overelaboration and overinterpretation, especially where these subsidiary analyses are secondary to the main issues.

4. SOME FURTHER POINTS

Many of the general issues raised in Section 2 would be regarded by, say, physicists, engineers and economists as part of general research methods and not as specifically statistical. A few might go so far as to identify statistical analysis with the calculation of P values and rather more would see some fairly explicit use of probability as a necessary characteristic of statistical methods.

A recent trend, especially under the stimulus of J. W. Tukey, has been a strong emphasis on descriptive, i.e. non-probabilistic, and exploratory methods under the name of data analysis, in some sense at least to be contrasted with statistical methods. The explicit consideration of descriptive methods is very welcome; much statistical work has always been non-probabilistic. Nevertheless a strong separation of probabilistic and non-probabilistic techniques seems undesirable; successful application so often depends on synthesis of the two. Further, the splits exploratory-specific or so-called confirmatory and descriptive-probabilistic seem quite separate.

Much effective statistical work is done by those whose primary training and expertise is not statistical. Active interest in statistics by scientists, technologists and others is surely to be encouraged and it would be excellent if the Society were to make a more active contribution over this. To some extent, indeed, the definition of the domain of the subject is uninteresting. For the statistician as an individual, however, the position is different.

The role of the statistician as custodian of the implementation of special techniques (multiple regression, time series analysis, and so on) is being increasingly taken over by the computer. This implies two things for the individual statistician. One is an involvement in a generally increasing level of sophistication in technique. The other is implicit in the argument of Section 2. Concern with the details of methods of analysis has to be linked with the study of the broader aspects of the design and analysis of investigations. The word "consulting" has overtones of pretentious servitude. It is to be hoped that the emphasis will increasingly be on "collaboration", involving concern with many broader issues as well as with technical details of methods of analysis. Of

This content downloaded from 163.1.41.46 on Tue, 28 Feb 2017 12:20:32 UTCAll use subject to http://about.jstor.org/terms

294 Cox - Theory and General Principle in Statistics [Part 3,

course, many statisticians now and in the past have worked in this way; it would be good to see this as more explicitly part of the statistician's role.

Such developments would increase the difficulties of teaching the subject, already considerable. For developers of new methods a strong mathematical background seems very desirable although not absolutely essential. Yet the general issues mentioned above are not particularly mathematical and, indeed, lack the sharp focus that mathematicians, especially undergraduates, expect.

Diversity of educational approach is needed. While the spread of statistics as an undergraduate subject is to be welcomed as hopefully increasing the pool of workers with some expertise and interest in technique, postgraduate training seems desirable for most of those likely either to need to develop new technique or to apply with critical understanding the most recent developments.

Negative attitudes to statistical arguments, at least those attitudes of the crudest kind, are possibly less widespread than they used to be; public discussion of essentially statistical topics is common. Negative attitudes remain, however, and some merit serious attention. One comment heard from scientists and technologists is "my data are not in good enough form to allow statistical analysis", meaning not that it is poor quality data but rather that it is not in the nicely balanced form found in textbooks. Another attitude among scientists is the statistics is "head counting", a useful but limited activity and less important than analysis and interpretation. A third comment to be heard from physicists and certain kinds of biologists is that "if I need statistics, I am doing a bad experiment". All these attitudes and the answers to them which the reader can supply serve to stress the need to take a broad view of the subject.

Perhaps the most critical issue facing statistical arguments in some fields is that they may be seen as primarily negative in tone. One of the roles in particular of significance tests is to restrain uncritical enthusiasm. More broadly, the ethos of the statistician is of cautious critical discussion, of explicit recognition of uncertainty and of the need for carefully balanced judgements. These are admirable notions surely in great need; one does not have to look far to see the dangers of fanaticism. Yet achievement in science or any other sphere requires enthusiasm. Overcaution can be harmful, in science and technology as much as in government and business.

In principle, decisive action can be combined with intellectual appreciation that there are uncertainties in the key evidence on which the decision making is based. In theoretical terms these are the dual themes of decision and inference that reappear throughout recent discussions of general principles in statistics. The possibility of explicit and quantitative resolution of this conflict is one of the most important intellectual contributions of our subject, with far-reaching and as yet undeveloped implications.

The two last Presidents of the Society responded to Professor Cox's address as follows.

Sir CLAUS MOSER: It is one of the privileges of the Society's Presidents that, after completing their term of office, they can surface again when proposing the vote of thanks to their successor; and emerge yet again, with a final presidential gasp, as seconder to their successor but one. So here I am on my first posthumous appearance. Unfortunately, one is not expected to exploit the occasion for a postscript to one's own Presidential Address. If one were, I would have the sad task of commenting on the reputed cuts in some important parts of government statistics which I covered in my own Address. I am dismayed by what I read in the Press and cannot hide my concern, or my belief that the present Government as well as its successors will come to regret serious cuts in government statistics.

However, this cannot be my theme today. I have a much more agreeable task, namely to propose the vote of thanks to the President for an Address which displays the hallmarks of Professor Cox's distinguished work. First, it is highly concise, and it is characteristic that, whereas it took me 91 words to express my gratitude to the Society for electing me, your new President does it all in 13. Secondly, his conciseness is combined, as always, with wisdom and profundity. I can say with feeling that his Address needs to be read slowly and repeatedly before one gets the full measure of it-it is not an Address for a quick skip.

This content downloaded from 163.1.41.46 on Tue, 28 Feb 2017 12:20:32 UTCAll use subject to http://about.jstor.org/terms

1981] Cox - Theory and General Principle in Statistics 295

But above all, it is invaluable for the way in which Professor Cox addresses himself to those aspects of theory and general principle which provide, or at least should provide, whatever unity our curious subject may possess. I have noted that hardly a President of recent years has refrained from expressing concern about the splintering both of our subject and our profession; yet, in spite of numerous discussions and reports, there has been little progress. I also regret that the Society itself has not so far contributed more deliberately to the building of bridges between applied and theoretical statisticians and between our various specialist interests. This remains a key concern and the Address before us is one of the most helpful and subtle contributions to its solution. It is particularly welcome to an applied statistician to find one of our most eminent theoretical statisticians showing himself so concerned with the applied, descriptive and non-probabilistic aspects of the subject; and so anxious to put mathematical theory in its vital, though not necessarily central, place. I do not think anyone could quarrel with the President's views on what is to be looked for in the contribution of theory and general principle to our subject and on the relation between statistics and mathematics. On this latter, as he says, much fine work in statistics involves minimum mathematics while some bad work gets by because of its apparent mathematical content. How true this is; but also how right he is to warn against the dangers of a swing towards an anti-mathematical attitude.

The second section of the Address takes us through the stages common to many statistical investigations. It is a familiar list, but what makes it special is the commentary by Professor Cox on each stage. Particularly important are his remarks on analytical strategy and on the common imbalance in the effort put into data collection on the one hand and analysis on the other; as well as his remarks on the uneasy balance between probabilistic and descriptive methods in analysis. We are all familiar with examples of data collected and never used and with over-complex approaches to analysis. Sometimes the computer is blamed and of course its existence encourages one to over-analyse. But we cannot shift the blame: when data are collected unnecessarily or analysed in an undisciplined fashion the fault lies squarely with us, the statisticians.

This is an aspect of a central issue in the President's Address, namely the unevenness in approach to the stages of a typical statistical investigation, something that has worried me all my statistical life and especially in teaching.

Let us take a typical survey and divide its process into three broad stages. In the first, where we are dealing with sample design and selection, we behave like scientists. We devote to it all our statistical ingenuity, basing ourselves on scientific theory and on principles enshrined in textbooks. But, as we well know, the sample is a means to an end, the end being the analysis of and inferences from the survey data. And if we turn to the second stage, choosing the questions, designing the questionnaires and so forth, we have to recognize that we are in a different world. However careful the statistician may be, however much he has experimented with different question forms, etc., his decisions on question design-in short on the measurements-are largely judgemental. Little science is involved. And when we turn to the third stage, of analysis and inference, we find ourselves halfway back to behaving like scientists, with some rigorous techniques in our tool kit. Yet even here, much of the approach is judgemental. In short, the process as a whole is a mixture of hard, very soft and semi-soft techniques; and yet we often end up using our results as if the entire process were hard.

Moreover, the mixture of hard and soft applies widely throughout statistical work, certainly in most economic and social statistics. Thus I can think of few areas in official statistics where one does not have to combine strong, well-based probabilistic data with weak, non-random information. Complex aggregates like the National Accounts contain a wide range of data-hardness, but we are not yet sophisticated enough to allow for this in estimating measurement error. Many other examples spring to mind.

How right the President is to emphasize the patchiness of current theory in dealing with different stages of a statistical investigation and in pointing to the need for advance. If I have a frustration, it is that I cannot gauge his confidence as to the chances of advance. Does he believe that the soft areas-in data collection, measurement and analysis-can gradually reach to the more scientific levels of, say, sample or experimental design; in short, that it is just a matter of more study? Or does he believe that we will get cleverer in synthesizing the hard with the soft, the probabilistic with the non-probabilistic? I would rather put money on the second than on the first, though even there I do not yet see the way.

In this context there is another problem touched on by the President, namely the decision process. As he rightly says, conventional decision theory is not always helpful. This suggests the need for continuous research on the relations between data and decision-making. If an indicator is less accurate than needed for whatever decisions are involved, one fails both to satisfy the user and to use resources effectively; equally, it is a waste to go for more accuracy than is needed. Ideally, a statistical investigation should be geared to its final use, avoiding both excessive and inadequate precision. But the real world is more

This content downloaded from 163.1.41.46 on Tue, 28 Feb 2017 12:20:32 UTCAll use subject to http://about.jstor.org/terms

296 Cox - Theory and General Principle in Statistics [Part 3,

complex. Most statistics serve multiple purposes and most purposes are hard to define, let alone quantify. We rarely know all the uses-or even the main uses-to which data are likely to be put; nor what the cost of wrong decisions may be (just think of money supply in this context!). We need extensive study of types of decision, and of their sensitivity to data quality.

In the third section of the Address, I was interested in the President's remarks on those situations where one has to adjust the methods of analysis in the light of the data. In economic and social studies one often has to make primary inferences, e.g. whether a particular relationship exists or not, only after one has thoroughly explored the data. As Professor Cox says, in order to deal with such situations rigorously, one needs to set out all the simple models consistent with the data. The trouble is that so often, in the social and economic field, we lack an adequate subject matter theory on which to set up meaningful models. Indeed it is the softness in subject-matter theory that often floors the statistician and his sophisticated bag of tricks. There is a tantalizing reference in the Address to a new emphasis on data analysis conceived in subject- matter as opposed to statistical terms, to the role of non-probabilistic, descriptive methods and to ways of combining these with rigorous probabilistic techniques. This remains a central puzzle, and I hope the Society will interest itself in this area.

What is at issue is not just the approach to statistical investigation, but the very role of the statistician. He is the custodian, as Professor Cox says, of a host of special techniques, many of which the computer now copes with. So the statistician, whether he likes it or not, should try to make more sophisticated those parts of the process which are at present weak and soft. And he should also become more subject-matter oriented, less buried in techniques and more concerned with data-content. The day of the statistician working primarily at arm's length as a techniques-man, pure and simple, are over; or at least should be. He has to be an integral part of the subject-matter team and be prepared for this in teaching. Here I applaud particularly the President's emphasis on the statistical work of non-statisticians as well as on the need for subject-matter involvement of professionals. The Society has a role to further both aims and in urging universities to review their teaching programmes both for specialists and for non-specialists.

I end in welcoming the closing paragraphs of the President's Address. What he has done is to widen our horizons and to remind us that, at present, statistical theory and principle contributes only patchily to the conduct of investigations. But, far from throwing up his hands in discouragement, he sees in this imposing challenges for our profession. He also urges what amounts almost to a change of heart in the way we view our task. Though we must remain true to our statistical techniques and principles, and to total integrity, we need to become more adventurous and less cautious in our approach. Science, as he reminds us, does not advance primarily by caution. Our approach needs to be positive and enthusiastic, and avoid the danger of using statistics primarily in a negative sense. Even uncertainties in evidence should be used positively to help decisions rather than to stop them. All too often, statisticians fail to make their full contribution because they are too caught up in technical perfectionism.

The President's concluding remarks deserve the closest attention. They carry far-reaching implications for our profession, for teachers and for the Society; and, if followed, they would give us a greater and more constructive role.

It is with admiration as well as pleasure that I propose the vote of thanks to the President.

Dr H. P. WYNN: I think it falls to me to speak for all the students and colleagues of the President, many in the audience tonight, and say how grateful we are to have benefited from his deep understanding and authority in both theoretical and applied statistics. His superb contributions and teachings have been among the most influential in post-war Britain and throughout the world.

It may seem churlish to take issue with the President on one matter. I excuse myself by knowing that following tradition he asked me to look at his address critically.

The President does not make much reference to the role of decision theory. I personally believe that a major advance in statistics was the clear separation of the parameter space from the action space by Abraham Wald around 1947. Whether or not one takes an objective or subjective point of view the classical decision theory model underpins much of modern thinking in statistics. The idea of action by the statistician should be expanded to include not only the post experimental decision but also the choice of model or hypothesis and the activity of data collection and experimental design. Without it the statistician is reduced to being merely a voyeur only able to stand passively by and look at the pretty patterns in Nature (including the stems and leaves). He must, rather, learn to quantify risk whether actual or in terms of error rates and other operating characteristics of his procedures. All these lie in the decision theoretic framework. If we deviate too much from it or do not replace it with something better we run the risk of drifting into self-deception. One famous decision-theoretist remarked that statistical inference is a "grim

This content downloaded from 163.1.41.46 on Tue, 28 Feb 2017 12:20:32 UTCAll use subject to http://about.jstor.org/terms

1981] Cox - Theory and General Principle in Statistics 297

business". These dangers are particularly clear in the complex areas of sequential analysis and dynamic processes where we have to make simultaneous decisions on estimation and observation.

The importance of action is especially evident where incorrect statements or poor forecasts or refusal to quantify losses or error rates can result in genuine damage in human terms. This is no more so than in the Government Statistical Service. (I hope there is at least one Treasury forecaster in the audience.) It is essential that statisticians be brought more closely into the policy making and not left to a passive role of data processing and euiting. Sometimes even the decisions about data collection are made for him so that he is squeezed from both ends. It is no accident that the operational research departments with their emphasis on costs and optimization have taken on some of the more interesting statistical work. Knowing the policy (if there is one), knowing what the data are used for, being able to suggest uses, assessing the costs of inaccuracy or lack of coverage all these add spice and direction to the analysis. In many small research units the statistician does have this key role.

Another consequence of divorcing action or policy from statistical activity is that the role of the statistician is weakened politically. The best way to stop someone saying "off with your head" is to be an indispensible part of the policy machine. The President hints at this in his closing remarks when he says that enthusiasm is required. All I am saying is that enthusiasm needs direction and that choice of direction is itself a decision.

However, my concern for the role of the statistician is also heightened by the current proposed cut- backs in the Government Statistical Service. If these lead to a rationalization and integration of the role of the statistician that is fine. But if constructive proposals are used as a smokescreen for the wholesale destruction of official statistics then we should complain. When even the role of the statistician as a cautious passive processor of information is threatened then everyone will suffer both Government and those who monitor Government from outside.

The threat of this happening makes me concede any argument I may have with the President because in the final analysis the data is paramount.

I have very great pleasure in seconding the vote of thanks.

The vote of thanks was passed by acclamation.

This content downloaded from 163.1.41.46 on Tue, 28 Feb 2017 12:20:32 UTCAll use subject to http://about.jstor.org/terms