Statistics in Psychology Using R and SPSS (Rasch/Statistics in Psychology Using R and SPSS) || Psychology - an Empirical Science

Download Statistics in Psychology Using R and SPSS (Rasch/Statistics in Psychology Using R and SPSS) || Psychology - an Empirical Science

Post on 10-Dec-2016




2 download


P1: OTA/XYZ P2: ABCJWST094-c03 JWST094-Rasch September 22, 2011 19:27 Printer Name: Yet to Come3Psychology an empirical scienceThis chapter is about the importance of statistics and its methods for psychology as a science.It will be demonstrated that for the gain of scientific insight in psychology, empirical studiesare needed. An example describes the statistical approach answering the scientific questionthat a study is based on. Important statistical terms, which will be clear in context, will beintroduced.Empirical research starts with a scientific question. Its concluding answer leads to a gain ofinsight. The way from the question to the gain of scientific insight is often intricate, not trivialfrom the outset, and different according to question. That is why the following section is abouta general strategy for gaining insight in an empirical science.A first example will be based on a question that can act as representative for manyother scientific questions in applied psychological research. It is about the psycholog-ical consequences of a hysterectomy; that is, the short- and middle-term condition ofwomen whose uteruses had to be removed due to medical reasons. It will be demon-strated that this question can only be answered by means of an empirical research study.In this context it will be shown that statistically well-founded planning, as well as anoperationalization of the psychological phenomenology (what will be investigated) areboth crucial.During the careful (research) planning of a study, which is necessary to answer a question,the focus is on the balance of concurrent demands of optimality. What the adequate solutionstrategy generally looks like will be demonstrated with a second example. It does not allude atall to a psychological question, but to an everyday trivial one. Even here it becomes clear thatstatistics as a scientific discipline is able to adequately work on and answer complex questionswhich come from in the first instant very easy-sounding questions but have finally beenstated more precisely.Statistics in Psychology Using R and SPSS, First Edition. Dieter Rasch, Klaus D. Kubinger and Takuya Yanagida. 2011 John Wiley & Sons, Ltd. Published 2011 by John Wiley & Sons, Ltd.P1: OTA/XYZ P2: ABCJWST094-c03 JWST094-Rasch September 22, 2011 19:27 Printer Name: Yet to ComeGAIN OF INSIGHT IN PSYCHOLOGY 233.1 Gain of insight in psychologyPsychology as a science deals with the (long-life development of the) behavior andexperience (consciousness) of humans as well as with the respective causative conditions. Thegoals . . . are to describe, explain, predict, and control behavior and seek[s] to improve thequality of each individuals and the collectives well-being (Gerrig & Zimbardo, 2004, p. 4).From that perspective, there is a need for empirical studies in psychology in order to gainscientific insight. Using rules and methods from natural sciences, systematic observations of acharacter must be made and they have to be related to treatment factors that are controlled as faras possible. The actually realized values of our observations we call observed (measurement)values/outcomes of a character.MasterDoctorExample 3.1 The psychological consequences of a hysterectomy will be assessedAccording to clinical psychology, physical illnesses are connected with mentalaspects most of the time. Some aspects are coping strategies or the preventionof psychic crises or psychic disorders. After undergoing a hysterectomy there isreason to fear that patients suffer from lasting psychic crises, for example in thesense of a massive loss of self-esteem, especially concerning self-esteem as awoman.Lets make the assumption that the cause for the given question is an unsys-tematic, subjective, or selective perception of some of these patients advisors.At least in terms of health policy or maybe even in economic terms it appearsappropriate to research this question.For the sake of simplicity lets assume that earlier research has provided apsychological assessment tool that can measure the self-esteem or the psychicstability of a person in a valid way. Lets simply call that tool Diagnosticum Y .Then we can begin to design a study. Usually one thinks about the first groupof patients that comes along. The most easily reachable, reasonably sized group(i.e. 30), as concerns the relation of workload and gain of insight, could bepsychologically tested with Diagnosticum Y after surgery.People critical of this empirical research design would at once argue:1. It is an arbitrary selection of patients. The institution may have systemati-cally chosen patients who were too old: in other institutions the mean ageof such patients might be substantially lower. There could be patients withcomparatively low educational level, divergent ethnic origin, long-termsingle status and much more.2. Every result would be meaningless because one would not know whichtest scores women without surgery have in Diagnosticum Y (that is to saythe (normal/total) population) as well as which test scores the women ofthe study would have had before surgery.According to this, one should design to examine women from the healthy (thatis the (not-yet) positively diagnosed) part of the population at the same time, too.They form the comparison group compared to the target group. Initially, hereagain the first group of women that is available will be chosen (presumablysimilar in number to the first group).P1: OTA/XYZ P2: ABCJWST094-c03 JWST094-Rasch September 22, 2011 19:27 Printer Name: Yet to Come24 PSYCHOLOGY AN EMPIRICAL SCIENCEPeople critical of this empirical research design would argue:3. More than ever, this is an arbitrary selection of persons. In the group ofhealthy women, that is to say the comparison group, there could be peoplewith systematically different psychosocial characteristics (age, educationlevel, social status, ethnic origin, relationship) as compared to the patientgroup; that is, the target group.4. The two group sizes seem to be too small; hardly anybody will dare todraw generally binding conclusions because of the possibly observabledifferences between the two arbitrarily examined groups. Therefore nogeneral conclusion about the psychic consequences of a hysterectomycan be made. A compulsory psycho-hygienic need for action for copingor prevention cannot be deduced from this but nobody is interestedin the differences between the two concrete groups (except the patientsthemselves and their family members).Therefore, one must basically reflect on the choice of women that should beexamined. Perhaps one comes to the conclusion that a so-called representativegroup subsequently termed sample of the population is hard to survey (formore details see Chapter 4). Not only the comparability of the psychosocialcharacteristics is questionable, but also especially the circumstances under whichwomen from the healthy population are willing to undergo the examination (i.e.only if they are paid and presumably not in a hospital) or which women are willingat all (i.e. those with especially high self-esteem or a special degree of psychicstability). As a consequence one will design to examine the patient group withDiagnosticum Y not only after surgery (note that here it is important to think aboutthe exact time of examination after surgery; preferably not right after surgery hasbeen performed but shortly before discharge) but also before surgery (note thatin this case one must think about the exact time of examination before surgery;preferably not right before surgery is performed but a short time before hospitaladmission). The respective results could be individually compared and from thisthe psychic consequences of a hysterectomy could be estimated.People critical of this empirical research design would argue:5. Before surgery, presumably no patient will have a test score in Diagnos-ticum Y that can serve as a comparable value typical of the time before anillness with indication of hysterectomy.6. And even if this were the case, a change towards loss of self-esteem orpsychic stability as a result of a surgery would hardly be surprising,because every surgery means a massive intrusion into a humans bio-(psycho-socio-) tope.Thus one has to specify the question: it is less about the examination of thepsychic consequences of surgeries (in a selective way that is a specific surgeryindication), but rather about the examination of the consequences of a specificsurgery that is of interest (namely hysterectomy), preferably compared with othersurgeries (that are less related to the role/functioning as a woman). Accordingly,an empirical research design is indicated that also includes, apart from a group ofP1: OTA/XYZ P2: ABCJWST094-c03 JWST094-Rasch September 22, 2011 19:27 Printer Name: Yet to ComeGAIN OF INSIGHT IN PSYCHOLOGY 25patients after hysterectomy, a group of patients with surgery that is comparableregarding severity (from a medical point of view; i.e. gallbladder surgery). Bothgroups would be examined with Diagnosticum Y after surgery.Critics would again object to the choice of the sample:7. Neither sample has been chosen in a representative way as concerns allthe patients, for whom conclusions should be made. We actually wantinsight that refers to all hysterectomy patients (compared to patients withgallbladder surgery) in the Western civilization or at least the English-speaking countries. Our findings should be applicable for the typical ageof such patients, for their typical psychosocial characteristics but alsoespecially for patients in the conceivable future.8. The choice of gallbladder surgery, out of all surgeries that are comparableregarding severity, is arbitrary and therefore may not be suitable.9. The sample size is still not plausible.Consequently, preliminary studies have to show that the first patient group thatcomes along, namely the one from a specific institution, really is typical regardingspecific criteria especially regarding the aforementioned psychosocial charac-teristics. Otherwise the research design has to be designed as a multi-center study.If necessary one has to take care to pick the patients representatively regarding thecalendar month of their surgery, in order to take into consideration seasonal varia-tions of what is examined with Diagnosticum Y (self-esteem, psychic stability).Also, at least through literature, the choice of gallbladder surgery as being typicalfor all other surgeries that are comparable regarding severity must be proven.Finally the number of investigated women should be considered in detail (seeChapter 8 and the subsequent ones).The starting point is the just-confirmed question: Are the psychic conse-quences of a hysterectomy graver than those of surgery with comparable severity?Critics of the current empirical research design would now have one finalgrave argument:10. Women, who fall ill such that a hysterectomy is indicated, are different fromthe start (maybe from the time of birth) from women who undergo gall-bladder surgery during their lifespan; for example the former could havea systematically different personality structure and as a consequence under corresponding environmental conditions a vulnerability toillnesses of the uterus must be suspected.Regarding this point of criticism, we ultimately have nothing to offer: thisempirical research design is a classical retrospective study (in experimental psy-chology: an ex-post-facto design); that means that the allocation of patients to thetwo samples did not happen, as in an experiment, by chance (see Chapter 4) beforethe exposure to different conditions; but the grouping of the patients was doneafterwards (after falling ill), and therefore by definition unable to be influencedby the examiner. Differences between patients after indications for hysterectomyP1: OTA/XYZ P2: ABCJWST094-c03 JWST094-Rasch September 22, 2011 19:27 Printer Name: Yet to Come26 PSYCHOLOGY AN EMPIRICAL SCIENCEor gallbladder surgery cannot, if once established, necessarily be traced back tothe group criterion, instead it can never be ruled out that the differences have beenthere all along.The gain of scientific insight in psychology starts, as in all other empirical sciences, witha deductive phase. Besides a general description of the problem, this phase also comprises:the specification of the aim of the study; the exact definition of the population of the unitsof research for which insights (from a subset, the sample) concerning the scientific questionhave to be gained; the exact definition of the required accuracy of the final conclusion; andthe selection or construction of (optimal) designs of the study. Then the investigation and thecollection of data connected with it are carried out. Afterwards, an inductive phase follows,beginning with the statistical evaluation of the data and the subsequent interpretation of theresults. The latter can lead to new questions that initiate further empirical research.3.2 Steps of empirical researchEmpirical research can be divided into seven steps:1. Exact formulation (specification) of the scientific question.2. Definition of certain precision requirements for the final conclusions, required foranswering the scientific question.3. Selection of the statistical model for the planning and analysis of the study.4. (Optimal) planning of the study.5. Realization of the study.6. Statistical analysis of the collected data.7. Interpretation of the results and conclusions.The three first steps, however, cannot just be completed one after the other. The specificationof the precision requirements, for example, can only be accomplished if one knows how thedata will be analyzed later on.MasterDoctorThe exact formulation of the scientific question is important, because in contrastto imprecise questions in common speech which will be understood even if theyare posed in the wrong manner a lack of precision in research will not lead tothe desired gain of insight.For Lecturers:The answer of the former publisher of the ZEIT, Marion Grafin Donhoff, to thequestion Do you mind if I smoke in your company? is quite subtle I dontknow, nobody ever dared to here, she actually answered two questions: theposed (Do you stand people smoking?) as well as the intended (May I pleasesmoke?) one.P1: OTA/XYZ P2: ABCJWST094-c03 JWST094-Rasch September 22, 2011 19:27 Printer Name: Yet to ComeSTEPS OF EMPIRICAL RESEARCH 27Doctor Example 3.2 A manufacturer wants to state the mean fuel consumption for aspecific car model.1,2First we have to point out that the posed question cannot be asked for everysingle car, but only for the car model as a whole; in the given case all produced carsof a specific model form the population (see also Chapter 4). Next, we think aboutfactors on which fuel consumption depends. At the same time we have to refrainfrom looking at the just-bought individual car, which for example as a Friday Carmight have certain defects. We find that fuel consumption depends on the drivingstyle of the driver (i.e. high- or low-revving typical driving style); on the route(i.e. SacramentoReno over the Sierra Nevada vs. San DiegoLos Angeles alongthe Pacific coastline); on whether one drives in the city, on a highway or on anexpressway; on whether one gets into a traffic jam or slowly moving traffic; andperhaps on many other factors. Therefore we ask ourselves: for which situationshould the statement in the prospectus be valid? For example, one could state theconsumption for the most important situations in a chart. This is unusual, andbecause of the amount of information it may be daunting. Instead of eliminatingthe mentioned influences (context factors) with special experimental adjustments(see Section 7.2.2), one could conduct several test drives under some arbitrarilychosen conditions (i.e. 150 miles each) and determine the fuel consumption; wewill show later on why this would be an inappropriate approach.Now we consider that instead of a single number it may be advantageousto state a range (a confidence interval; for more details see Chapter 8) for theaverage/mean fuel consumption, averaged particularly concerning all influences for most cars produced this range should be true. During the construction of sucha confidence interval one has to fix the relative frequency (more exactly, theprobability; see Chapter 6), 1 , with which the mean fuel consumption reallyis in that range. One will be anxious to keep (very) small and therefore keep1 (very) large. At the same time, however, we dont want to have too largean interval: the statement of, for example, between 15 and 40 miles per gallonwould only lead to disapproval in future customers; it hardly contains surprisinginformation. From this we learn to keep the range within admissible boundaries;we could for example determine that it should not be larger than 3 miles per gallon.With this we have accomplished a great deal from the seven steps of empiricalresearch. The exactly formulated scientific question now is: Within what bound-aries is the mean fuel consumption of the specific car model expected to be?Also part of the analysis has been determined; namely: from the collected data(here, measurement values in the natural sense; that is fuel consumption in milesper gallon), a confidence interval will be calculated with the accuracy of at adetermined width.1 In a modified way this question was posed as a consulting problem to the first author of this book; the consultingled not only to a hardly ever published design of the study but also to a statistical analysis that even nowadays israrely found in any statistics book.2 The chosen non-psychological example can easily be transformed by the means of an analogy into a psycho-logical one: a sport psychologist, who gives a certain treatment (mental training) to long jumpers in a competitivesports center, wants to publish the mean-achieved training performance.P1: OTA/XYZ P2: ABCJWST094-c03 JWST094-Rasch September 22, 2011 19:27 Printer Name: Yet to Come28 PSYCHOLOGY AN EMPIRICAL SCIENCELet us now come to the choice of the statistical model and the empirical re-search design: a statistical model, which is an assumption, is needed as a math-ematical explanation for the collection of the data. Due to reasons that will be ex-plained later on (see Chapter 7) we will aim to have a random sample of a > 1 carstaken for test drive purposes from the total of all cars produced. With every singleone of these cars, n test drives with predetermined distance will be made, andthe amount of fuel consumed will be measured. These measurement values willbe termed yiv, with i being any (fixed) number of the car that can take anyvalue between 1 and a. And v is the number of the test drive with car i. Forsimplicity (and as we will see later also because it is, in a well-defined way,optimal) we make the same number of test drives with every car, so that vruns from 1 to n. Symbolically, our future data will have the form: yiv; i = 1,2, . . . , a; v = 1, 2, . . . , n, which are a n measurement values. The future datastructure therefore has already been determined at the time of planning the study.Now we want to specify a statistical model for yiv. Therefore we assume that themeasurement values yiv fluctuate around a mean . Out of all the reasons why wedont get the same measurement value every time (these are the so-called causesof variation), we only can/want to look at possible differences between the cars(inter-individual causes of variation). Accordingly, deviations between the singlemeasurement values can be traced back to the effect ai of the respective car i aswell as to a measurement error eiv. We model the observed measurement valuesyiv through a random variable yiv (note the difference between yiv and yiv: randomvariables are here made distinguishable from non-random quantities through boldletters). Chance has an effect because we want to assume that the cars in thestudy have been randomly taken from the population of cars produced.3 That iswhy also the effects ai of the cars must be modeled or described through randomvariables. Hence the model equation is:yiv = + ai + eiv (i = 1, 2, . . . , a; v = 1, 2, . . . , n) (3.1)More about this model and its side condition can be found in Section that, essentially all of the first three steps of empirical research have beencompleted. For planning the study it is now necessary to optimally determine thetwo parameters a and n.4 We can either minimize the amount a n of test drives(that is the size of the study) or the cost of the study. At this point we dont wantto explain in detail how we get to the solution, we only state the result here: witha = 14 cars, according to the price and precision requirements (not given here),the test track has to be driven n = 12 times.As soon as the study has been carried out according to this design, we thenonly have to analyze the data and interpret the results, as described in Chapter 10.3 Basically all cars have the same opportunity of becoming part of the study, for example by having a lottery todecide which cars will actually be picked.4 Neither the calculation for a confidence interval, nor the optimal design of the study could be found in literatureat the time of the aforementioned consulting. That is why at that time two colleagues were asked to develop somethingappropriate (see now Herrendorfer & Schmidt, 1978).P1: OTA/XYZ P2: ABCJWST094-c03 JWST094-Rasch September 22, 2011 19:27 Printer Name: Yet to ComeREFERENCES 29For Lecturers:Non-statistical reasoning often leads to unjustified generalizations in the inter-pretation of observations. This can be well demonstrated with the followinghumorous example. Three Continental Europeans travel through Scotland in atrain. One is very uncritical, one is critical, and the third one is statistically edu-cated. They see three black sheep standing on a hill. The uncritical one says tothe others: See, in Scotland the sheep are black. Next the critical one says: Onecannot say that in such a general way. What one can say is that there are at leastthree black sheep in Scotland. Then the third one says: Even that doesnt haveto be true, we can only say that there are at least three sheep in Scotland that areblack on one side.Most of the time in psychological studies one will have to deal with the collection of morethan one single character. One then has to decide in favor of one character that is the mostinteresting in order to complete the first four steps of empirical research.SummaryFor the gain of insight in psychology, a statistically (meaning: derived from statistics as a sci-ence) founded design of the study is needed. For this purpose, a definition should be givenfor the population (of persons), for whom findings will be recorded (using a subset/sampleof it). Regarding content, the needed observations must be carried out in a way that rele-vant context factors are controlled. The way of sampling, as well as the size of the sample,depend on fundamental rules of statistics. The collection of data regards the ascertainment(often measurement) of observable phenomena, which leads to actually realized values of ourobservations; we call them observed (measurement) values/outcomes.ReferencesGerrig, R. J. & Zimbardo, P. G. (2004). Psychology and life (17th edn). Boston: Allyn & Bacon.Herrendorfer, G. & Schmidt, J. (1978). Estimation and test for the mean in a model II of analysis ofvariance. Biometrical Journal, 20, 355361.


View more >