appendix data files - home - springer978-1-4757-3090-6/1.pdf · appendix. data files 209 cereal.dat...
TRANSCRIPT
Appendix DATA FILES
The data files used in this manual are contained on the Springer-Verlag web site (http://www.springer-ny.com/supplements/voelkl). The data files are provided in two forms, an SPSS data file (.sav extension), and a raw or ASCII data file (.dat format).
BILLIONAIRE.DAT Information on 224 billionaires throughout the world (n = 224). (available: http://lib.stats.cmu.edu/DASUDatafiles/Billionaires92.html)
Variable Column(s) Format .sav Variable Name Wealth (in 4-8 f5.2 wealth billions) Age 15-16 f2.0 age Region of World 21-24 f4.2 region
1=Asia 2=Europe 3=Middle East 4=US 5=0ther
BOTTLE.DAT Daily Output of 12 Bottle Capping Machines (n = 12) (Kruskal, W. H., and Wallis, W. A. (1952). Use of ranks in one-criterion analysis of variance. Journal of the American Statistical Association, 47, 583-621).
Variable Column(s) Format .sav Variable Name Machine 5-8 f4.2 machine Output 11-16 f6.2 output
208 Using SPSS for Windows
CANCER.DAT Exposure to Radioactive Materials and Cancer Mortality Rate (n = 9) (Fadeley, R. C. (1965). Oregon malignancy pattern physiographically related to Hanford, Washington, radioisotope storage. Journal of Environmental Health, 27,883-897.
Variable Column(s) Format .sav Variable Name Index of 5-9 f5.2 expose Exposure Cancer Mortality 13-17 f5.1 mortal it (per 100,000
_Qerson--years)
CEREAL.DAT Nutritional information for breakfast cereals (n = 77) (available: http://lib.stat.cmu.edu/DASUDatafiles/Cereals.html).
Variable Column(s) Format .sav Variable Name Name of cereal 1-32 A name Manufacturer 42 fl.O manufac
1=American Home Foods
2=General Mills 3=Kellogg's 4=Nabisco 5=Post 6=Quaker Oats ?=Ralston Purina
Type of cereal 50 fl.O type 1=cold 2=hot
Calories per 52-58 f6.2 calories serving Protein grams 63-66 f4.2 protein Fat grams 71-74 f4.2 fat Sodium 77-82 f6.2 sodium millimeters Fiber 86-90 f5.2 fiber Carboeydrates 94-98 f5.2 carbo Sugar 102-106 f5.2 sugar
Appendix. Data Files 209
CEREAL.DAT (continued) Potassium 109-114 f6.2 potass Vitamins 117-122 f6.2 vitamin Shelf position in 127-130 f4.2 shelf store
1=bottom 2=middle 3=top
Weight 135-138 f4.2 weight Cups 142-145 f4.2 cups Rating_ 149-153 f5.2 rating
CLT.DAT 100 Random Samples of Size 50 from Uniform Distribution (n = 1 00) (generated by SPSS).
Variable Column(s) Format .sav Variable Name Marker 4-5 f2.0 marker Sample 1 7-8 f2.0 u1 Sample 2 10-11 f2.0 u2 Sample 100 304-305 f2.0 ulOO
COMPUTER.DAT Gender and Role Portrayal in Computer Magazines (n = 661) (Ware, M. C., and Stuck, M. F. (1985). Sex-role messages vis-a-vis microcomputer use: A look at the pictures. Sex Roles, 13, 205-214) .
Variable Column(s) Format . sav Variable Name Gender (l=M, 5 fl.O gender 2=F) Role 7 fl.O role
1=Seller 2=Manager 3=Clerical 4=Computer Expert
5=0ther
210 Using SPSS for Windows
CONFORM.DAT Husbands and Wives Conformity Ratings (n 20) (hypothetical data).
Variable Column(s) Format .sav Variable Name Husband's Score 3-4 f2.0 husband Wife's Score 7-8 f2.0 wife
DEAm.DAT Data on Number of Months Before, During, or After Birth month that Death Occurred (n = 348) (Philips, D. (1972). Deathday and birthday: An unexpected connection. In J. M. Tanner, et al. (Eds.), Statistics: A Guide to the Unknown. San Francisco: Holden Day).
Variable Format .sav Variable Name Month of Death f2.0 month
DELINQ.DAT Data on SES, Population Density, and Delinquency for 75 Community Areas of Chicago (n ... 75). (hypothetical data suggested by Galle, O.R., Gove, W. R., and McPherson, J. M. (1972). Population density and pathology: What are the relations for man? Science, 176, 23-30) .
Variable Column(s) Format . sav Variable Name Socioeconomic 2 fl.O ses Status (SES)
1-Low, 2=High
Population 4 fl.O pop_dens Density
1=Low, 2=High
Delinquency 6 fl.O delinq 1=Low, 2-High
DIETER.DAT Weights of25 Dieters (n = 25) (hypothetical).
Column(s) Format 2-4 f3.0
Appendix. Data Files 211
DIVISION.DAT Data on sex and division of students in a course (n - 25) (hypothetical)
Variable Column(s) Format .sav Variable Name Name 1-10 A name Sex (l=F, 2=M) 13 fl.O sex Division 15 fl.O division
!=Graduate 2=Undergraduate
ENROLL.DAT Data on School Districts, Including the Racial Disproportion in Classes for Emotionally Disturbed Children (n = 26) (US Department of Education, Office for Civil Rights).
Variable Column(s) Format .sav Variable Name District 1-5 f5.0 enroll Enrollment Percentage of 7-11 f5.2 pct_aa Students Who Are African-American Percentage of 15-19 f5.2 pct_lnch Students Who Pay Full-Price for Lunches Racial 23-27 f5.2 rac_disp Disproportion in Classes for Emotionally Disturbed*
*Positive index indicates that proportion of African-American students is greater than the
proportion of white students.
212 Using SPSS for Windows
EXERCISE.DAT Data for 202 Individuals on Exercise Behavior and Health Status (n = 202) (Hypothetical).
Variable Column(s) Format .sav Variable Name Exercise Category 5 fl.O exercise
1=Exerciser 2=Nonexerciser
Health Status 7 fl.O health 1=Good 2=Poor
FINAL.DAT Final Grade in Statistics for 68 Students (n = 68) (Anderson, T. W., and Finn, J. D. (1996). The New Statistical Analysis of Data. New York: Springer-Verlag).
Variable Column(s) Format .sav Variable Name Class 5 fl.O class
O=Undergraduate 1=Graduate
Primary Language 10 fl.O language O=Other 1=English
Final Statistics 13-15 f3.0 grade Grade
FIRE.DAT Data for 28 Firefighter Applicants (n = 28) (Buffalo, New York records)
Variable Column(s) Format .sav Variable Name Candidate 1-4 f4.0 candnum Number Sex (l=M, 2=F) 5 fl.O sex Race (1 =White, 6 fl.O race
2=Minority) Stair Climb Time 7-11 f5.2 stair Body Drag Time 12-16 f5.2 body Obstacle Course 17-22 f6.2 obstacle Time
Appendix. Data Files 213
FIRE.DAT (continued) Agility Score 23-28 f6.2 agility Written Score 29-33 f5.2 written Composite Score 24-38 f5.2 composit
FOOTBALL.DAT Weights and Heights of 56 Stanford Football Players (n = 56) (official program, freshmen eliminated).
Variable Column(s) Format .sav Variable Name Weight 3-5 f3.0 weight Height 8-9 f2.0 height
GAS.DA T Federal and State Gasoline Tax Rates for the 50 States (n = 50) (Statistical Abstract of the United States (1993), p. 305.)
Variable Column(s) Format .sav Variable Name Tax Rate (cents 5-9 f5.2 gas tax per gallon)
HEAD.DAT Head Measurements for 25 Infants (n = 25) (Frets, G. P. (1921). Heredity of head form in men. Genetica, 3, 193-384).
Variable Column(s) Format .sav Variable Name Length 3-5 f3.0 length (millimeters) Breadth 8-10 f3.0 breadth (millimeters)
214 Using SPSS for Windows
HOTDOG.DAT Nutritional information for different brands of hot dogs (n = 54) (available: http:/ llib.stat.cmu.edu/DASUDatafiles/Hotdogs.html).
Variable Column(s) Format .sav Variable Name Type of meat 5 fl.O type
l=Beef 2=0ther type of
meat 3=Poultry
Calories 7-9 f3.0 calories Sodium 11-13 f3.0 sodium millimeters
INCOME.DA T Income of 64 Families (n = 64) (hypothetical data) .
Variable Column(s) Format . sav Variable Name Income 1-5 f5.0 income
IQ.DAT IQ Scores for 23 Children (n = 23) (Anderson, T. W., and Finn, J.D. (1996). The New Statistical Analysis of Data. New York: Springer-Verlag).
Variable Column(s) Format .sav Variable Name LanguageiQ 1-3 f3.0 lang Nonlanguage IQ 4-6 f3.0 nonlang
IQ2.DAT IQ Scores for 24 Children (n = 24) (Anderson, T. W., and Finn, J.D. (1996). The New Statistical Analysis of Data. New York: Springer-Verlag) .
Variable Column(s) Format . sav Variable Name Language IQ 1-3 f3.0 lang Nonlan_guage IQ 4-6 f3.0 nonlang
Appendix. Data Files 215
JOBPROF.DAT Job Proficiencies Based on Educational Attainment for 22 Individuals (n = 22) (hypothetical data).
Variable Column(s) Format .sav Variable Name Educational 1 fl.O educ Attainment
1=Noncompleter 2=High School 3=College Job Proficiency 4-6 f3.0 job _prof
KIDS.DAT Number of Children in 20 Households (n = 20) (hypothetical data).
Variable Column(s) Format .sav Variable Name Number of 1 fl.O num_chld Children
LIBRARY.DAT Size of Book Collection and Number of Staff for 22 College Libraries (n = 22) (McGrath, W. E. ( 1986). Levels of data in the study of library practice: Definition, analysis, inference and explanation. In G. G. Allen and F. C. A. Exon (Eds.), Research and the Practice of Librarianship: An International Symposium (pp. 29-40). Perth, Australia: Western Australian Institute of Technology).
Variable Column(s) Format .sav Variable Name Volumes 1-4 f4.1 volumes (1 OO,OOOs) Staff 6-8 f3.0 staff
LUNCH.DAT Data on Calories Consumed at Lunch and Restaurant Temperature (n = 14) (hypothetical).
Variable Column(s) Format .sav Variable Name Temperature 1-2 f2.0 temp Calories 4-6 f2.0 calories
216 Using SPSS for Windows
MEAL.DAT Data on Restaurant Type, Cost, and Meal Type (n 46) (hypothetical).
Variable Column(s) Format .sav Variable Name Meal 1 fl.O meal
1=Pasta 2=French 3=Seafood
Cost 3 fl.O cost 1 =Inexpensive 2=Moderate 3=Seafood
Chain 5 fl.O chain (l=Yes, 2=No)
MOVIES.DAT Genre and Gross for 100 Top Movies in 1995 (n = 1 00) (Daily Variety).
Variable Column(s) Format .sav Variable Name Genre 5 fl.O genre
1=Action-Adventure
2=Drama 3=Family 4=Comedy Gross (Millions) 9-13 f5.1 gross
NOISE.DAT Average Highway Speed and Noise Level for 30 Sections of Highway (n = 30) (hypothetical data suggested by Drew, D. R., and Dudek, C. L. (1965). Investigation of an Internal Energy Model for Evaluating Freeway Level of Service. College Station: Texas A&M University, Texas Transportation Institute).
Variable Column(s) Format .sav Variable Name Average Speed 1-4 f4.1 speed (mph) Noise Level 5-8 f3.2 noise
Appendix. Data Files 217
OCCUP.DAT Occupations of 20 Primary Householders (n = 20) (hypothetical) .
Variable Column(s) Format . sav Variable Name Occupation 1 fl.O occup
1 =Professional 2=Sales 3=Clerical 4=Laborer
PHONE.DAT Monthly Telephone Calls Received by Airlines (n = 36) (American Airlines (1988). Call workload data from a reservations office. Unpublished raw data).
Variable Column(s) Format .sav Variable Name Year 1 fl.O year
1=1985 2=1986 3=1987
Number of Calls 2-6 f5.2 calls (nearest 100)
POPULAR.DAT Data on Elementary School Students' Goals (n 478) (available: http:/ /lib.stat.cmu.edu/DASUDatafiles/PopularKids.html) .
Variable Column(s) Format . sav Variable Name Gender 2 fl.O gender (l=F, 2=M) Grade 7 fl.O grade Age 9-10 f2.0 age Race 16 fl.O race
1=White 2=0ther
Urbanicity 18 fl.O urban 1=Rural 2=Suburban 3=Urban
School Name 19-25 A school
218 Using SPSS for Windows
POPULAR.DAT (continued) Goals 26 fl.O goals
1=Make good grades
2=Be popular 3=Be good at
sports Importance of 31 fl.O grades grades (1-4) for Popularity (1=most; 4=least) Importance of 34 fl.O sports sports ( 1-4) for Popularity (l=most; 4=least) Importance of 37 fl.O looks looks (1-4) for Popularity (1 =most; 4=least) Importance of 40 fl.O money money (1-4) for Popularity (1 =most; 4=least)
READING.DAT Reading Scores of 30 Students Before and After Second Grade (n = 30) (records of a second-grade class).
Variable Column(s) Format .sav Variable Name Score Before 1-4 f3.2 before Second Grade Score After 5-8 f3.2 after Second Grade
Appendix. Data Files 219
RULERS.DAT Age of Death of 42 English Rulers (n = 42) (Gebski, V., Leung, 0., McNeil, D. R., and Lunn, A. D. (1992). The SPIDA User's Manual. Syndey, Australia: Statistical Computing Laboratory. Reprinted in Hand, D. L, Daly, F., Lunn, A. D., McConway, K. J., and Ostrowski, E. (Eds.) (1994). A handbook of small data sets. London: Chapman & Hall).
Column(s) Format .sav Variable Name 1-2 f2.0
SALARY.DAT Salaries by Gender for 33 Half-Time Clerical Workers (n = 33) (hypothetical data).
Variable Column(s) Format .sav Variable Name Gender 1 fl.O gender Salary 3-5 f3.0 salary ( 1 OOs of Dollars)
SEMESTER.DAT Data on 24 Students in a Statistics Course (n = 24) (records from a class).
Variable Column(s) Format .sav Variable Name Major 2 fl.O major
1=Electrical Engineering
2=Chemical Engineering
3=Statistics 4=Psychology 5=Public
Administration 6=Architecture ?=Industrial
Engineering 8= Materials
Science Semesters of 4 fl.O semester Statistics Courses Grade 7-9 f3.0 grade
220 Using SPSS for Windows
SLEEP.DAT Data on Mammals' Physical, Environmental, and Sleep Characteristics (n = 62) (available: http:/ /lib.stat.cmu.edu/ datasets/ sleep.html)
Variable Column(s) Format .sav Variable Name Species' Name 1-27 A species
Body Weight (kg) 28-34 f7.2 bodywt Brain Weight (g) 35-42 f8.0 brainwt Non dreaming 43-50 f8.2 nodream Sleep (hrs/day) Dreaming Sleep 51-58 f8.2 dream (hrs/day) Total Sleep 59-66 f8.2 totsleep (hrs/day) Life Span (years) 67-74 f8.2 lifespan Gestation Time 75-82 f8.2 gestate (days) Predation Index 83-90 f8.2 prey ( 1 =min to 5=max) Sleep Exposure 91-98 f8.2 sleepexp Index (l=least exposed to 5=most exposed) Danger Index 99-106 f8.2 danger (l=least to 5=most) Missing values= -999
Appendix. Data Files 221
SOCMOB.DAT Data on Family Structure and Occupation of Members (n = 1156) (available: http://lib.stat.cmu.edu/datasets/socmob).
Variable Column(s) Format .sav Variable Name Father's 1 fl.O f_occup Occupation
1=Laborer 2=Craftsperson 3=Salesperson 4=Manager 5=Professional
Family Structure 3 fl.O family 1=Intact, 2= Nonintact Race (1 =White, 5 fl.O race 2=0ther) Son's Occupation 7 fl.O s_occup
1=Laborer 2=Craftsperson 3=Salesperson 4=Manager 5=Professional
SPIT.DAT Data on Success of Interventions to Curb Chewing Spitting Tobacco (n =54) (Greene,J. C., Walsh, M. M., and Mosouredis, C. (1994). Report of a pilot study: A program to help major league baseball players quit using spit tobacco. Journal of the American Dental Association, 125, 559-567.)
Variable Column(s) Format .sav Variable Name Type of 1 fl.O interven Intervention
1=Minimum 2=Extended
222 Using SPSS for Windows
SPIT.DAT (continued) Outcome 3 fl.O outcome
l=Subject Quit Entirely
2=Subject Tried Unsuccessfully to Quit
3=Subject Failed to Try to Quit
VOTE.DAT Data on Voting Patterns (n = 46) (hypothetical).
Variable Column(s) Format .sav Variable Name Plan to Vote in 1 fl.O voting Current Election l=Yes, 2=No Registered Voter 3 fl.O register l=Yes, 2=No Voted in Last 5 fl.O voted Election l=Yes, 2=No
WAR.DAT Expectations of Possibility of War (n = 597) (Lazarsfeld, P. F., Berelson, B., and Gaudet, H. (1968). The People's Choice (3rd edition). New York: Columbia University Press).
Variable Column(s) Format .sav Variable Name June Response 1 fl.O June
O=Does Not Expect War
l=Expects War October Response 3 fl.O October
O=Does Not Expect War
l=Expects War
Appendix. Data Files 223
WEATHER.DAT Average Precipitation and Temperature on July 2nd for US Cities (n = 78) (data obtained from Internet).
Variable Column(s) Format .sav Variable Name City_ 1-29 A city Temperature 30--34 f5.2 temp Precipitation 38-42 f5.2 precip
WORDS.DAT Number of Words 18 Children Memorized Based on Three Different Experimental Conditions (n = 18) (hypothetical data.).
Variable Column(s) Format .sav Variable Name Information Set 2 fl.O info_set
1=No Information
2="3 Categories" 3="6 Categories"
Number of Words 4-5 fl.O words Memorized
Index
~.beta, 174, 176, 177 y, gamma, 168 A, lambda, 166
Alternative hypothesis, See Hypotheses testing
Analysis of variance, ANOV A, 178, 189, 191, 195, 197, 201
Association between categorical variables,
161 coefficient of, gamma, 168 coefficient of, lambda, 166 coefficient of, phi, 164
b, least-squares estimate of regression coefficient, 176
Bar chart, 30, 31 Bell-shaped curve, 93 Bernoulli distribution, 89 Bonferroni, 201 Box-and-Whisker plots, 56
Categorical data, 27, 157 Central Limit Theorem, 101 Central tendency, 39 Chi-square, 158, 161,203 Coin toss, 89 Comparison of two independent
sample means, 137, 139, 142
Conditional association, 83
Confidence interval, 110, 111, 113, 115, 116, 124, 150, 153, 174, 177
Correlation coefficient, 59, 64, 66, 172, 178, 189
Correlation matrix, 66, 68, 69 Crosstabulation, 73, 75, 143,
161, 164, 166, 168
Data categorical, 27 continuous, 33 discrete, 32 numerical, 27
Data files, 5 ASCII, 7 opening, 5 saving, 13, 14 SPSS,6
Deciles, 43 Degrees of freedom, 120, 124,
141, 151, 162, 177, 178, 179,198,201
Descriptives, 44, 45, 52, 54, 55, 108, 112, 120
Deviation from mean, 51 See also Mean deviation,
Standard deviation Dice, 90,99 Discrete variable, See Variables,
discrete Dummy coding, 190
226 Using SPSS for Windows
Effect size, 202 Interquartile range, 53 Estimated regression line, 17 4, Interval estimate, 107
176 Inverse association, 180 Estimation
of a mean, 107 Kruskal-Wallis, 202 of a median, 113 of a proportion, 112 Least-squares estimation, 176 of a standard deviation, 107 Linear relationship, 64, 171 of a standard error, 107 of a variance, 107 Main menu, 4
Explore, 33, 46, 56, 111, 113, Marginal totals, 75 120, 150, 151 Matched samples, 116
Maximum, 44, 51 F McNemar, 133, 134, 135
distribution, 151 Mean ratio, 178, 198, 199 arithmetic average, 44 statistic, 190 of a population of differences,
Files 114 opening, 5 population mean, Jl, 107, 109, printing, 21 137 saving, 13, 14 sample mean, 120
Frequency,27,28,29,32,39, Mean deviation, 53 40,44, 73, 78,107,108, Median, 41, 128 113, 120, 121 Minimum, 44, 51
Missing values, 23, 70 Goodness of Fit, 157, 159 Mode, 39 Graph, bar, 30, 31 Multiple correlation, 178
Multiple regression, 171, 185 H (H0, H), 119, 120, 123, 125, Multivariate, 59, 73
126, 127, 128, 133, 138, 139, 142, 149, 151, 157, Nonparametric, 127, 134, 158, 177' 178, 183, 195 202
See also Hypotheses testing Normal distribution, 93, 101 Help, 5 Null hypothesis, See Hypotheses Histogram, 32, 34, 36, 100, 101, testing
102, 121, 122 Hypotheses, testing, 119. See One-tailed tests, 142
also Testing hypotheses One-way analysis of variance, 195, 197
Inferential statistics, 107 Outlier, 63, 64, 172 Intercept, 17 4 Output window, 3, 4, 21
P value, 125, 179 Paired measurements, 114, 131 Percentages, crosstabulation, 76 Percentiles, 43 Phi coefficient, 77, 78, 164 Point estimate, 107 Post hoc, 199, 201 Printing, 21 Probability, 89, 93, 157, 159 Proportion,47, 126,133,142
Quantile, 43 Quartiles, 43
r, correlation coefficient, 64, 172
Random sample, 21, 91, 97 Range, 18,51
interquartile, 53 Rank correlation coefficient, 66 Ranks, analysis of variance of,
202 Regression, 171, 17 4, 185
line, estimated, 17 4 multiple, 171, 185 weight, 176
s, Sample standard deviation, 108
Sample data, 107 Sampling distribution, 97 Saving, 13, 14 Scatter plot, 59, 60, 61, 172,
179, 196 Sign test, 128, 145 Slope, 174, 176, 182, 185 Spearman's rank correlation
coefficient, 66 Standard deviation, 54
population standard deviation, 119
Index 227
population standard deviation, cr, 108, 120, 125, 137, 139, 142
sample, s, 108, 122 Standard error, 126 Standard normal distribution,
93,94,98,125 Standard score, 55 Stem-and-Leaf, 37 Sunflower, 196 Syntax, 3
T test, 123, 131, 139 Test statistic, 120
for a correlation, 178 for a mean, 120, 123 for a median, 130 for a proportion, 126 for a variance, 150 for independence in two-way
tables, 162 for the slope of a regression
line, 177 fortwomeans, 133,138,141 for two medians, 145, 146 for two proportions, 135, 144 for two variances, 151 in analysis of variance, 201
Testing hypotheses about a correlation, 178 about a variance, 149 about means, 119 about medians, 128, 145 about proportions, 126 about the mean of a
population of differences, 131
about two means, 137 of equality of two proportions,
133, 142
228 Using SPSS for Windows
Testing hypotheses (cont.) of equality of two variances,
150 Three-way frequency tables, 79 Transform, 14, 90, 94, 97, 99,
102, 129, 143 Transpose, 101 Two-way frequency tables, 73,
78
Variability, 51, 149
Variables adding, 10 categorical, 27, 30 continuous, 27, 32, 33 deleting, 11 discrete, 23, 27, 32 numerical, 27
Variance, 149, 150
Windows, 3
z value, 93, 94