data construction and data analysis for survey research

14
Data Construction and Data Analysis for Survey Research

Upload: others

Post on 25-Dec-2021

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Data Construction and Data Analysis for Survey Research

Data Construction and Data Analysis for Survey Research

Page 2: Data Construction and Data Analysis for Survey Research

Data Construction and Data Analysis

for Survey Research

RAYMOND KENT

Page 3: Data Construction and Data Analysis for Survey Research

* © Raymond Kent 2001

All rights reserved. No reproduction, copy or transmission of this publication may be made without written permission.

No paragraph of this publication may be reproduced, copied or transmitted save with written permission or in accordance with the provisions of the Copyright, Designs and Patents Act 1988, or under the terms of any licence permitting limited copying issued by the Copyright Licensing Agency, 90 Tottenham Court Road, London WIT 4LP.

Any person who does any unauthorised act in relation to this publication may be liable to criminal prosecution and civil claims for damages.

The author has asserted his right to be identified as the author of this work in accordance with the Copyright, Designs and Patents Act 1988

First published 2001 by PALGRA VE MACMILLAN Houndmills, Basingstoke, Hampshire RG21 6XS and 175 Fifth Avenue, New York, N.Y. 10010 Companies and representatives throughout the world

PALGRAVE MACMILLAN is the global academic imprint of the Palgrave Macmillan division of St. Martin's Press, LLC and of Palgrave Macmillan Ltd. Macmillan is a registered trademark in the United States, United Kingdom and other countries. Palgrave is a registered trademark in the European Union and other countries.

ISBN 978-0-333-76306-3 ISBN 978-1-137-08944-1 (eBook) DOI 10.1007/978-1-137-08944-1

This book is printed on paper suitable for recycling and made from fully managed and sustained forest sources.

A catalogue record for this book is available from the British Library.

Library of Congress Cataloging-in-Publication Data Kent, Raymond A.

Data construction and data analysis for survey research / Raymond Kent. p.cm

Includes bibliographical references and index. ISBN 978-0-333-76306-3 1. Sociology-Statistical methods. 2. Sociology- Data processing. 3. Social

sciences-Research. 4. Social surveys-Statistical methods. 5. Market surveys-Statistical methods. I. Title. HM535.K45 2001 301'.07'27- dc21 2001036165

10 9 8 7 6 5 4 3 2 10 09 08 07 06 05 04 03

Copy-edited and typeset by Povey-Edmondson Tavistock and Rochdale, England

Page 4: Data Construction and Data Analysis for Survey Research

Contents

List of Tables List of Figures Preface

1 Introduction Learning objectives The scope of this book The structure of the book How to study this book The survey in market and social research Survey objectives Types of survey When to use (and when not to use) surveys Summary Further reading References

part I CONSTRUCTING SURVEY DATA: GETTING GOOD QUALITY DATA

Introduction to Part 1

2 Designing the Data Matrix Learning objectives Introduction What are 'data'? The process of data construction The format of the data matrix The design of the data matrix

The specification, number and selection of respondents The variables The process of measurement Scaling

The quality of data Background discussion: the current 'theory' of measurement Summary Exercises Points for discussion Further reading References

v

xi

xiii

XV

1 1 1 3 4 5 7 9

10 11 11 12

14

15 15 15 15 20 20 23 23 25 28 36 44 45 48 48 50 50 50

Page 5: Data Construction and Data Analysis for Survey Research

vi - Contents

3 Filling the Data Matrix Learning objectives Introduction

51 51 51 51 52 52 54 56 59 60 60 61 61 63 64 68 68 69 70 70 70 71

Survey design and execution Population specification Frame error Questionnaire design Non-response Response errors Interviewer errors Editing Coding Data entry

Using computer packages Entering data on SPSS Saving your work Introduction to the table tennis study Summary Exercises Points for discussion Further reading References

part II ANALYSING SURVEY DATA: CHOOSING THE RIGHT DATA ANALYSIS TECHNIQUES

Introduction to Part II 74 Analysis objectives 74 Scale type 75 The number of variables 76 Choosing data analysis teclmiques 77 Further reading 77

4 Tables and Charts for Categorical Variables 79 Learning objectives 79 Introduction 79 Univariate frequency tables 79 Bivariate crosstabulation 82 Three-way and n-way tables 84 Bar charts and pie charts 84 Using SPSS Frequencies, Graphs, Crosstabs and Recode procedures 88

Frequencies 88 Graphs and charts 90 ~~~ ~ Recode 92

Summary 94 Exercises 95

Page 6: Data Construction and Data Analysis for Survey Research

Contents - vii

Points for discussion 95 Further reading %

5 Tables and Charts for Interval Variables 97 Learning objectives 97 Introduction 97 Frequency tables for interval data 98 Metric tables 99 Histograms and line graphs 99 Scattergrams 102 Using SPSS Histogram, Line and Scatter 103 Summary 106 Exercises 106 Points for discussion 107 Further reading 107

6 Summarising Categorical Variables 108 Learning objectives 108 Introduction 108 Univariate data summary 108 Bivariate data summary 109 Rank correlation 112 Using SPSS Crosstabs:Statistics 113 Summary 114 Exercises 114 Points for discussion 115 Further reading 115

7 Summarising Interval Variables 116 Learning objectives 116 Introduction 116 Univariate data summary 116

Central tendency 117 Dispersion 118 Distribution shape 119 Percentile values 122

Bivariate data summary 123 Correlation and regression 125 Spearman's rho 126

Multivariate procedures 128 Multiple regression 129 Factor analysis 129 Cluster analysis 131 Multidimensional scaling 132

UsingSPSS 133 Summary 136 Exercises 136 Points for discussion 137 Further reading 137

Page 7: Data Construction and Data Analysis for Survey Research

viii -Contents

8 Sampling and the Concept of Error 138 Learning objectives 138 Introduction 138 When we need to take samples 138 Sample selection 139 Sample design 141 Sampling in practice 144 Sampling errors 148

Systematic error 148 Random error 149

The concept of error 150 Total survey error 151 Controlling error 152 Summary 154 Exercises 154 Points for discussion 155 Further reading 155 References 155

9 Making Inferences from Samples: Categorical Variables 156 Learning objectives 156 Introduction 156 Estimation 157 Testing hypothese for statistical significance 160

Univariate hypotheses 160 Bivariate hypotheses 163 Multivariate hypotheses 164

Statistical inference and bivariate data summaries 164 Using SPSS 166 Summary 167 Exercises 168 Points for discussion 169 Further reading 169 References 169

10 Making Inferences from Samples: Interval Variables 170 Learning objectives 170 Introduction 170 Estimation 170 Testing the null hypothesis 171

Univariate hypotheses 171 Bivariate hypotheses 174 Comparing means: the analysis of variance 174 The statistical significance of correlation 178

The significance test controversy 178 Using SPSS 181 Summary 182 Exercises 183 Points for discussion 183 Further reading 183

Page 8: Data Construction and Data Analysis for Survey Research

Contents - ix

11 Evaluating Hypotheses and Explaining Relationships Learning objectives Introduction What is an 'hypothesis'? Should hypotheses be stated in advance of undertaking the

research? Evaluating hypotheses Analysing and explaining relationships between variables

What is an 'explanation'? Causal analysis Providing understanding IJialectical anal)fsis

Summary Exercises Points for discussion Further reading

part Ill ANALYSING SURVEY DATA: KNOWING HOW TO HANDLE YOUR DATA

184 184 184 184

186 187 188 188 189 192 192 193 193 194 194

Introduction to Part Ill 196

12 Handling Your Data Matrix 197 Learning objectives 197 Introduction 197 Upgrading and downgrading scales 198 IJata dredging 200 How many cases are needed and what size of sample should

be taken? 201 Strategies for coping with relatively few cases 203 Strategies for analysing summated rating scales 203

The reliabilit)f and validit)f of summated ratings 208 What do you do with 'don't know's? 211 Missing values 212 Handling multiple response items 214 Can you use statistical inference on non-random samples? 216 Using SPSS 218

Using Compute 218 Using Basic Tables 219 Using Define Variable/Missing Values 219 Using Multiple Response 220 Using Reliability Analysis 220

Background discussion: the interpretation of Cronbach' s coefficient alpha 221

Summary 222 Exercises 225 Points for discussion 225 Further reading 226 References 226

Page 9: Data Construction and Data Analysis for Survey Research

x - Contents

13 Analysing Open-ended Questions Learning objectives Introduction Treating responses as qualitative data Coding the data Open versus closed questions Summary Exercises Points for discussion Further reading

Appendix 1: The Table Tennis Questionnaire

Appendix 2: Using Pinpoint

Appendix 3: SPSS Release 10.0

Glossary References

Index

227 227 227 228 229 231 232 233 233 234

235

238 241 243

247 249

Page 10: Data Construction and Data Analysis for Survey Research

List of Tables

2.1 Adult press readership by title 16 2.2 Students by identification number 42 2.3 Respondents by sex 42 2.4 Respondents by ethnic group 42 2.5 Respondents' agreement with the statement: 'This is a first class

service' 43 2.6 Ranking of 6 students by performance in Maths and English 43 2.7 Household size 43 2.8 Age distribution of respondents 43 2.9 Some results from a survey 49 3.1 Errors in data entry 62 4.1 A frequency table for a binary variable 80 4.2 A frequency table of ordinal data 80 4.3 A multi-variable frequency table: respondents by sex, social class

and age 81 4.4 Table 4.2 regrouped 81 4.5 A crosstabulation of other household players by sex of

respondent 82 4.6 A crosstabulation of 'where table tennis was first played' by 'age' 83 4.7 Table 4.5 expressed as column percents 83 4.8 Age began playing table tennis and other household players

layered by whether or not they were encouraged to take up the sport 84

4.9 An SPSS 'Frequencies' output 90 4.10 'Play frequency' by whether anybody else in the household plays 91 4.11 Column percentages: 'how many times played per week' by

whether anybody else in the household plays 92 5.1 A frequency table for age began playing table tennis 98 5.2 Table 5.1 regrouped into two categories 99 5.3 Age groups by sex of respondent 99 6.1 A univariate table for categorical variables 109 6.2 How many times per week respondents play crosstabulated by

whether anybody else in the household plays 110 6.3 Chi-square and Cramer's V for Table 6.2 111 6.4 SPSS output for Phi and Cramer's V 114 6.5 SPSS Chi-square output 114 7.1 Measures of central tendency for 'age began' 117 7.2 Measures of dispersion for 'age began' 118 7.3 A frequency distribution for 'age began' 119 7.4 SPSS measures of distribution shape 120

xi

Page 11: Data Construction and Data Analysis for Survey Research

xii -List of Tables

7.5 Percentile values for 'agebegan' 122 7.6 Scores of A-D on two tests 124 7.7 The calculation of r from Table 7.6 124 7.8 SPSS regression coefficients 125 7.9 SPSS Pearson correlation output 127 7.10 SPSS Spearman's rho output 127 7.11 Multiple regression. 'Spend' regressed on 'Enjoyment', 'Social

benefits', 'Competition' and 'Health and fitness' 129 7.12 SPSS model constants for multiple regression 130 7.13 A correlation matrix 130 7.14 Factor loading on two factors 131 7.15 Similarity rankings of six multiples 133 7.16 SPSS regression output- variables entered 134 7.17 SPSS regression output- model summary 135 7.18 SPSS coefficients 135 9.1 Critical values of Chi-square 162 9.2 Importance of social benefits by age groups 163 9.3 Division played in by whether anybody else in the

household plays 166 9.4 A goodness-of-fit test using Chi-square 167

10.1 Mean importance of social benefits by presence of other household players 175

10.2 ANOV A of perceived importance of benefits by whether anybody else in the household plays table tennis 176

10.3 Using SPSS 'Explore' to generate confidence intervals 181 11.1 Type of wine consumed by income 190 11.2 Wine consumed by income, controlling for age 191 11.3 Wine consumed by income, controlling for social class 191 12.1 Percentage sampling errors: maximum variability at 50:50 202 12.2 Satisfaction with various elements of playing table tennis 204 12.3 Mean satisfaction score for elements of playing table tennis 205 12.4 Total satisfaction scores by sex of player 205 12.5 Satisfaction with practice facilities by whether anybody else in the

household plays table tennis 206 12.6 Satisfaction with practice facilities by whether anybody else in the

household plays table tennis 207 12.7 Satisfaction with practice facilities by whether anybody else in the

household plays, excluding 'neither' category 207 12.8 Analysis of missing values: univariate 213 12.9 Analysis of missing values: bivariate 214 12.10 A multiple response question 215 12.11 Crosstabulating a multiple response by another variable 215 13.1 Social grade and socio-economic classification 230 13.2 Analysis of an open-ended question 231

Page 12: Data Construction and Data Analysis for Survey Research

List of Figures

2.1 An open-ended question 18 2.2 A data matrix on SPSS 19 2.3 A survey data matrix 21 2.4 A coded data matrix 21 2.5 A coded single-answer question 21 2.6 A multiple-response question 21 2.7 The problem of measurement 30 2.8 A summate rating scale - customer satisfaction with service

provided 32 2.9 A Likert measurement 33 2.10 A semantic differential 35 2.11 A snake diagram 35 2.12 Summary of scale types 40 3.1 Telephone survey contact outcomes 56 3.2 The 'Data Editor' window 65 3.3 The completed data matrix 65 3.4 The 'Define Variable' dialog box 66 ii.1 Factors determining choice of technique 78 4.1 Bar chart: the age distribution of players 85 4.2 A horizontal bar chart: importance of perceived social benefits 86 4.3 A stacked bar chart: importance of perceived social benefits by

sex of player 86 4.4 A clustered bar chart: importance of perceived social benefits by

sex of player 87 4.5 Importance of aspects of playing table tennis 87 4.6 The social benefits of playing table tennis by individual case 87 4.7 A pie chart 88 4.8 Summaries of separate variable as a pie chart 88 4.9 The SPSS 'Frequencies' dialog box 89 4.10 The 'Crosstabs' dialog box 91 4.11 The 'Recode into Different Variables' dialog box 93 4.12 'Old and New Values' 93 4.13 The new recoded variable 94 5.1 A bar chart for continuous interval data 100 5.2 A histogram for continuous interval data 100 5.3 A bar chart for discrete interval data 101 5.4 A histogram for discrete interval data 101 5.5 A line graph of Figure 5.1 102 5.6 A scattergram of 'spend' by 'agebegan' 102 5.7 A histogram of 'age began' 103

xiii

Page 13: Data Construction and Data Analysis for Survey Research

xiv - List of Figures

5.8 5.9 5.10 5.11 5.12 6.1 7.1 7.2 7.3 7.4 7.5 7.6 7.7 9.1

10.1 10.2 11.1 12.1 12.2 12.3

A3.1 A3.2

A line graph of 'agebegan' The importance of various aspects of playing table tennis 'Agebegan' plotted by sex of respondent A scatterplot with separate markers for males and females A matrix scatterplot with three variables The 'Crosstabs: Statistics' dialog box The dishibution of 'agebegan' Three normal distributions with differing parameters The normal distribution Scattergram of X on Y A histogram for 'spend' A multi-dimensional map based on ranking in Table 7.15 The SPSS 'Frequencies: Statistics' dialog box A sampling dishibution of sample size n Critical regions: two-tail test Critical region: one-tail test A 'spurious' relationship A scattergram of 1 agebegan' by total satisfaction score A scatter gram of 1 spend' by total satisfaction score The 'Compute Variable' dialog box SPSS Release 10.0 'Variable View' Completed 'Variable View' for Figure 3.3

104 104 105 105 106 113 120 121 122 124 128 132 134 158 172 173 191 205 206 219 241 242

Page 14: Data Construction and Data Analysis for Survey Research

Preface

The idea for this book was occasioned by an uncounted, but certainly large, number of students (and in some cases colleagues) who over the years have come to see me and asked, 'OK, so now I've collected all my data, what do I do next?' or 'I've looked at a lot of statistics books, but I don't know which statistics to use to analyse my data' or 'I've used a set of 5-point rating scales, but how do I analyse the results?' or 'I've run off all these tables, but I've no idea how to make sense of them' or 'Do I add the neutral category to the agree or to the disagree group?' or 'What do I do with the don't knows?' or 'My sample is not really a random one, do I have to calculate tests of significance?' or 'Can I use a spreadsheet to analyse my survey data?'

This book is addressed to all these students - and to their tutors and supervisors who will no doubt appreciate having some literature that they can recommend. It should assist tutors and lecturers to give students some good advice. It might also help market and social researchers who have sought in vain for help on the 'nitty gritty' of analysing data from surveys they have undertaken.

The present structure of the book owes a lot to six anonymous reviewers who made many useful suggestions (although one of them did wonder whether I had recently had a disagreement with a statistician, given some of the jibes I had made at their expense in earlier drafts). Although I have toned down some of the comments, I could not bring myself to expunge them all, so I expect some flak from that quarter. H any of the reviewers read the final version of this book, I hope they will recognise that many of their comments have been taken on board. Needless to say, the reviewers did not all express the same viewpoint, so it was not possible to incorporate all their suggestions.

University of Stirling RAYMOND KENT

The author and publishers are grateful for pennission given by SPSS, St Andrews House, West Street, SU1Tey, to produce screen shots of a range of SPSS windows. This book is not sponsored or approved by SPSS and any errors are in no way the responsibility of SPSS.

XV