sw388r6 data analysis and computers i slide 1 comparing central tendency and variability across...

52
SW388R6 Data Analysis and Computers I Slide 1 Comparing Central Tendency and Variability across Groups Impact of Missing Data on Group Comparisons Sample Homework Problem Solving the Problem with SPSS Logic for Comparing Central Tendency and Variability Problems

Upload: dortha-cannon

Post on 08-Jan-2018

220 views

Category:

Documents


1 download

DESCRIPTION

SW388R6 Data Analysis and Computers I Slide 3 Impact of missing data on group comparisons - 2  When we compare the measures of central tendency and variability on multiple characteristics for groups, the issue of valid and missing data becomes more complex. For example, if we wanted to compare age, income, and education for males and females, we may get different values for the means and the standard deviations depending on how the analysis is conducted in SPSS.  SPSS can compute the statistics for each characteristic in a separate analysis, or it can compute the statistics for all variables in a single analysis.

TRANSCRIPT

Page 1: SW388R6 Data Analysis and Computers I Slide 1 Comparing Central Tendency and Variability across Groups Impact of Missing Data on Group Comparisons Sample

SW388R6Data Analysis

and Computers I

Slide 1

Comparing Central Tendency and Variability

across Groups

Impact of Missing Data on Group Comparisons

Sample Homework Problem

Solving the Problem with SPSS

Logic for Comparing Central Tendency and Variability Problems

Page 2: SW388R6 Data Analysis and Computers I Slide 1 Comparing Central Tendency and Variability across Groups Impact of Missing Data on Group Comparisons Sample

SW388R6Data Analysis

and Computers I

Slide 2

Impact of missing data on group comparisons - 1

When we analyze variables individually, we report on the valid and missing cases for each variable.

When we compare a measure of central tendency and variability for groups, we are analyzing two variables simultaneously: one which defines the groups, and one that represents the characteristic we are

describing.

We report statistics for the cases that have valid data on both variables.

Page 3: SW388R6 Data Analysis and Computers I Slide 1 Comparing Central Tendency and Variability across Groups Impact of Missing Data on Group Comparisons Sample

SW388R6Data Analysis

and Computers I

Slide 3

Impact of missing data on group comparisons - 2

When we compare the measures of central tendency and variability on multiple characteristics for groups, the issue of valid and missing data becomes more complex. For example, if we wanted to compare age, income, and education for males and females, we may get different values for the means and the standard deviations depending on how the analysis is conducted in SPSS.

SPSS can compute the statistics for each characteristic in a separate analysis, or it can compute the statistics for all variables in a single analysis.

Page 4: SW388R6 Data Analysis and Computers I Slide 1 Comparing Central Tendency and Variability across Groups Impact of Missing Data on Group Comparisons Sample

SW388R6Data Analysis

and Computers I

Slide 4

Impact of missing data on group comparisons - 3

SPSS uses two rules for deciding how to handle missing data for multiple variables: pairwise deletion of cases missing data, and listwise deletion of cases missing data.

The default rule that SPSS will use unless instructed otherwise is listwise deletion.

In listwise deletion, SPSS omits cases that were missing data for any of the variables included in the analysis. Using our example, SPSS would omit a case from its calculations if it was missing data on age, income, education, or sex.

Page 5: SW388R6 Data Analysis and Computers I Slide 1 Comparing Central Tendency and Variability across Groups Impact of Missing Data on Group Comparisons Sample

SW388R6Data Analysis

and Computers I

Slide 5

Impact of missing data on group comparisons - 4

In pairwise deletion, SPSS omits cases only if they are missing data for the two variables needed for a specific calculation. When using pairwise deletion to compute the mean age for males and females, a case would be omitted only if it were missing data for age or sex. Cases missing data for income and education, but not for age or sex, are included in the calculations.

Computing statistics using listwise deletion of missing cases may produce different answers than one would get using pairwise deletion.

Page 6: SW388R6 Data Analysis and Computers I Slide 1 Comparing Central Tendency and Variability across Groups Impact of Missing Data on Group Comparisons Sample

SW388R6Data Analysis

and Computers I

Slide 6

Impact of missing data on group comparisons - 5

If we can get two different values for the same statistic, the obvious question is which one is correct.

An argument can be made for listwise deletion that it is correct because the same cases are being used for all calculations.

An argument can be made for pairwise deletion that it better represents the value for the statistic because it makes use of more cases.

Page 7: SW388R6 Data Analysis and Computers I Slide 1 Comparing Central Tendency and Variability across Groups Impact of Missing Data on Group Comparisons Sample

SW388R6Data Analysis

and Computers I

Slide 7

Impact of missing data on group comparisons - 6

The problem can become even more complex when we use multiple SPSS procedures for the same set of variables. For example, we will use the “Explore” procedure to get measures for interval and ordinal variables, and the “Crosstabs” procedure to get the mode and modal percentage, which “Explore” does not compute.

If we include only two variables in each procedure, the measures would have the same value under pairwise or listwise deletion, because the list only includes one pair of variables.

Page 8: SW388R6 Data Analysis and Computers I Slide 1 Comparing Central Tendency and Variability across Groups Impact of Missing Data on Group Comparisons Sample

SW388R6Data Analysis

and Computers I

Slide 8

Impact of missing data on group comparisons - 7

We can force SPSS to omit cases listwise across procedures if we create a dichotomous variable that indicates whether or not a case had valid data for all variables, and selecting only those cases as a subset for the analysis.

While we will not do this in our problems, the method requires that we create a new variable, e.g. “nmissing” which uses the SPSS NMISS function to count the number of variables that are missing data for the specified list of variables. We would then select cases to be included if nmissing = 0.

Page 9: SW388R6 Data Analysis and Computers I Slide 1 Comparing Central Tendency and Variability across Groups Impact of Missing Data on Group Comparisons Sample

SW388R6Data Analysis

and Computers I

Slide 9

Impact of missing data on group comparisons - 7

In solving the problems for this assignment, we will follow the strategy of including the variables only two at a time in the SPSS procedures: one variable which defines the groups, and one variable that represents the characteristic we

are describing.

If you include multiple characteristic variables when you do the statistics in SPSS, you will probably get different answers than the ones stated in the problem and get the answer wrong.

Page 10: SW388R6 Data Analysis and Computers I Slide 1 Comparing Central Tendency and Variability across Groups Impact of Missing Data on Group Comparisons Sample

SW388R6Data Analysis

and Computers I

Slide 10

This problem uses the data set "GSS2000R.Sav" to compare the distribution of survey respondents who had not seen an x-rated movie in the last year to the distribution of survey respondents who had seen an x-rated movie in the last year for the variables: "highest year of school completed" [educ], "sex" [sex], "liberal or conservative political views" [polviews] and "frequency of attendance at religious services" [attend]. The groups are based on the variable "seen x-rated movie in last year" [xmovie].

The data available for this study included 136 survey respondents who had not seen an x-rated movie in the last year and 49 survey respondents who had seen an x-rated movie in the last year. Out of the total of 270 cases in the dataset, 85 were omitted because of missing data and 0 cases were in other categories of the variable "seen x-rated movie in last year" [xmovie].

Survey respondents who had not seen an x-rated movie in the last year had completed fewer years of school (M = 13.01) than survey respondents who had seen an x-rated movie in the last year (M = 13.53). The scores for highest year of school completed varied more widely for survey respondents who had not seen an x-rated movie in the last year (SD = 3.25) compared to the scores for survey respondents who had seen an x-rated movie in the last year (SD = 2.73).

Survey respondents who had not seen an x-rated movie in the last year were most likely to have been female (68.4%), while survey respondents who had seen an x-rated movie in the last year were most likely to have been male (69.4%).

Continued on next slide…

Homework problems: Comparing central tendency and variability

This is the general framework for the problems in the homework assignment on comparing central tendency and variability. The measures of central tendency and variability are used to compare and contrast two groups whose differences are important to the research being reported.

Page 11: SW388R6 Data Analysis and Computers I Slide 1 Comparing Central Tendency and Variability across Groups Impact of Missing Data on Group Comparisons Sample

SW388R6Data Analysis

and Computers I

Slide 11

Continued from previous slide

Survey respondents who had not seen an x-rated movie in the last year were most likely to describe their political views as moderate (41.7%), as were survey respondents who had seen an x-rated movie in the last year (37.8%).

Survey respondents who had not seen an x-rated movie in the last year attended religious services more often (Mdn = 3.00) than survey respondents who had seen an x-rated movie in the last year (Mdn = 2.00). The scores for frequency of attendance at religious services varied by the same amount for survey respondents who had not seen an x-rated movie in the last year (IQR = 5.00) compared to the scores for survey respondents who had seen an x-rated movie in the last year (IQR = 5.00).

o Trueo Falseo Inappropriate use of a statistic

Homework problems: Comparing central tendency and variability

Page 12: SW388R6 Data Analysis and Computers I Slide 1 Comparing Central Tendency and Variability across Groups Impact of Missing Data on Group Comparisons Sample

SW388R6Data Analysis

and Computers I

Slide 12

This problem uses the data set "GSS2000R.Sav" to compare the distribution of survey respondents who had not seen an x-rated movie in the last year to the distribution of survey respondents who had seen an x-rated movie in the last year for the variables: "highest year of school completed" [educ], "sex" [sex], "liberal or conservative political views" [polviews] and "frequency of attendance at religious services" [attend]. The groups are based on the variable "seen x-rated movie in last year" [xmovie].

The data available for this study included 136 survey respondents who had not seen an x-rated movie in the last year and 49 survey respondents who had seen an x-rated movie in the last year. Out of the total of 270 cases in the dataset, 85 were omitted because of missing data and 0 cases were in other categories of the variable "seen x-rated movie in last year" [xmovie].

Survey respondents who had not seen an x-rated movie in the last year had completed fewer years of school (M = 13.01) than survey respondents who had seen an x-rated movie in the last year (M = 13.53). The scores for highest year of school completed varied more widely for survey respondents who had not seen an x-rated movie in the last year (SD = 3.25) compared to the scores for survey respondents who had seen an x-rated movie in the last year (SD = 2.73).

Survey respondents who had not seen an x-rated movie in the last year were most likely to have been female (68.4%), while survey respondents who had seen an x-rated movie in the last year were most likely to have been male (69.4%).

Continued on next slide…

Homework problems: Data set, groups and variables

The first paragraph identifies:• The data set to use, e.g. GSS2000R.Sav• The groups to be compared in the analysis • The variables used as the descriptors of the

groups• The variable to use to create the groups

In this problem, the variable used to define groups has only two categories. If the grouping variable had more than two categories, the problem ignores the results for the categories not listed in the problem.

Page 13: SW388R6 Data Analysis and Computers I Slide 1 Comparing Central Tendency and Variability across Groups Impact of Missing Data on Group Comparisons Sample

SW388R6Data Analysis

and Computers I

Slide 13

This problem uses the data set "GSS2000R.Sav" to compare the distribution of survey respondents who had not seen an x-rated movie in the last year to the distribution of survey respondents who had seen an x-rated movie in the last year for the variables: "highest year of school completed" [educ], "sex" [sex], "liberal or conservative political views" [polviews] and "frequency of attendance at religious services" [attend]. The groups are based on the variable "seen x-rated movie in last year" [xmovie].

The data available for this study included 136 survey respondents who had not seen an x-rated movie in the last year and 49 survey respondents who had seen an x-rated movie in the last year. Out of the total of 270 cases in the dataset, 85 were omitted because of missing data and 0 cases were in other categories of the variable "seen x-rated movie in last year" [xmovie].

Survey respondents who had not seen an x-rated movie in the last year had completed fewer years of school (M = 13.01) than survey respondents who had seen an x-rated movie in the last year (M = 13.53). The scores for highest year of school completed varied more widely for survey respondents who had not seen an x-rated movie in the last year (SD = 3.25) compared to the scores for survey respondents who had seen an x-rated movie in the last year (SD = 2.73).

Survey respondents who had not seen an x-rated movie in the last year were most likely to have been female (68.4%), while survey respondents who had seen an x-rated movie in the last year were most likely to have been male (69.4%).

Continued on next slide…

Homework problems: Sample size

The second paragraph describes:• the number of cases in each group, • the number of total cases in the data set, • the number of cases with missing data,

and• the number of cases that were in other

categories of the grouping variable.

The answer to the problem can only be true if all of the numbers describing the groups and sample are correct. The number of cases in the analysis will be the number in the two groups mentioned in the problem statement.

Page 14: SW388R6 Data Analysis and Computers I Slide 1 Comparing Central Tendency and Variability across Groups Impact of Missing Data on Group Comparisons Sample

SW388R6Data Analysis

and Computers I

Slide 14

This problem uses the data set "GSS2000R.Sav" to compare the distribution of survey respondents who had not seen an x-rated movie in the last year to the distribution of survey respondents who had seen an x-rated movie in the last year for the variables: "highest year of school completed" [educ], "sex" [sex], "liberal or conservative political views" [polviews] and "frequency of attendance at religious services" [attend]. The groups are based on the variable "seen x-rated movie in last year" [xmovie].

The data available for this study included 136 survey respondents who had not seen an x-rated movie in the last year and 49 survey respondents who had seen an x-rated movie in the last year. Out of the total of 270 cases in the dataset, 85 were omitted because of missing data and 0 cases were in other categories of the variable "seen x-rated movie in last year" [xmovie].

Survey respondents who had not seen an x-rated movie in the last year had completed fewer years of school (M = 13.01) than survey respondents who had seen an x-rated movie in the last year (M = 13.53). The scores for highest year of school completed varied more widely for survey respondents who had not seen an x-rated movie in the last year (SD = 3.25) compared to the scores for survey respondents who had seen an x-rated movie in the last year (SD = 2.73).

Survey respondents who had not seen an x-rated movie in the last year were most likely to have been female (68.4%), while survey respondents who had seen an x-rated movie in the last year were most likely to have been male (69.4%).

Continued on next slide…

Homework problems: Comparing central tendency and variability

The remaining paragraphs describe each demographic characteristic in terms of central tendency and variability.

These paragraphs are written in the descriptive format similar to what would appear in a journal, rather than as tables of statistical values. This will require you to translate the SPSS output, variable, and value labels to more descriptive statements.

The statistics themselves are shown in parentheses at the end of the statements, using APA formatting style where a style has been defined. For example, “M” is defined as the correct abbreviation for the mean.

Page 15: SW388R6 Data Analysis and Computers I Slide 1 Comparing Central Tendency and Variability across Groups Impact of Missing Data on Group Comparisons Sample

SW388R6Data Analysis

and Computers I

Slide 15

Survey respondents who had not seen an x-rated movie in the last year had completed fewer years of school (M = 13.01) than survey respondents who had seen an x-rated movie in the last year (M = 13.53). The scores for highest year of school completed varied more widely for survey respondents who had not seen an x-rated movie in the last year (SD = 3.25) compared to the scores for survey respondents who had seen an x-rated movie in the last year (SD = 2.73).

Survey respondents who had not seen an x-rated movie in the last year were most likely to have been female (68.4%), while survey respondents who had seen an x-rated movie in the last year were most likely to have been male (69.4%).

Survey respondents who had not seen an x-rated movie in the last year were most likely to describe their political views as moderate (41.7%), as were survey respondents who had seen an x-rated movie in the last year (37.8%).

Survey respondents who had not seen an x-rated movie in the last year attended religious services more often (Mdn = 3.00) than survey respondents who had seen an x-rated movie in the last year (Mdn = 2.00). The scores for frequency of attendance at religious services varied by the same amount for survey respondents who had not seen an x-rated movie in the last year (IQR = 5.00) compared to the scores for survey respondents who had seen an x-rated movie in the last year (IQR = 5.00).

Homework problems: Comparing interval level variables

Comparison for an interval level variable that is not skewed, e.g. years of schooling, is done using the mean and the standard deviation of each group.

This comparison supports statements about which group had a higher score or greater variability on the variable. For example, we can say that one group has more years of education that the other group, and the distribution of scores was more or less varied.

If an interval level variable is badly skewed, the comparison is done with the median and the interquartile range, following the same rule which we used for individual variables.

Page 16: SW388R6 Data Analysis and Computers I Slide 1 Comparing Central Tendency and Variability across Groups Impact of Missing Data on Group Comparisons Sample

SW388R6Data Analysis

and Computers I

Slide 16

Survey respondents who had not seen an x-rated movie in the last year had completed fewer years of school (M = 13.01) than survey respondents who had seen an x-rated movie in the last year (M = 13.53). The scores for highest year of school completed varied more widely for survey respondents who had not seen an x-rated movie in the last year (SD = 3.25) compared to the scores for survey respondents who had seen an x-rated movie in the last year (SD = 2.73).

Survey respondents who had not seen an x-rated movie in the last year were most likely to have been female (68.4%), while survey respondents who had seen an x-rated movie in the last year were most likely to have been male (69.4%).

Survey respondents who had not seen an x-rated movie in the last year were most likely to describe their political views as moderate (41.7%), as were survey respondents who had seen an x-rated movie in the last year (37.8%).

Survey respondents who had not seen an x-rated movie in the last year attended religious services more often (Mdn = 3.00) than survey respondents who had seen an x-rated movie in the last year (Mdn = 2.00). The scores for frequency of attendance at religious services varied by the same amount for survey respondents who had not seen an x-rated movie in the last year (IQR = 5.00) compared to the scores for survey respondents who had seen an x-rated movie in the last year (IQR = 5.00).

Homework problems: Comparing ordinal level variables

For an ordinal level variable, e.g. frequency of church attendance, groups are compared using the values for the median and the interquartile range.

This comparison supports statements about which group had a higher score or greater variability on the variable. For example, we can say that one group went to church more often that the other group, and the distribution of scores was more or less spread out.

Page 17: SW388R6 Data Analysis and Computers I Slide 1 Comparing Central Tendency and Variability across Groups Impact of Missing Data on Group Comparisons Sample

SW388R6Data Analysis

and Computers I

Slide 17

Survey respondents who had not seen an x-rated movie in the last year had completed fewer years of school (M = 13.01) than survey respondents who had seen an x-rated movie in the last year (M = 13.53). The scores for highest year of school completed varied more widely for survey respondents who had not seen an x-rated movie in the last year (SD = 3.25) compared to the scores for survey respondents who had seen an x-rated movie in the last year (SD = 2.73).

Survey respondents who had not seen an x-rated movie in the last year were most likely to have been female (68.4%), while survey respondents who had seen an x-rated movie in the last year were most likely to have been male (69.4%).

Survey respondents who had not seen an x-rated movie in the last year were most likely to describe their political views as moderate (41.7%), as were survey respondents who had seen an x-rated movie in the last year (37.8%).

Survey respondents who had not seen an x-rated movie in the last year attended religious services more often (Mdn = 3.00) than survey respondents who had seen an x-rated movie in the last year (Mdn = 2.00). The scores for frequency of attendance at religious services varied by the same amount for survey respondents who had not seen an x-rated movie in the last year (IQR = 5.00) compared to the scores for survey respondents who had seen an x-rated movie in the last year (IQR = 5.00).

Homework problems: Comparing ordinal variables with many tie scores

“Political views” is also an ordinal variable, but it contains an excessive number of tied scores which compromise the meaning of the median and interquartile range as measures of central tendency and variablility.

When a variable has excessive ties, its mode and the percent of cases in the modal category is reported. The value label is used instead of the numeric code.

An ordinal variable will be considered to have excessive tie scores when the median has the same value as either the lower or upper bound of the interquartile range, following the same rule we used for central tendency and variability for individual variables.

Page 18: SW388R6 Data Analysis and Computers I Slide 1 Comparing Central Tendency and Variability across Groups Impact of Missing Data on Group Comparisons Sample

SW388R6Data Analysis

and Computers I

Slide 18

Survey respondents who had not seen an x-rated movie in the last year had completed fewer years of school (M = 13.01) than survey respondents who had seen an x-rated movie in the last year (M = 13.53). The scores for highest year of school completed varied more widely for survey respondents who had not seen an x-rated movie in the last year (SD = 3.25) compared to the scores for survey respondents who had seen an x-rated movie in the last year (SD = 2.73).

Survey respondents who had not seen an x-rated movie in the last year were most likely to have been female (68.4%), while survey respondents who had seen an x-rated movie in the last year were most likely to have been male (69.4%).

Survey respondents who had not seen an x-rated movie in the last year were most likely to describe their political views as moderate (41.7%), as were survey respondents who had seen an x-rated movie in the last year (37.8%).

Survey respondents who had not seen an x-rated movie in the last year attended religious services more often (Mdn = 3.00) than survey respondents who had seen an x-rated movie in the last year (Mdn = 2.00). The scores for frequency of attendance at religious services varied by the same amount for survey respondents who had not seen an x-rated movie in the last year (IQR = 5.00) compared to the scores for survey respondents who had seen an x-rated movie in the last year (IQR = 5.00).

Homework problems: Comparing nominal level variables

Nominal (including dichotomous) variables, e.g. sex, are compared using their modal category and the percent of cases in the modal category.

The value label is used for the modal category instead of the numeric code.

Page 19: SW388R6 Data Analysis and Computers I Slide 1 Comparing Central Tendency and Variability across Groups Impact of Missing Data on Group Comparisons Sample

SW388R6Data Analysis

and Computers I

Slide 19

Survey respondents who had not seen an x-rated movie in the last year had completed fewer years of school (M = 13.01) than survey respondents who had seen an x-rated movie in the last year (M = 13.53). The scores for highest year of school completed varied more widely for survey respondents who had not seen an x-rated movie in the last year (SD = 3.25) compared to the scores for survey respondents who had seen an x-rated movie in the last year (SD = 2.73).

Survey respondents who had not seen an x-rated movie in the last year were most likely to have been female (68.4%), while survey respondents who had seen an x-rated movie in the last year were most likely to have been male (69.4%).

Survey respondents who had not seen an x-rated movie in the last year were most likely to describe their political views as moderate (41.7%), as were survey respondents who had seen an x-rated movie in the last year (37.8%).

Survey respondents who had not seen an x-rated movie in the last year attended religious services more often (Mdn = 3.00) than survey respondents who had seen an x-rated movie in the last year (Mdn = 2.00). The scores for frequency of attendance at religious services varied by the same amount for survey respondents who had not seen an x-rated movie in the last year (IQR = 5.00) compared to the scores for survey respondents who had seen an x-rated movie in the last year (IQR = 5.00).

o Trueo Falseo Inappropriate use of a statistic

Homework problems: Choosing an answer

The answer to a problem will be True if all of the statements about the sample size, and the comparisons of central tendency and variability are correct, both in terms of the statistic selected and the value reported.

The answer to a problem will Inappropriate use of a statistic if the reported statistic violates the level of measurement criteria, i.e.:

• the mean and standard deviation are reported for an ordinal or nominal variable

• the median and interquartile range are reported for a nominal variable.

The answer to a problem will be False if a wrong value is reported for the sample size or for a statistic, or the wrong statistic is reported but the level of measurement criteria are not violated.

Page 20: SW388R6 Data Analysis and Computers I Slide 1 Comparing Central Tendency and Variability across Groups Impact of Missing Data on Group Comparisons Sample

SW388R6Data Analysis

and Computers I

Slide 20

Solving the problem with SPSS: Checking the number of cases - 1

Select the Descriptive Statistics > Frequencies… command from the Analysis menu.

Our first task is to use a frequency distribution to verify the number of cases in both groups to check the statement in the problem that:The data available for this study included 136 survey respondents who had not seen an x-rated movie in the last year and 49 survey respondents who had seen an x-rated movie in the last year. Out of the total of 270 cases in the dataset, 85 were omitted because of missing data and 0 cases were in other categories of the variable "seen x-rated movie in last year" [xmovie].

Page 21: SW388R6 Data Analysis and Computers I Slide 1 Comparing Central Tendency and Variability across Groups Impact of Missing Data on Group Comparisons Sample

SW388R6Data Analysis

and Computers I

Slide 21

Solving the problem with SPSS: Checking the number of cases - 2

In the Frequencies dialog box, we move the variable used to define the groups, xmovie, to the Variable(s): list box.

Since all we want is the frequency distribution, we click on the OK button to generate the output.

Page 22: SW388R6 Data Analysis and Computers I Slide 1 Comparing Central Tendency and Variability across Groups Impact of Missing Data on Group Comparisons Sample

SW388R6Data Analysis

and Computers I

Slide 22

Solving the problem with SPSS:Checking the number of cases - 3

The data available for this study included 136 survey respondents who had not seen an x-rated movie in the last year and 49 survey respondents who had seen an x-rated movie in the last year. Out of the total of 270 cases in the dataset, 85 were omitted because of missing data and 0 cases were in other categories of the variable "seen x-rated movie in last year" [xmovie].

As we can see in the frequency table, each of these numbers is correct.

Page 23: SW388R6 Data Analysis and Computers I Slide 1 Comparing Central Tendency and Variability across Groups Impact of Missing Data on Group Comparisons Sample

SW388R6Data Analysis

and Computers I

Slide 23

Solving the problem with SPSS: Generating the output - 1

Select the Descriptive Statistics > Explore… command from the Analysis menu.

We will use the Explore procedure to generate the measures of central tendency and variability that we need to evaluate the statements about the individual demographic variables.The Explore procedure gives us the output we need to solve the measures of central tendency and variability needed for interval and ordinal variables. To get the mode and modal percent for nominal level variables, we will use the Crosstabs procedure.

Page 24: SW388R6 Data Analysis and Computers I Slide 1 Comparing Central Tendency and Variability across Groups Impact of Missing Data on Group Comparisons Sample

SW388R6Data Analysis

and Computers I

Slide 24

Solving the problem with SPSS: Generating the output - 2

First, In the Explore dialog box, we move the first variable we want to compare, educ, to the Dependent List list box.

Second, we move the variable defining the groups, xmovie, to the Factor List list box.

Fourth, we click on the Statistics… button to select specific statistics.

Third, we click the Statistics option button to limit the output displayed by SPSS.

Following the discussion about missing data above, we will analyze characteristics one at a time.

Page 25: SW388R6 Data Analysis and Computers I Slide 1 Comparing Central Tendency and Variability across Groups Impact of Missing Data on Group Comparisons Sample

SW388R6Data Analysis

and Computers I

Slide 25

Solving the problem with SPSS: Generating the output - 3

In the Explore: Statistics dialog box, we can only select the general category of Descriptives. We do not have an option to specify individual measures.

While Descriptives will include the interquartile range, it does not include the values of the first and third quartile, which we need to identify excessive ties for an ordinal variable. To get the quartiles, we mark the Percentiles check box.

When we have marked the options we want, we click on the Continue button to close the dialog.

Page 26: SW388R6 Data Analysis and Computers I Slide 1 Comparing Central Tendency and Variability across Groups Impact of Missing Data on Group Comparisons Sample

SW388R6Data Analysis

and Computers I

Slide 26

Solving the problem with SPSS: Generating the output - 4

Having selected the statistics we want, we click on the OK button to generate the output.

Page 27: SW388R6 Data Analysis and Computers I Slide 1 Comparing Central Tendency and Variability across Groups Impact of Missing Data on Group Comparisons Sample

SW388R6Data Analysis

and Computers I

Slide 27

Solving the problem with SPSS: Statistical comparison of education - 1

The skewness for survey respondents who had not seen an x-rated movie in the last year is -0.33, and the skewness for survey respondents who had seen an x-rated movie in the last year is 0.45. The skewness for both groups falls between -1 and +1.

The mean and standard deviation should be reported as the measures of central tendency and variability for education

Since educ is an interval level variable, we check the skewness of the distribution to determine when we report the mean or median as the measure of central tendency.

Page 28: SW388R6 Data Analysis and Computers I Slide 1 Comparing Central Tendency and Variability across Groups Impact of Missing Data on Group Comparisons Sample

SW388R6Data Analysis

and Computers I

Slide 28

Solving the problem with SPSS: Statistical comparison of education - 2

The statement that survey respondents who had not seen an x-rated movie in the last year had completed fewer years of school (M = 13.01) than survey respondents who had seen an x-rated movie in the last year (M = 13.53) is correct.

Page 29: SW388R6 Data Analysis and Computers I Slide 1 Comparing Central Tendency and Variability across Groups Impact of Missing Data on Group Comparisons Sample

SW388R6Data Analysis

and Computers I

Slide 29

Solving the problem with SPSS: Statistical comparison of education - 3

The statement that The scores for highest year of school completed varied more widely for survey respondents who had not seen an x-rated movie in the last year (SD = 3.25) compared to the scores for survey respondents who had seen an x-rated movie in the last year (SD = 2.73) is also correct.

Page 30: SW388R6 Data Analysis and Computers I Slide 1 Comparing Central Tendency and Variability across Groups Impact of Missing Data on Group Comparisons Sample

SW388R6Data Analysis

and Computers I

Slide 30

Generating the output for sex - 1

SPSS provides a short cut for us to use when we want to run the same procedure again. Position the mouse over the Dialog Recall tool button on the tool bar.

Next, we will compute central tendency for the dichotomous variable, sex.

Page 31: SW388R6 Data Analysis and Computers I Slide 1 Comparing Central Tendency and Variability across Groups Impact of Missing Data on Group Comparisons Sample

SW388R6Data Analysis

and Computers I

Slide 31

Generating the output for sex - 2

Click the mouse on the Dialog Recall tool button. A drop-down menu listing the last procedures run appear. Click on the Explore item at the top of the menu.

Page 32: SW388R6 Data Analysis and Computers I Slide 1 Comparing Central Tendency and Variability across Groups Impact of Missing Data on Group Comparisons Sample

SW388R6Data Analysis

and Computers I

Slide 32

Generating the output for sex - 3

Click on the left arrow button to remove educ from the Dependent List.

Page 33: SW388R6 Data Analysis and Computers I Slide 1 Comparing Central Tendency and Variability across Groups Impact of Missing Data on Group Comparisons Sample

SW388R6Data Analysis

and Computers I

Slide 33

Generating the output for sex - 4

Move sex to the Dependent List.

Since we only want to change the variable being analyzed and keep all of the options we previously specified, we click on the OK button.

Page 34: SW388R6 Data Analysis and Computers I Slide 1 Comparing Central Tendency and Variability across Groups Impact of Missing Data on Group Comparisons Sample

SW388R6Data Analysis

and Computers I

Slide 34

Solving the problem with SPSS: Statistical comparison of sex - 1

The Explore procedure does not supply us with any information about central tendency for a nominal level variable.We will use the Crosstabs procedure to create a contingency table.

Page 35: SW388R6 Data Analysis and Computers I Slide 1 Comparing Central Tendency and Variability across Groups Impact of Missing Data on Group Comparisons Sample

SW388R6Data Analysis

and Computers I

Slide 35

Generating more output for sex - 1

Select the Descriptive Statistics > Crosstab… command from the Analysis menu.

Page 36: SW388R6 Data Analysis and Computers I Slide 1 Comparing Central Tendency and Variability across Groups Impact of Missing Data on Group Comparisons Sample

SW388R6Data Analysis

and Computers I

Slide 36

Generating more output for sex - 2

First, move the variable for the characteristic we want to analyze, sex, to the Row(s) list box.

Second, move the group variable, xmovie, to the Column(s) list box.

Third, click on the Cells button to specify what will appear in the cells of the crosstabulated table.

To keep from confusing myself, I always create crosstabs tables with the grouping variable (independent variable) in the columns and the characteristic variable (dependent variable) in the rows. The mode for each group will be on the row that has the largest percentage within the column.

Page 37: SW388R6 Data Analysis and Computers I Slide 1 Comparing Central Tendency and Variability across Groups Impact of Missing Data on Group Comparisons Sample

SW388R6Data Analysis

and Computers I

Slide 37

Generating more output for sex - 3

Accept the default Observed check box so that the table contains the tally of cases in each cell.

Mark the check box for Column percentages. The cell with the largest percentage in each column is the mode for the group specified in the column, and the percentage in that cell is the modal percent.

Click on the Continue button to close the dialog box.

Page 38: SW388R6 Data Analysis and Computers I Slide 1 Comparing Central Tendency and Variability across Groups Impact of Missing Data on Group Comparisons Sample

SW388R6Data Analysis

and Computers I

Slide 38

Generating more output for sex - 4

Click on the OK button to request the output.

Page 39: SW388R6 Data Analysis and Computers I Slide 1 Comparing Central Tendency and Variability across Groups Impact of Missing Data on Group Comparisons Sample

SW388R6Data Analysis

and Computers I

Slide 39

Solving the problem with SPSS: Statistical comparison of sex - 1

The mode for subjects who saw an x-rated movie is “1 MALE” because the largest percentage, (69.4%) in the “1 YES” column is located on that row.

The mode for subjects who did not see an x-rated movie is “2 FEMALE” because the largest percentage, (68.4%) in the “0 NO” column is located on that row.

The statement that survey respondents who had not seen an x-rated movie in the last year were most likely to have been female (68.4%), while survey respondents who had seen an x-rated movie in the last year were most likely to have been male (69.4%) is correct.

Page 40: SW388R6 Data Analysis and Computers I Slide 1 Comparing Central Tendency and Variability across Groups Impact of Missing Data on Group Comparisons Sample

SW388R6Data Analysis

and Computers I

Slide 40

Generating the output for political views - 1

After selecting Explore from the Dialog Recall menu, remove the variable sex from the Dependent List and move the variable polviews to the Dependent List.

Since we only want to change the variable being analyzed and keep all of the options we previously specified, we click on the OK button.

Page 41: SW388R6 Data Analysis and Computers I Slide 1 Comparing Central Tendency and Variability across Groups Impact of Missing Data on Group Comparisons Sample

SW388R6Data Analysis

and Computers I

Slide 41

Solving the problem with SPSS: Statistical comparison of political views -

1

Since polviews is an ordinal level variable, we identify the medians and interquartile ranges for the two groups. We will compare these as our measures of central tendency and variability, provided that there are not excessive ties in either group for the medians of 4.00.

Page 42: SW388R6 Data Analysis and Computers I Slide 1 Comparing Central Tendency and Variability across Groups Impact of Missing Data on Group Comparisons Sample

SW388R6Data Analysis

and Computers I

Slide 42

Solving the problem with SPSS: Statistical comparison of political views -

2Since the first quartile is the same as the 25th percentile and the third quartile is the 75th percentile, we can use the Percentiles table to detect excessive ties.

We will use the row for Weighted Average (the SPSS default) as the calculation for percentiles. Tukey’s hinges are the percentiles used for box plots and may differ from other calculations for percentile.

The first quartile for the group which saw an x-rated movie, 4.00, is the same value as the median for the group, 4.00, indicating excessive ties. The mode should be used as the measure of central tendency rather than the median.

Page 43: SW388R6 Data Analysis and Computers I Slide 1 Comparing Central Tendency and Variability across Groups Impact of Missing Data on Group Comparisons Sample

SW388R6Data Analysis

and Computers I

Slide 43

Generating more output for political views - 1

To generate the mode for political views, we use the Crosstabs procedure, which we used for sex.Click on the Dialog Recall tool button and select Crosstabs from the drop-down menu.

Page 44: SW388R6 Data Analysis and Computers I Slide 1 Comparing Central Tendency and Variability across Groups Impact of Missing Data on Group Comparisons Sample

SW388R6Data Analysis

and Computers I

Slide 44

Generating more output for political views - 2

Remove the variable sex from the Row(s) list box and move the variable polviews to the Row(s) list box.

Since we only want to change the variable being analyzed and keep all of the options we previously specified, we click on the OK button.

Page 45: SW388R6 Data Analysis and Computers I Slide 1 Comparing Central Tendency and Variability across Groups Impact of Missing Data on Group Comparisons Sample

SW388R6Data Analysis

and Computers I

Slide 45

Solving the problem with SPSS: Statistical comparison of political views -

1

The statement that survey respondents who had not seen an x-rated movie in the last year were most likely to describe their political views as moderate (41.7%), as were survey respondents who had seen an x-rated movie in the last year (37.8%) is correct.

The mode for subjects who saw an x-rated movie is “4 MODERATE” because the largest percentage, (37.8%) in the “1 YES” column is located on that row.

The mode for subjects who did not see an x-rated movie is “4 MODERATE” because the largest percentage, (41.7%) in the “0 NO” column is located on that row.

Page 46: SW388R6 Data Analysis and Computers I Slide 1 Comparing Central Tendency and Variability across Groups Impact of Missing Data on Group Comparisons Sample

SW388R6Data Analysis

and Computers I

Slide 46

Generating output for church attendance - 1

After selecting Explore from the Dialog Recall menu, remove the variable polviews from the Dependent List and move the variable attend to the Dependent List.

Since we only want to change the variable being analyzed and keep all of the options we previously specified, we click on the OK button.

Page 47: SW388R6 Data Analysis and Computers I Slide 1 Comparing Central Tendency and Variability across Groups Impact of Missing Data on Group Comparisons Sample

SW388R6Data Analysis

and Computers I

Slide 47

Solving the problem with SPSS: Comparison of church attendance - 1

Since attend is an ordinal level variable, we identify the medians and interquartile ranges for the two groups. We will compare these as our measures of central tendency and variability, provided that there are not excessive ties in either group for the medians of 3.00 and 2.00.

Page 48: SW388R6 Data Analysis and Computers I Slide 1 Comparing Central Tendency and Variability across Groups Impact of Missing Data on Group Comparisons Sample

SW388R6Data Analysis

and Computers I

Slide 48

Solving the problem with SPSS: Comparison of church attendance - 2

Since the first quartile is the same as the 25th percentile and the third quartile is the 75th percentile, we can use the Percentiles table to detect excessive ties.

The statement that survey respondents who had not seen an x-rated movie in the last year attended religious services more often (Mdn = 3.00) than survey respondents who had seen an x-rated movie in the last year (Mdn = 2.00) is correct. The statement that The scores for frequency of attendance at religious services varied by the same amount for survey respondents who had not seen an x-rated movie in the last year (IQR = 5.00) compared to the scores for survey respondents who had seen an x-rated movie in the last year (IQR = 5.00) is also correct.

Neither the first quartile or the third quartiles match the median for either group, so we will report the median and interquartile range.

Since all of the reported statistics were correctly chosen and reported, the answer to the overall problem is True.

Page 49: SW388R6 Data Analysis and Computers I Slide 1 Comparing Central Tendency and Variability across Groups Impact of Missing Data on Group Comparisons Sample

SW388R6Data Analysis

and Computers I

Slide 49

Logic for homework problems: Comparing central tendency and

variability - 1

Number of valid and missing cases

correct?

Yes

NoFalse

Measurement level of

variable?

Ordinal Nominal (dichotomous)Interval

The logic for the problems in this assignment is the same as the logic for central tendency and variability except for the requirement at the end that the comparative statement must be correct as well as the reported statistical values.

Page 50: SW388R6 Data Analysis and Computers I Slide 1 Comparing Central Tendency and Variability across Groups Impact of Missing Data on Group Comparisons Sample

SW388R6Data Analysis

and Computers I

Slide 50

Logic for homework problems: Comparing central tendency and

variability 2Interval/ratio

No

Skewed?

False

Mean/St.Dev.

reported?

Yes

Median/IQR

reported? No

FalseNo

A variable is skewed if its skewness is not between -1.0 and + 1.0.

Modereported?

Mode is legitimate for interval variables, but not meaningful unless values are grouped.

Homework problems do not include modes for interval variables.

Yes

True

Yes Yes

Correct values and

comparison?

False

No

Page 51: SW388R6 Data Analysis and Computers I Slide 1 Comparing Central Tendency and Variability across Groups Impact of Missing Data on Group Comparisons Sample

SW388R6Data Analysis

and Computers I

Slide 51

Logic for homework problems: Comparing central tendency and

variability - 3Ordinal

Mean/St.Dev.

reported? No

Median/IQR

reported?

Yes

False

No

Excessive ties?

Inappropriate application of

a statistic

Modereported

?

False

No

Yes

Excessive ties occur when the median is equal to either the lower or upper bound of the IQR.

Yes

True

Yes Yes

Correct values and

comparison?

False

No

Page 52: SW388R6 Data Analysis and Computers I Slide 1 Comparing Central Tendency and Variability across Groups Impact of Missing Data on Group Comparisons Sample

SW388R6Data Analysis

and Computers I

Slide 52

Logic for homework problems: Comparing central tendency and

variability - 4

Modereported

?

Yes

No

Nominal (dichotomous)

Median/IQR

reported?

Mean/St.Dev.

reported?

Inappropriate application of

a statistic

Yes

Inappropriate application of

a statistic

Yes

No

Yes

True

Correct values and

comparison?

False

No