intro to statistics and spss. mean (average) median – the middle score (even number of scores or...

20
Intro to Statistics and SPSS

Upload: albert-lang

Post on 30-Dec-2015

226 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Intro to Statistics and SPSS. Mean (average) Median – the middle score (even number of scores or odd number of scores) Percent Rank (percentile) – calculates

Intro to Statistics and SPSS

Page 2: Intro to Statistics and SPSS. Mean (average) Median – the middle score (even number of scores or odd number of scores) Percent Rank (percentile) – calculates

• Mean (average)• Median – the middle score (even number of

scores or odd number of scores)• Percent Rank (percentile) – calculates the

position of a datapoint in a data set. More precisely, tells you approximately how many percent of the data is less than the datapoint.

• Range – difference between the maximum and minimum values in the data set

2

Page 3: Intro to Statistics and SPSS. Mean (average) Median – the middle score (even number of scores or odd number of scores) Percent Rank (percentile) – calculates

Lower quartile – or first quartile, it is the median of the data values in the lower half of a data set

Middle quartile – or second quartile, this is the overall median

Upper quartile – or third quartile, it is the median of the data values in the upper half of a data set

Quartiles may help in seeing the variation in a data set

3

Page 4: Intro to Statistics and SPSS. Mean (average) Median – the middle score (even number of scores or odd number of scores) Percent Rank (percentile) – calculates

For example (bank waiting times):

4

Big Bank: 4.1 5.2 5.6 6.2 6.7 7.2 7.7 7.7 8.5 9.3 11.0

Best Bank: 6.6 6.7 6.7 6.9 7.1 7.2 7.3 7.4 7.7 7.8 7.8

lower quartile median upper quartile

Big Bank range: 11.0 – 4.1 = 6.9Best Bank range: 7.8 – 6.6 = 1.2

Page 5: Intro to Statistics and SPSS. Mean (average) Median – the middle score (even number of scores or odd number of scores) Percent Rank (percentile) – calculates

The five number summary consists of:◦ The minimum value◦ The lower quartile (first quartile)◦ The median (second quartile)◦ The upper quartile (third quartile)◦ The maximum value

In SPSS (was called PASW), when viewing output, first quartile is 25th percentile, second quartile is 50th percentile, and third quartile is 75th percentile

5

Page 6: Intro to Statistics and SPSS. Mean (average) Median – the middle score (even number of scores or odd number of scores) Percent Rank (percentile) – calculates

Quartiles are OK for characterizing data, but standard deviation is preferred by statisticians

It is a measure of how far data values are spread around the mean of a data set

Std dev = sqrt(sum of (deviations from the mean)2 / total number of data values – 1)

Don’t calculate by hand, use SPSS (which we’ll do in a few minutes)

6

Page 7: Intro to Statistics and SPSS. Mean (average) Median – the middle score (even number of scores or odd number of scores) Percent Rank (percentile) – calculates

• A simple way to estimate standard deviation is the standard deviation estimate

• Divide the range by 4

• Watch for outliers. They can ruin your range estimate

• What is an outlier? Two or more standard deviations from the mean (plus OR minus)

7

Page 8: Intro to Statistics and SPSS. Mean (average) Median – the middle score (even number of scores or odd number of scores) Percent Rank (percentile) – calculates

• Go back to Big Bank / Best Bank example• Big Bank: range = 6.9• 6.9 / 4 = 1.7• Actual standard deviation is 1.96• Best Bank: range = 1.2• 1.2 / 4 = 0.3• Actual standard deviation is 0.44• Any outliers? Means are 7.2 and 6.7

Big Bank: 4.1 5.2 5.6 6.2 6.7 7.2 7.7 7.7 8.5 9.3 11.0Best Bank: 6.6 6.7 6.7 6.9 7.1 7.2 7.3 7.4 7.7 7.8 7.8

8

Page 9: Intro to Statistics and SPSS. Mean (average) Median – the middle score (even number of scores or odd number of scores) Percent Rank (percentile) – calculates

Nice way to view a data set

A histogram is a chart similar to a dotplot created by defining a set of bins and counting how many data points lie in each bin. Bars are drawn with height proportional to the number of data points in each bin.

9

Page 10: Intro to Statistics and SPSS. Mean (average) Median – the middle score (even number of scores or odd number of scores) Percent Rank (percentile) – calculates

10

Page 11: Intro to Statistics and SPSS. Mean (average) Median – the middle score (even number of scores or odd number of scores) Percent Rank (percentile) – calculates

While Excel can do some basic statistics, it is not considered a serious statistics tool

You really should use something like SPSS or SAS

We’ll use SPSS since DePaul has a site license

11

Page 12: Intro to Statistics and SPSS. Mean (average) Median – the middle score (even number of scores or odd number of scores) Percent Rank (percentile) – calculates

• Copy the dataset Grades.xls from the QRC website (OlderData) to My Documents and start SPSS (or try the file IncomeGaps.xls)

• Using SPSS, open the Grades.xls spreadsheet

• Change the variable names and make sure the data is numeric, not text

• Click on Analyze -> Descriptive Statistics -> Frequencies

12

Page 13: Intro to Statistics and SPSS. Mean (average) Median – the middle score (even number of scores or odd number of scores) Percent Rank (percentile) – calculates

13

Be careful! If the numeric fields in the dataset have any $, % or #, SPSS will have difficulty converting these to numeric

In particular, if the data has dollar signs, have SPSS first convert the field to Dollar, then convert it to Numeric (IncomeGaps.xls)

Page 14: Intro to Statistics and SPSS. Mean (average) Median – the middle score (even number of scores or odd number of scores) Percent Rank (percentile) – calculates

14

Using the grades for Exam 2, find the◦ 5 number summary (minimum, 1st quartile,

median, 3rd quartile, maximum)◦ Mean◦ Range◦ What is the standard deviation?

Page 15: Intro to Statistics and SPSS. Mean (average) Median – the middle score (even number of scores or odd number of scores) Percent Rank (percentile) – calculates

Let’s say you have just performed a survey. One of the questions you ask is, what type

of home computer Internet connection do you have?

Answers can be: none, dial-up, dsl, cable, other, not sure.

15

Page 16: Intro to Statistics and SPSS. Mean (average) Median – the middle score (even number of scores or odd number of scores) Percent Rank (percentile) – calculates

Here are some of your results

16

Respondent ID Cable Type 11111 no 11112 ds 11113 cm 11114 dk 11115 du 11116 du

Where no = none; ds = dsl; cm = cable modem; du = dial up; dk = don’t know; ot = other

Page 17: Intro to Statistics and SPSS. Mean (average) Median – the middle score (even number of scores or odd number of scores) Percent Rank (percentile) – calculates

• You can use SPSS to count the occurrences of data items, just like a pivot table

• Enter your data into SPSS• Click on Analyze / Descriptive Statistics /

Frequencies• Move the variable that you want to count

from the left box to the right box• Make sure Display Frequencies Table is

checked• Run it

17

Page 18: Intro to Statistics and SPSS. Mean (average) Median – the middle score (even number of scores or odd number of scores) Percent Rank (percentile) – calculates

Crosstabs are an extension of pivot tables Let’s say you have asked a number of

students: How many schools did you apply to?

You get results something like the following (in a spreadsheet):

18

Page 19: Intro to Statistics and SPSS. Mean (average) Median – the middle score (even number of scores or odd number of scores) Percent Rank (percentile) – calculates

19

Respondent ID Sex Number Schools1 F 22 M 63 F 14 F 45 M 96 M 107 F 38 F 29 F 710 M 5

Page 20: Intro to Statistics and SPSS. Mean (average) Median – the middle score (even number of scores or odd number of scores) Percent Rank (percentile) – calculates

Now open the data in SPSS Then pull down the menu Analyze and click

on Descriptive Statistics, then Crosstabs What variable do you want in the row? The

column? When ready, click OK to perform the

crosstab.

Let’s do the activity.

20