you’ve collected it … now what ? exploratory data analysis for not-very-big data

77
Telling Stories with your data

Upload: audra-conley

Post on 29-Dec-2015

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: You’ve collected it … NOW WHAT ? Exploratory data analysis for not-very-big data

Telling Stories with your data

Page 2: You’ve collected it … NOW WHAT ? Exploratory data analysis for not-very-big data

Graphs, Tables and Basic, Basic Statistics with SAS Enterprise

Guide ®

Page 3: You’ve collected it … NOW WHAT ? Exploratory data analysis for not-very-big data

You’ve collected it …NOW WHAT ?

Exploratory data analysis for

not-very-big data

Page 4: You’ve collected it … NOW WHAT ? Exploratory data analysis for not-very-big data

Our sampleData from the pilot study for Spirit Lake: The Game, an educational game for students in grades four through six

Page 5: You’ve collected it … NOW WHAT ? Exploratory data analysis for not-very-big data

Our Enterprise Guide Project

A first look at the data, Filter data setsTables of descriptive statistics

Page 6: You’ve collected it … NOW WHAT ? Exploratory data analysis for not-very-big data

Our Enterprise Guide Project

A second look

Cross-tabulations, Graphics, Summary tables T-test

Page 7: You’ve collected it … NOW WHAT ? Exploratory data analysis for not-very-big data

Let’s replace math class with a game ……. Incredibly, the schools went along with this!

Page 8: You’ve collected it … NOW WHAT ? Exploratory data analysis for not-very-big data

Our sites Two intervention schools, six classrooms

Three fourth-grade classes

Fifteen students (five each) from three fifth-grade classes

One control group school with one fourth-grade and one fifth-grade

All on the same American Indian reservation

Page 9: You’ve collected it … NOW WHAT ? Exploratory data analysis for not-very-big data

Exercise 1: ready

Figure 1.1

FILE> OPEN> DATA

Page 10: You’ve collected it … NOW WHAT ? Exploratory data analysis for not-very-big data

Figure 1.2

Page 11: You’ve collected it … NOW WHAT ? Exploratory data analysis for not-very-big data

Tools> options

Figure 1.3

Page 12: You’ve collected it … NOW WHAT ? Exploratory data analysis for not-very-big data

Results General> RTF under Result Formats

Figure 1.4

Page 13: You’ve collected it … NOW WHAT ? Exploratory data analysis for not-very-big data

Great. You have data and you are set to have pretty results.

What now?

Page 14: You’ve collected it … NOW WHAT ? Exploratory data analysis for not-very-big data

Tasks> describe> characterize data

Figure 1.5

ALWAYS DO THIS !!

Page 15: You’ve collected it … NOW WHAT ? Exploratory data analysis for not-very-big data

Just click through the windows and accept all of the defaults.

Page 16: You’ve collected it … NOW WHAT ? Exploratory data analysis for not-very-big data

Especially this one (you’ll find out why shortly)

Page 17: You’ve collected it … NOW WHAT ? Exploratory data analysis for not-very-big data

Real grown-up statisticians know ….

Look for errors in data

Look for missing data

Page 18: You’ve collected it … NOW WHAT ? Exploratory data analysis for not-very-big data

There is no teacher named “test”

This outlier with the perfect score wasn’t a student, it was the answer key!

Page 19: You’ve collected it … NOW WHAT ? Exploratory data analysis for not-very-big data

Filtering data

Page 20: You’ve collected it … NOW WHAT ? Exploratory data analysis for not-very-big data

Exercise 2: set

Figure 2.1

OR……

Page 21: You’ve collected it … NOW WHAT ? Exploratory data analysis for not-very-big data

Tasks > data > filter and sort

Figure 2.2

Page 22: You’ve collected it … NOW WHAT ? Exploratory data analysis for not-very-big data

Double arrows select all of the variables.

Figure 2.3

Page 23: You’ve collected it … NOW WHAT ? Exploratory data analysis for not-very-big data

Filter out records

Page 24: You’ve collected it … NOW WHAT ? Exploratory data analysis for not-very-big data

Click on the …

Page 25: You’ve collected it … NOW WHAT ? Exploratory data analysis for not-very-big data

Exercise 3: go!

Is 64% missing okay?

Page 26: You’ve collected it … NOW WHAT ? Exploratory data analysis for not-very-big data

Exercise 3: go!

Variable Label N NMiss Total Min MeanMedian Max StdMean

Gender Gender 67 21 99 1 1.4776 1.0 2 0.06148

Variable Label N NMiss Total Min MeanMedia

nMa

x StdMean

 Grade grade 83 5 362 4 4.314 4.0 5 0.05305

Table 3.1

Table 3.2

DATA

Page 27: You’ve collected it … NOW WHAT ? Exploratory data analysis for not-very-big data

We naively believed that there would be little attrition over a 10-

week period

"He's just gone".

Page 28: You’ve collected it … NOW WHAT ? Exploratory data analysis for not-very-big data

Lesson learned Real data analysis is

more than pointing and clicking, it’s learning to ask questions from the results you get and learn from the answers to those questions

Page 29: You’ve collected it … NOW WHAT ? Exploratory data analysis for not-very-big data

Data visualization at its most basic

Figure 3.1 Figure 3.2

Take Note!

Page 30: You’ve collected it … NOW WHAT ? Exploratory data analysis for not-very-big data

File > new note

Figure 3.3

Figure 3.4

Page 31: You’ve collected it … NOW WHAT ? Exploratory data analysis for not-very-big data

Right-click on note to link to procedure that inspired your note

Figure 3.4

Page 32: You’ve collected it … NOW WHAT ? Exploratory data analysis for not-very-big data

Exercise 4: table analysis- looking a little closer

TASKS> DESCRIBE> TABLE ANALYSIS

Figure 4.1

Page 33: You’ve collected it … NOW WHAT ? Exploratory data analysis for not-very-big data

CHANGING THE DATA SET

Page 34: You’ve collected it … NOW WHAT ? Exploratory data analysis for not-very-big data

PULL DOWN TO SELECT DATA SET

Page 35: You’ve collected it … NOW WHAT ? Exploratory data analysis for not-very-big data

Drag the variables, School and Grade, under the Table variables heading.

Page 36: You’ve collected it … NOW WHAT ? Exploratory data analysis for not-very-big data

Drag “School” to top of the table as your column variable and “grade” to the side as your row variable.

Page 37: You’ve collected it … NOW WHAT ? Exploratory data analysis for not-very-big data

Table of grade by School

grade(grade) School(School)FrequencyCol Pct

CONTROL EXPERIME Total4 15

55.5638

67.8653

5 1244.44

1832.14

30

Total 27 56 83

Frequency Missing = 5

Page 38: You’ve collected it … NOW WHAT ? Exploratory data analysis for not-very-big data

What of the students with complete data

Is there still a disproportion by grade?

Page 39: You’ve collected it … NOW WHAT ? Exploratory data analysis for not-very-big data

Right-click on the Table Analysis icon

Page 40: You’ve collected it … NOW WHAT ? Exploratory data analysis for not-very-big data

Drag the variable “missdata” under Group Analysis By

Page 41: You’ve collected it … NOW WHAT ? Exploratory data analysis for not-very-big data

Table of grade by School

grade(grade) School(School)FrequencyCol Pct

CONTROL EXPERIME Total4 4

80.0010

62.5014

5 120.00

637.50

7

Total 5 16 21

Frequency Missing = 5

TOO MUCH MISSING DATA MAKES PEOPLE GRUMPY

Page 42: You’ve collected it … NOW WHAT ? Exploratory data analysis for not-very-big data

WHO ARE THESE PEOPLE MISSING DATA?

Page 43: You’ve collected it … NOW WHAT ? Exploratory data analysis for not-very-big data

Figure 4.9

Page 44: You’ve collected it … NOW WHAT ? Exploratory data analysis for not-very-big data

What Happened? To insure anonymity and protect student data, we

never had the students' names – the teachers had a roster of students matched with username. The experimental school was able to fill in the blanks for all students missing grade data, the control group school did not.

Page 45: You’ve collected it … NOW WHAT ? Exploratory data analysis for not-very-big data

What We Did About It This isn't something we just waved our hands about

and moved on. It seriously concerned us. The degree of missing data overall disturbed us, as did the fact that we didn't seem to be able to get follow-up data. It concerned us enough that we hired a data coordinator on each reservation where we are testing in the upcoming year and will be analyzing the pretest data as it comes in and trying to update any missing data we can in the same week it is collected. This is the purpose of pilot studies, to find problems and fix them.

Page 46: You’ve collected it … NOW WHAT ? Exploratory data analysis for not-very-big data

What is an item analysis and how is it helpful?

Two types of item analysis.

Examination of the distribution of responses which choice the student selected “a”,”b”, “c” or “d” as the correct answer. (Look at CHARACTERIZE DATA plots)

Does one of the distractors gets selected more often than the correct answer?

Page 47: You’ve collected it … NOW WHAT ? Exploratory data analysis for not-very-big data

Item difficulty analysis

Examine what percentage of students answered each item correctly

A basic means of establishing test validity. One would expect that items at the second-grade level would have the lowest level of difficulty, items at fifth-grade level the highest difficulty, and be answered correctly by the fewest students.

If items are scored 0 = wrong, 1 = right, can use the means to see what percentage of students answered correctly.

Page 48: You’ve collected it … NOW WHAT ? Exploratory data analysis for not-very-big data

EXERCISE 5: ITEM ANALYSIS TWO WAYS WITH THE CHARACTERIZE DATA

TASK

Item Difficulty Analysis in Six (or fewer) Easy Steps

 

1. Click on the univariate statistics data set produced by the CHARACTERIZE DATA TASK to select it

2. From the top menu, select TASKS > DESCRIBE > LIST DATA

Page 49: You’ve collected it … NOW WHAT ? Exploratory data analysis for not-very-big data
Page 50: You’ve collected it … NOW WHAT ? Exploratory data analysis for not-very-big data

3. From the Variables to assign pane, select the ones you want in your report, in this case Variable, N, NMISS, Mean, Min and Max.

EDIT

Page 51: You’ve collected it … NOW WHAT ? Exploratory data analysis for not-very-big data

4. Select the records you want in your report.

Page 52: You’ve collected it … NOW WHAT ? Exploratory data analysis for not-very-big data

5. Format the columns in the report

Figure 5.4

Page 53: You’ve collected it … NOW WHAT ? Exploratory data analysis for not-very-big data

I click the CHANGE button next to format.

Then I click on Numeric for the format category. >

Page 54: You’ve collected it … NOW WHAT ? Exploratory data analysis for not-very-big data

6. Next, click Options and un-check the box next to Row Numbers

Figure 5.7

Page 55: You’ve collected it … NOW WHAT ? Exploratory data analysis for not-very-big data

Item N NMiss Mean Min Max

postsc2 68 20 0.88 0 1

postsc3 68 20 0.87 0 1

postsc4 68 20 0.81 0 1

sc3 82 6 0.8 0 1

sc2 82 6 0.78 0 1

sc4 82 6 0.78 0 1

postsc18 68 20 0.74 0 1

postsc8 68 20 0.68 0 1

postsc1 68 20 0.65 0 1

COPIED AND PASTED INTO EXCEL & SORTED

What does this table tell you?

Page 56: You’ve collected it … NOW WHAT ? Exploratory data analysis for not-very-big data

What data tells us Since items are in order of grade level, first few

items should be answered correctly by the most people.

Expected pattern holds for post-test and pre-test, although it's not perfect

A higher percentage of students answered the post-test questions correctly than the pretest, as we would hope

Unexpectedly, items 5 and 6 have some of the lowest percentage correct of any item

Page 57: You’ve collected it … NOW WHAT ? Exploratory data analysis for not-very-big data

EXERCISE 6: GRAPHING ITEM DIFFICULTY

Page 58: You’ve collected it … NOW WHAT ? Exploratory data analysis for not-very-big data

(If you forget the Sum variable, you'll just get a chart that shows each item occurred in the data set once. Not very helpful.)

Page 59: You’ve collected it … NOW WHAT ? Exploratory data analysis for not-very-big data

SELECT ONLY THE ITEMS OF INTEREST

Page 60: You’ve collected it … NOW WHAT ? Exploratory data analysis for not-very-big data

Click Layout tab under Appearance and select Descending Bar Height

Page 61: You’ve collected it … NOW WHAT ? Exploratory data analysis for not-very-big data

Figure 6.5

Page 62: You’ve collected it … NOW WHAT ? Exploratory data analysis for not-very-big data

Item Difficulty AnalysisPost-Test

Page 63: You’ve collected it … NOW WHAT ? Exploratory data analysis for not-very-big data

Compare with pre-test data

Right-click on the bar chart icon in process flow and select Modify.

Three modifications are needed: Click on the EDIT button and change the

filter. First click the X at the end of the row to delete the current filter. Then select “item” and “in a list” and the variables sc1 – sc24 for your items to chart.

Change the chart title.

Page 64: You’ve collected it … NOW WHAT ? Exploratory data analysis for not-very-big data

X axis must be the same, from 0 to 1 so you can compare the two charts

How to set the X axis. Click on Major Ticks and then under Major Horizontal

Ticks click Specify. In the input box on the top right, enter each of the major ticks you want (from 0, .2 to 1) and click ADD.

Page 65: You’ve collected it … NOW WHAT ? Exploratory data analysis for not-very-big data

Pre-test

Post-test

Page 66: You’ve collected it … NOW WHAT ? Exploratory data analysis for not-very-big data

Our story so far …

We've found one error, that the answer key was left in as a record.

We've seen that we have an issue with missing data that needs to be fixed

It appears that the test is reasonably reliable, although, of course, more sophisticated statistics are needed to examine that issue further.

We've also realized that we can't really compare the pretest and post-test since we have a large proportion of missing subjects. We need to match pre- and post-test scores.

Page 67: You’ve collected it … NOW WHAT ? Exploratory data analysis for not-very-big data

EXERCISE 7: GETTING DOWN TO BUSINESS WITH T-TESTS

TASKS > ANOVA > T-test

Page 68: You’ve collected it … NOW WHAT ? Exploratory data analysis for not-very-big data
Page 69: You’ve collected it … NOW WHAT ? Exploratory data analysis for not-very-big data
Page 70: You’ve collected it … NOW WHAT ? Exploratory data analysis for not-very-big data

Control GroupResults

Page 71: You’ve collected it … NOW WHAT ? Exploratory data analysis for not-very-big data

Table 7.1

Page 72: You’ve collected it … NOW WHAT ? Exploratory data analysis for not-very-big data

Experimental Group

Results

Page 73: You’ve collected it … NOW WHAT ? Exploratory data analysis for not-very-big data
Page 74: You’ve collected it … NOW WHAT ? Exploratory data analysis for not-very-big data

The moral of the story Experimental group improved more

Students who played the game less improved less

Differences were not explained by outliers in either group

There was a definite shift in the distribution of the experimental group only

Page 75: You’ve collected it … NOW WHAT ? Exploratory data analysis for not-very-big data

CONCLUSION Exploratory data analysis is a key first step

A few simple tasks in SAS Enterprise Guide can go a surprisingly long way

While exploring your data, it’s crucial to note the concerns raised, follow-up questions and policy recommendations that come out of your analysis

Page 76: You’ve collected it … NOW WHAT ? Exploratory data analysis for not-very-big data

Thank YouU.S. Department of

Agriculture

Small Business Innovation Research

Rural Economic Development

Teachers and students of the

Spirit Lake Dakota Nation

Page 77: You’ve collected it … NOW WHAT ? Exploratory data analysis for not-very-big data

ContactThe Julia Group/ 7 Generation Games

2111 7th St. #8

Santa Monica, CA 90405

(310) 717-9089

[email protected]

http://www.thejuliagroup.comwww.7generationgames.com