seeing is believing: telling stories with statistics in ...telling stories with statistics ......

76
SEEING IS BELIEVING: Telling stories with statistics in pictures

Upload: others

Post on 22-May-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: SEEING IS BELIEVING: Telling stories with statistics in ...Telling stories with statistics ... visual data just as we read numeric data, and we need to learn to have some discussion

SEEING IS BELIEVING:Telling stories with statistics – in pictures

Page 2: SEEING IS BELIEVING: Telling stories with statistics in ...Telling stories with statistics ... visual data just as we read numeric data, and we need to learn to have some discussion

We’re failing

Page 3: SEEING IS BELIEVING: Telling stories with statistics in ...Telling stories with statistics ... visual data just as we read numeric data, and we need to learn to have some discussion

Do you see the same thing here?

Gender Male Female

Military -------------- ---------

No 943 1,222

Yes 227 72

Page 4: SEEING IS BELIEVING: Telling stories with statistics in ...Telling stories with statistics ... visual data just as we read numeric data, and we need to learn to have some discussion

This is your brain on statistics

Gender Male Female

Military -------------- ---------

No 943 1,222

Yes 227 72

Page 5: SEEING IS BELIEVING: Telling stories with statistics in ...Telling stories with statistics ... visual data just as we read numeric data, and we need to learn to have some discussion

The total sample is (roughly) evenly divided by gender.

Subtracting 72 from the 150 one would expect gives a value of about 80, which squared is 6,400.

It is already obvious this is significant.

Page 6: SEEING IS BELIEVING: Telling stories with statistics in ...Telling stories with statistics ... visual data just as we read numeric data, and we need to learn to have some discussion

Just for closure ..

e o (e-o) ^2 ((e-o)^2)/e

157 72 7225 46.01910828

142 227 7225 50.88028169

1028 943 7225 7.028210117

1137 1222 7225 6.354441513

110.2820416

Page 7: SEEING IS BELIEVING: Telling stories with statistics in ...Telling stories with statistics ... visual data just as we read numeric data, and we need to learn to have some discussion

Seeing is a learned skill

Statisticians may see things in a picture others don’t

Page 8: SEEING IS BELIEVING: Telling stories with statistics in ...Telling stories with statistics ... visual data just as we read numeric data, and we need to learn to have some discussion

My points

(surprisingly, I do have some)

Page 9: SEEING IS BELIEVING: Telling stories with statistics in ...Telling stories with statistics ... visual data just as we read numeric data, and we need to learn to have some discussion

Data Visualization

Graphics do not necessarily stand alone

Page 10: SEEING IS BELIEVING: Telling stories with statistics in ...Telling stories with statistics ... visual data just as we read numeric data, and we need to learn to have some discussion

Data visualization is all around us.

Page 11: SEEING IS BELIEVING: Telling stories with statistics in ...Telling stories with statistics ... visual data just as we read numeric data, and we need to learn to have some discussion

Visual representation in one context is often misapplied to another.

Page 12: SEEING IS BELIEVING: Telling stories with statistics in ...Telling stories with statistics ... visual data just as we read numeric data, and we need to learn to have some discussion

Atomic numbers on your socks?

Page 13: SEEING IS BELIEVING: Telling stories with statistics in ...Telling stories with statistics ... visual data just as we read numeric data, and we need to learn to have some discussion

Data visualization needs to ADD information

Page 14: SEEING IS BELIEVING: Telling stories with statistics in ...Telling stories with statistics ... visual data just as we read numeric data, and we need to learn to have some discussion

Basic Assumptions

• Our audience needs to be taught to read visual data just as we read numeric data, and we need to learn to have some discussion beyond the choices of line graphs vs. pie charts

Page 15: SEEING IS BELIEVING: Telling stories with statistics in ...Telling stories with statistics ... visual data just as we read numeric data, and we need to learn to have some discussion

YOU NEED TO LEARN TO WRITE PICTURES

You learned to read numbers

Or, to be more specific, you need to explain to others what you see in pictures

Page 16: SEEING IS BELIEVING: Telling stories with statistics in ...Telling stories with statistics ... visual data just as we read numeric data, and we need to learn to have some discussion

?

Question + Data > Picture = Story

Page 17: SEEING IS BELIEVING: Telling stories with statistics in ...Telling stories with statistics ... visual data just as we read numeric data, and we need to learn to have some discussion

Bad visualization for one question can be good for another

• Who will win the election?

• Which regions support the Democrats?

Poll dataset did not include Hawaii or Alaska

Page 18: SEEING IS BELIEVING: Telling stories with statistics in ...Telling stories with statistics ... visual data just as we read numeric data, and we need to learn to have some discussion

DATA VISUALIZATION BY EXAMPLE

AN EXAMPLE OF PROGRAM EVALUATION

Page 19: SEEING IS BELIEVING: Telling stories with statistics in ...Telling stories with statistics ... visual data just as we read numeric data, and we need to learn to have some discussion

The government is smarter than you think

(No, I’m serious)

Page 20: SEEING IS BELIEVING: Telling stories with statistics in ...Telling stories with statistics ... visual data just as we read numeric data, and we need to learn to have some discussion

Was the program implemented as planned?

Page 21: SEEING IS BELIEVING: Telling stories with statistics in ...Telling stories with statistics ... visual data just as we read numeric data, and we need to learn to have some discussion

Was the program implemented as planned?

Page 22: SEEING IS BELIEVING: Telling stories with statistics in ...Telling stories with statistics ... visual data just as we read numeric data, and we need to learn to have some discussion
Page 23: SEEING IS BELIEVING: Telling stories with statistics in ...Telling stories with statistics ... visual data just as we read numeric data, and we need to learn to have some discussion

Was the program implemented as planned?

Page 24: SEEING IS BELIEVING: Telling stories with statistics in ...Telling stories with statistics ... visual data just as we read numeric data, and we need to learn to have some discussion

Did the program work?

Page 25: SEEING IS BELIEVING: Telling stories with statistics in ...Telling stories with statistics ... visual data just as we read numeric data, and we need to learn to have some discussion

GOPTIONS HBY = 2 ;PROC GPLOT

DATA=wussexampleUNIFORM; PLOT z_total_post * z_total_pre / VREF=0 ;BY group;

Page 26: SEEING IS BELIEVING: Telling stories with statistics in ...Telling stories with statistics ... visual data just as we read numeric data, and we need to learn to have some discussion

EQUATIONS IN THE SAS LOG FOR THE STATISTICIAN IN YOU

NOTE: Regression equation : z_total_post = 0.13379 + 0.776552*z_total_pre.

NOTE: The above message was for the following BY group: group=CONTROL

NOTE: Regression equation : z_total_post = 1.233616 + 0.578418*z_total_pre.

NOTE: The above message was for the following BY group: group=EXPERIMENTAL

Page 27: SEEING IS BELIEVING: Telling stories with statistics in ...Telling stories with statistics ... visual data just as we read numeric data, and we need to learn to have some discussion

Same plot in JMP

Page 28: SEEING IS BELIEVING: Telling stories with statistics in ...Telling stories with statistics ... visual data just as we read numeric data, and we need to learn to have some discussion

Is the intervention successful under all conditions?

Page 29: SEEING IS BELIEVING: Telling stories with statistics in ...Telling stories with statistics ... visual data just as we read numeric data, and we need to learn to have some discussion

TRAINING WAS ADMINISTERED TO FOUR COHORTS

Admittedly, we did not train people while flying on a trapeze

Page 30: SEEING IS BELIEVING: Telling stories with statistics in ...Telling stories with statistics ... visual data just as we read numeric data, and we need to learn to have some discussion
Page 31: SEEING IS BELIEVING: Telling stories with statistics in ...Telling stories with statistics ... visual data just as we read numeric data, and we need to learn to have some discussion

Creating the interaction graph

First, in the RESULTS window, type

sgedit on

Page 32: SEEING IS BELIEVING: Telling stories with statistics in ...Telling stories with statistics ... visual data just as we read numeric data, and we need to learn to have some discussion

Creating the interaction graph

First, in the RESULTS window, type

sgedit on

Ods listing sge = on ;

Ods graphics on ;

proc glm data = plots ;

class TestType cohort ;

model z_total = TestType cohort TestType*cohort ;

where group = "EXPERIMENTAL" ;

Page 33: SEEING IS BELIEVING: Telling stories with statistics in ...Telling stories with statistics ... visual data just as we read numeric data, and we need to learn to have some discussion

Click on the sge plot to edit it

Page 34: SEEING IS BELIEVING: Telling stories with statistics in ...Telling stories with statistics ... visual data just as we read numeric data, and we need to learn to have some discussion

ODDLY, THE MOST TIME-CONSUMING PART OF THIS IS MAKING THE LINES THICKER

Of course, that is kind of like being the smaller midget

Page 35: SEEING IS BELIEVING: Telling stories with statistics in ...Telling stories with statistics ... visual data just as we read numeric data, and we need to learn to have some discussion

Using SGEDIT to, well, edit

1. Double-click on the .sgefile in the RESULTS window

2. Right-click in the plot area & select PLOT PROPERTIES

3. Select desired line thickness

Page 36: SEEING IS BELIEVING: Telling stories with statistics in ...Telling stories with statistics ... visual data just as we read numeric data, and we need to learn to have some discussion

THANKS FOR ASKING!

Yes, the TestType*Cohort*Group interaction (F=5.84, p < .0001) AND the TestType*Group interaction (F=22.92, p < 0001) in the other repeated measures ANOVA were significant.

Page 37: SEEING IS BELIEVING: Telling stories with statistics in ...Telling stories with statistics ... visual data just as we read numeric data, and we need to learn to have some discussion

LOOKING AT THE LITTLE PICTURE

Page 38: SEEING IS BELIEVING: Telling stories with statistics in ...Telling stories with statistics ... visual data just as we read numeric data, and we need to learn to have some discussion

(Especially true for small samples)

Page 39: SEEING IS BELIEVING: Telling stories with statistics in ...Telling stories with statistics ... visual data just as we read numeric data, and we need to learn to have some discussion
Page 40: SEEING IS BELIEVING: Telling stories with statistics in ...Telling stories with statistics ... visual data just as we read numeric data, and we need to learn to have some discussion

How does our screening test work?

R-square = .05

Page 41: SEEING IS BELIEVING: Telling stories with statistics in ...Telling stories with statistics ... visual data just as we read numeric data, and we need to learn to have some discussion

Don’t be too hasty

Page 42: SEEING IS BELIEVING: Telling stories with statistics in ...Telling stories with statistics ... visual data just as we read numeric data, and we need to learn to have some discussion

Look!

Page 43: SEEING IS BELIEVING: Telling stories with statistics in ...Telling stories with statistics ... visual data just as we read numeric data, and we need to learn to have some discussion

Another example

• Years of Education as predictor of gain score

• R-square = .46

• Correlation = .68)

• P <.01.

Page 44: SEEING IS BELIEVING: Telling stories with statistics in ...Telling stories with statistics ... visual data just as we read numeric data, and we need to learn to have some discussion

Now looky here …

Is it a real relationship?

Page 45: SEEING IS BELIEVING: Telling stories with statistics in ...Telling stories with statistics ... visual data just as we read numeric data, and we need to learn to have some discussion

What should we do?

Throw the score out?

Keep the score in?

Something else?

Page 46: SEEING IS BELIEVING: Telling stories with statistics in ...Telling stories with statistics ... visual data just as we read numeric data, and we need to learn to have some discussion

Ignoring my partner …

Compare your answers with the people next to you

Page 47: SEEING IS BELIEVING: Telling stories with statistics in ...Telling stories with statistics ... visual data just as we read numeric data, and we need to learn to have some discussion

Sometimes outliers are the most interesting part of your study

Page 48: SEEING IS BELIEVING: Telling stories with statistics in ...Telling stories with statistics ... visual data just as we read numeric data, and we need to learn to have some discussion
Page 49: SEEING IS BELIEVING: Telling stories with statistics in ...Telling stories with statistics ... visual data just as we read numeric data, and we need to learn to have some discussion

PROC CORR

Page 50: SEEING IS BELIEVING: Telling stories with statistics in ...Telling stories with statistics ... visual data just as we read numeric data, and we need to learn to have some discussion

One last example on knowing your data

Not just telling a story,

having a conversation

Page 51: SEEING IS BELIEVING: Telling stories with statistics in ...Telling stories with statistics ... visual data just as we read numeric data, and we need to learn to have some discussion

PROC FREQ

Page 52: SEEING IS BELIEVING: Telling stories with statistics in ...Telling stories with statistics ... visual data just as we read numeric data, and we need to learn to have some discussion

Custom Map-making

How to plot the largest category in a frequency distribution

Page 53: SEEING IS BELIEVING: Telling stories with statistics in ...Telling stories with statistics ... visual data just as we read numeric data, and we need to learn to have some discussion

1, 2, 3

1. PROC TABULATE -> output dataset

2. PROC FORMAT

3. Proc GMAP

Page 54: SEEING IS BELIEVING: Telling stories with statistics in ...Telling stories with statistics ... visual data just as we read numeric data, and we need to learn to have some discussion

DATA VISUALIZATION BY EXAMPLE

WHERE IS DEMOCRATIC SUPPORT BASED?

DATA VISUALIZATION IN POLITICAL SURVEYS

Page 55: SEEING IS BELIEVING: Telling stories with statistics in ...Telling stories with statistics ... visual data just as we read numeric data, and we need to learn to have some discussion

PROC TABULATE

DATA= in.VOTE2008 OUT=SummaryVOTE2008 ;

CLASS question3 state ;

TABLE state, question3* RowPctN ;

Page 56: SEEING IS BELIEVING: Telling stories with statistics in ...Telling stories with statistics ... visual data just as we read numeric data, and we need to learn to have some discussion

proc format ;

value vote

50.01 - 100 = "Obama"

0 - 50 = "McCain" ;

Page 57: SEEING IS BELIEVING: Telling stories with statistics in ...Telling stories with statistics ... visual data just as we read numeric data, and we need to learn to have some discussion

PROC GMAP

DATA = SummaryVOTE2008 map = maps.us ;

ID state ;

CHORO PctN_01 / discrete LEGEND=LEGEND1 ;

Page 58: SEEING IS BELIEVING: Telling stories with statistics in ...Telling stories with statistics ... visual data just as we read numeric data, and we need to learn to have some discussion

PROC GMAP

DATA = SummaryVOTE2008 map = maps.us ;

ID state ;

CHORO PctN_01 / discrete LEGEND=LEGEND1 ;

Pattern1 c = red ;

Pattern2 c = blue ;

format PctN_01 vote. ;

Page 59: SEEING IS BELIEVING: Telling stories with statistics in ...Telling stories with statistics ... visual data just as we read numeric data, and we need to learn to have some discussion

PROC GMAP

CHORO PctN_01 / discrete LEGEND=LEGEND1 ;

FORMAT PctN_01 vote. ;

CHORO statement uses the first observation and ignores the others.

Page 60: SEEING IS BELIEVING: Telling stories with statistics in ...Telling stories with statistics ... visual data just as we read numeric data, and we need to learn to have some discussion
Page 61: SEEING IS BELIEVING: Telling stories with statistics in ...Telling stories with statistics ... visual data just as we read numeric data, and we need to learn to have some discussion

Does Race Matter?

Page 62: SEEING IS BELIEVING: Telling stories with statistics in ...Telling stories with statistics ... visual data just as we read numeric data, and we need to learn to have some discussion

PROC GMAP

DATA = wuss map=maps.us ;

ID state ;

area vote2008 / discrete statistic = mean ;

block pctmin / discrete statistic = mean ;

format pctmin rangep. vote2008 voten. ;

Page 63: SEEING IS BELIEVING: Telling stories with statistics in ...Telling stories with statistics ... visual data just as we read numeric data, and we need to learn to have some discussion

mean minority percentage in districts where Obama voters live is 21% versus 13% for McCain voters

(t= 5.73, p < .0001)

Page 64: SEEING IS BELIEVING: Telling stories with statistics in ...Telling stories with statistics ... visual data just as we read numeric data, and we need to learn to have some discussion

The usefulness of visual data

With one statement, I can change the percentage of minority & re-run the chart

value rangep

0 - 10 = "0 -10%"

10.01 - 100 = "> 10%%" ;

Page 65: SEEING IS BELIEVING: Telling stories with statistics in ...Telling stories with statistics ... visual data just as we read numeric data, and we need to learn to have some discussion

DATA VISUALIZATION BY EXAMPLE

Decision Trees, ROC & Lift Curves to Predict Military Service

Page 66: SEEING IS BELIEVING: Telling stories with statistics in ...Telling stories with statistics ... visual data just as we read numeric data, and we need to learn to have some discussion

Speaking of easy, interactive, graphics

JMP

Page 67: SEEING IS BELIEVING: Telling stories with statistics in ...Telling stories with statistics ... visual data just as we read numeric data, and we need to learn to have some discussion

How to get a SAS .xpt file into JMP, Step 1

File > Open

Page 68: SEEING IS BELIEVING: Telling stories with statistics in ...Telling stories with statistics ... visual data just as we read numeric data, and we need to learn to have some discussion

DECISION TREE

• ANALYZE > MODELING > PARTITION

• SELECT Y

• SELECT X VARIABLES

• Click on the SPLIT button

Page 69: SEEING IS BELIEVING: Telling stories with statistics in ...Telling stories with statistics ... visual data just as we read numeric data, and we need to learn to have some discussion
Page 70: SEEING IS BELIEVING: Telling stories with statistics in ...Telling stories with statistics ... visual data just as we read numeric data, and we need to learn to have some discussion

In JMP, use of training and testing datasets is REALLY easy

EXCLUDE 25% or 50% of the data and then re-run your analyses with the

excluded sample

Page 71: SEEING IS BELIEVING: Telling stories with statistics in ...Telling stories with statistics ... visual data just as we read numeric data, and we need to learn to have some discussion

Receiver Operating Characteristic

Click on the red arrow at the top left of the partition window for pull-down options include ROC and Lift curves.

Page 72: SEEING IS BELIEVING: Telling stories with statistics in ...Telling stories with statistics ... visual data just as we read numeric data, and we need to learn to have some discussion
Page 73: SEEING IS BELIEVING: Telling stories with statistics in ...Telling stories with statistics ... visual data just as we read numeric data, and we need to learn to have some discussion

Comparing models

Page 74: SEEING IS BELIEVING: Telling stories with statistics in ...Telling stories with statistics ... visual data just as we read numeric data, and we need to learn to have some discussion

A statistician is a person who was good at math but didn’t have enough

personality to be an accountant ?

Page 75: SEEING IS BELIEVING: Telling stories with statistics in ...Telling stories with statistics ... visual data just as we read numeric data, and we need to learn to have some discussion

It is important that people believe you

And that’s my story

Page 76: SEEING IS BELIEVING: Telling stories with statistics in ...Telling stories with statistics ... visual data just as we read numeric data, and we need to learn to have some discussion

AnnMaria De Mars

The Julia Group

2111 7th St #8

Santa Monica, CA 90405

[email protected]

(310) 717 -9089