lecture 4w-interpretingregression

Post on 12-Apr-2017

23 Views

Category:

Documents

1 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Lecture 4W: Explaining Regression ResultsDescribing Results in Everyday English

Explaining Regression Results•Some things are technical, precise

▫Everyone who does same command should get same table

•Other things more open to interpretation▫Should we care about our results?

Regression Table: Tons of Info•Covered regression coefficients Monday•Will cover rest of output today

Picking Up From Monday•Start by looking at bottom table

▫Column for coefficients usually first •y = a + bx + E•conrinc = -11679.16 + 1036.512 *

prestg10 + E

Protip: Focus on IV More Than Constant

•Easy to overemphasize constant, particularly when it’s negative▫Constant usually matters less

Describing Coefficients in Words•Number under coeff for prestg10.

(1036.512)▫Slope of the line.

•If someone scores 1 point higher on occupational prestige, how much more money would we expect them to earn, given these results?

Explaining the 1036.512 Coefficient

•If someone scores 1 point higher on occupational prestige, how much more money would we expect them to earn, given these results?▫We would expect them to earn 1 * 1036.512

= $1036.512 more per year.•If Mary’s occupational prestige is 10

points higher than Jess, how much more would we expect her to earn?▫10 * 1036.512 = $10365.12 more per year.

A “One-Unit Increase”•Common way to describe a regression

coefficient in English.▫A one unit increase in occupational

prestige leads to a 1036.512 unit increase in annual income.

▫Better than saying “the regression coefficient is 1036.512.”

•Don’t round until every calculation is done

Why One Unit Increase vs. Ten?•Start with one unit since it is the easiest

to calculate.

•Only time you calculate just to show you can is homework problems.▫When we talk about interpreting results

and making arguments, up to you to pick resonable number of units as part of the argument

Predicting Scores•We can also use this equation to predict

how much money someone would make, based on their occupational prestige.▫conrinc = -11679.16 + 1036.512 * prestg10

+ E•Assume occupational prestige = 50.

▫Income = -11679.16 + 1036.512 * 50 = $40146.44

•What about someone with 80 occ. prestige?▫Income = -11679.16 + 1036.512 * 80 =

$71241.80

Prediction and Exceptions•Reminder: most people not exactly on the

regression line▫Exceptions do not invalidate the pattern

Constants can be weird•Imagine using age to predict income.

Toddlers may be predicted to have negative income.▫It’s because there are no toddlers in our

sample.▫The constant may be nonsense if we never

see x = 0 in our sample.

Statistical Significance in Regression

•Goal is to see whether change in independent variable leads to change in dependent variable▫Is relationship relatively unlikely to appear

just from random chance?

•Null hypothesis: regression coefficient = 0

Calculating Statistical Significance•Each variable has it’s own standard error

term.•Use standard error to get a t statistic for

each term.▫We don’t care about constant though

Computing SE for Regression Coeff.

•Where σε2 is the variance in error term εi

•sx2 is the sample variance of x, sx is the

sample standard deviation of x.

2

2

2

22

2

22

2

222

)1()( )(

)(1

)()(

xib

ii

ia

snxxbV

xxx

nxxnx

aV

SE Formula Implications•In general, lower SE shows better

estimates▫A worse regression model means bigger

error term, higher SE for any variable▫Large N reduces SE

P>|t| is p-value for a Variable•Read across to get the appropriate p-

value•Would we reject the null hypothesis?

▫Yes, p < 0.05

What Does p-value Tell Us?•A low p-value tells us a relationship is

unlikely to happen by random chance.▫We can be very confident that people with

higher prestige jobs tend to make more money.

•However, p-value does not tell us whether the relationship has any real world meaning.

Is the following important?•If we survey everyone in LA, people born

in January may make $10/year more than December babies.▫With millions in the data set, p < 0.001

•But should we care about $10 a year?

•Common problem when people who know a little stats encounter “big data”

Statistical vs. Substantive Significance•Ideally we want both.•Statistical significance is based strictly on

p-values.•Substantive significance is based on our

knowledge of the world. What is worth telling people about?▫These judgments won’t come from a

statistics class!▫Often worth discussing substantively

significant variables that don’t quite reach p < .05

Two Main Criteria for Substantive SignificanceEffect Size Personal Interest

• Always need to explain regression coefficient in a sentence (or more).

• Is number large enough for us to care about relationship?▫ If not, need to offer a

reasonable benchmark

• Is number nonsense?

• Could be intellectual or personal interest

• May feel any statistically significant relationship is important

Balancing Effect Size and Interest

Interest in Variable/Relationship

EffectSize

Not Sig.

Significant

Common Problems: Strike Zone Metaphor

Too low: not enough explanation

Too high: of thresholdfor effect size

Outside:Too

Much Spin

Outside:Too

Much Spin

JUST THROWSTRIKES!

Describe our results•Would you say the effect age has on

income is substantively significant? Why or why not?

Is There Substantive Significance?•Arguments for yes:

▫It is statistically significant and▫Some may feel $573 more per year is

enough•Arguments for no:

▫Some may feel $573 is not enough.▫It doesn’t make sense. People retire!

•Hold off claims about other variables

Note on r-squared•We were initially scheduled to cover r-

squared today, but I wanted to spend more time on substantive significance because it is the hardest concept.

•r-squared appears on HW 2 but will be pushed to Monday’s lecture. If necessary I will pare down on other concepts.

top related