economics 105: statistics

61
Economics 105: Statistics GH 19 not due Thur RAP assignment … datasets to look at Find the “codebook” or “survey instrument” and look at the questions they asked

Upload: flo

Post on 05-Jan-2016

21 views

Category:

Documents


0 download

DESCRIPTION

Economics 105: Statistics. GH 19 not due Thur RAP assignment … datasets to look at Find the “codebook” or “survey instrument” and look at the questions they asked. Brief Introduction to Research Design. Design Notation Internal Validity Experimental Design. Design Notation. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Economics 105: Statistics

Economics 105: Statistics• GH 19 not due Thur• RAP assignment … datasets to look at

• Find the “codebook” or “survey instrument” and look at the questions they asked

Page 2: Economics 105: Statistics

Brief Introduction to Research Design

Design Notation

Internal Validity

Experimental Design

Page 3: Economics 105: Statistics

Design Notation• Observations or measures are indicated with an “O”• Treatments or programs with an “X”• Groups are shown by the number of rows• Assignment to group is by “R,N,C”

– Random assignment to groups– Nonequivalent assignment to groups– Cutoff assignment to groups

• Time

Page 4: Economics 105: Statistics

Design Notation Example

R O1,2 X O1,2

R O1,2 O1,2

Os indicate differentwaves of

measurement.

Vertical alignmentof Os shows that

pretest and posttestare measured at same time.

X is the treatment.There are twolines, one foreach group.

R indicates the groups

are randomly assigned.

Subscriptsindicate

subsets ofmeasures.

Page 5: Economics 105: Statistics

Types of DesignsRandom assignment?

Control group or multiple measures?

No

Yes

Yes

Randomized(true experiment)

Quasi-experiment

No

Nonexperiment

Page 6: Economics 105: Statistics

Non-Experimental Designs

X O

O X O

X O O

Post-test only (case study)

Single-group, pre-test, post-test

Two-group, post-test only(static group comparison)

Page 7: Economics 105: Statistics

Experimental Designs

R O1 X O1,2

R O1 O1,2

R X O1,2

R O1,2

• Pretest-Posttest Randomized Experiment Design

• If continuous measures, use t-test

• If categorical outcome, use chi-squared test

• Posttest only Randomized Experiment Design

• Less common due to lack of pretest

• Probabilistic equivalence between groups

Page 8: Economics 105: Statistics

Experimental DesignsR O X O

R O O

R X O

R O

• Advantages• Information is available on the effect of treatment

(independent variable), the effect of pretesting alone, possible interaction of pretesting & treatment, and the effectiveness of randomization

• Disadvantages• Costly and more complex to implement

Solomon Four-Group Design

Page 9: Economics 105: Statistics

Establishing Cause and Effect

Single-Group Threats

Multiple-Group Threats

“Social” Interaction Threats

• Internal validity is the approximate truth about inferences regarding cause-effect relationships.

Internal Validity

Page 10: Economics 105: Statistics

Threats to Internal ValidityR X

OR

OHistory

MaturationTesting

InstrumentationMortality

Regression to the meanSelection

Selection-historySelection- maturation

Selection- testingSelection- instrumentation

Selection- mortality*Selection- regressionDiffusion or imitation*

Compensatory equalization*Compensatory rivalry*

Resentful demoralization*

Single-Group

Multiple-Group

Social Interaction

Page 11: Economics 105: Statistics

Single-Group Threatsto Internal Validity

Page 12: Economics 105: Statistics

Administerprogram

Measureoutcomes

X O

Two designs:

Administerprogram

Measureoutcomes

X O

Measurebaseline

O

Post-test only a single group

What is a “single-group” threat?

Page 13: Economics 105: Statistics

• Diabetes educational program for newly diagnosed adolescents in a clinic

• Pre-post, single group design• Measures (O) are paper-pencil, standardized

tests of diabetes knowledge (e.g. disease characteristics, management strategies)

Example

Page 14: Economics 105: Statistics

• Any other event that occurs between pretest and posttest

• For example, adolescents learn about diabetes by watching The Health Channel

Program Posttest

X O

Pretest

O

History Threat

Page 15: Economics 105: Statistics

• Normal growth between pretest and posttest.• They would have learned these concepts anyway,

even without program.

Program Posttest

X O

Pretest

O

Maturation Threat

Page 16: Economics 105: Statistics

• The effect on the posttest of taking the pretest• May have “primed” the kids or they may have

learned from the test, not the program• Can only occur in a pre-post design

Program Posttest

X O

Pretest

O

Testing Threat

Page 17: Economics 105: Statistics

• Any change in the test from pretest and posttest• So outcome changes could be due to different

forms of the test, not due to program• May do this to control for “testing” threat, but

may introduce “instrumentation” threat

Program Posttest

X O

Pretest

O

Instrumentation Threat

Page 18: Economics 105: Statistics

• Nonrandom dropout between pretest and posttest• For example, kids “challenged” out of program by

parents or clinicians• Attrition

Program Posttest

X O

Pretest

O

Mortality Threat

Page 19: Economics 105: Statistics

• Group is a nonrandom subgroup of population.• For example, mostly low literacy kids will appear

to improve because of regression to the mean.• Example: height

Program Posttest

X O

Pretest

O

Regression Threat

Page 20: Economics 105: Statistics

When you select a sample from

the low end of a distribution ...

the group will do better on a

subsequent measure.

The group mean on the first measure

appears to “regress toward the mean” of

the population.

Selectedgroup’smean

Overallmean

Regression to the mean

Overallmean

Regression to the Meanpre-test scores ~ N

post-test scores ~ N & assuming no effect of treatment pgm

Page 21: Economics 105: Statistics

Regression to the Mean

Page 22: Economics 105: Statistics

Regression to the MeanSir Francis Galton (1822 – 1911)903 adult children & their 250 parents

Page 23: Economics 105: Statistics
Page 24: Economics 105: Statistics

Regression to the Mean

• How to Reduce the effects of RTM (Barnett, et al., International Journal of Epidemiology, 2005)

1. When designing the study, randomly assign subjects to treatment and control (placebo) groups. Then effects of RTM on responses should be same across groups.

2. Select subjects based on multiple measurements

• RTM increases with larger variance (see graphs) so subjects can be selected using the average of 2 or more baseline measurements.

Page 25: Economics 105: Statistics

Multiple-Group Threats to Internal Validity

Page 26: Economics 105: Statistics

• When you move from single to multiple group research the big concern is whether the groups are comparable.

• Usually this has to do with how you assign units (e.g., persons) to the groups (or select them into groups).

• We call this issue selection or selection bias.

The Central Issue

Page 27: Economics 105: Statistics

Administerprogram

Measureoutcomes

Measurebaseline

Alternativeexplanations

Alternativeexplanations

X OO

OODo not

administerprogram

Measureoutcomes

Measurebaseline

The Multiple Group Case

Page 28: Economics 105: Statistics

• Diabetes education for adolescents

• Pre-post comparison group design

• Measures (O) are standardized tests of diabetes knowledge

Example

Page 29: Economics 105: Statistics

• Any other event that occurs between pretest and posttest that the groups experience differently.

• For example, kids in one group pick up more diabetes concepts because they watch a special show on Oprah related to diabetes.

X OO

OO

Selection-History Threat

Page 30: Economics 105: Statistics

• Differential rates of normal growth between pretest and posttest for the groups.

• They are learning at different rates, even without program.

X OO

OO

Selection-Maturation Threat

Page 31: Economics 105: Statistics

• Differential effect on the posttest of taking the pretest.

• The test may have “primed” the kids differently in each group or they may have learned differentially from the test, not the program.

X OO

OO

Selection-Testing Threat

Page 32: Economics 105: Statistics

• Any differential change in the test used for each group from pretest and posttest

• For example, change due to different forms of test being given differentially to each group, not due to program

X OO

OO

Selection-Instrumentation Threat

Page 33: Economics 105: Statistics

• Differential nonrandom dropout between pretest and posttest.

• For example, kids drop out of the study at different rates for each group.

• Differential attrition

X OO

OO

Selection-Mortality Threat

Page 34: Economics 105: Statistics

• Different rates of regression to the mean because groups differ in extremity.

• For example, program kids are disproportionately lower scorers and consequently have greater regression to the mean.

X OO

OO

Selection-Regression Threat

Page 35: Economics 105: Statistics

“Social Interaction” Threats to Internal Validity

Page 36: Economics 105: Statistics

• All are related to social pressures in the research context, which can lead to posttest differences that are not directly caused by the treatment itself.

• Most of these can be minimized by isolating the two groups from each other, but this leads to other problems (for example, hard to randomly assign and then isolate, or may reduce generalizability).

What Are “Social” Threats?

Page 37: Economics 105: Statistics

• Diffusion or imitation of Treatment• Compensatory Equalization of Treatment• Compensatory Rivalry• Resentful Demoralization

What Are “Social” Threats?

Page 38: Economics 105: Statistics

What is a Clinical Trial?• “A prospective study comparing the effect and

value of intervention(s) against a control in human beings.”

• Prospective means “over time”; vs. retrospective• It is attempting to change the natural course of a

disease• It is NOT a study of people who are on drug X

versus people who are not

• http://www.clinicaltrials.gov/info/resources

Page 39: Economics 105: Statistics

Example: Job Corps• What is Job Corps? http://jobcorps.doleta.gov/

• January 5, 2006 Thursday Late Edition – Final

SECTION: Section C; Column 1; Business/Financial Desk; ECONOMIC SCENE; Pg. 3

HEADLINE: New (and Sometimes Conflicting) Data on the Value to Society of the Job Corps

BYLINE: By Alan B. Krueger.

Alan B. Krueger is the Bendheim professor of economics and public affairs at Princeton University. His Web site is www.krueger.princeton.edu.

He delivered the 2005 Cornelson Lecture in the Department of Economics here at Davidson (that’s the big econ lecture each year).

Page 40: Economics 105: Statistics

Example: Job Corps• Quotations from “New (and Sometimes Conflicting) Data on the Value

to Society of the Job Corps” by Alan B. Krueger.

• Since 1993, Mathematica Policy Research Inc. has evaluated the performance of the Job Corps for the Department of Labor.

• Its evaluation is based on one of the most rigorous research designs ever used for a government program. From late 1994 to December 1995, some 9,409 applicants to the Job Corps were randomly selected to be admitted to the program and another 6,000 were randomly selected for a control group that was excluded from the Job Corps.

• Those admitted to the program had a lower crime rate, higher literacy scores and higher earnings than the control group.

Page 41: Economics 105: Statistics
Page 42: Economics 105: Statistics
Page 43: Economics 105: Statistics
Page 44: Economics 105: Statistics

RCT for Credit Card Offers

Source: Agarwal, et al. (2010), Journal of Money, Credit & Banking, 42 (4)

A1: 0% APR for first 8 months & 9.99% on balance transfers, then 9.99% on purchases

A2: 0% APR for first 12 months, & 9.99% on balance transfers, then 9.99% on purchases

A3: 0% APR for first 8 months & 8.99% on balance transfers, then 8.99% on purchases

Page 45: Economics 105: Statistics

RCT for Education in India

Source: Banerjee, et al. (2007), Quarterly Journal of Economics

Page 46: Economics 105: Statistics

RCT for Education in India

Page 47: Economics 105: Statistics

RCT for the Effect of High Rewards on Performance

Source: Ariely, Gneezy, Loewenstein, and Mazar (2009), Review of Economic Studies

Page 48: Economics 105: Statistics

RCT for the Effect of High Rewards on Performance

Random assignment !

Page 50: Economics 105: Statistics

Introduction to Regression Analysis• Correlation analysis only measures the strength of

the association (linear relationship) between two variables … not necessarily a causal relationship

• Regression analysis is used to:– Predict the value of a dependent variable based on the

value of at least one independent variable– Explain the impact of changes in an independent variable

on the dependent variable

• Dependent variable: the variable we wish to predict or explain variation in ... outcome variable, Y.

• Independent variables: the variables used to explain variation in Y ... covariates, explanatory variables, r.h.s. vars, X-variables

Page 51: Economics 105: Statistics

Types of Relationships

Y

X

Y

X

Y

Y

X

X

Linear relationships Curvilinear relationships

Page 52: Economics 105: Statistics

Types of Relationships

Y

X

Y

X

Y

Y

X

X

Strong relationships Weak relationships

(continued)

Page 53: Economics 105: Statistics

Types of Relationships

Y

X

Y

X

No relationship

(continued)

Page 54: Economics 105: Statistics

Deterministic Linear Models• Theoretical Model:

– b0 and b1 are constant terms

• b0 is the intercept

• b1 is the slope

– Xi is a predictor of Yia

bb0

Xi

Yi

Page 55: Economics 105: Statistics

(continued)

Pop Random Error for this Xi value

Y

X

Observed Value of Y for Xi

Xi

Pop Slope = β1

Pop Intercept = β0

εi

Stochastic Simple Linear Population Regression Model

Page 56: Economics 105: Statistics

Gauss-Markov Assumptions• (1) Zero conditional mean

– Idiosyncratic, “white noise”– Measurement error on Y– Omitted relevant explanatory variables … why?

• (2)– Homoskedastic errors

• (3) – No serial correlation among errors (autocorrelation)

Y

X

E[Y|X] = 0+ 1X

Page 57: Economics 105: Statistics

Gauss-Markov Assumptions(4)

– Linear in the parameters + error– Variation in Y is caused by , the error (as well as X)– Not

(5) Random sample of data• are i.i.d.

• (Ancillary) errors are normally distributed

Page 58: Economics 105: Statistics

Stochastic Linear Models• Assumptions so far imply• • • Need to estimate population intercept & slope• Take a sample of data & obtain the sample regression line

Page 59: Economics 105: Statistics

The sample regression line equation provides an estimate of the population regression line

Sample Regression Equation (Prediction Line)

Estimate of the regression

intercept

Estimate of the regression slope

Estimated (or predicted) Y value for observation i

Value of X for observation i

The individual random error terms ei have a mean of zero

Other notation:

Page 60: Economics 105: Statistics

chosen in samplenot chosen in sample

estimated error for X3

(residual)

Y

X

Observed Value of Y for X3

Predicted Value of

Y for X3

X3

ε3

Sample Regression Equation

e3

Page 61: Economics 105: Statistics

Sample Regression Equation• Residual, ei, is the prediction error

• Positive errors• Negative errors

Y

X