measuring impact

68
Measuring Impact

Upload: clearsateam

Post on 12-Nov-2014

116 views

Category:

Education


1 download

DESCRIPTION

Measuring Impact

TRANSCRIPT

Page 1: Measuring Impact

Measuring Impact

Page 2: Measuring Impact

Course Overview

1. What is Evaluation?

2. Measuring Impact

3. Why Randomize?

4. How to Randomize?

5. Sampling and Sample Size

6. Threats and Analysis

7. Cost-Effectiveness Analysis

8. Project from Start to Finish

Page 3: Measuring Impact

CASE STUDYWomen as Policymakers

Page 4: Measuring Impact

What was the main purpose of the 73rd Amendment of India’s constitution?

A. To reserve leadership positions for women (and caste minorities)

B. To formalize local institutions of leadership

C. To give women the right to vote in local elections

To rese

rve le

adership po...

To form

alize

loca

l insti

tu...

To give w

omen the rig

ht ..

0% 0%0%

Page 5: Measuring Impact

Theory of Change from Reservations to Final Outcomes

Women are empowered

Public goods show

women’s preferences

Pradhan’s preferences

matterDemocracy imperfect

More female Pradhans

Reservations for women

Women have different

preferences

Page 6: Measuring Impact

Data Used Covers All Aspects of Theory of Change

Purpose of Measurement

Sources of Measurement

Indicators

Reservations Policy and GP Priorities

Administrative Data

• Reservation• Budgets• Balance sheets

Preferences of Women and Men

Transcript from village meeting

• Who speaks and when (gender)• Issues raised

Public Investments

Village Leader Interview

• Political experience• Investments undertaken

Public Investments

Village PRA • Village infrastructure + investments

• Perception of public good quality• Participation of men and women

Household (HH) Survey

• Declared HH preference• HH perceptions of quality of

public goods and services

Page 7: Measuring Impact

Results: Reservations Increase Female Leadership

West Bengal Rajasthan

Of Pradhans:Reserved Unreserve

dReserve

dUnreserve

d

Total Number 54 107 40 60

Proportion female 100% 6.5% 100% 1.7%

Page 8: Measuring Impact

Results: Women Raise Different Issues at GP Meetings

West Bengal Rajasthan

Of Pradhans: Women Men Women Men

Drinking water 31% 17% 54% 43%

Road improvement 31% 25% 13% 23%

Page 9: Measuring Impact

Results: Public Investments Show Women’s Preferences

West Bengal Rajasthan

Issue Investment

Issue ReservedInvestmen

t

Issue ReservedInvestme

ntW M W M

Drinking Water

# facilities 31%

17%

9.09 54%

43%

2.62

Road Improvement

Road Condition (0-1)

31%

25%

0.18 13%

23%

-0.08

Irrigation # facilities 4% 20%

-0.38 2% 4% -0.02

Education Informal education center

6% 12%

-0.16 5% 13%

Page 10: Measuring Impact

Why do You Need Data? Follow Theory of Change

Our theory and hypothesis helps us define the set of

outcomes

Need to find indicators that map the outcomes well

Characteristics: Who are the people the program works

with, and what is their environment?

• Sub-groups, covariates, predictors of compliance

Channels: How does the program work, or fail to work?

Outcomes: What is the purpose of the program? Did it

achieve that purpose?

Page 11: Measuring Impact

Lecture Overview

Theory of Change: What Do You Want to Measure?

Theory of Measurement: What Makes a Good Measure?

Practice of Measurement: Measuring the Immeasurable

Collecting Data

Page 12: Measuring Impact

WHAT MAKES A GOOD MEASURETheory of Measurement

Page 13: Measuring Impact

The Main Challenge in Measurement: Getting Accuracy and Precision

More accurate M

ore

p

recis

e

Page 14: Measuring Impact

Terms “Biased” and “Unbiased” Used to Describe Accuracy

More accurate

“Biased” “Unbiased”On average, we get the wrong answer

On average, we get the right answer

Page 15: Measuring Impact

Terms “Noisy” and “Precise” Used to Describe Precision

M

ore

p

recis

e“Noisy”

“Precise”

Random error causes answer to bounce around

Measures of the same thing cluster together

Page 16: Measuring Impact

A Noisy and a Precise Measure Can Both Be Biased

More accurate M

ore

p

recis

e“Noisy”

“Precise”

Random error causes answer to bounce around

Measures of the same thing cluster together

Page 17: Measuring Impact

Choices in Real Measurement Often Harder

More accurate M

ore

p

recis

e“Noisy” but

“Unbiased”

“Precise” but “Biased”

Random error causes answer to bounce around

Measures of the same thing cluster together

Page 18: Measuring Impact

The Main Challenge in Measurement: Getting Accuracy and Precision

More accurate M

ore

p

recis

e

Page 19: Measuring Impact

Is this introducing noise or bias?

A. Noise / Random Error

B. Bias

A surveyor doesn´t follow the exact phrase of the question:Surveyor- “you feel unsecure in this neighborhood, right?” Respondent- “well, sometimes yes and sometimes, well, no…” Surveyor- “so, is more of a yes, right?”

- “mmm…well”Respondent - “ok, thanks. Next question.” code 1=yes

Random Error

Systematic E

rror

0%0%

Page 20: Measuring Impact

Accuracy

In theory:

• How well does the indicator map to the outcome? (e.g. IQ test intelligence)

In practice: Are you getting unbiased answers?

• Respondent bias

• Recall bias

• Anchoring bias

• Social desirability bias (response bias)

• Framing effect / Neutrality

Page 21: Measuring Impact

Precision and Error

In theory: The measure is precise, but not necessarily accurate In practice:

• Length, fatigue

• “How much did you spend on broccoli yesterday?” (as a measure of annual broccoli spending)

• Ambiguous wording (definitions, relationships, recall period, units of question)

Eg. Definition of terms – ‘household’, ‘income’

• Recall period /units of question

• Type of answer -Open/Closed

• Choice of options for closed questions

Likert (i.e. Strongly disagree, disagree, neither agree nor disagree, . . .)

Rankings

Quantitative scales. Numbers or bins.

Page 22: Measuring Impact

Random Error

Mistakes in estimation or recall Question wording Surveyor training/quality Data entry Length, fatigue of survey How do you generalize from certain questions?

Page 23: Measuring Impact

Which is worse?

A. Bias (Low Accuracy)

B. Noise (Low Precision)

C. Equally bad

D. Depends

E. Don’t know/can’t say

Bias (Lo

w Accura

cy)

Noise (L

ow Precision)

Equally bad

Depends

Don’t kn

ow/can’t

say

0% 0% 0%0%0%

Page 24: Measuring Impact

A biased measure will bias our impact estimates

A. True

B. False

C. Depends

D. Don’t know

True

False

Depends

Don’t kn

ow

0% 0%0%0%

Page 25: Measuring Impact

MEASURING THE UNMEASURABLE

Practice of Measurement

Page 26: Measuring Impact

What is Hard to Measure?

(1) Things people do not know very well

(2) Things people do not want to talk about

(3) Abstract concepts

(4) Things that are not (always) directly observable

(5) Things that are best directly observed

Page 27: Measuring Impact

How much juice did you consume last month?

A. <2 liters

B. 2-5 liters

C. 6-10 liters

D. >11 liters

<2 liters

2-5 liters

6-10 liters

>11 liters

0% 0%0%0%

Page 28: Measuring Impact

1. Things people do not know very well

What: Anything to estimate, particularly across time. Prone to recall error and poor estimation

• Examples: distance to health center, profit, consumption, income, plot size

Strategies:

• Frame the question in a way people are likely to know and remember.

• Consistency checks – How much did you spend in the last week on x? How much did you spend in the last 4 weeks on x?

• Multiple measurements of same indicator – How many minutes does it take to walk to the health center? How many kilometers away is the health center?

Page 29: Measuring Impact

How many glasses of juice did you consume yesterday

A. 0

B. 1-3

C. 4-6

D. >6

0

01-Mar

4-6 >6

0% 0%0%0%

Page 30: Measuring Impact

What is Hard to Measure?

(1) Things people do not know very well

(2) Things people do not want to talk about

(3) Abstract concepts

(4) Things that are not (always) directly observable

(5) Things that are best directly observed

Page 31: Measuring Impact

How frequently do you yell at your spouse

A. Daily

B. Several times per week

C. Once per week

D. Once per month

E. Never

Daily

Severa

l times p

er week

Once per w

eek

Once per m

onthNeve

r

0% 0% 0%0%0%

Page 32: Measuring Impact

2. Things people don’t want to talk about

What: Anything socially “risky” or something painful

• Examples: sexual activity, alcohol and drug use, domestic violence, conduct during wartime, mental health

Strategies:

• Don’t start with the hard stuff!

• Consider asking question in third person

• Always ensure comfort and privacy of respondent

Page 33: Measuring Impact

Asking Sensitive Questions: List Randomization

Randomly ask part of the sample the question with / without a sensitive option

Response only a count, not the specific options

How many of these statements are true for you? This morning I took a

shower. My nearest bank branch

office is within walking distance.

I have tea every day. I use my loan for non-

business expenses.

How many of these

statements are true for

you?

This morning I took a

shower.

My nearest bank branch

office is within walking

distance.

I have tea every day.

Page 34: Measuring Impact

Asking Sensitive Questions: List Randomization

Average number of true statements:

2.1

Average number of true statements:

2.8

2.8 – 2.1 = 0.7 full TRUE difference (on average)70% used their loan for non-business purposes.

Page 35: Measuring Impact

List Randomization Shows Big Differences in Some Real Cases

Page 36: Measuring Impact

Asking Sensitive Questions: List Randomization

Some questions are sensitive and people are hesitant to answer truthfully

List randomization is a way to get the answer on average without knowing confidential information on any one person

Page 37: Measuring Impact

What is Hard to Measure?

(1) Things people do not know very well

(2) Things people do not want to talk about

(3) Abstract concepts

(4) Things that are not (always) directly observable

(5) Things that are best directly observed

37

Page 38: Measuring Impact

“I feel more empowered now than last year”

A. Strongly disagree

B. Disagree

C. Neither agree nor disagree

D. Agree

E. Strongly agree

Strongly

disagre

e

Disagre

e

Neither a

gree nor disa

gree

Agree

Strongly

agree

0% 0% 0%0%0%

Page 39: Measuring Impact

3. Abstract concepts

What: Potentially the most challenging and interesting type of difficult-to-measure indicators Examples: empowerment, bargaining power, social cohesion,

risk aversion

Strategies:• Three key steps when measuring “abstract concepts”

• Define what you mean by your abstract concept

• Choose the outcome that you want to serve as the measurement of your concept

• Design a good question to measure that outcome

Often choice between choosing a self-reported measure and a behavioral measure – both can add value!

Page 40: Measuring Impact

“I am involved in the decision to send my child to private vs public school”

A. Strongly disagree

B. Disagree

C. Neither agree nor disagree

D. Agree

E. Strongly agree

Strongly

disagre

e

Disagre

e

Neither a

gree nor disa

gree

Agree

Strongly

agree

0% 0% 0%0%0%

Page 41: Measuring Impact

How “Socially Connected" do You Feel to the Other People in this Room?

ABCDE A B C D E

20% 20% 20%20%20%

You

Everyone else in

this room

Page 42: Measuring Impact

How likely are you to take a taxi with a driver you don’t know after dark?

A. Very unlikely

B. Unlikely

C. Likely

D. Very likely

Very unlikely

Unlik

ely Li

kely

Very lik

ely

0% 0%0%0%

Page 43: Measuring Impact

How likely are you to take a taxi with a driver you don’t know after dark?

A. Very unlikely

B. Unlikely

C. Likely

D. Very likely

Very unlikely

Unlik

ely Li

kely

Very lik

ely

0% 0%0%0%

Page 44: Measuring Impact

Perceptions and Attitudes

Ask directly

• “How effective is your leader?” (ineffective, somewhat

effective, effective, very…)

Indirect approaches often have better accuracy

• Listen to a Vignette (Male v. Female)

• Revealed preference – voting behavior

• Implicit Association tests

Page 45: Measuring Impact

Implicit Association Test

People simplify the world for efficiency

• Use thumb rules to draw connections

• May not even be aware themselves

For some important outcomes, may be worth trying to measure these indirectly

• Implicit association one technique

Page 46: Measuring Impact

Implicit Association Test: Match on Left or Right?

Page 47: Measuring Impact

Implicit Association Test: Match on Left or Right?

Parliament

Page 48: Measuring Impact

Implicit Association Test: Match on Left or Right?

Home

Page 49: Measuring Impact

Implicit Association Test: Match on Left or Right?

Parliament

Page 50: Measuring Impact

Implicit Association Test

People simplify the world for efficiency

• Use thumb rules to draw connections

• May not even be aware themselves

Actually based on response time, not accuracy

• Are respondents faster to select “Parliament” when associated with a man than a woman?

Page 51: Measuring Impact

What is Hard to Measure?

(1) Things people do not know very well

(2) Things people do not want to talk about

(3) Abstract concepts

(4) Things that are not (always) directly observable

(5) Things that are best directly observed

51

Page 52: Measuring Impact

What proportion race X people are denied jobs due to racial discrimination?

A. 0%

B. 1-20%

C. 21-40%

D. 41-60%

E. >60%

0%1-20%

21-40%

41-60%>60%

0% 0% 0%0%0%

Page 53: Measuring Impact

4. Things that aren’t Directly Observable

What: You may want to measure outcomes that you can’t ask directly about or directly observe

• Examples: corruption, fraud, discrimination

Strategies:

• Audit studies, e.g. Rajasthan police experiment to register cases, Delhi Driver’s Licenses, Delhi doctors

• Multiple sources of data, e.g. inputs of funds vs. outputs received by recipients, pollution reports by different parties

• Don’t worry – there have already been lots of clever people before you – so do literature reviews!

Page 54: Measuring Impact

5. Things that are Best Directly Observed

What: Behavioral preferences, anything that is more believable when done than said

Strategies:

• Develop detailed protocols

• Ensure data collection of behavioral measures done under the same circumstances for all individuals

• Example: how often do you wash your hands? Strategy: collect the data directly while disrupting

participants’ lives as little as possible E.g., put movement sensors in soap, measure the

weight of the soap

Page 55: Measuring Impact

The Problem

With the following questions…

Page 56: Measuring Impact

Question: After the last time you used the toilet, did you wash your hands? (The problem with this indicator is….)

A. Accuracy

B. Precision

C. Both

D. Neither

Accuracy

Precision

Both

Neither

0% 0%0%0%

Page 57: Measuring Impact

Outcome: Gender BiasQuestion: How effective are women leaders? (ineffective, somewhat effective, effective, very…)

A. Accuracy

B. Precision

C. Both

D. Neither

Accuracy

Precision

Both

Neither

0% 0%0%0%

Page 58: Measuring Impact

Examples

Study: Double-Fortified Salt (Duflo et al. forthcoming)

• Where: Bihar, India

• Intervention: selling low-cost iron-fortified salt to at-risk-for-anemia families.

• Indicators:Physical fitness (directly observable; step test)Physical development (directly observable; anthro

measures)Cognitive development (indirectly observable;

puzzles, tests)Health history (indirectly observable; surveying on

immunizations received, etc.)

58

Page 59: Measuring Impact

COLLECTING DATA

Page 60: Measuring Impact

When to Collect Data

[ Baseline ] : Before you start

During the intervention

End line : After you’re done

[ Scale-up, intervention ]

Page 61: Measuring Impact

Methods of Data Collection: Not Just Surveys

Surveys-

household/individual

Administrative data

Logs/diaries

Qualitative – eg. focus

groups, RRA

Games and choice

problems

Observation

Health/Education tests and

measures

Page 62: Measuring Impact

Where can we get Data?

The good . . . . and the bad

Administrative data• Collected by a

government or similar body as part of operations

• May already be collected and thus free• Can be extremely accurate (e.g. electricity bills)

• May not exist or not answer the question you want• May itself change behavior (e.g. taxes)

Other secondary data• Collected for

research or other purposes not admin

• May already be collected and thus free• Can inform the larger context of a project

• May not exist or not answer the question you want• Dubious quality

Primary data• Collected by

researchers for study

• Address the exact question of interest• Cover channels and assumptions

• Very costly and time consuming• May be biased answers

Page 63: Measuring Impact

Primary Data Collection

Surveys

• Questions

• Exams, tests, games, vignettes, etc.

Direct Observation

• Personal or technological (e.g. smoke sensor, vibration sensor)

Diaries/Logs

Page 64: Measuring Impact

Common Survey Modules Can be Adapted for a Particular Project

Demographics Economic

• Income, consumption, expenditure, time use

• Yields, production, etc. Beliefs

• Expectations or assumptions

• Bargaining power, patience, risk Anthropometric Cognitive, Learning

Page 65: Measuring Impact

Primary Data Collection Considerations

Quality

• Reliability and validity of the data

Costs / Logistics

• Surveyor recruitment and training

• Field work and transport, interview time

• Electronic vs. paper

• Data entry, reconciliation, cleaning, etc.

Ethics Human subjects, data security

Page 66: Measuring Impact

Reliability of Data Collection

The process of collecting “good” data requires a lot of

efforts and thought

Need to make sure that the data collected is precise and

accurate.

avoid false conclusions, for any research design

The survey process:

Design of questionnaire Survey printed on

paper/electronic filled in by enumerator interviewing the

respondent data entry electronic dataset

Where can this go wrong?

Page 67: Measuring Impact

Things to Take-away :

Theory of change guides measurement

• Want to measure each step

Data collection all about trade-offs

• Quality and cost

• Bias (accuracy) and noise (precision)

Creative techniques can sometimes help

• Think about what outcomes are most important

Page 68: Measuring Impact

Thank you!