program evaluation. evaluation defined green and kreuter (1999) broad definition: “comparison of...

Program Evaluation

Evaluation Defined

• Green and Kreuter (1999) broad definition:“comparison of an object of interest against a standard of acceptability”• Weiss (1998) more targeted:“systematic assessment of the operation and/or the outcomes of a program or a policy, compared to a set of explicit or implicit standards, as a means of contributing to the improvement of the program or policy”.

• Fournier (2005) “Evaluation is an applied inquiry process for collecting and synthesizing evidence that culminates in conclusions about the state of affairs, value, merit, worth, significance, or quality of a program, product, person, policy, proposal, or plan.”

Program evaluation• A tool for using science as a basis for :– Ensuring programs are rational and evidence-based

• Needs assessed• Theory-driven• Research-based

– Ensuring programs are outcome-oriented • Forces goals and objectives at the outset

– Ascertaining whether goals and objectives are being achieved • Performance measures established at outset

Program evaluation• A tool for using science as a basis for :– Informing program management about • Program processes – adjusted, improved• Program quality – effectiveness (see goals and

objectives)• Program relevance

– Decision-making and action • e.g. policy development based on program evaluations

– Transparency and accountability• Funders, participants, and other stakeholders.

Program evaluation

• Not done consistently in programs• Often not well-integrated into the day-to-day

management of most programs

• What gets measured gets done• If you don’t measure results, you can’t tell success from failure• If you can’t see success, you can’t reward it• If you can’t reward success, you’re probably rewarding failure• If you can’t see success, you can’t learn from it• If you can’t recognize failure, you can’t correct it.• If you can demonstrate results, you can win public support.

Re-inventing government, Osborne and Gaebler, 1992

Source: University of Wisconsin- Extension-Cooperative Extension

FROM Logic Model presentation: The accountability era

Within an organization – evaluation...

• Should be designed at the time of program planning • Should be a part of the ongoing service design and policy

decisions– Evidence that actions conform with strategic directions,

community needs, etc– Evidence that money spent wisely

• Framework should include components that are consistent across programs – In addition to indicators and methods tailor-made for specific

programs and contexts• Extent of evaluation

– related to the original goals– related to complexity of the program

When not to evaluate (Patton, 1997)

• There are no questions about the program• Program has no clear direction• Stakeholders can’t agree on program

objectives• Insufficient funds to evaluate properly

(Patton, 2005)

Basic Research Evaluation

Purpose Discovery of new knowledge

Inform decisions, clarify options, reduce uncertainties, and provide information about programs and policies

Adding to an existing body of knowledge

Study of the effectiveness with which existing knowledge is used to inform and guide practical action

Conclusions have empirical aspect

Conclusions encompass both an empirical aspect (that something is the case) and a normative aspect (judgment about the value of something).

Assesses merit Assesses both merit (absolute quality) & worth

Aimed at truth Aimed at action

Methods Theory-driven Theory-driven though pragmatic concerns outweigh theoretical considerations when selecting methods

Some form of experimental research design is essential

Experimental research design not essential

Merit and Worth

• Evaluation looks at the merit and worth of an evaluand (the project, program, or other entity being evaluated)

• Merit is the absolute or relative quality of something, either intrinsically or in regard to a particular criterion

• Worth is an outcome of an evaluation and refers to the evaluand’s value in a particular context. This is more extrinsic.

• Worth and merit are not dependent on each other.

Merit and Worth

• A medication review program has merit if it is proven to reduce known risk for falls– It also has value/worth if it saves the health system

money• An older driver safety program has merit if it is

shown to increase confidence among drivers over 80 years of age– Its value is minimal if it results in more unsafe

drivers on the road and increases risk and cost to community at large.

Evaluation vs research

• In evaluation, politics and science are inherently intertwined. – Evaluations are conducted on the merit and worth

of programs in the public domain• which are themselves responses to prioritized needs

that resulted in political decisions

– Program evaluation is intertwined with political power and decision making about societal priorities and directions (Greene, 2000, p. 982).

Formative evaluation

Purpose: ensure a successful program. Includes: 1. Developmental Evaluation (pre-program)

• Needs Assessment – match needs with appropriate focus and resources

• Program Theory Evaluation / Evaluability Assessment – clarity on theory of action, measurability, against what criteria – Logic Model – ensures aims, processes and evaluations linked

logically

• Community/organization readiness• Identification of intended users and their needs• etc

Surveillance, Planning and Evaluating for Policy and Action: PRECEDE-PROCEED MODEL*

Quality of life

Phase 1 Socialassessment

Health

Phase 2Epidemiological assessment

Healtheducation

Policyregulation

organization

PublicHealth

Phase 5Administrative &

policy assessment

Output Longer-termhealth outcome

Short-termsocial impact

Short-term impact

ProcessInput Long-termsocial impact

Phase 6Implementation

Phase 7Process evaluation

Phase 8Impact evaluation

Phase 9Outcome evaluation

Predisposing

Reinforcing

Enabling

Phase 4Educational &

ecologicalassessment

Behavior

Environment

Phase 3Behavioral &environmentalassessment

*Green & Kreuter, Health Promotion Planning, 4th ed, 2005.

Formative evaluation

Purpose: ensure a successful program2. Process Evaluation– all activities that evaluate

program once running• Program Monitoring

– Implemented as designed or analysing/understanding why not

– Efficient operations– Meeting performance targets (Outputs in logic model)

Summative evaluation

Purpose: determine program success in many different dimensionsAlso called- • Effectiveness evaluation • Outcome/Impact evaluation• Examples– Policy evaluation – Replicability/exportability/transferability evaluation – Sustainability evaluation – Cost-effectiveness evaluation

Evaluation Science

• Social research methods• Match research methods to the particular

evaluation questions – and specific situation

• Quantitative data collection involves:– identifying which variables to measure – choosing or devising appropriate research instruments

• reliable and valid

– administering the instruments in accordance with general methodological guidelines.

Experimental Design in Evaluation• Randomized controlled trial (RCT)

– Robust science, internal validity1. Pre/post-test with equivalent groupR O1 X O2

R O1 O2

2. Post-test only with equivalent groupR X O2

R O2

Problems with natural settings:• Randomization• Ethics• Implementation not controlled (staff, situation)• Participant demands• Perceived inequity between groups• etc

Experimental Design in Evaluation

• Quasi-experimental design – Randomization not possible:

Ethics Program underway No reasonable control group

1. One group post-test X O2

• Weakest design so use for exploratory, descriptive• Case study. Not for attribution.

2. One group pretest-posttest O1 X O2

• Can measure change• Can’t attribute to program

3. Pre-post non-equivalent (non-random) groups – good but must• Construct similar comparators by (propensity) matching individuals or group

N O1 X O2

N O1 O2

Evaluation Methods (Clarke and Dawson, 1999)

• Strict adherence to a method deemed to be ‘strong’ may result in the wrong problems becoming the focus of the evaluation – purely because the right problems are not

amenable to analysis by the preferred method• Rarely only one method used – Require range to ensure depth and detail from

which conclusions can be drawn

Experimental Design in Evaluation

• Criticism of experimental design in evaluation– Program is a black box

• ED measures causality (Positivist)• Does not capture the nature of causality (Realist)

– Internal dynamics of program not observed• How does the program work?

– Theory helps explain

• What are the characteristics of those in the program?– Participants need to choose to make a program work– Right conditions are needed to make this possibleClark and Dawson, 1999

– What are unintended outcomes/effects of the program?

Naturalistic Inquiry - Qualitative design

• Quantitative (ED) offers little insight into the social processes which actually account for the changes observed

• Can use naturalistic methods to supplement quantitative techniques (mixed methods)

• Can use fully naturalistic paradigm– Less common

Naturalistic Inquiry

– Interpretive: • People mistake their own experiences for those of others.

So….• Emphasis on understanding lived experiences of (intended)

program recipients

– Constructivist:• Knowledge is constructed (not discovered by detached

scientific observation). So…• Program can only be understood within natural context

– How being experienced by participants, staff, policy makers– Can’t construct evaluation design ahead of time “don’t know what

you don’t know”– Theory is constructed from (grounded in) data

Evaluation Data

• Quantitative• Qualitative• Mixed• Primary• Secondary • One-off surveys, data pulls• Routine monitoring• Structured • Unstructured (open-ended)

Data Collection for Evaluation

• Questionnaires– right targets – carefully constructed: capture the needed info,

wording, length, appearance, etc– analysable

• Interviews (structured, semi-, un-)– Individuals– Focus groups

• Useful at planning, formative and summative stages of program


• Observation– Systematic

• Explicit procedures, therefore replicable• Collect primary qualitative data

– Can provide new insights• drawing attention to actions and behaviour normally

taken for granted by those involved in program activities and therefore not commented upon in interviews

– Circumstances in which it may not be possible to conduct interviews


• Documentary– Solicited e.g. journals/diaries– Unsolicited e.g. meeting minutes, emails, reports– Public e.g. organization’s reports, articles in

newspaper/letters– Private e.g. emails, journals

Evaluation in Logic Models

• Look at the Logic Model Template (next slide)• What types of evaluation do you see?• What methods are implied?• What data could be used?

From Logic Model presentation

Evaluation in Business Case

• Look at handout: Ontario Min of Agriculture, Food, and Rural Affairs(OMAFRA)

• Where do you see evaluation?• What methods are implied?• What data could be used?

program evaluation. evaluation defined green and kreuter (1999) broad definition: “comparison of...

Documents

program program

program slide

program evaluation slide

program management

program processes

time of program planning

outset slide

ensuring programs