what is evaluation, and why is it needed · what is evaluation, and why is it ... to inform...

What is evaluation, and why is it needed ?

Prof. Jeremy Wyatt DM FRCP, Director

Acknowledgments:

• Doug Altman & David Spiegelhalter

• Chuck Friedman & Trish Greenhalgh

International

Digital Lab,

Warwick Uni.

What is digital healthcare ?

“The redesign of care pathways, health services

and systems supported by appropriate digital

technologies”

What is evaluation ?

Describing or measuring something

Usually with:

– a person or group (the audience) in mind

– a purpose – making a decision, answering a

question…

Implies a set of criteria, judgements to

be made

May just be data collection and analysis

Evaluation as an information-

generating cycle

1. Question

2. Design a

study

3. Collect data,

analyse results

4. Make

decision

Why evaluate ?

1. To learn as we go along [formative]

2. To ensure our systems are safe & effective, solve

more problems than they create

3. To inform decisions made by others [summative]

4. To publish, add to the evidence base

5. To account for money spent (cover our backs)

6. To persuade stakeholders: health professionals,

politicians, organisations, patients…

Some of the evaluation

stakeholders

Health

information

system

System

developersSystem

suppliers

System

purchaser

System users

Patients

Friends & family

Tax payers

Professional

bodies

Trade

associations

Evaluators

Academic

peers

Evaluation

funder

Users of similar

systems

Regulators

Evaluation principles

The stakeholders ask & prioritise the questions, evaluators

formalise them

The methods used depend on the question & on the

required reliability of answer (not on the technology):

– Qualitative methods describe perceptions, barriers, needs, why

things (don’t) work, teams, relationships...

– Quantitative methods measure how much, how often, eg. data

quality, system usage, changes in clinical actions, patient

outcomes…

Challenge: titrating the evaluation methods to the

resources available & required reliability of answer

Kinds of evaluation study

Quantitative studiesQualitative studies

Measurement studies Demonstration studies

Descriptive studies

Correlational studies

Reliability studies

Validity studies

Evaluation studies

Comparative studies

Objectivist mind set Real objects exist in the world

These objects have real properties

that we can measure or infer from

measurements

An observer can measure these

properties without affecting the

object. The result of this should be

independent of the observer

Different observers of these objects

should agree on the properties – and

whether they are good or right

The better the measurement

method, the greater the agreement

between observers

Latin square stained glass

– David Spiegelhalter

Subjectivist mind set Real objects exist in the world – but so do important

constructs such as organisations, teams, personalities…

Some of these objects have real properties that we can measure, but for others, we can only attempt a rich description

Sometimes an observer can measure these properties without affecting the object. Its fine for the result to be dependent on the observer

Since every observer is different, different observers of these objects will rarely agree on the properties – but the differences will be illuminating

The better the description, the greater theunderstanding

The sociology of research &

evaluationSociety

Advocates Opponents

Risk takers Regulators

Innovators Laggards

Vested interestsThe media

Practitioners Academics

Public sector Private sector

Evaluation study findings

Relationships between research and

policy – Carol Weiss 1979

1. Knowledge driven – new evidence leads to policy change

2. Problem driven – policy question drives search for evidence /

commissioned research

3. Interactive search – with many kinds of evidence and

opinions

4. Political model – evidence used to justify entrenched views

5. Tactical model – research as delaying tactic, to shirk

responsibility

6. Enlightenment model – policy changes not due to specific

results but to osmosis of methods & models

7. Research as part of intellectual enterprise – research &

policy both respond to current concerns

EBM / Technology appraisal model

Practice guidelines

Policy decisions ?

Social constructivism and evaluation

We all / Society constructs reality

There are many possible interpretations, all equally valid

Context is all (well, very important)

Focus of evaluation should therefore be on interpretation &

understanding, not on truth or prediction

Ethnography is a means to elicit and explore this

Gehry’s house,

Santa Monica

Links to post modernism &

de-constructivism in literary

criticism, architecture etc. –

Derrida, Gehry…

Realist evaluation – Ray Pawson

Old care

pathway

New care

pathway

Information

system

+Transformation

How did this happen ?

What helped this to happen ?

Who made this happen ?

What had to change ?

What evidence informed it ?

What are the benefits ?

Who benefits, who doesn’t ?

What works, for whom, in what context ?

How to avoid mistakes next time ?

The evaluator’s mindset

Evaluation, like politics, is the art of the possible

Have realistic goals: aim to be informative, not

definitive

Tailor the study to the problem and collect

information to address stakeholder questions

Exploit opportunities in the lab and in the field

Be:

– Focused (always have a plan)

– Open (to intended & unintended effects)

– Flexible (ie. prepared to change your plan)

Step wedge design

RCT in which each unit is randomly allocated to

cross over to intervention early, or at random time

Useful if intervention in short supply - only fair way

to allocate it is a lottery !

Eg. impact study of HIS in 28 hospitals in Limpopo

province, South Africa in 2002:

– Half randomised to early implementation [but copper

cable linking some to data centre stolen – 6 times !]

– Half randomised to late implementation [but chief execs

of some persuaded HIS team to implement earlier]

Littlejohns & Wyatt. Evaluating computerised health

information systems: hard lessons still to be learnt. BMJ 2003

Oncocin clinical workstation – Stanford 1980s

Did Oncocin improve data quality ?

0

10

20

30

40

50

60

70

80

90

100

Toxicity data Symptom data

Perc

en

t co

mp

lete

data Before Oncocin

With Oncocin

Kent & Shortliffe 1986

Possible responses

1. Yes, this is the intended benefit– Oncocin required data before doctor could prescribe, other

toxicity data entered from lab reports

2. No, it’s an artefact of measurement methods– Easier to check if data complete in database than paper record– Definition of “complete data” changed (for paper records, no

mention = no toxicity present)

3. No, an indirect impact via changes in staff– New staff coincided with introduction of Oncocin– Hawthorne effect, stimulated by presence of Oncocin in clinic– Feedback of baseline results raised motivation

4. Numerous other possible explanations:– Legal case, poor data quality, letter from chief executive– New, toxic drug introduced– Chance effect: small numbers…

Local, specific questions versus

generic questions

Answering local, specific questions is relatively easy:

“What happened in our clinic after this system was

tailored to our requirements and used?”

Answering generic questions is much harder: “What is

likely to happen in similar clinics when systems like this

are used?”

(Recall that different questions require different evaluation

methods)

Does telehealth work ?

UK Whole System Demonstrators - covers 6000

pts with HF, COPD, DM

“If used correctly” telehealth reduced:

Death rates by 45%

NHS resource usage by 15-20%

NHS tariff costs by 8%

[But how many people used it “correctly” ?]

Avoiding undue optimism

Ioannidis (JAMA 2005) criticises clinical optimists and

innovators for their breathless excitement about the

positive results of drug trials - many of which are later

contradicted

This also applies to telehealth:

– Cochrane review (Inglis 2010) concluded that telehealth in

heart failure reduced death rate by 34%

– Now, including several large negative trials, this figure is

halved to 18%.

– Even this could be optimistic, as there is evidence that some

negative trials were never published

37 RCTs measuring impact of TH on heart failure

mortality: RR 0.82 (95% CI 0.74-0.92)

Work by

Dr Shiva

Sathanandam

MB MPH

Key Telehealth study questions

Not “Does telehealth work ?”

But:

Who wants telehealth ?

Who engages with telehealth ?

Who benefits from it ?

In what care pathways is it cost effective?

Why bother with evaluation – can’t we

just predict the results ?

No, the real word is too messy / complex:

Bike Ed - carefully designed training campaign for

boys – doubled injury risk (Carlin J 1998)

Weekly exercise programme for nurses to reduce

back problems - no reduction, interfered with work

planning (Skargren E 1999)

Toughened glass tankards to reduce alcohol-related

injuries – randomised trial in 57 bars showed 60% rise

(tankards shattered more - Warburton A 2000)

Source: Anne Oakley, BMJ 1999

Conclusions

1. We expect too much of studies:

– to be definitive, pass ultimate judgment

– to tell us exactly what to do

– to appeal to every stakeholder

2. Successful evaluations help inform

decisions in identified stakeholders

for whom study is performed – but

do not dictate them

3. Rarely is a single study - even a

randomised trial - definitive

What is telehealth for ?

To help clinicians remotely monitor

patients with a long term condition

(LTC) and deliver advice & care to them

at home ?

To empower patients to monitor and

self manage their LTC ?

To help patients better navigate the

health system & negotiate access to

resources using data about their LTC ?

The telehealth fallacy

“Technology is the solution”

No, technology is just a channel to deliver a

safe, effective self-care package tailored to

support people with LTCs

The problem is, few self-care packages are

sufficiently well-defined to be replicated and

tested - let alone known to be effective

Why evaluate health information systems?

To describe, clarify, understand system problems &

inform system design

To assess if we built our system right (verification)

To check if the system we built is alleviating the

intended problems (validation)

To provide good quality evidence for policy makers

To describe & understand unexpected events after the

system is installed

What is the IDH ?

A 5-year partnership between NHS Midlands & East

cluster & Warwick University to promote excellent

R&D in - and NHS uptake of – appropriate digital

healthcare

Emphasis on both eHealth innovation and new

technology development

An incubator, demonstrator, facilitator & network hub

Currently 8 academic and research staff & 7 PhD

students

New Masters in Digital Healthcare with clinical

& engineering tracks starting October

what is evaluation, and why is it needed · what is evaluation, and why is it ... to inform...

Documents