screening in clinical research - ppukm.ukm.my · pdf fileassessing the validity and...
TRANSCRIPT
SCREENING IN CLINICAL RESEARCH
DR TAN TOH LEONG
SENIOR LECTURER & EMERGENCY PHYSICIAN
MINGGU PENYELIDIKAN KE-18 UKM (2016)
ISSUES IN ASSESSING CLINICAL TESTS
• VALIDITY OF THE TESTS – HOW GOOD IS THE TEST TO IDENTIFY THE
SICK AND THE HEALTHY INDIVIDUALS
• RELIABILITY – HOW STABLE IS THE RESULTS OF THE TEST .
• EFFICIENCY AND COST-EFFECTIVENESS
ISSUES OF MEASUREMENTS
• VALIDITY/ACCURACY
• CONTENT
• CONSTRUCT
• CRITERION
• RELIABILITY/REPRODUCIBILITY
• RANGE
CONTENT VALIDITY
• DEFINITION: THE EXTEND TO WHICH A PARTICULAR METHOD OF
MEASUREMENT INCLUDE ALL OF THE DIMENSIONS OF THE CONSTRUCT ONE
INTENDS TO MEASURE AND NOTHING MORE.
• E.G. A SCALE FOR MEASURING PAIN WOULD HAVE CONTENT VALIDITY IF IT
INCLUDE QUESTIONS ABOUT ACHING, THROBBING, BURNING, AND STINGING
(THE NATURE OF PAIN) BUT NOT ABOUT PRESSURE, ITCHING, NAUSEA,
TINGLING AND THE LIKE.
CONSTRUCT VALIDITY
• DEFINITION: THE PRESENT TO THE EXTEND THAT THE MEASUREMENT IF
CONSISTENT WITH OTHER MEASUREMENTS OF THE SAME PHENOMENON.
• E.G. THE RESEARCHER MIGHT SHOW THAT RESPONSES TO A SCALE
MEASURING PAIN ARE RELATED TO OTHER MANIFESTATIONS OF THE SEVERITY
OF PAIN SUCH AS SWEATING, MOANING, WRITHING, AND ASK FOR PAIN
MEDICINE (THE ASSOCIATED SYMPTOM OF PAIN).
CRITERION VALIDITY
• DEFINITION: THE PRESENT TO THE EXTENT THAT THE MEASUREMENT PREDICT A
DIRECTLY OBSERVABLE PHENOMENON.
• E.G. ONE MIGHT SEE IF RESPONSES ON THE SCALE MEASURING PAN BEAR A
PREDICTABLE RELATIONSHIP TO PAIN OF KNOWN SEVERITY. MILD PAIN FROM
MINOR ABRASION, MODERATE PAIN FROM ORDINARY HEADACHE AND SEVERE
PAIN FROM RENAL COLIC.
RELIABILITY
• DEFINITION: IT IS THE EXTEND TO WHICH REPEATED MEASUREMENTS OF A STABLE
PHENOMENON- BY A DIFFERENT PEOPLE AND INSTRUMENTS, AT DIFFERENT TIMES
AND PLACES AND GET SIMILAR RESULTS.
• OTHER WORD : REPRODUCIBILITY AND PRECISION.
• E.G. THE RELIABILITY OF LABORATORY MEASUREMENTS IS ESTABLISHED BY MEASURE ;
FOR EXAMPLE, THE SAME SERUM OR TISSUE SPECIMEN WITH SOMETIME DIFFERENT
PEOPLE AND WITH DIFFERENT INSTRUMENTS.
• E.G. THE RELIABILITY OF SYMPTOMS CAN BE ESTABLISHED BY SHOWING THAT THEY
ARE SIMILAR DESCRIBE TO DIFFERENT OBSERVERS UNDER DIFFERENT CONDITIONS.
RANGE
• DEFINITION: THE MEASURABLE LEVELS THAT A DEVICE ABLE TO DETECT
• E.G. A INSTRUMENT MAY NOT REGISTER VERY LOW OR HIGH VALUES OF
THE THING BEING MEASURED, LIMITING THE INFORMATION IT CONVEYS.
THUS THE “FIRST GENERATION” METHOD OF MEASURING SERUM
THYROID-STIMULATING HORMONES WAS NOT USEFUL FOR DIAGNOSING
HYPERTHYROIDISM OR FOR PRECISE TITRATE IF THYROXINE
ADMINISTRATION BECAUSE THE METHOD COULD NOT DETECT LOW LEVELS
TSH.
ASSESSING THE VALIDITY AND RELIABILITY OF CLINICAL TEST OR
RESEARCH
HOW TO MEASURE?
• VALIDITY/ACCURACY – SENSITIVITY & SPECIFICITY
• RELIABILITY
• QUALITATIVE
• KAPPA ANALYSIS
• AC-1 ANALYSIS
• QUANTITATIVE
• CROHNBERG ALPHA ANALYSIS
• BLAND-ALTMAN ANALYSIS
SCREENING
• PRESUMPTIVE IDENTIFICATION OF UNRECOGNIZED DISEASE
• USING TEST, EXAMINATION
• CAN APPLY RAPIDLY
• NOT DIAGNOSTIC
• DONE TO APPARENTLY NORMAL WELL PERSON
• USE TO SCREEN RISK FACTORS OR EARLY DISEASE
DIAGNOSTIC VS SCREENING TESTS
• TESTS - HISTORY, PHYSICAL EXAMINATION, QUESTIONAIRE, LAB.
TEST
• DIAGNOSTIC – MAKING DIAGNOSIS AND LEADING TO TREATMENT
• SCREENING – PRESUMPTIVE IDENTIFICATION OF DISEASE OR RISK
FACTORS AMONG APPARENTLY HEALTHY INDIVIDUALS.
STEP BY STEP APPROACH TO SCREENING AND DIAGNOSTIC
RESEARCH.
1. CHOOSE A TOOL: HISTORY TAKING, PHYSICAL EXAMINATION AND
LAB.
2. WE SHOULD KNOW THE “NORMALITY/PARAMETRIC OR NON-
PARAMETRIC” OR CUT-OFF POINT TO CLASSIFY DISEASE & NON-
DISEASE
3. “GOLD STANDARD” – MOST SPECIFIC/ DEFINITIVE OUTCOME
4. DETERMINE THE VALIDITY OF THE TESTS (SENSITIVITY, SPECIFICITY &
PREDICTIVE VALUES.)
TEST FOR NORMALITY
• THE KOLMOGOROV-SMIRNOV TEST, THE ANDERSON-DARLING TEST,
AND THE SHAPIRO-WILK TEST.
• IF THE TEST IS STATISTICALLY SIGNIFICANT (E.G., P<0.05), THEN
DATA DO NOT FOLLOW A NORMAL DISTRIBUTION, AND A
NONPARAMETRIC TEST IS WARRANTED.
WHAT IS NORMALITY
EVEN HARDER TO DEFINE THAN ABNORMALITY, BUT FOLLOWING CHARACTERISTICS USUALLY
PRESENT:
• APPROPRIATE PERCEPTION OF REALITY; REALISTIC IN ASSESSING ACTIONS OF SELF AND
OTHERS AND OWN CAPABILITIES, AND INTERPRETING WHAT IS GOING ON.
• ABILITY TO EXERCISE VOLUNTARY CONTROL OVER BEHAVIOUR; ABLE TO RESTRAIN
(AGGRESSIVE / SEXUAL) URGES - IF FAIL TO CONFORM TO SOCIAL NORMS IT IS A RESULT OF
CHOICE NOT LACK OF CONTROL.
• SELF-ESTEEM AND ACCEPTANCE; COMFORTABLE WITH OTHERS AND CONFIDENT OF SELF
WORTH - FEEL ACCEPTED BY SOCIETY AND VALUED.
• ABILITY TO FORM AFFECTIONATE RELATIONSHIPS; SENSITIVE TO NEEDS OF OTHERS AND NOT
SELF-CENTRED IN THE EXTREME.
• PRODUCTIVITY; HAVE ENERGY AND ENTHUSIASM FOR LIFE.
DEFINING ABNORMALITY
THERE IS NO GENERAL AGREEMENT BUT MOST ATTEMPTS ARE BASED ON ONE OR MORE
(AND USUALLY ALL) OF THE FOLLOWING:
• DEVIATION FROM STATISTICAL NORMS; BASED ON STATISTICAL FREQUENCY, WHERE ABNORMAL IS
STATISTICALLY INFREQUENT. BUT IS EXTREMELY HAPPY ABNORMAL BEHAVIOUR?
• DEVIATION FROM SOCIAL NORMS; NOT FOLLOWING CERTAIN STANDARDS OR NORMS ACCEPTABLE
IN PARTICULAR SOCIETY. BUT CULTURES AND SOCIETIES DIFFER IN WHAT IS NORMAL, AND THE
CONCEPT OF ABNORMALITY CHANGES OVER TIME WITHIN ONE CULTURE.
• MALADAPTIVENESS OF BEHAVIOUR; HAS ADVERSE EFFECTS ON THE INDIVIDUAL AND/OR ON SOCIETY
EG PHOBIAS, ADDICTIONS, EXTREME AGGRESSION ETC).
• PERSONAL DISTRESS; INDIVIDUAL’S SUBJECTIVE FEELINGS OF DISTRESS RATHER THAN BEHAVIOUR.
MOST PEOPLE DIAGNOSED AS MENTALLY ILL ARE ACUTELY MISERABLE.
CUT-OFF POINT?
• FOR ORDINAL/NUMERICAL DATA;
WHEN DOES NORMAL LEAVE OFF AND ABNORMAL BEGINS?
• FOR EXAMPLE;
WHEN DOES A LARGE NORMAL PROSTATE BECOMES TOO LARGE TO BE
CONSIDERED NORMAL?
RECEIVER OPERATING CHARACTERISTIC (ROC), OR ROC CURVE
• AIM TO IMPROVE ‘SIGNAL-TO-NOISE’ RATIO
• USED FOR CHOOSING CUT-OFF POINT FOR CONTINUOUS DATA
• PORTRAYS THE TRADE-OFF BETWEEN IMPROVING EITHER THE SENSITIVITY
OR SPECIFICITY OF A TEST
• SELECT CUT-OFF POINTS AND CALCULATE THE VALUES
• PLOT THE SENSITIVITY (TRUE POSITIVES) AGAINST 1-SPECIFICITY (FALSE
POSITIVES)
1.00
.80
.60
.40
.20
0
Sen
siti
vity
(Tru
e p
osi
tiv
es)
1- specificity
(False positives)
(24)
(27)
(28)
(29)
0 .20 .40 .60 .80 1.00
(21)
ROC curve for mean corpuscular haemoglobin MCH) values
in the screening for thalassaemia carrier (Hasniah et al 2002)
Area = .90
INTERPRETING THE ROC CURVE
• A PERFECT TEST WILL HAVE A
TRUE POSITIVE RATE OF 1.0
AND A FALSE POSITIVE RATE OF
0.0 I.E. A CURVE THAT FILLS THE
SQUARE
Area = 1.00
INTERPRETING THE ROC CURVE
• A NON-INFORMATIVE CURVE
OCCURS WHEN THE TRUE
POSITIVE AND FALSE POSITIVE
RATES ARE EQUAL
Area = 0.50
INTERPRETING THE ROC CURVE
• CHOOSING A CUT-OFF POINT
• THE BEST CUT-OFF WILL BE AT THE POINT WHERE THE CURVE TURNS
• ALSO CONSIDER IMPLICATIONS OF THE TWO POSSIBLE ERRORS (MINIMIZING FALSE +VE OR FALSE NEGATIVE)
• AREA UNDER THE CURVE
• CAN BE USED TO COMPARE TWO TESTS
• A TEST WITH AN ROC CURVE AREA OF 0.90 THE TEST WOULD BE CORRECT 90% OF THE TIME
THE VALIDITY OF A TEST RESULT
DISEASE STATUS
TEST RESULT PRESENT ABSENT
Positive true positive false positive
(TP) (FP)
Negative false negative true negative
(FP) (TN)
SENSITIVITY
• PROBABILITY THAT A SICK INDIVIDUAL WILL BE CLASSIFIED AS SICK
• PROPORTION OF SUBJECTS WITH A DISEASE WHO HAVE A POSITIVE
TEST FOR A DISEASE
SENSITIVITY =
Number of sick people who are
classified as sick
Total number of sick people
Sensitivity = TP
TP + FN X 100 %
SPECIFICITY
• PROBABILITY THAT A HEALTHY INDIVIDUAL WILL BE CLASSIFIED AS HEALTHY
• PROPORTION OF SUBJECTS WITHOUT A DISEASE WHO HAVE A NEGATIVE
TEST
SPECIFICITY =
Number of healthy people who
are classified as healthy
Total number of healthy people
Specificity = TN
TN + FP X 100 %
SENSITIVITY AND SPECIFICITY OF BREAST CANCER SCREENING
BREAST CANCER
CANCER CANCER NOT TOTAL
CONFIRMED CONFIRMED
SCREENING TEST +VE 132 983 1115
PE & MAMMO -VE 45 63650 63695
TOTAL 177 64633 64810
Sensitivity
= 132/177
= 74.6%,
Specificity
= 63650/64633
= 98.5%
SENSITIVITY AND SPECIFICITY OF SCREENING TEST
DISEASE STATUS
POSITIVE NEGATIVE TOTAL
SCREENING TEST +VE A B A + B
-VE C D C + D
TOTAL A + C B + D Sensitivity = A / A + C
Specificity = D / B + D
SENSITIVITY VS SPECIFICITY
•Dangerous but treatable disease (tb, syphilis, Hodgkin)
•early work-up (differential diagnosis)
•discover disease (low prevalence)
•very sensitive test useful when the test negative
•Sn Out
•False positive can harm the patient (cancer)
•to rule in or confirm diagnosis
•very specific test useful when the test positive
•diagnostic test
•Sp In
PREDICTIVE VALUE OF A TEST
• WHETHER OR NOT THE PERSON HAS THE DISEASE, GIVEN THE RESULT
OF THE TEST. WHAT IS THE PROBABILITY ?
• PPV = PROBABILITY OF DISEASE IN PATIENT WITH A POSITIVE TEST
(ABNORMAL)
• NPV = PROBABILITY OF NOT HAVING A DISEASE WHEN THE TEST
NEGATIVE (NORMAL)
• PPV & NPV IS DETERMINED BY PREVALENCE OF DISEASE AND THE
VALIDITY.
PREDICTIVE VALUE OF THE TEST
DISEASE STATUS
POSITIVE NEGATIVE TOTAL
SCREENING TEST +VE A B A + B
-VE C D C + D
TOTAL A + C B + D
PPV = A / A + B
NPV = D / C + D PPV = Prevalence x Sensitivity
(Prev x Sen) + (1 - Sp) (1 - Prev)
PREDICTIVE VALUE OF A TEST
• PPV = THE LIKELIHOOD THAT THE TEST POSITIVE PERSON ACTUALLY HAS A DISEASE
• NPV = THE LIKELIHOOD THAT THE TEST NEGATIVE PERSON ACTUALLY DOES NOT HAVE A DISEASE
• PPV = PREVALENCE X SENSITIVITY
(PREV X SEN) + (1 - PREV)X (1 - SP)
• NPV = (1-PREVALENCE) X SPECIFICITY
(1-PREV) X SP + PREV X (1 - SEN)
PREDICTIVE VALUE OF A TEST • HIGH PPV = SCREENING PERFORMANCE SATISFACTORY
• LOW PPV = THE TEST HAS A POOR SPECIFICITY AND/OR A LOW PREVALENCE
PREVALENCE SENSITIVITY % & SPECIFICITY %
99 95 90 80
99 95 90 80
20 96.1 82.1 69.2 50.0
10 91.7 67.9 50.0 30.8
5 88.9 50.0 32.1 17.4
1 50.0 16.1 8.3 3.9
0.1 9.0 8.7 4.3 2.0
YIELD IN SCREENING
• THE AMOUNT OF PREVIOUSLY UNRECOGNISED DISEASE WHICH IS DIAGNOSED
AND BROUGHT TO TREATMENT (DETECTED) AS A RESULT OF SCREENING. THIS USE
TO MEASURE PROGRAM PERFORMANCE
FACTORS AFFECTING THE YIELD
• SENSITIVITY OF THE TEST
• PREVALENCE OF THE DETECTABLE PRECLINICAL STAGE
• EXTENT OF PREVIOUS SCREENING
• HEALTH BEHAVIOR AND PERCEPTION OF THE DISEASE
CHARACTERISTIC OF DISEASE FOR SCREENING
• FATAL AND PROLONGED MORBIDITY
• MUST HAVE EFFECTIVE TREATMENT (DETECTED STAGE > AFTER SYMPTOM)
• DETECTABLE PRECLINICAL PHASE > PREVALENCE AMONG THE PERSON
SCREENED
CRITERIA FOR SCREENING
• THE CONDITION SHOULD BE IMPORTANT HEALTH PROBLEM
• ACCEPTABLE BY COMMUNITY
• MINIMAL SIDE EFFECTS
• VALIDITY MUST BE SATISFACTORY
• PRESENCE OF GOOD CONFIRMATORY TEST.
• AGREED POLICY ON WHOM TO TREAT
CRITERIA FOR SCREENING
• PRESENCE OF AN EFFECTIVE TREATMENT
• RESOURCES AND FACILITIES FOR TREATMENT
• KNOW THE HISTORY OF PRESENCE ILLNESS
• REASONABLE YIELD
• EVIDENCE THAT THE EARLY DETECTION AND TREATMENT REDUCE MORTALITY AND
MORBIDITY
• THE EXPECTED BENEFIT EXCEED THE RISK OF SCREENING
THANK YOU