data collection

49
Page 1 EDWARD JAMES R GORGON MPhysio BCHPEd PTRP Department of Physical Therapy College of Allied Medical Professions University of the Philippines Manila Email: [email protected] Methods, tools, and Methods, tools, and issues issues

Upload: trilby

Post on 22-Jan-2016

41 views

Category:

Documents


0 download

DESCRIPTION

Data Collection. Methods, tools, and issues. EDWARD JAMES R GORGON MPhysio BCHPEd PTRP Department of Physical Therapy College of Allied Medical Professions University of the Philippines Manila Email: [email protected]. Learning objectives. Define reliability - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Data Collection

Page 1

EDWARD JAMES R GORGON MPhysio BCHPEd PTRP

Department of Physical TherapyCollege of Allied Medical ProfessionsUniversity of the Philippines ManilaEmail: [email protected]

Methods, tools, and issuesMethods, tools, and issues

Page 2: Data Collection

Page 2

Learning objectivesobjectives

Define reliability

Discuss potential sources of measurement error

Explain the types of reliability

Explain concepts in measurement reliability

Define validity

Explain the types of validity

Explain the concepts of sensitivity and specificity

Page 3: Data Collection

Page 3

Part One

Reliability and validity

Page 4: Data Collection

Page 4

Measurement reliabilityreliability

Degree of consistency or agreement between repeated measurement taken when the underlying phenomenon has not changed

Reproducibility and repeatability of an instrument or procedure in measurement

ErrorError = Variation without true change

Repeatability = Reproducibility

Page 5: Data Collection

Page 5

Measurement reliabilityreliability

Potential sources of measurement error

Rater

Patient / subject

Equipment

Procedure

Page 6: Data Collection

Page 6

Measurement reliabilityreliability

Error related to the RATER

Competence / skill

Preparation

Motivation / interest

Fatigue

Page 7: Data Collection

Page 7

Measurement reliabilityreliability

Error related to the PATIENT / SUBJECT

Comprehension

Familiarization

Environment

Pain

Fatigue

Page 8: Data Collection

Page 8

Measurement reliabilityreliability

Error related to the PATIENT / SUBJECT

Recovery / deterioration

Hawthorne effect

Page 9: Data Collection

Page 9

Measurement reliabilityreliability

Error related to the EQUIPMENT

Operation

Maintenance

Calibration

Sensitivity

Page 10: Data Collection

Page 10

Measurement reliabilityreliability

Error related to the PROCEDURE

Positioning

Handling

Stabilization

Instructions

Page 11: Data Collection

Page 11

Measurement reliabilityreliability

Types of reliability

Internal consistency

Test-retest

Intra-rater

Inter-rater

Page 12: Data Collection

Page 12

Reliability

9.00 10.00 11.00 12.00 13.00

Rater2

9.00

10.00

11.00

12.00

13.00

Ra

ter1

Page 13: Data Collection

Page 13

Reliability

2.00 4.00 6.00 8.00 10.00 12.00

Measurement 2

9.00

10.00

11.00

12.00

13.00

Mea

sure

men

t 1

Page 14: Data Collection

Page 14

Internal consistencyInternal consistency

Degree of homogeneity of test items within an instrument to the attribute being measured

Measured at one point in time

Usually assessed using Cronbach’s alpha (α)

Page 15: Data Collection

Page 15

Test-retestTest-retest reliability

Degree to which an instrument is stable, based on repeated (at least 2) measurements on different occasions

Constant test conditions, including subjects and rater(s), in both occasions

Not possible to assess if the variable is labile

Page 16: Data Collection

Page 16

Test-retestTest-retest reliability

Barthel Index, BADL (Sackley et al, 2006)

WEEK 1 WEEK 2 SUBJ1 10 11SUBJ2 10 10SUBJ3 11 12SUBJ4 13 13SUBJ5 9 11SUBJ6 11 12SUBJ7 12 11SUBJ8 10 9

Page 17: Data Collection

Page 17

Intra-raterIntra-rater reliability

Stability of data recorded by 1 rater across 2 or more trials done in 1 occasion of measurement

Constant test conditions, including subjects, in both trials

Page 18: Data Collection

Page 18

Intra-raterIntra-rater reliability

Goniometry, knee flexion (Lin, 2003)

TRIAL 1 (deg) TRIAL 2 (deg)SUBJ1 76 75SUBJ2 90 87SUBJ3 84 82SUBJ4 83 85SUBJ5 79 78SUBJ6 87 86SUBJ7 80 82SUBJ8 77 79

Page 19: Data Collection

Page 19

Inter-raterInter-rater reliability

Variation between 2 or more raters who measure the same group of subjects at least once each

Constant test conditions, including subjects

Potential bias from differences in raters’ training and experience levels

Page 20: Data Collection

Page 20

Inter-raterInter-rater reliability

Peabody, language skills (van Kleeck et al., 2006)

RATER 1 RATER 2 SUBJ1 45 69SUBJ2 99 81SUBJ3 84 75SUBJ4 80 74SUBJ5 79 72SUBJ6 81 85SUBJ7 60 82SUBJ8 76 87

Page 21: Data Collection

Page 21

Reliability coefficientReliability coefficient

Formula:

true score variance

-----------------------------------------------------

true score variance + error variance

Page 22: Data Collection

Page 22

Kappa (k)Kappa (k)

Represents the average rate of agreement for an entire set of yes/no responses

Appropriate when data are nominal-level or ordinal-level

Varies from 0 – 1 (no units associated)

Page 23: Data Collection

Page 23

Coefficient of variation (CoV)Coefficient of variation (CoV)

Formula:

Standard deviation

------------------------------- X 100%

Mean

Page 24: Data Collection

Page 24

Coefficient of variation (CoV)Coefficient of variation (CoV)

The standard deviation expressed as a percentage of the mean

Useful when comparing variability in different groups

Appropriate when data are interval-level or ratio-level

Page 25: Data Collection

Page 25

Intraclass correlation coefficient (ICC)Intraclass correlation coefficient (ICC)

Ratio of person variance divided by total variance (between persons + within persons)

Reflects both the degree of correspondence and agreement among ratings

Varies from 0 – 1 (no units associated)

Page 26: Data Collection

Page 26

Interpreting reliability estimates reliability estimates

“Rule of thumb”

> 0.80 = Excellent

0.60 – 0.79 = Adequate

< 0.60 = Poor

HOWEVER, estimates are population-specific and use may be context-specific

Page 27: Data Collection

Page 27

Choosing reliablereliable outcome measures Rigor of standardization studies for reliability

ExcellentExcellent More than 2 well-designed reliability studies completed with

adequate to excellent reliability values

AdequateAdequate1-2 well-designed reliability studies with adequate to excellent reliability values

PoorPoorReliability studies poorly completed, or reliability studies showing poor levels of reliability

No evidence availableNo evidence available

Page 28: Data Collection

Page 28

Measurement validityvalidity

Extent to which an instrument measures what it is supposed to measure

= TRUENESSTRUENESS OF A MEASURE

Validity implies that a measurement is relatively free from errorfree from error, i.e., a valid test is also reliable

Validity allows generalizations beyond a specific score

Page 29: Data Collection

Page 29

Measurement validityvalidity

Emphasis is placed on the objectives of a test and the ability to make inferences from test scores or measurements

Specificity of validity evaluated within the context of the test’s intended use and a specific population

Page 30: Data Collection

Page 30

Measurement validityvalidity

How to say that inferences from a test are validvalid?

Instrument output related and proportional to the actual variable of interest

Values assigned to the variable are representative of response

Page 31: Data Collection

Page 31

Types of validityvalidity

Face validity

Content validity

Criterion-related validity

Construct validity

Page 32: Data Collection

Page 32

FaceFace validity

The extent to which an instrument appears to test what it is supposed to test

Determined by a non-rigorous process – ALL OR NONEALL OR NONE

Insufficient for the overall validity of a test

Page 33: Data Collection

Page 33

ContentContent validity

The extent to which items in an instrument addresses and samples relevant aspects within the concept / variable being measured / assessed

Page 34: Data Collection

Page 34

ContentContent validity

Important characteristic of questionnaires, examinations, and interviews

Demands that a test is not influenced by factors irrelevant to the purpose of measurement

Page 35: Data Collection

Page 35

CriterionCriterion validity

The extent to which an instrument agrees with an external criterion measurement (a “gold standard”) of that concept

Ergo, outcomes of the instrument can be used as a substitute measure for the gold standard

If the correlation between the target test and criterion is high, the test is a valid predictor of the criterion score

Page 36: Data Collection

Page 36

CriterionCriterion validity

Criterion must be reliable and relevant to the parameter measured by the target test

Criterion and target ratings should be independent and free from bias

If a gold standard does not exist, other similar measures are used

Page 37: Data Collection

Page 37

CriterionCriterion validity

CONCURRENT validityTarget measurement and criterion measurement taken at the same time

PREDICTIVE validityTest will be a valid predictor of a future criterion score

Page 38: Data Collection

Page 38

ConstructConstruct validity

Ability of an instrument to measure an abstract (typically multidimensional) construct and the degree to which the instrument reflects the theoretical components of that construct

Page 39: Data Collection

Page 39

ConstructConstruct validity

CONVERGENT validityThe extent to which an instrument agrees with conceptually similar instruments

DIVERGENT validityThe extent to which an instrument lacks correlation with instruments that, conceptually, are distinct

Page 40: Data Collection

Page 40

Validity estimates: Pearson’s Pearson’s rr

Demonstrates the strength of linear relationship between 2 variables

Often used, if erroneously, as a reliability indicator

Varies from –1 through 0 through +1 (directionality of relationship indicated by the - / + sign)

Page 41: Data Collection

Page 41

Sensitivity Sensitivity and specificity specificity

SensitivitySensitivity

The ability of a test to obtain a positive test when the condition is actually present

SpecificitySpecificity

The ability of a test to obtain a negative test when the condition is actually absent

Page 42: Data Collection

Page 42

SensitivitySensitivity

Sensitivity = [a / (a + c)] x 100%

Condition + Condition - Total

Test result + a b a + b

Test result - c d c + d

Total a + c b + d

Page 43: Data Collection

Page 43

SpecificitySpecificity

Specificity = [d / (b + d)] x 100%

Condition + Condition - Total

Test result + a b a + b

Test result - c d c + d

Total a + c b + d

Page 44: Data Collection

Page 44

Measure developmentMeasure development

Planning

Test construction

Reliability testing

Validation

Page 45: Data Collection

Page 45

Measure developmentMeasure development

Appropriateness of the test for the target group

Interpretation of results in a meaningful way

Sufficient sensitivity to detect small but CLINICALLY RELEVANT change

Application of the test in varied settings and populations to determine useful properties

Page 46: Data Collection

Page 46

Selection criteriaSelection criteria for measures

Appropriateness to the target group

Psychometric properties

Validity

Reliability

Sensitivity to clinically relevant change

Sensitivity and specificity , if diagnostic purpose

Page 47: Data Collection

Page 47

Selection criteriaSelection criteria for measures

Clinical utility / practicality of administration

Clarity of instructions

Format (interview, questionnaire, task performance, naturalistic observation, other)

Ease of administration (time required to complete, scoring, interpretation)

Expertise / training required for administering and/or interpreting

Cost-effectiveness

Page 48: Data Collection

Page 48

In summary...

Page 49: Data Collection

Page 49

The End.

Thanks for listening.