lecture notes on reliability and validity

22
Reliability and Validity Today’s Objectives Understand the difference between reliability and validity Understand how to develop valid indicators of a concept

Upload: jahangirian

Post on 28-Feb-2023

0 views

Category:

Documents


0 download

TRANSCRIPT

Reliability and Validity

Today’s Objectives Understand the difference between reliability and validity

Understand how to develop valid indicators of a concept

Reliability and ValidityReliability

•How accurate or consistent is the measure?

•Would two people understand a question in the same way?

•Would the same person give the same answers under similar circumstances?

Validity•Does the concept measure what it is intended to measure?

•Does the measure actually reflect the concept?

•Do the findings reflect the opinions, attitudes, and behaviors of the target population?

Valid and reliable

Valid but not reliable

Reliable but not valid

Levels of ReliabilityExample: Person’s weight

Estimate on the part of the subjectLOW

HIGH

Estimate on the part of the observer

Old bathroom scale

Industrial scale

Reliability Reliability is the consistency of your measurement, or the degree to which an instrument measures the same way each time it is used under the same condition with the same subjects. In short, it is the repeatability of your measurement. A measure is considered reliable if a person's score on the same test given twice is similar. It is important to remember that reliability is not measured, it is estimated.

Here is a simple example to illustrate this. Suppose that you have bathroom weight scales and these weight scales are broken. The weight scales will represent the methodology. One person weighs you with these scales and obtains a result. Then, the weight scales are passed along to another person. The second person follows the same procedure, uses the same weight scales and weighs you. The same broken weigh scales are used. The two people, using the same broken weight scales, come to similar measures. The results are reliable. The results are obtained by two (or perhaps more) people using the faulty scale. Although the results are reliable, they may not be valid. That is, by using the faulty scales, the results are not a true indicator of the real weight.

Reliability

Accuracy, precision, or consistency of measurement

Degree to which measures are free from error and therefore yield consistent results

Reliable measures mean the same data would have been collected under similar circumstances

Methods used to determine reliability Test-retest method

• Administer the same measures to the same respondents at two separate points in time

Split-half method• Correlate one-half of a scale with the other half

Calculate reliability coefficient• Statistical test that measures the internal consistency of a set of items

How to improve Reliability? Quality of items; concise statements, homogenous words (some sort of uniformity)

Adequate sampling of content domain; comprehensiveness of items

Longer assessment – less distorted by chance factors

Developing a scoring plan (esp. for subjective items – rubrics)

Ensure VALIDITY

Food Quality What items would you include to get adequate sampling of content domain?

Program Satisfaction I like the after-school program

I like the after-school teachers

I would sign up again for the after-school program

Validity

The ability of a scale to measure what it is intended to measure

The extent to which a measure reflects the real meaning of the concept under consideration

The extent to which a measure reflects the opinions and behaviors of the population under investigation

Can not be valid unless also reliable

Validity Validity refers to the degree to which a study accurately reflects or assesses the specific concept that the researcher is attempting to measure. While reliability is concerned with the accuracy of the actual measuring instrument or procedure, validity is concerned with the study's success at measuring what the researchers set out to measure.

Validity Depends on the Purpose of the

measure• E.g. a ruler may be a valid measuring

device for length, but isn’t very valid for measuring volume

Measuring what ‘it’ is supposed to Must be inferred from evidence;

cannot be directly measured

What would be valid measures of… Intelligence? Religiosity? Knowledge of RPTS 336 material? Tourism motivations? Commitment to a leisure activity?

Satisfaction with a leisure service?

Environmental ethic?

Types of validity Face (content) validity—professional agreement that variables cover range of meanings included within the concept• Items should be evaluated for their presumed relevance

• Items should cover a range of ideas rather than a single topic area

• Items should be evaluated in terms of the abilities of the individuals under investigation

Types of validity Construct validity—the degree to which a measure relates to other variables, as expected, within a given system of theoretical relationships

•Satisfaction and Program Quality Predictive validity—extent to which a measure predicts some future event

•Self-esteem and GPA

Factors that can lower Validity Unclear directions Difficult reading vocabulary and sentence structure

Ambiguity in statements Inadequate time limits Inappropriate level of difficulty Poorly constructed test items Test items inappropriate for the outcomes being measured

Continued…. Tests that are too short Improper arrangement of items (complex to easy?)

Identifiable patterns of answers Teaching Administration and scoring Students Nature of criterion

External Validity Answers the question of generalizability

To what populations or settings can this

effect be generalized? Two aspects Population validity Ecological Validity

Population Validity Is the actual sample representative of

the theoretical population? To determine, need to identify:

• Theoretical population• Accessible population• Sampling design and selected sample• Actual sample