Transcript

Lesson FiveLesson Five

Validity & PracticalityValidity & Practicality

ContentsContents

IntroductionIntroduction: : Definition of ValidityDefinition of ValidityTypes of validityTypes of validity

Non-empiricalNon-empiricalFace ValidityFace ValidityContent ValidityContent Validity

EmpiricalEmpiricalConstruct ValidityConstruct ValidityCriterion-related ValidityCriterion-related Validity

PracticalityPracticality

IntroductionIntroductionA writing test asks test takers to write A writing test asks test takers to write

on the following topic:on the following topic:““Is Photography an Art or a Science?”Is Photography an Art or a Science?”

A valid writing test? Why or why not?A valid writing test? Why or why not?You should be clear about what exactly You should be clear about what exactly

you want to test (i.e., no other you want to test (i.e., no other irrelevant abilities or knowledge).irrelevant abilities or knowledge).

Validity concerns what a test measures Validity concerns what a test measures and how well it measures what it is and how well it measures what it is intended to measure. intended to measure.

Definition of ValidityDefinition of Validity

• ““the extent to which inferences made from the extent to which inferences made from assessment results are appropriate, assessment results are appropriate, meaningful, and useful in terms of the meaningful, and useful in terms of the purpose of the assessment” (cited in Brown purpose of the assessment” (cited in Brown 22)22)

A valid test = a test that measures what it is A valid test = a test that measures what it is intended to measure, and nothing else (i.e., intended to measure, and nothing else (i.e., no external knowledge or other skills no external knowledge or other skills measured at the same time).measured at the same time).

e.g. A listening test measures listening skill e.g. A listening test measures listening skill and nothing else. It shouldn’t favor any and nothing else. It shouldn’t favor any students.students.

Non-empirical ValidityNon-empirical Validity

Involving inspection, intuition, Involving inspection, intuition, and common senseand common senseConsequential validity:Consequential validity:

Face validityFace validityContent validityContent validity

Consequential ValidityConsequential ValidityEncompasses all the consequences of Encompasses all the consequences of

a test: a test: (Brown 26)(Brown 26)

Its accuracy in measuring intended criteriaIts accuracy in measuring intended criteriaIts impact on the preparation of test-Its impact on the preparation of test-

takerstakersIts effect on the learnerIts effect on the learnerThe social consequences of a test’s The social consequences of a test’s

interpretation and useinterpretation and useThe effect on Ss’ motivation, subsequence The effect on Ss’ motivation, subsequence

in a course, independent learning, study in a course, independent learning, study habits, and attitude toward school work.habits, and attitude toward school work.

Face Validity (1)Face Validity (1)

You know if the test is valid or nYou know if the test is valid or not by ‘looking’ at it.ot by ‘looking’ at it.

It “looks right” to other testerIt “looks right” to other testers, teachers, and testees, the gens, teachers, and testees, the general public, etc.eral public, etc.

It “appears” to measure the kIt “appears” to measure the knowledge or abilities it claims to nowledge or abilities it claims to measure.measure.

Face Validity (2)Face Validity (2)

Face validity asked the Q: “does Face validity asked the Q: “does the test, on the ‘face’ of it, appear the test, on the ‘face’ of it, appear from the learner’s perspective to from the learner’s perspective to test what it is designed to test?” test what it is designed to test?” (Brown 27)(Brown 27)

Face validity cannot be Face validity cannot be empirically tested.empirically tested.

Essential to all kinds of tests, but Essential to all kinds of tests, but it is not enough.it is not enough.

Content Validity (1)Content Validity (1) ““A test is said to have content A test is said to have content

validity if its content constitutes a validity if its content constitutes a representative sample of the representative sample of the language skills, structures, etc. with language skills, structures, etc. with which it is meant to be concerned.” which it is meant to be concerned.” (Hughes 1989)(Hughes 1989)

Also called rational or logical validity.Also called rational or logical validity.Esp. important for achievement, Esp. important for achievement,

progress, & diagnostic testsprogress, & diagnostic testsA valid test: A valid test:

contains appropriate and representative contains appropriate and representative content.content.

Content Validity (2)Content Validity (2)A test with content validity contains a A test with content validity contains a

representative sample of the course representative sample of the course (objectives), and quantifies and (objectives), and quantifies and balances the test components (given a balances the test components (given a percentage weighting)percentage weighting)

Check against:Check against:Test specifications (test plan)Test specifications (test plan)Notes, textbooksNotes, textbooksCourse syllabus/objectivesCourse syllabus/objectivesAnother teacher or subject-matter expertsAnother teacher or subject-matter experts

Content Validity (3)Content Validity (3)

An example of a (fill-up) quiz on the An example of a (fill-up) quiz on the use of articles: (see Brown 23)use of articles: (see Brown 23)Does it have content validity if used as a Does it have content validity if used as a

listening/speaking test?listening/speaking test?Classroom tests should always have Classroom tests should always have

content validity.content validity.Rule of thumb for achieving content Rule of thumb for achieving content

validity: always use direst testsvalidity: always use direst tests

Criterion-related Validity (1)Criterion-related Validity (1)

The extent to which the The extent to which the “criterion” of the test has actually “criterion” of the test has actually been reached.been reached.

““how far results on the test agree how far results on the test agree with those provided by some with those provided by some independent and highly independent and highly dependable assessment of the dependable assessment of the candidate’s ability.”candidate’s ability.”

Criterion-related Validity (2)Criterion-related Validity (2)Two kinds of criterion-related Two kinds of criterion-related

validityvalidityConcurrent validity:Concurrent validity:

How closely the test result parallels test How closely the test result parallels test takers’ performance on another valid takers’ performance on another valid test, or criterion, which is thought to test, or criterion, which is thought to measure the same or similar activitiesmeasure the same or similar activities

test & criterion administered at about test & criterion administered at about the same timethe same time

possible criteria = an established test possible criteria = an established test or some other measure within the same or some other measure within the same domain (e.g., course grades, T’s ratings)domain (e.g., course grades, T’s ratings)

Criterion-related Validity (3)Criterion-related Validity (3)

E.g., situation: E.g., situation: conv. class, objectives = conv. class, objectives = a large # of functions. To test all of whica large # of functions. To test all of which will take 45 min. for each S.h will take 45 min. for each S.

Q: Q: Is such a 10-min. test a valid measure?Is such a 10-min. test a valid measure?Method: Method: a random sample of Ss taking a random sample of Ss taking

the full 45 min-test = criterion test; compthe full 45 min-test = criterion test; compare scores on short version with the thosare scores on short version with the those on criterion test e on criterion test if a high level of agr if a high level of agreement eement short version = valid test short version = valid test

Criterion-related Validity (4)Criterion-related Validity (4)

Validity coefficient:Validity coefficient:A mathematical measure of A mathematical measure of

similaritysimilarityPerfect agreement Perfect agreement validity validity

coefficient = 1coefficient = 1E.g., a coefficient = 0.7; (0.7)E.g., a coefficient = 0.7; (0.7)2 2 = =

0.490.49 49%, which means almost 49%, which means almost 50% agreement50% agreement

Criterion-related Validity (5)Criterion-related Validity (5)

Predictive validity:Predictive validity: How well the test result predicts How well the test result predicts

future performance/successfuture performance/successcorrelation done at future timecorrelation done at future timeImportant for the validation of Important for the validation of

aptitude tests, placement test, aptitude tests, placement test, admissions tests.admissions tests.

Criterion:Criterion:Outcome of the course (pass/fail), T’s Outcome of the course (pass/fail), T’s

ratings laterratings later

Construct Validity (1)Construct Validity (1)

Construct:Construct:any underlying ability (trait) which any underlying ability (trait) which

is hypothesized in a theory of is hypothesized in a theory of language abilitylanguage ability

Any theory, hypothesis, or model Any theory, hypothesis, or model that attempts to explain observed that attempts to explain observed phenomena in our universe of phenomena in our universe of perceptions (Brown 25)perceptions (Brown 25)

Construct Validity (2)Construct Validity (2)Originated for psychological testsOriginated for psychological testsRefers to the extent to which the test Refers to the extent to which the test

may be said to measure a theoretical may be said to measure a theoretical construct or trait which is normally construct or trait which is normally unobservable and abstract at different unobservable and abstract at different levels (e.g., personality, self-esteem; levels (e.g., personality, self-esteem; proficiency, communicative proficiency, communicative competence)competence)

It examines whether the test is a true It examines whether the test is a true reflection of the theory of the trait reflection of the theory of the trait being measured.being measured.

Construct Validity (3)Construct Validity (3)

A test has construct validity if it A test has construct validity if it can be demonstrated that it can be demonstrated that it measures measures just the abilityjust the ability which which it is supposed to measure.it is supposed to measure.

Two examples:Two examples:1. reading ability: involves a # of 1. reading ability: involves a # of

sub-abilities, e.g., skimming, sub-abilities, e.g., skimming, scanning, guessing meaning of scanning, guessing meaning of unknown words, etc.unknown words, etc.

Construct Validity (4)Construct Validity (4) need empirical research to need empirical research to

establish if such a distinct ability establish if such a distinct ability existed and could be measuredexisted and could be measured

Need of construct validity Need of construct validity (because we have to (because we have to demonstrate we’re indeed demonstrate we’re indeed measuring just that ability in a measuring just that ability in a particular test.particular test.

Construct Validity (5)Construct Validity (5)2. when measuring2. when measuring an ability an ability

indirectlyindirectly::E.g., writing abilityE.g., writing abilityNeed to look to a theory of writing ability for Need to look to a theory of writing ability for

guidance as to the form (i.e., content, guidance as to the form (i.e., content, techniques) an indirect test should taketechniques) an indirect test should take

Theory of writing tells us that underlying Theory of writing tells us that underlying writing abilities = a # of sub-abilities, e.g., writing abilities = a # of sub-abilities, e.g., punctuation, organization, word choice, punctuation, organization, word choice, grammar . . .grammar . . .

Based on the theory, we construct multiple-Based on the theory, we construct multiple-choice tests to measure these sub-abilitieschoice tests to measure these sub-abilities

Construct Validity (6)Construct Validity (6)

But, how do we know this test really But, how do we know this test really is measuring writing ability?is measuring writing ability?

Validation methods:Validation methods:Compare scores on the pilot test with Compare scores on the pilot test with

scores on a writing test (direct test) scores on a writing test (direct test) if high level of agreement if high level of agreement yes yes

Administer a # of tests; each Administer a # of tests; each measures a construct. Score the measures a construct. Score the composition (direct test) separately for composition (direct test) separately for each construct. Then compare scores.each construct. Then compare scores.

Construct Validity (7)Construct Validity (7)

To examine whether the test is a true To examine whether the test is a true reflection of the theory of the trait being reflection of the theory of the trait being measured.measured.

In lang. testing construct= any In lang. testing construct= any underlying ability/trait which is underlying ability/trait which is hypothesized in a theory of language hypothesized in a theory of language ability.ability.

Necessary in a case of indirect testing.Necessary in a case of indirect testing.Can be measured by comparing the Can be measured by comparing the

scores of a group of students for two scores of a group of students for two tests.tests.

PracticalityPracticalityPractical consideration when planning Practical consideration when planning

tests or ways of measurement, tests or ways of measurement, including cost, time/effort requiredincluding cost, time/effort required

Economy (cost, time: Economy (cost, time: administration & administration &

scoringscoring))Ease of Ease of

scoring and score interpretationscoring and score interpretationadministrationadministrationtest compilationtest compilation

A test should be practical to use, but A test should be practical to use, but also valid and reliablealso valid and reliable


Top Related