measurement in psychology i: reliability

23
Measurement in Psychology I: RELIABILITY Lawrence R. Gordon

Upload: newman

Post on 13-Feb-2016

51 views

Category:

Documents


0 download

DESCRIPTION

Measurement in Psychology I: RELIABILITY. Lawrence R. Gordon. Do you support the civil union legislation?. What are some of the ways in which you can ask this question? How do you measure the response (operational definitions)?. Levels of Measurement. Nominal scales - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Measurement in Psychology I: RELIABILITY

Measurement in Psychology I:RELIABILITY

Lawrence R. Gordon

Page 2: Measurement in Psychology I: RELIABILITY

Do you support the civil union legislation?

What are some of the ways in which you can ask this question?

How do you measure the response (operational definitions)?

Page 3: Measurement in Psychology I: RELIABILITY

Levels of Measurement

Nominal scales giving names to data, putting into categories Examples: sex, race labels; baseball uniform

numbersOrdinal scales

numbers give order but not distance Examples: mailbox numbers; class rankings

Page 4: Measurement in Psychology I: RELIABILITY

Levels of Measurement (cont.)

Interval scales numbers indicate order and distance (they are

separated by equal distances or intervals) Example: Fahrenheit temperature

Ratio scales numbers indicate order, distance, AND have a

true zero point (zero = there isn’t any) Examples: height; weight; miles per hour; time

Page 5: Measurement in Psychology I: RELIABILITY

Levels of Measurement ExampleAuto race which started at 2 pm

Driver Make FinishOrder

FinishTime

ElapsedTime

Mary Corvette 1 3:00 1.00Joe Mustang 2 3:15 1.25Tom BMW 3 3:30 1.50Ann Ferrari 4 4:00 2.00

Nominal Nominal Ordinal Interval Ratio

Page 6: Measurement in Psychology I: RELIABILITY

Closed vs. Open Responses

Closed responses (a.k.a. forced choice) Examples (rate civil union support on a scale 1 to 9) Advantages

• you know what the responses will be (or what they should be!) because of restrictions on choice

• easy to empirically evaluate (relatively)• gives data that gives a straightforward answer to how you

ask your question• coding not necessary, usually

Page 7: Measurement in Psychology I: RELIABILITY

Closed vs. Open ResponsesClosed responses (a.k.a. forced choice)

Disadvantages • may not be sensitive enough to get some interesting

information• will not give you as clear an indication of what

participants think/feel/report“Do you agree that same-sex couples should have

the right to marry/civil union?”

1 2 3 4 5 6 7 8 9Disagree AgreeCompletely Completely

Page 8: Measurement in Psychology I: RELIABILITY

Support Civil Union (histogram)

Agreement (9='Agree Completely')

9.08.07.06.05.04.03.02.01.0

Attitudes toward Civil Union

Psyc 109, Fall 2001

Freq

uenc

y (o

f 195

)

140

120

100

80

60

40

20

0

Agreement (9='Agree Completely')

9.08.07.06.05.04.03.02.01.0

Attitudes toward Civil Union

Psyc 109, Fall 2002

Fre

quen

cy (o

f 195

)

140

120

100

80

60

40

20

0

Page 9: Measurement in Psychology I: RELIABILITY

Support Civil Union (area graph)Attitudes toward Civil Union

Psyc 109, Fall 2001

Agreement

Agree Cmpltly

8.00

7.00

6.00

Midpoint

4.00

3.00

2.00

Disagr Cmpltly

Freq

uenc

y (o

f 195

)

140

120

100

80

60

40

20

0

Attitudes toward Civil Union

Psyc 109, Fall 2002

Agreement

Agree Cmpltly8.00

7.006.00

Midpoint4.00

3.002.00

Disagr Cmpltly

Freq

uenc

y (o

f 195

)

140

120

100

80

60

40

20

0

Page 10: Measurement in Psychology I: RELIABILITY

Compare the Graphs: Same Info

Agreement (9='Agree Completely')

9.08.07.06.05.04.03.02.01.0

Attitudes toward Civil Union

Psyc 109, Fall 2002

Fre

quen

cy (

of 1

95)

140

120

100

80

60

40

20

0

Attitudes toward Civil Union

Psyc 109, Fall 2002

Agreement

Agree Cmpltly8.00

7.006.00

Midpoint4.00

3.002.00

Disagr Cmpltly

Fre

quen

cy (

of 1

95)

140

120

100

80

60

40

20

0

Page 11: Measurement in Psychology I: RELIABILITY

Closed vs. Open Responses

Open responses (a.k.a. free response) • Examples (Do you support the civil union legislation?

Why?)Example from the survey used the first day?“Please describe yourself in 12 words or less”

• more on this in a bit...Advantages

• gives any answer participant wants• not restricted by choices

Page 12: Measurement in Psychology I: RELIABILITY

Closed vs. Open Responses

Open responses (cont.) Disadvantages

• have to code to empirically evaluate (time intensive, need to find people who will do it)

• reliability issues!

Page 13: Measurement in Psychology I: RELIABILITY

Reliability

Consistency (stays the same)Repeatable (get the same results again and

again) Measures need to be reliable to be good

measuresNow, some nitty-gritty...

Page 14: Measurement in Psychology I: RELIABILITY

Reliability (cont.)

Measuring closed responses you don’t need to put things into categories reliable over time (do you get the same answers

again and again?) if the answers vary greatly from one time of

measurement to the next, the measurement is not reliable

Page 15: Measurement in Psychology I: RELIABILITY

Reliability (cont.)Measuring closed responses (cont.)

scales (sets of questions designed to measure something) need to be given multiple times, or in multiple forms, and the answers must remain similar for the scale to be reliable

Example (personality scale?)

Types of reliability Stability (“test-retest reliability”) Equivalence (“parallel forms reliability”) Consistency (“split-half reliability”) Homogeneity (“internal consistency reliability”)

Page 16: Measurement in Psychology I: RELIABILITY

Reliability Quick ExampleAny test, scale, inventory with items: E.g., a 50-item test, scored 0-50:

Form A 9/4 9/4, Form AExaminee 9/4 9/25 Form A Form B Odd Even1 George 27 35 27 33 15122 Alice 49 46 49 40 30193 Mary 30 35 30 27 13174 Larry 16 10 16 19 795 Linda 27 24 27 20 10176 Doug 40 42 40 48 22187 Chuck 21 18 21 35 10118 Judy 42 39 42 35 1923

Test-retest: Form A, 9/4 vs 9/25 (“r=.92") StabilityParallel forms: Form A vs Form B, 9/4 (“r=.69") EquivalenceCross form: Form A 9/25 vs Form B 3/19 (“r=.72") Stab & EquivSplit-half: Odd vs Even, Form A 9/4 (“r=.79") ConsistencyAlpha reliability No example – data from all 50 items Internal consistency

Page 17: Measurement in Psychology I: RELIABILITY

Reliability (cont.)

Measuring open responses Will often code into categories (Examples) How do you assess reliability?

Page 18: Measurement in Psychology I: RELIABILITY

Reliability (cont.)Measuring open responses (cont.)

Does everyone put the response into the same category? If yes, you have good inter-coder reliability

more specific operational definitions will increase this reliability

Coding personality responses into categories Using positive, negative, and neutral descriptors

Page 19: Measurement in Psychology I: RELIABILITY

Reliability (cont.)Measuring behavioral responses through

observation special cases of open response, can’t really control

what participants do coding and/or rating what you observe reliability of ratings (interrater reliability? If all

raters agree on the rating, then yes.) need to be very clear on operational definitions

Baggage claim study (Scherer & Ceschi, 2000)

Page 20: Measurement in Psychology I: RELIABILITY

Assessing Reliability

Steps decide on operational definitions of your

variables and scale(s) of measurement train your coders/raters, answer questions, and

alleviate confusion do the coding and rating compare responses were the measurements reliable?

Page 21: Measurement in Psychology I: RELIABILITY

Reliability ExerciseMeasuring your personalityLooking for “big” traits

defining big traits and training coders The Big Five Personality Factors

1. Open to Experience (O) vs. Closed to Experience (NO)2. Conscientious (C) vs. Nonconscientious (NC)3. Extraverted (E) vs. Introverted (NE)4. Agreeable (A) vs. Unagreeable (NA)5. Neurotic (N) vs. Nonneurotic (NN)

Which one best fits the description?Do the coding!

Page 22: Measurement in Psychology I: RELIABILITY

Reliability ExerciseMeasuring your personalityLooking for “big” traitscompare responses to other coders

intercoder reliability List number on which you agreed List number on which you disagreed Calculate the percentages

were the measurements reliable?

Page 23: Measurement in Psychology I: RELIABILITY

And for next time…is reliability enough?

If your measurement is reliable, does that mean that it is good?

Does being reliable make your measurement valid?