steps to complete the usability test

1

Decide

what to

test

Design

the test

Conduct

the test

Write-up

Test

findings

12

34

Steps to Complete : Usability Test

© Karen Thompson ● Department of English ● University of Idaho

STEP 1:

DECIDE WHAT TO TEST

Option 1

Test usability of two

communication

products and

compare outcomes.

Option 2

Test usability of one

website.

OR

Test usability of two

websites and compare

outcomes.

2

3

Option 1: Ideas (compare 2)

Word Processing

Microsoft Word, Open Office, Google Docs

Image Editing

Gimp, Adobe Photoshop, Pixelmator

Website Creators

Wix, Weebly, JIMDO

3-D Modeling

AutoCad, OpenScaD,

SolidWorks, FreeCAD

The two Products:

• Need to be comparable.

4

They don’t need to have the exact same features.

But they need to be the same type of product.

Example: AutoCad does not have the same features as

SolidWorks but both are 3-D modeling tools, and Pixelmator

does not have the same features as Adobe Photoshop but

both are image creators/editors.

5

Option 2: Ideas

US Internal Revenue Service

Idaho Department of Fish and Game

Idaho Department of Labor

University of Idaho

These are just a few ideas. Many other possibilities exist.

6

Option 2: Test one or Tw0?

• You may choose to test one website or compare two

websites that are similar such as two department of fish and

game sites or two university sites.

• The key is to find a website that has complexity, where users

need to find and/or do stuff.

• What users find and do must be accessible to you, so you

could not compare how to enroll in classes at the University

of Idaho compared to Boise State because you would not

have access to both of these.

If you Test Two Websites:

• They need to be comparable.

7

They don’t need to have the exact same features.

But they need to be the same type of product.

Example: Some states have fish and game websites with more

features than others, but both are the same type of website.

You may also compare two university sites from the viewpoint

of a user who is not enrolled in either: a prospective student.

Step 2: Designing the Test

2.1 Identify users and and level of knowledge.

2.2 Decide test tasks.

2.3 Design test metrics.

2.4 Create data collection form

2.5 Decide test environment and equipment.

8

2.1: Identify Users

9

• Identify typical users of the product or website and

decide their level of knowledge/background.

For example, engineering students who have never used AutoCad or SolidWorks would bring a level of knowledge when using the products that others outside of engineering would not possess.

A hunter using the Idaho Department of Fish and Game website would bring a level of knowledge to the site that

a non-hunter would not possess.

2.2: Decide Test Tasks

10

• If testing two communication products or two websites,

you will need to test ten specific tasks. Avoid tasks that are

too basic to evaluate usability. Example: testing how to

save a document file.

• If you are testing one website, you will need approximately

20 specific tasks.

• If you have no idea what type of tasks can be done with a

product, then consult the product’s HelpCenter.

2.2: Decide Test Tasks continued

11

Example: Here is a screenshot of

Weebly’s Help Center.

Do not spend time trying to learn

how to do a task because you will

be testing learnability as one

component of your test.

Simply use the help page menus to

derive a list of the tasks.

2.3: Design Test Metrics

12

Metrics are the means by which you will evaluate the following five components that define usability.

1. Effective: were you able to complete the task? Also means achieving goals.

2. Efficient: how long (or number of clicks) did it take you to perform a task and is that amount of time reasonable?

3. Engaging: Did the product’s style and appearance enhance your user experience?

4. Error Tolerant: How many times did you attempt an action that did not work? when you made an error or a false start, did the product give you feedback where needed?

5. Easy to Learn: how easy was it to learn how to do a task?

2.3: Design Test Metrics continued

13

Your test metrics will need to include the following:

– Qualitative Data: five criteria statements aimed at

evaluating the five components that define usability.

You must write STATEMENTS and not questions.

– Quantitative Data: a fixed-response Likert Scale that

allows you to measure your level of agreement or disagreement with each criteria statement.

• Example: the following statement would measure learning ability (component 5).

• I was able to complete the task without consulting the product’s HelpCenter.

14

Strongly Agree

StronglyDisagree

5 4 3 2 15

2.3: Design Test Metrics Continued

15

In 1932, Rensis Likert invented a means

of measuring responses to a survey.

He showed how asking respondents to

place themselves on a scale of

favor/disfavor with a neutral midpoint

worked better than other means of

measuring responses.

The Likert scale has since been adopted

throughout the world.


About Likert Scales

The neutral midpoint of Likert’s five-point scale is a matter of considerable debate.

16


The debate about the midpoint.

17

Although Likert advocated for the use of a 5-point

scale, researchers have since argued for more points

to increase the reliability and validity of the scale.

Their arguments are based on findings from studies

about how respondents choose items on the scale.


What research has shown about why respondents chose the mid-point:

• Respondents may be unmotivated to make an effort to choose anything else.

• Respondents may be ambivalent.

• Respondents may be reluctant to voice a socially undesirable response.

18

Based on these findings, some researchers advocate

eliminating the midpoint or adding more items to the scale.


19


• The debate among researchers also includes how the scales should

be labeled.

Here are two ways to avoid the mid point problem.

1. Eliminate all but the end labels.

2. Add a sixth response choice.

Labeling end points only or labeling all points is also a matter of debate.

20

There seems to be a pattern here.

What does the research show?

What research has shown about labeling.

• Labeling End Points Only: respondents are more attracted to labeled points, so just labeling the end points may result in a bias towards extreme answers.

• Labeling All Points: respondents are more attracted to intermediate options, so it can lower extremeness of responses, which is good.

But, it may also increase levels of positivity bias, the tendency to respond with a positive answer, which is not good.

21

Even the direction of labels/numbers is debated. Should 1 start the scale or should the last number and how many items 5, 6, 7 more??

22

Good grief.

Maybe they should seek counseling.

23

Hey Karen, thanks for that, but we are now thoroughly confused.

Which scale do you want us to use?

• In 1956, a researcher named George Miller conducted some experiments on short-term memory.

• He published the results in a paper: The Magical Number Seven, Plus or Minus Two: Some Limits on Our Capacity for Processing Information.

• His experiments lead him to conclude we can only retain 5 to 9 items in our short term memory.

24

It’s your choice, but here are some

things to think about.

His paper is cited as the reason why Bell Telephone chose to issue only seven digit phone numbers.

25

Call me back. My number is 1234567

Tip #1: Keep it Simple

26

Do you really want to deal with interpreting the

nuances of these scales? Just sayin.....

Those are just plain scary.

Tip #2: Be consistent

27

• Use the same scale for all criteria statements.

• Don’t switch from a five-point to a seven-point and then

to a four-point.

• Keep the positions of the value labels the same.

• If you start with negative type value labels on the left of

the scale (levels of disagree) and positive on the right

(levels of agree), don’t change the positions.

• Example of inconsistent value labels:

Tip #3: Use Numbers

28

• A Likert Scale is not a Likert Scale unless it has

numbers.

• That’s the whole point of using it.

• You want to connect qualitative criteria

statements to a quantitative measure.

Tip #4: Compose criteria statements.

29

• A criteria statement has to be written in a way that a

respondent (in this project the respondent is you) can

answer by rating the level of agreement with that

statement.

• Ease of use.

• I was able to complete the task without consulting the product’s HelpSection.

• You will need a means of collecting the data for each task you do.

• You may create a data collection form for this purpose or use a spreadsheet.

• Your form or spreadsheet will need to list each tasks with the five criteria statements plus the Likert Scale and a place to take notes.

30

2.4: Create Data collection Form

31

2.4: Example

You will need 10 pages using a form like this. One for each of the ten tasks.

32

2.4: Data Collection Option

If you know how to use a spreadsheet, that is another option for

collecting the data.

2.5: Choose Test Environment

and Equipment

33

• In the white paper write-up of test findings, you will need to

describe the test environment (where you conducted the

test) and the equipment you used (PC? Mac? Operating

system? Etc.).

• The test environment should be a quiet place where you

can work without interruption.

• If you do the test in more than one setting, be consistent

with the environment and equipment you use.

Step 3: Conduct the Test

3.1 Become the user.

3.2 Do each task, take notes, score usability.

3.3 Take screenshots.

34

3.1: Become the User

35

• To do the test, you need to set aside what you

may already know about how to use the

products or website(s) because the test is about

you playing the role of a new user.

3.2: Do each Task

36

• Pay close attention to what happens when you do a task

in the five component areas that define usability.

• Score your level of agreement or disagreement with the

criteria statements in the data collection form or

spreadsheet.

AND

• Take notes to help you describe what happened as you

did the task and to offer any recommendations about

how to improve the product or website.

3.3: Take Screenshots

37

• Take screenshots as you take notes.

• The number of screenshots you take should work to support the descriptions you will write.

• You will need to size these to be appropriate to the page layout in your white paper.

• Often screenshots are difficult to wrap text around. That’s fine, but do keep the screenshot close to the textual information it is supporting. Remember to label, caption, and refer t screenshots in the text.

Write-Up Test Results

38

• You will be writing-up the test results (findings) in a technical

white paper.

• What white papers are and how to write your is covered in

another lecture.

steps to complete the usability test

Education