the process for conducting a test ch.11 analyze data and observations

The process for conducting a test

Ch.11 Analyze Data and Observations

Analyze Data and Observations

Recommendation for improvement

Recommendations Preliminary analysis

Hot spots Without having to wait for the final test

report Small written report or a verbal presentation To see the larger trend and patterns Compile data, Summarize data, Analyze

data Comprehensive analysis

A final, more exhaustive report

A word of caution

Be timely Err on the conservative side by

providing too little preliminary recommendation

Clearly marked in large letter

Begin compiling data as you test

Whether creating a preliminary report or not, go on throughout the test sessions

Help to see anything important missing

Understand what have collected

Iterative, fast-turnaround test

Changes are made to the product after every few sessions.

Compiling data each time will be necessary.

Update test plan, session script, debriefing guide to reflect the revisions.

Organize raw data

Raw data Recordings to notes, to questions Comments from participants Issues lists from obserbers

Toolbox for organizing Lists 清冊 Tallies 記錄、積分表 Matrices 環境網路 Stories Storyboards 分鏡腳本 Structure models Flow diagram 流程圖 Spreadsheet 試算表

Summarize data

Get a snapshot of what happened during the test

To indicate if there were differences in performance of different

groups or differences in performance of different

versions of a product

Summarize performance data

Descriptive statistics(敍述性統計 ): the most common sataistics Simply techniquse for classifying the charact

eristics Use simple formulas that are available on m

ost computer spreadsheet Task accuracy Task timings

Task Accuracy

Count the number of errors made per task

Categorize the errors by type Track the number of participants who

performed successfully or requiring some assistance to succeed

Three types of statistics relate to task accuracy

Percentage of participants performing successfully, including those who required assistance.

Percentage of participants performing successfully.

Percentage of participants performing successfully within a time benchmark.

Task timings

Relate to how much time participants require to complete each task

Commons statistics: mean, median, range, standard deviation

Common task timings statisticsmean time

Mean time 平均時間 A rough indication Can be compared to the original time bench

mark Consider using the median score if the task ti

mes are very skewed

Common task timings statisticsmedian time & range

Median time Exactly in the middle position when all

the completion times are listed in ascending order.

Range Shows the highest and lowest completion

times for each task Each participant’s performance is crucial

in small sample size

Common task timings statisticsstandard deviation

standard deviation 實驗或測試結果的可信度有多高 Like the range. A measure of variability

Summarize preference data

Limited-choice questions See how many participants selected

each possible choice. For a small sample size this may not be

necessary to view trends.


For free-form questions and comments List all questions and group all similar

answers into meaningful categories Enable to scan the results quickly for a

general indication of the number of positive and negative comments


For debriefing sessions Have all interviews transcribed. Pull out the critical comments.

Compile and summarize other measures

Number of times returning to main navigation unnecessarily

Number (and type) of hints or prompts Number of times the site map was

accessed Points of hesitation (and for how long) Summarize scores by group or version

Analyze data

Time to

Identify tasks that did not meet the success criterion

Identify user errors and difficulties Conduct a source of error analysis Prioritize problems Analyze differences between groups or

product versions Using inferential statistics

Analyzing data you have to

Identify tasks that did not meet the success criterion

Use a 70 percent success criterion for a typical assessment test.

Doing small tests and iterating design, the likelihood of reaching the 70 percent success criterion should grow.

If demand too high (95%) success rate for the first usability test, will flag almost tasks.

Identify user errors and difficulties

It is helpful to define what an error is. Do this in a validating or summative test. The purpose is to understand what the p

ossible errrors are.

Conduct a source of error analysis

Identify the source of every error. Transition point from task orientation

to product orientation. Ultimate detective work. Most labor-intensive portion. To attribute a product-related reason

for use difficulties and/or poor performance.

Prioritize problems To rank usability problem: CRITICALITY Criticality =

severity + probability of occurrence To enable the development team to

structure and prioritize the work. How to categorize a problem by severity

Categorize a problem – 4 point scale Rank the problem by estimated frequency of

occurrence. A easier way: ask participants to tell you

what was the most problematic situation.

Analyze differences between groups or product versions

Ver.a #tasks correct

Ver.b #tasks correct

Liked best

Prefer to teach a novice

Ver.a ease of use (1-5)

Ver.b ease of use (1-5)

Participant 77.22% 76.67% A=4 A=3 3.8 3.6

B=8 B=9

Using inferential statistics推論統計學 Infer something about a larger population

from the smaller sample of test participant. Caution

Have not been sufficiently trained in use and interpretation of inferential statistics.

Rarely trained in interpreting and easily misinterpret the result.

Greatly depending on trying to obtain statistical results or not.

Sample size

the process for conducting a test ch.11 analyze data and observations

Documents

product slide

skewed slide

obserbers slide

standard deviation slide

task times

time participants

compiling data

diagram spreadsheet