the process for conducting a test ch.11 analyze data and observations
Post on 21-Dec-2015
217 views
TRANSCRIPT
The process for conducting a test
Ch.11 Analyze Data and Observations
Analyze Data and Observations
Recommendation for improvement
Recommendations Preliminary analysis
Hot spots Without having to wait for the final test
report Small written report or a verbal presentation To see the larger trend and patterns Compile data, Summarize data, Analyze
data Comprehensive analysis
A final, more exhaustive report
A word of caution
Be timely Err on the conservative side by
providing too little preliminary recommendation
Clearly marked in large letter
Begin compiling data as you test
Whether creating a preliminary report or not, go on throughout the test sessions
Help to see anything important missing
Understand what have collected
Iterative, fast-turnaround test
Changes are made to the product after every few sessions.
Compiling data each time will be necessary.
Update test plan, session script, debriefing guide to reflect the revisions.
Organize raw data
Raw data Recordings to notes, to questions Comments from participants Issues lists from obserbers
Toolbox for organizing Lists 清冊 Tallies 記錄、積分表 Matrices 環境網路 Stories Storyboards 分鏡腳本 Structure models Flow diagram 流程圖 Spreadsheet 試算表
Summarize data
Get a snapshot of what happened during the test
To indicate if there were differences in performance of different
groups or differences in performance of different
versions of a product
Summarize performance data
Descriptive statistics(敍述性統計 ): the most common sataistics Simply techniquse for classifying the charact
eristics Use simple formulas that are available on m
ost computer spreadsheet Task accuracy Task timings
Task Accuracy
Count the number of errors made per task
Categorize the errors by type Track the number of participants who
performed successfully or requiring some assistance to succeed
Three types of statistics relate to task accuracy
Percentage of participants performing successfully, including those who required assistance.
Percentage of participants performing successfully.
Percentage of participants performing successfully within a time benchmark.
Task timings
Relate to how much time participants require to complete each task
Commons statistics: mean, median, range, standard deviation
Common task timings statisticsmean time
Mean time 平均時間 A rough indication Can be compared to the original time bench
mark Consider using the median score if the task ti
mes are very skewed
Common task timings statisticsmedian time & range
Median time Exactly in the middle position when all
the completion times are listed in ascending order.
Range Shows the highest and lowest completion
times for each task Each participant’s performance is crucial
in small sample size
Common task timings statisticsstandard deviation
standard deviation 實驗或測試結果的可信度有多高 Like the range. A measure of variability
Summarize preference data
Limited-choice questions See how many participants selected
each possible choice. For a small sample size this may not be
necessary to view trends.
Summarize preference data
For free-form questions and comments List all questions and group all similar
answers into meaningful categories Enable to scan the results quickly for a
general indication of the number of positive and negative comments
Summarize preference data
For debriefing sessions Have all interviews transcribed. Pull out the critical comments.
Compile and summarize other measures
Number of times returning to main navigation unnecessarily
Number (and type) of hints or prompts Number of times the site map was
accessed Points of hesitation (and for how long) Summarize scores by group or version
Analyze data
Time to
Identify tasks that did not meet the success criterion
Identify user errors and difficulties Conduct a source of error analysis Prioritize problems Analyze differences between groups or
product versions Using inferential statistics
Analyzing data you have to
Identify tasks that did not meet the success criterion
Use a 70 percent success criterion for a typical assessment test.
Doing small tests and iterating design, the likelihood of reaching the 70 percent success criterion should grow.
If demand too high (95%) success rate for the first usability test, will flag almost tasks.
Identify user errors and difficulties
It is helpful to define what an error is. Do this in a validating or summative test. The purpose is to understand what the p
ossible errrors are.
Conduct a source of error analysis
Identify the source of every error. Transition point from task orientation
to product orientation. Ultimate detective work. Most labor-intensive portion. To attribute a product-related reason
for use difficulties and/or poor performance.
Prioritize problems To rank usability problem: CRITICALITY Criticality =
severity + probability of occurrence To enable the development team to
structure and prioritize the work. How to categorize a problem by severity
Categorize a problem – 4 point scale Rank the problem by estimated frequency of
occurrence. A easier way: ask participants to tell you
what was the most problematic situation.
Analyze differences between groups or product versions
Ver.a #tasks correct
Ver.b #tasks correct
Liked best
Prefer to teach a novice
Ver.a ease of use (1-5)
Ver.b ease of use (1-5)
Participant 77.22% 76.67% A=4 A=3 3.8 3.6
B=8 B=9
Using inferential statistics推論統計學 Infer something about a larger population
from the smaller sample of test participant. Caution
Have not been sufficiently trained in use and interpretation of inferential statistics.
Rarely trained in interpreting and easily misinterpret the result.
Greatly depending on trying to obtain statistical results or not.
Sample size