3/25/96teenid/hci/usability.doc · web viewthis chapter breaks out of the cognitive/rational...

Copyright (C) Te’eni 5/27/23

7 USABILITY AND EVALUATION

Contents

7 USABILITY AND EVALUATION 7-1

7.1 Usability Mini-Case 7-2

7.2 The definition of usability and its place in systems development 7-5

7.3 Usability assessment techniques 7-10

7.4 Evaluation 7-12

7.5 Other measures of performance 7-19

7.6 Conclusion 7-20

7.7 Bibliography 7-24

1

7.1

Context

This chapter breaks out of the cognitive/rational perspective of HCI to include experiential and subjective aspects of HCI. This broader perspective emphasizes actual work rather than rational goals and methods. It may seem more practical than the earlier discussions. Indeed, the measurement techniques introduced here have been more popular with industry.

Synopsis

Usability and evaluation are closely related. Usability is about designing a system with a given functional specification so that the user finds it more compatible with her own wants and needs. Evaluation has to do with assessing the overall impact of the system. It therefore incorporates the effect of the system's functionality as well as its usability. This chapter includes a collection of techniques for assessing usability. More general evaluation techniques are also discussed because it is sometimes difficult to differentiate between the impact of functionality and usability. In particular, we look at the techniques that help us compare interfaces, assess their impact and most importantly, indicate what must be improved.

2

7.2 Usability Mini-Case

"The 1984 Olympic Message System": a case study adapted from Gould et al., 1987, Communications of the ACM, Vol. 30. pp. 758-769.

Over 10 years ago and still a lot to learn from the experience of designers who were sensitive to usability issues. This mini case shows that when you build a system

1) you will not get it right first time round, 2) you will get different answers when you use different methods of observation, and 3) you will need to come up with integrative solutions to problems you encounter.

The Olympic Message System (OMS) allowed Olympians to communicate voice messages among themselves and with friends from all over the world. It worked in 12 languages. OMS Kiosks were designed like round phone booths that had a PC-driven visual display of names of users with new messages, and electronic bulletin board with news items and a tutorial on a videodisc. Over 43,000 usages of the OMS were recorded during four weeks of operation. It was reliable and worked 24 hours a day. Users liked the system. Forty percent of the Olympians used it at least once. OMS was definitely a success. The designers explain they followed three principles of system design:

1) Early focus on user and tasks.2) Empirical measurement of user's behavior.3) Iterative design.

3

The designers used a host of techniques in their study of user behavior. First, they created printed scenarios similar to the one shown in Figure 7.1. Next, early iterative tests of user guides included mailing guides to Olympians and their family. Early simulations of the system's input and output functions added the aspect of how much the user needs to know in order to work with system. This was followed by early demonstrations that pushed the designers to reduce functionality drastically. The team of designers then invited a user to join the team. They visited the actual sites, which made them realize that some of their original thoughts would not work on campus. Interviews with real users of diverse groups, such as overseas users, revealed more problems. Below are some examples of how these researchers improved the initial design after observing use in a pre-Olympic event.

When a user called, OMS asked for the caller's country code. Europeans apparently confused this code with an international dialing code. A plausible confusion, since people do not usually as associate any generic code with their country but rather think of one in the whatever context is relevant. However, this confusion was not detected in the earlier tests because they included Americans only. So the system prompt was changed to ask for the "three-letter Olympic" country code. It should be noted that this change had to be made in the interface but also in the help system, printed handouts and all the translations. For example, inconsistencies between the English instructions and the user's native language would lead to more mistakes.

You: (Dial 740-4567)

OMS: Please keypress your three-letter Olympic country code.

You: U.S.A.

OMS: John Jones - Please keypress your password.

You: 405

OMS: New messages sent by Message Center. 'John, good luck in your race. Dad. End of message. Press `1, listen again; 2, leave a message; 3 hang up.

You: 3OMS: Good-bye

Figure 7-1: A dialog between the user and OMS -Olympic Message System (Gould et al., 1987)

4

There were also problems in getting the user's name. First, some users were not sure whether to input their last or first name. A clever solution was to show the users (again in the interface, the help system, the printed materials and training) to spell their names exactly as spelled on their badges. The solution assumed that badges were printed by the system. It would be difficult to rely on manual printing to be consistent without both training and controlling over implementation. Once the badge name was dependable, it could indeed become the standard personal identification throughout the interaction with the system. Second, due to earlier demand, the OMS used the minimum number of necessary characters to identify a person, letting the user minimize keystrokes. In the field, however, when users continued to type the full name past the necessary minimal set, the extra characters were taken as part of the password. This obviously created havoc. The simple solution here was to absorb the extra characters. How could you, as designer, make sure the user knew what was happening?

These and many other problems that were discovered led to several more general features such as an expanded tutorial and an outreach program. The researchers conclude that piecemeal solutions would not have been enough. Only through integrated design solutions could these usability problems be corrected.

5

7.3 The definition of usability and its place in systems developmentUp to now we have looked at the user's behavior from a cognitive perspective. The

cognitive models introduced in Chapters 3,4 represent behavior as formulating goals, planning a strategy to achieve these goals and then executing the strategy. This behavior, we called user activity, results in performance assessed by accuracy and time. The current chapter emphasizes subjective feelings and actual behavior rather than the more rational and desired flavor of user activity. To do so it introduces the terms usability, attitudes and satisfaction.

In Figure 7.2, user behavior begins with perceptions of the system's usability and usefulness (for a detailed explanation of the effect of perceptions on attitudes see Davis, 1992). The user's attitudes towards the system are based on these perceptions. The attitudes affect the user's intentions and behavior, which in turn, affect performance. Perceptions of performance are in relation to the user's initial expectations of usability (or ease of use) and usefulness. These perceptions affect the user's satisfaction with the system. The whole process is affected by the system's design, individual characteristics (e.g., experience, thinking style) and environmental factors (e.g., time pressure, uncertainty, and organizational culture). This chapter concentrates on measures of usability, attitudes and satisfaction, as they are affected by the system’s design. Individual characteristics and environmental factors are added in later chapters.

ISO (International Standard Organization) defines usability as 'a concept comprising the effectiveness, efficiency and satisfaction with which specified users can achieve specified goals in a particular environment' (ISO CD 9241-11). Other definitions are closer to ease of use, e.g., Ken Eason's (1988) definition: the degree to which users are able to use the system with the skills, knowledge, stereotypes and experience they can bring to bear. Barnard et al. (1981) clarifies the difference between functionality, which is a list of functions needed to perform the specified tasks, and usability. They write- "to be truly useable, the system must be compatible not only with characteristics of the human perception and action but with the user's cognitive skills in communication, understanding, memory and problem solving". In other words, you cannot achieve a good fit between user, task and computer without looking into the way users function. To design truly usable human-computer interactions, users cannot be treated as a black box. Knowing the inputs and outputs but ignoring the human processes is not enough.

Figure 7-2: Attitudes, use, performance and satisfaction

Attitudes Use PerformanceSatisfaction

Perceived usefulness (intentions Perceived usability & behavior)

6

From a practical viewpoint, it is important to define usability so that we can measure it. For example, one of the European centers for developing usability metrics (ESPRIT MUSiC) takes a very broad view of usability. It combines the system's ease of use (which affects user performance and satisfaction) with the system's acceptability (which determines whether the product is used). This definition has led to four categories of usability

indicators that are based on performance (see Table 7.1).

The usability indicators above are very general. Table 7.2 is a step closer to practical measures. It assumes a given task, but one that is complex enough to require nontrivial behavior. The list is taken from Whiteside, Bennett and Holtzblatt (1988).

The narrower view of usability emphasizes the user's ability. Usability is about ensuring that it is easy to use a system with a given level of functionality. When does one conduct usability studies? Ideally, aspects of usability should be incorporated into the traditional systems

Table 7-1: Usability indicators based on performance

1. Goal achievement indicators (success rate, failure rate, accuracy, effectiveness).

2. Work rate indicators (speed, completion rate, efficiency, productivity, productivity gain).

3. Operability indicators of the user's ability to make use of the systems features (error rate, problem rate, function usage)

4. Knowledge acquisition indicators of the user's ability and effort in learning to use the system (learnability and learning period).

Table 7-2: Usability measures based on performance and HCI

1. Time to complete a task2. Number of user commands to complete task3. Fraction of task completed4. Fraction of task completed in a given time5. Number of errors6. Time spent on errors7. Frequency of online help used8. Number of available commands not used9. When task is repeated, ratio of successes to failures10. Fraction of positive comments made by user11. Fraction of good to bad features recalled by user12. Number of expressions of frustration and satisfaction13. Number of times user loses control over system14. Number of times the user needs to devise a way of working around the problem/system.

7

development life cycle. Indeed, one can think of a usability engineering life cycle. Table 7.3 is a summary of the life cycle's stages according to Nielsen (1993).

The first step is 'know the user', i.e., study the user and her view of using the system to accomplish a task as she sees it. Individual characteristics and task variability drastically affect usability. Individual characteristics include experience, age, domain expertise, attitudes and much more. These data can be gathered through interviews and questionnaires, and are best done through a structured framework. Recall for example, the block interaction model of user knowledge in Figure 3.5. Such a model is a good framework to look for user knowledge that is pertinent to the specific setting. Task analysis is the study of how the user sees and solves the task. Within the traditional systems analysis, there are usually instances of use that are left uncovered because they do not fit into the functional decomposition. Task analysis examines use from the user's perspective, taking into account the user's goals, how she approaches the task, and how she deals with exceptional cases.

The next step is to analyze competing products according to standard usability guidelines and, if possible, to test the use of these products with a sample of your users. The third step is to set specific usability goals. Table 7.1 is a good general starting point, but one should be as precise as possible. For example, reducing errors from 5 to 2 per session is a good goal. This could be improved further by suggesting that anything beyond 5 errors would render the system unacceptable for use (a detailed example is given below - Table 7.6). The fourth step according to Nielsen is working on several designs in parallel to enrich the final result with the best of all options. This is especially useful with novel technologies where little is known in advance. In addition, Nielsen advocates participatory design in which the user becomes more active in generating ideas.

The sixth step is one of coordinating the entire interface in its broadest sense, e.g., consistent terminology across input and output screens, help facilities, manuals, tutorials etc. Recall the lesson learnt from the mini-case of the 1984 Olympic games message system: usability solutions must be comprehensive.

Table 7-3: The usability engineering life cycle

Know the user Analyze competing products Set usability goals Consider alternative designs Engage in participatory design Coordinate the total interface Check against heuristic guidelines Prototype Evaluate interface Design in iterations Follow up with studies of installed systems

8

The seventh step is more testing according to standard usability guidelines (e.g., minimize effort, provide feedback etc.). Whenever possible, prototyping should be encouraged. Then a detailed interface evaluation should be conducted - the appropriate techniques are given below. Finally, the lessons learnt should be incorporated into the design and then follow up studies should be conducted.

Sutcliffe and McDermott (1991) propose an approach similar to Nielsen's engineering lifecycle (see Figure 7.3). The figure adapts the traditional flow of systems development to include usability aspects from the very beginning of work. This approach is important because it integrates usability into common practices of systems development. A tactic that may be politically important given the tendency to minimize effort and time devoted to 'non productive' activities.

9

Figure 7-3: Flow diagram of usability process integrated into software engineering (Sutcliffe & McDermott (1991)

10

Analyze user tasks 1

Describe user characteristics2

Analyze user view 3

Allocate tasks and actions4

Select interface style7

Design human tasks5

Design computer tasks6 Design interface

displays9

Design interface dialogs8

User manuals & job design

Detailed software design

Requirements analysis

Functional analysis

Data analysis

Data structure design

Process design

Program design

7.4 Usability assessment techniques

Methods of usability assessments include think aloud, observation, interviews, focus groups, automatic logs and many others. Different techniques can be used at different stages of the usability engineering life cycle. Consider for example, the first step of knowing your user. Interviews, which can be structured, semi-structured or unstructured, are an obvious technique to get first hand impressions of a few representative users. With larger numbers questionnaires are more practical.

It is often important to learn about the users' attitudes before implementation. This may produce findings that have strong implications on the need for training and motivation. The attitude questionnaire in Table 7.4 was developed by Magid Igbaria. Construction of such a questionnaire requires several iterations and field testing to ensure a reliable measure. It is therefore recommended to use validated instruments. Table 7.4 contains the final set of items measuring attitudes toward microcomputers in general (taken from Igbaria and Parasuraman, 1991). Even though the questionnaire has been validated it is not yet ready for a specific setting. The designer must examine what needs to be adapted, e.g., terms may need to be changed to the terms used in a particular organization. If, for instance, your organization uses the term workstations rather than microcomputers, you should change the questionnaire accordingly.

Table 7-4: Attitude questionnaire

1. Using a microcomputer could provide me with information that would lead to a better decisions.

2. I wouldn't use a microcomputer because programming it would take too much time.

3. I'd like to use a microcomputer because it is oriented to user needs.4. I wouldn't use a microcomputer because it is too time consuming.5. Using a microcomputer would take too much time away from my

normal duties.6. Using a microcomputer would involve too much time doing

mechanical operations (e.g., programming, inputting data) to allow sufficient time for managerial analysis.

7. A microcomputer would be of no use to me because of its limited computing power.

8. I'd like to learn about ways that microcomputers can be used as aids in managerial tasks.

9. Using a microcomputer would result in a tendency to over design simple tasks.

10. I wouldn't want to have a microcomputer at work because it would distract me from my normal job duties.

11. A microcomputer would give me more opportunities to obtain the information that I need

11

12. I wouldn't favor using a microcomputer because there would be a tendency to use it even when it was more time consuming than manual methods.

13. I'd like to have a microcomputer because it is so easy to use.14. I'd hesitate to acquire a microcomputer for my use at work because

of the difficulty of integrating it with existing information systems.15. I'd discourage my company from acquiring microcomputers because

most application packages would need to be modified before they could be useful in our specific situation.

16. It is easy to access and store data in a microcomputer.17. A microcomputer would be of no use to me because of the limited

availability of application program packages.18. A microcomputer would be of no use to me because of its small

storage capacity.19. It is easy to retrieve or store information from/to a microcomputer.20. Using a microcomputer would give me much greater control over

important information.

The seventh step in the engineering life cycle was 'check against heuristic guidelines' (Table 7.3). Although the thrust of usability testing is done with the user once the system is working and available to the user, usability can be assessed before release by running through a check list of heuristic guidelines. Table 7.5 is a list of such standard guidelines adapted from Nielsen, 1993.

Table 7-5: Heuristic guidelines

1. Create simple and natural dialog2. Speak the user's language3. Minimize the user's memory load4. Be consistent5. Provide feedback6. Provide clearly marked exits7. Provide shortcuts8. Provide specific, corrective and positive error messages9. Minimize propensity for error

12

7.5 EvaluationEvaluation is a broader term than usability. Evaluation emphasizes performance related

assessment and includes usability testing. In this book, we concentrate on evaluating HCI, excluding such considerations as cost or technical maintainability. Hence, the strong overlap between usability and evaluation. The waterfall model of the systems development life cycle suggests that after each stage of the life cycle, a test is made to ensure the quality of the system thus far. If this evaluation fails, one should go back to revise whatever needs to be changed. Ideally, then, evaluation should occur throughout the development life cycle, e.g., evaluation of design and evaluation of implementation. In practice, evaluation may be limited to one or two stages. It is often easier to point at the distinction between evaluation of functionality, i.e., with regard to the user's task specifications, and evaluation of usability (discussed above). We will talk about both.

There are several possible goals for evaluation: 1) assess the system's functionality against the intended specifications, 2) assess the system's effect on the user's behavior and attitude, 3) assess the system's impact on measures of performance that are related to the user or

the objective of the system, 4) discover unintended problems and perhaps opportunities.

And as we shall see, the goal of evaluation affects the choice of the evaluation technique.

The human factors specialist has a range of evaluation techniques to choose from. It is important to know the different styles and characteristics of the different techniques and to choose the most appropriate one for any specific task.. The choice also depends on whether the findings of the evaluation study are put to use immediately, i.e., is the setting one of iterative design? Alternatively, it may be a general study with implications for future systems, e.g., testing reading from paper vs. reading from a screen

Generally, evaluation techniques can be classified on several dimensions:

1) Exploratory vs. model based.2) Design or implementation.3) Field study vs. laboratory testing.4) Design vs. use.5) Level of performance measures.6) Degree of designed manipulation and intrusion.

These dimensions are not independent of each other. For example, a high degree of designed manipulation may be possible only in a laboratory setting. Below is a quick summary of these dimensions. More advanced readings on research and usability methods can be found in Downton (1991) and Nielsen (1993).

13

Some studies are exploratory in nature. They observe the user and guided purely by the observer's interpretation of the emerging data. Other studies are guided by some theory and meant to either confirm or disconfirm certain predictions. For example, the GOMS model can be used to predict speed and accuracy. It can then be put to empirical test. See for example, the study of using a spreadsheet discussed in Chapter 4.

Evaluating the design prior to implementation is clearly different from evaluating a working system. There are several common techniques for evaluating design, some of which we have already seen. In the usability engineering life cycle, step 7 is a heuristic evaluation of the design according to standard design guidelines (Table 7.5). We have also mentioned evaluation of designs based on models. For example, in the GOMS model, methods that achieve the same goal should be evaluated for consistency. We could use this kind of analysis to eliminate inconsistencies at the design stage. A third type of evaluation is based on past empirical results but not necessarily complete theories. Here one should be cautious to determine whether past experiments are indeed applicable to the new problem, e.g., past studies may have found a touch sensitive screen to be less accurate than a mouse but new technologies for the former may overcome that shortcoming.

One of the most common techniques for evaluating designs is the cognitive walk through. Based on knowledge of cognitive engineering, the designer can look at a given task and analyze the cognitive process involved in using the system to accomplish the task. In fact, the technique goes beyond that material to include a theory of learning by exploration (discussed in Chapter 8). It is therefore a theory based technique. This should point at potential problems in learning and using the system. The advanced reading by Peter Polson et al. (1992) provides an excellent tutorial on cognitive walk through. The cognitive walk through relies on forms that articulate the user's goal structure and the possible methods that can be used to achieve these goals. Figures 7.4-7.5, taken from Polson et al., demonstrate how detailed an analysis is required for even a simple interaction. Given a description of user steps (a scenario), the evaluator asks for each step four questions: 1) will the user try to achieve the right effect?, 2) will the user notice that the correct action is available?, 3) will the user associate the correct action with the effect that the user is trying to achieve? 4) if the correct action is performed, will the user see it as progress? Any problem with anyone of these questions for any of the steps indicates a usability problem.

14

Usability laboratories are facilities that are equipped with advanced interactive technologies to allow for testing state-of- the-art systems but are also equipped with recording devices such as audio/visual recording and automated logging. The laboratories are typically sound-proof rooms that have one-way mirrors separating the observation room from the user room. These methods of observation and data collection enable a detailed analysis of behavior. For example, a complete recording of every keystroke can be analyzed to see the effect of time pressure on system usage. These facilities also allow live discussions among the observers during the action.

Laboratory studies are especially useful when working out of context is not a major threat to the generality of the findings. This means that if a user will behave and use the system differently in a real office with physical file cabinets, phone calls and co-workers around him, then the findings of a study outside of this context cannot represent what will happen in context. Sometimes, the issue may be considered divorced of the surroundings, e.g., when you compare the use of two different colors for a given applications. In these cases and in other situations where it is simply unfeasible to test on site, laboratory studies may be the best solution.

Task description from the first-time user’s viewpoint. Include any special assumptions about the state of the system assumed when the user begins work.

Action sequence: Make a numbered list of the atomic actions that the user should perform to accomplish the task.

Anticipated users: Briefly describe the class of users who will use this system. Note what experience they are expected to have with similar or previous versions.

User’s initial goals: List the goals the user is likely to form when starting the task. If there are other likely goals list them, and estimate for each what percentage of user are likely to have them.

Figure 7-4: Cognitive walkthrough start up (from Polson et al 1992)

15

Field studies attempt to observe the user in her natural environment. In principle, this is to be preferred to observation out of context. It is however difficult to isolate other effects that may obscure what effect we are trying to examine. It is also difficult to record certain types of data that do not fit into the work environment.

A nice example of a laboratory study is reported by Scott and Findlay (1991). They tested two arrangements in a text-editor for presenting information about whether the state was insert mode or type-over mode. Subjects were divided into two groups. One group was presented with information only at the cursor position and the other group was presented with information only within a status window. By comparing the means of several measures between the two groups, the authors were able to conclude that information at the cursor position only resulted in faster performance. What is particularly interesting to us is the measures they used. These are listed in Table 7.6.

.

Table 7-6: Measures for comparing displays

Typing mistakes made before correction Mistakes remaining Pauses of 3 seconds or more immediately before mode change Other pauses of 3 seconds or more Length of pause immediately before mode change Attempting to type-over whilst in insert mode Attempting to insert whilst in type-over mode

The important lesson to learn from Table 7.6 is that usability measures need to be operationalized on the basis of a specific situation. From our earlier discussions, we should be careful to take into account not only the task, but also the user, the computer and the context.

1. Goal structure for this step.1.1 Correct goals. What are the appropriate goals for this point in the interaction? Describe as for initial goals.

1.2 Mismatch with likely goals. What percentage of users will not have these goals, based on the analysis at the end of the previous step? Check each goal n this structure against your analysis at the end of the previous step. Based on that analysis, will all users have the goal at this point, or may some users have dropped it or failed to form it? Also check the analysis at the end of the previous step to see if there are unwanted goals that will be formed or retained by some users (%0 25 50 75 100).

Figure 7-5: Cognitive walkthrough for a step (from Polson et al. 1992)

16

Notice that the context is necessary too because it affects performance based measures of the task. In the text editing system above, obviously time and errors were an important consideration. If there is a trade off between the two, the context may dictate the relative importance. The exact types of errors were determined by considering both the interface feature to be tested and the task itself. In different situations it may have made sense to add, say, the time to recover from an error.

Different evaluation techniques attempt to measure performance at different levels, e.g., time it takes to complete task vs. the impact on work behavior. One common measure of performance is the user's satisfaction with the system, in general, and the system's interface, in particular.

A survey that is growing in popularity is QUIS, which stands for the Questionnaire for User Interface Satisfaction. It was developed by Ben Shniederman and Kent Norman in the HCI Laboratory at the University of Maryland. The survey has a short and long form and can be purchased both as a paper questionnaire and as an online system. Table 7.7 shows the short form. Note that the survey esquires into the user characteristics as well as computer attributes of the screen, terminology, learning and general system capabilities. QUIS does not related directly to any specific task. However, it is possible to add domain specific items to such a questionnaire. For example, in translating this questionnaire into Hebrew, it was necessary to add several items that pertained to translation issues. Or in using it to assess a new benefits system, it was necessary to augment some terms to relate more closely to the task of filling the benefit forms.

What is the difference between the attitude questionnaire in Table 7.4 and the satisfaction questionnaire in Table 7.7? Figure 7.2 showed attitudes affecting use and use affecting satisfaction through performance. The usability engineering life cycle suggests that the attitude questionnaire can be employed at earlier stages to get to know the user and the satisfaction questionnaire can be used later to assess the quality of the implemented system. The latter is performance related and is similar to other questionnaires that ask for perceived usefulness in terms of improved productivity. Note however, that because use itself affects attitude, the next time a user uses the system, his attitude is influenced by his perceived usefulness, which in turn is influenced by his previous experience.

Table 7-7: Questionnaire for user interface satisfaction

17

7.6 Other measures of performance

Physiological measures: Under some circumstances, it may be possible and advantageous to use physiological measures to assess usability. These include such measures as cortisol, pupil response, critical flicker frequency, evoked response potentials, heartrate, heartrate variability, blood pressure, respiration, electromyography, and electrodermal activity. For example, cortisol secretion is related to situations of losing controllability and can be used to do so in the context of a user losing the sense of control during human-computer interaction. Another example, is heart-rate variability, which decreases with increased task difficulty. Indeed, it has been used to examine the effect of breaks on the reduction of workload in office automation (Wiethoff, Arnold and Houwing, 1991). For a detailed review see ESPRIT MUSiC 5429 on metrics for usability standards, 1991.

Metrics: Rengger (1991) proposed several metrics for the usability indicators in Table 7.1. Examples of metrics for indicators of goal achievement, work rate, operability and knowledge acquisition are shown in Table 7.8.

20

In sum, usability and evaluation are critical activities that must be planned and integrated into the development life cycle as early as possible. The designer must choose from the wide variety of techniques shown in this chapter those that are suitable for the particular context. More often than not, a combination of techniques will be used to support the different stages of the life cycle.

7.7 Conclusion

Table 7.8: Metrics of usability indicators (adapted from Rengger, 1991) 1) Effectiveness = (Quantity * Quality) % User effectiveness Expert task time 2) Relative efficiency ----------------------- * ------------------- % Expert effectiveness User task time

(Task time-Total problem time - Learning time) 3) Productive period = ------------------------------------------------------ % Task time 4) Problem rate = Number of problems encountered / Task time 5) Problem recovery level = Number of problems encountered - Number of unsolved problems --------------------------------------------------------------------------- % Number of problems encountered 6) Complexity factor =

Number of calls for assistance Learning time +Total problem time ------------------------------------ + ---------------------------------------- Number of actions undertaken Task time

21

Figure 7-6 taken from (Olson and Moran, 1996) compare popular methods by the type of method, benefits and costs. Type of method refers to whether it collects data (empirical), analyses the task or system (analytic), or construct system representations (construct).

22

Method Type Benefits Costs-effort Costs-trainingDEFINE THE PROBLEMNaturalistic observation (diaries, videotape, etc.)

Interviews (including focusgroups, decision tree analysis,semantic nets)

Scenarios or use cases(including envisioning)

Task analysis (includingoperator function model)GENERATE A DESIGNBuilding on previousdesigns (steal and improve,design guidelines)

Represent conceptual model

Represent interaction (GTN,dataflow diagram)

Represent visual display

Design space analysis (QOC,decomposition analysis)REFLECT ON THE DESIGNChecklists

Walkthroughs

Mapping analysis (task action,metaphor, consistency)

Methods analysis (GOMS, KLM, CPM, CCT)

Display analysesBUILD A PROTOTYPEPrototyping tools

Participatory prototypingTEST THE PROTOTYPEOpen testing (storefront orhallway, alpha, damage testing)

Usability testingIMPLEMENT THE DESIGNToolkits (Motif, NeXTstep,Apple, etc.)DEPLOY THE SYSTEMInternal testing

Beta testing (logging, metering, surveys)

Figure 7-6: Comparison of methods (M & Olson, 1996)

23

Table 8: ANSI/HFES 200 outline and status in 1997

Section Status1 Introduction draft2 accessibility draft3 presentation of information ISO document4 user guidance ISO document5 direct manipulation ISO document6 color draft7 forms fill-in ISO document8 command languages ISO document9 voice i/o

voice recognition draftnon-speech auditory output draftinteractive voice response draft

10 visually displayed menus re-drafted ISO doc

24

Concepts in Chapter 7Usability AttitudesPerformance SatisfactionEvaluation Heuristic guidelinesUsability engineering life cycle Cognitive walk through

7.8 Bibliography

Goodwin N. C. (1987) Functionality and usability. Communications of the ACM, 30, 3, 229-233.

Rengger R E (1991) Indicators of usability based on performance, June 1991, MUSiC WP NPL.M1.TW.1.PUBLIC

Scott D. and Findlay J. M. (1991) Optimum display arrangements for presenting status information. International Journal of Man-machine Studies, 35, 399-407.

Wiethoff M, A.G. Arnold & E.M. Houwing (1991) The value of psychphysiologicl measures in human-computer interaction. In H.J. Bullinger, Human Aspects in Computing: Design and Use of Interactive Systems and Work with Terminals. Elsevier Science Publishers, 661-665.

Whiteside, Bennett and Holtzblatt (1988).

Ken Eason's (1988)Barnard et al. (1981Davis, 1992).

Nielsen (1993).Sutcliffe and McDermott (1991

Igbaria and Parasuraman, 1991).

Downton (1991)

25

3/25/96teenid/hci/usability.doc · web viewthis chapter breaks out of the cognitive/rational...

Documents