interacting with virtual environments: an evaluation of a model of interaction

24
Interacting with virtual environments: an evaluation of a model of interaction Kulwinder Kaur * , Neil Maiden, Alistair Sutcliffe Centre for HCI Design, City University, Northampton Square, London EC1V 0HB, USA Abstract There is a need for interface design guidance for virtual environments, in order to avoid common usability problems. To develop such guidance an understanding of user interaction is required. Theoretical models of interaction with virtual environments are proposed, which consist of stages of interaction for task/goal oriented, exploratory and reactive modes of behaviour. The models have been evaluated through user studies and results show the models to be reasonably complete in their predictions about modes and stages of interaction. Particular stages were found to be more predo- minant than others. The models were shown to be less accurate about the exact flow of interaction between stages. Whilst the general organisation of stages in the models remained the same, stages were often skipped and there was backtracking to previous stages. Results have been used to refine the theoretical models for use in informing interface design guidance for virtual environments. q 1999 Elsevier Science B.V. All rights reserved. Keywords: Virtual environments; Interaction modelling; Usability 1. Introduction Virtual Environments (VEs), are three-dimensional, computer simulated environments which are rendered in real time according to the behaviour of the user [16]. VEs differ in important ways from conventional interfaces, offering new possibilities and bringing new challenges to human–computer interface design. Compared with direct manipulation (DM) interfaces, VEs are structured as 3D graphical models with only a sub-section of the model presented through the interface at any one time, whereas DM systems provide a 2D presentation area which continually presents objects of interest [25]. In VEs, the spatial structure of the model remains fairly static and the user navigates around the model, to locate objects of interest. Interacting with Computers 11 (1999) 403–426 0953-5438/99/$ - see front matter q 1999 Elsevier Science B.V. All rights reserved. PII: S0953-5438(98)00059-9 * Corresponding author. Tel.: 1 44-171-477-8427; fax: 1 44-171-477-8859. E-mail addresses: [email protected] (K. Kaur), [email protected] (N. Maiden), [email protected] (A. Sutcliffe)

Upload: kulwinder-kaur

Post on 05-Jul-2016

213 views

Category:

Documents


0 download

TRANSCRIPT

Interacting with virtual environments: an evaluationof a model of interaction

Kulwinder Kaur* , Neil Maiden, Alistair Sutcliffe

Centre for HCI Design, City University, Northampton Square, London EC1V 0HB, USA

Abstract

There is a need for interface design guidance for virtual environments, in order to avoid commonusability problems. To develop such guidance an understanding of user interaction is required.Theoretical models of interaction with virtual environments are proposed, which consist of stagesof interaction for task/goal oriented, exploratory and reactive modes of behaviour. The models havebeen evaluated through user studies and results show the models to be reasonably complete in theirpredictions about modes and stages of interaction. Particular stages were found to be more predo-minant than others. The models were shown to be less accurate about the exact flow of interactionbetween stages. Whilst the general organisation of stages in the models remained the same, stageswere often skipped and there was backtracking to previous stages. Results have been used to refinethe theoretical models for use in informing interface design guidance for virtual environments.q 1999 Elsevier Science B.V. All rights reserved.

Keywords:Virtual environments; Interaction modelling; Usability

1. Introduction

Virtual Environments (VEs), are three-dimensional, computer simulated environmentswhich are rendered in real time according to the behaviour of the user [16]. VEs differ inimportant ways from conventional interfaces, offering new possibilities and bringing newchallenges to human–computer interface design. Compared with direct manipulation(DM) interfaces, VEs are structured as 3D graphical models with only a sub-section ofthe model presented through the interface at any one time, whereas DM systems provide a2D presentation area which continually presents objects of interest [25]. In VEs, the spatialstructure of the model remains fairly static and the user navigates around the model, tolocate objects of interest.

Interacting with Computers 11 (1999) 403–426

0953-5438/99/$ - see front matterq 1999 Elsevier Science B.V. All rights reserved.PII: S0953-5438(98)00059-9

* Corresponding author. Tel.:1 44-171-477-8427; fax:1 44-171-477-8859.E-mail addresses: [email protected] (K. Kaur), [email protected] (N. Maiden),

[email protected] (A. Sutcliffe)

VEs are significantly more difficult to design and use than 2D interfaces [9] and there isa need for better-designed VE systems [2] that support perception, navigation, explorationand engagement [29]. Significant usability problems exist with current VEs. In an evalua-tion of the Royal Navy’s Virtual Submarine [13], submariners experienced major inter-action problems, such as maintaining a suitable viewing angle, navigating through tightareas, losing whereabouts after getting too close to objects and recognising interactive‘hot-spots’ in the environment. Similar problems have been found in other evaluationstudies of VEs (for example [17]), and these problems appear to be different to thosefound with conventional interfaces (for example [28]).

There are currently no guidelines and little knowledge of how VEs should be designed.Therefore, guidance is needed to VE interface design and, to develop such guidance, anunderstanding of user interaction behaviour is required [9,10,22]. There are models ofinteraction for conventional interfaces, but none exist for VEs. This paper describestheoretical models of interaction in VEs and validation studies on the models.

2. Theory of interaction

Previous work in interaction modelling has involved various approaches, such as the useof cognitive architectures (e.g. in the AMODEUS project, [1]), process models (e.g. [19])that describe interactions at a higher level of granularity, and models of user knowledgeand its use (e.g. Cognitive Complexity Theory, [14]). The approach adopted here was thatof process modelling which describes interaction at an appropriate level of detail fordefining general requirements to support that interaction. By omitting lower level detailof cognitive tasks, precision in modelling is sacrificed for a wider scope. The theoryelaborates on Norman’s [19] general model of action to describe interaction in VEs.

K. Kaur et al. / Interacting with Computers 11 (1999) 403–426404

Fig. 1. Norman’s seven stage model of interaction (from [19]).

Norman’s theory is well known and has been used in developing interaction models forevaluating DM interface (see [26]). It consists of seven stage cycles of action, see Fig. 1.

The elaboration of Norman’s model involved the explicit modelling of exploratory andreactive behaviours, which are important aspects of VE interaction. Tasks in VEs are oftenloosely structured with more emphasis on exploration and opportunistic action (as definedby [8]). For example, in many simulation and tutorial applications, the user’s task is toinvestigate the environment so behaviour is primarily opportunistic following of cues. VEsare often active, with objects operating independently of the user’s actions [3] and theseenvironment events may demand or invite responsive behaviours [7] from the user. There-fore, three inter-connected models have been used to describe important modes of VEinteraction:

• Task action model—describes purposeful behaviour in planning and carrying outspecific actions as part of user’s task or current goal/intention, and then evaluatingthe success of actions.

• Explore navigate model—describes opportunistic and less goal-directed behaviourwhen the user explores and navigates through the environment. A target may be inmind or observed features may arouse interest.

• System initiative model—describes reactive behaviour to system prompts and events,and to the system taking interaction control from the user (for example taking the useron a pre-set tour of the environment).

The task action model was based on Norman’s action cycle, with additions for:

• Consideration of objects involved in an action. Since objects in a VE are not continuallypresented, the user may need to reason about what environment objects are available forcarrying out actions.

• Searching for objects when they are not within the environment section in view. Searchtasks are an important part of VE interaction (see [4]).

• Approaching objects and orienting correctly to them. Approaching objects is non-trivial in 3D interaction and appropriate 3D orientations to objects are required.

• Object investigation actions, as opposed to object manipulations. The user may onlybe interested in examining VE content [18], rather than manipulating it in someway.

Figs. 2, 4 and 6 show flow diagrams for each of the interaction models. Walking throughtask action mode, the user establishes a goal (stage ‘establish goals’ in Fig. 2) such as tostudy the electricity supply in a building, and forms an intention to carry out an action toturn on power (‘intention task action’). S/he then considers what power objects are avail-able in the environment (‘consider objects’), such as mains boxes and switches. If themains are not within his/her immediate vicinity, a search for them is carried out in explorenavigate mode. Once the mains are found, s/he approaches and takes up a suitable orienta-tion to them (‘approach/orient’), see Fig. 3. They then deduce how to turn on the power atthe mains (‘deduce sequence’) and execute the action (‘execute’). They interpret feedbackin the environment (‘feedback’) to see whether or not power has been turned on. Alter-natively, after approaching the mains, if s/he had had an intended action to study themains, rather than turn on power, they would closely inspect and investigate the mains

K. Kaur et al. / Interacting with Computers 11 (1999) 403–426 405

(‘inspect’). Finally, s/he evaluates the outcome of this inspection or the action to turn onpower, on their goal to study the electricity supply (‘evaluate’).

Walking through explore navigate mode, the user forms an intention to explore theenvironment (stage ‘explore’ in Fig. 4), such as a virtual building. S/he scans the

K. Kaur et al. / Interacting with Computers 11 (1999) 403–426406

Fig. 2. Task action model, showing stages and flow of interaction.

observable environment (‘scan’) and decides to move forward through the building(‘plan’). They navigate forward (‘navigate’) and re-scan the environment. If, for instance,they see a cupboard which arouses interest, see Fig. 5, they decide to investigate thecupboard (‘intention explore action’) and this action is now carried out in task actionmode. Alternatively, s/he may be searching for targets, such as the mains boxes. Whenthey scan and find the mains boxes, they may return to task action mode to try and switchon power.

System initiative behaviour may either be events or interaction control. In the case of

K. Kaur et al. / Interacting with Computers 11 (1999) 403–426 407

Fig. 3. The user’s view when approaching the mains object.

Fig. 4. Explore navigate model, showing stages and flow of interaction.

K. Kaur et al. / Interacting with Computers 11 (1999) 403–426408

Fig. 5. The user scans and sees a cupboard object which arouses interest.

Fig. 6. System initiative model, showing stages and flow of interaction.

events, the user perceives and interprets an event (stage ‘event’ in Fig. 6), such as a ringingtelephone, see Fig. 7. S/he plans how to respond to it (‘plan’). They may immediatelydecide to answer the telephone (‘intention reactive action’). Alternatively, they mayinvestigate how to use the telephone, in exploratory mode, or evaluate what the telephoneringing event means to their ongoing task, in task action mode. In the case of interactioncontrol, the user acknowledges the beginning of system control (‘acknowledge control’),such as an automated tour of the building. S/he watches the tour (‘monitor’) and acknowl-edges when the tour has ended (‘end control’). They then plan how to respond to thissystem behaviour (‘plan’ again). Whilst watching the tour, s/he may decide they have seenenough and would like to quit the tour (‘intention control action’).

The theory applies to VEs that are single-user, modelled on real world phenomena and,at the level of interaction description involved, either desk-top or immersive. The modelsaim to capture the basic and typical flow of interaction, and it should be recognised thatbehaviour will often deviate from such simple patterns. Some stages may be skipped, forexample in task action mode the ‘deduce sequence’ stage may not be needed by the skilleduser who has learned the required sequence. There maybe repetitions of stages or parts ofmodels, for example some tasks may involve several object searches or manipulations.Backtracking to previous stages may also occur as remedial activity when the user encoun-ters errors. Finally, the ordering of some stages may differ, for example ‘deduce sequence’may be carried out before, instead of after, an object ‘approach/orient’.

For the theory to be useful in developing guidance, it must be representative of actualinteraction behaviour. The following hypotheses were set to test this.

To test the existence and cohesiveness of the three modes of behaviour—in an inter-action session:

Hypothesis 1a: there will be significantly more stage-to-stage transitions within modeboundaries, than across different modes;Hypothesis 1b: observed sequences of up to five stages long will fall within the model–mode boundaries.

K. Kaur et al. / Interacting with Computers 11 (1999) 403–426 409

Fig. 7. The user perceives the telephone ringing event.

To test that the stages of interaction together describe important behaviour—in aninteraction session:

Hypothesis 2: theory stages will occur significantly more times than any other stages ofinteraction.

To test that the interaction models represent the generalised pattern of interactionflow—in an interaction session:

Hypothesis 3: observed sequences of stage transitions will conform to patterns predictedin the models. More specifically, observed stage transitions will be either exactly aspredicted, or jumps forward or backtracks of one stage in the interaction models.

3. Method

3.1. Experiment

Empirical studies were carried out to gather data on user behaviour when interactingwith VEs. Pre-study questionnaires were used to select ten participants with a range ofexperience in direct manipulation interfaces, video games, virtual reality systems, andproperty evaluation (the experiment task). The participants (seven males and threefemales) were staff and students at the School of Informatics, City University, and werepaid £10 for participating in the studies.

The application was a business park simulation, developed by VR Solutions, and wasbeing used by The Rural Wales Development Board for marketing of business units topotential leaseholders. It was a desk-top application consisting of two worlds—an externalview of the park and an inside view of a unit in the park. The unit could be viewed as eitheran empty, factory or office complex. Hot-keys, ‘SHIFT-H’ and ‘SHIFT-G’, were used tomove to the external world and to different views of the inside world. Information about

K. Kaur et al. / Interacting with Computers 11 (1999) 403–426410

Fig. 8. The external world showing the outside view of the business park. The unit represented internally is shownin the left of the picture.

features in the unit was available by mouse clicking on related objects, such as windowsand lighting. Figs. 8–12 show the external world, the different inside views of the unit andan example information box.

Changes were made to the application to ensure it allowed for the range of behaviours tobe evaluated in the theory. The application lacked any aspect of system control, thereforean automatic guided tour was introduced to show the user around the external world. Therewas only one system event, therefore two more were added—a speech bubble appearingfrom a man (upon the user approaching the man), and a telephone ringing. The applicationwas run on a PC with a 21 inch monitor, a joystick was used for navigation and a standard2D mouse for interacting with objects.

The task scenario told participants they were salespeople who were to gather informa-tion about the architecture and basic services of a site, represented in a VE, so that itcould be described to potential leaseholders. Participants were told that, following the

K. Kaur et al. / Interacting with Computers 11 (1999) 403–426 411

Fig. 10. The inside world showing of one of the units in the park, as a factory complex.

Fig. 9. The inside world showing of one of the units in the park, as an empty complex.

experiment, they would be questioned on the site. Specific tasks tested all aspects of thetheory, such as exploration, target searches, actions and object investigation. Participantswere given 10 minutes to explore and familiarise themselves with the VE. There were theneight set tasks, with no time limits. Two of the tasks involved finding and investigatingobjects, such as the windows. For three tasks, participants were asked to carry out specificactions, such as opening the loading bay door. For the other three tasks, they carried outgeneral analysis and problem solving, such as comparing the three toilets in the building(disabled, men’s and women’s).

A pilot study was first undertaken to refine the experimental procedure. In the experi-ment, participants were asked to provide a concurrent, ‘think-aloud’ verbal protocol [5].They were first given a training task to practice ‘thinking aloud’. Participants began theirinteraction session by first completing the exploration phase. They then carried out all the

K. Kaur et al. / Interacting with Computers 11 (1999) 403–426412

Fig. 12. The inside world representing an empty complex. From clicking on one of the windows, an informationbox is displayed describing the windows in the unit.

Fig. 11. The inside world showing of one of the units in the park, as an office complex.

eight set tasks. Their interaction sessions were video recorded. When participants hadcompleted the tasks to their satisfaction, they completed a paper test on the site and tookpart in a de-brief session with the experimenter.

3.2. Data analysis

A random selection of the tasks for each participant were analysed. This was because oftime constraints for carrying out the data analysis, which meant that it was impractical toanalyse participants’ full interaction sessions. The exploration phase was analysed alongwith one object investigation task, one action task and one analysis task. The number oftimes each of the different set tasks were analysed was balanced overall.

Verbal protocols from selected tasks were transcribed, noting concurrent physical andsystem behaviour. Speech segments were matched to mental behaviours using generalverbalisation categories, independent of the theory. The major categories were:

observations—comments describing the observed environment or assertions givingreasoning and interpretations about the observed environment (e.g. from participantA—“ok windows straight ahead”, “erm no window handles”);observations–changes—comments describing observed changes in the environmentresulting from system feedback, after an executed action, or system behaviour (e.g.from participant L—“here I’m I seem to be being taken through it”, “I err yes its a drivethrough”);objectives—expression of goals or planning of the experiment task (e.g. from partici-pant I—“let’s find the drawing board”, “oh I’ll have to give up on that task I can’t doit”);intentions—verbalisation of intent to do some specified action (e.g. from participantB—“err open the door”, “go in”);problem reports—comments describing problems faced, such as being unable to under-stand the current situation, or making slips (e.g. from participant K—“oops”, “bash intothe wall a bit”);problem solving—verbalisation of mental activity in considering a task or interactionproblem, such as asking questions, developing and testing hypotheses (e.g. from parti-cipant R— “maybe the switches are in here it’s a bit unlikely”, “I don’t think you havepower switches in the boardroom”);reading documentation—verbalisation arising from participant reading aloud andtaking information from the provided documentation, such as task descriptions (e.g.from participant M—“OK task three analysis and problem solving”, “right so mainbuilding find out what areas of the building have a special factory floor covering”);

For physical behaviour, four categories were defined to cover the range of relevantphysical operations:

movement—navigating around and thereby changing the current position in the envir-onment;adjusting view angle—altering the angle of view whilst maintaining a stationary posi-tion, such as tilting the view angle down or heightening it;interacting with objects—mouse-clicking on objects to interact with them;

K. Kaur et al. / Interacting with Computers 11 (1999) 403–426 413

executing commands—executing commands not directly related to objects in the envir-onment, such as pressing ‘SHIFT-H’ to change worlds or ‘F12’ to reset the world.

The categorisation of verbal protocols was validated through cross-marking by twoindependent observers. Each observer allocated a verbalisation category to each utterancein a 10 minute transcript from one of the interaction sessions. An inter-observer agreementof 83% of the utterances was reached. Resulting differences in categorisation werediscussed and, where necessary, changes were agreed by both observers.

Rules, represented in decision trees, were defined for matching the verbalisation andphysical behaviour categories to interaction stages in the theory. The tree for the ‘observa-tions–changes’ category is shown in Fig. 13. There are a possible five theory stages, fromthe task action (TA) and system initiative (SI) modes, that can be matched to this category,according to the type of change commented on. The decision trees were systematicallyapplied to the data by selecting the appropriate tree, for a category, and then checking therequired conditions, from top to bottom, until either a theory stage was reached or no set ofconditions were met, in which case an additional behaviour was noted. Later all additionalbehaviours noted were refined into a set of additional stages of interaction, by groupingtogether similar behaviours.

The matching of interaction stages was also validated through cross-marking by twoindependent observers. Each observer allocated a stage of interaction to each verbalisationinstance and physical behaviour in a 10 minute transcript from one of the interactionsessions. An inter-observer agreement of 81% was reached. Resulting differences betweenmatched stages were discussed and, where necessary, changes were agreed by bothobservers.

Data on observed stages of interaction were entered into a database which was queriedto test hypotheses. The next section details the results.

4. Results

All ten participants attempted all tasks. The time spent interacting with the VE rangedfrom 28 to 73 minutes (mean 47). Participants’ interaction sessions selected for analysisranged from 16 to 35 minutes (mean 23).

K. Kaur et al. / Interacting with Computers 11 (1999) 403–426414

Fig. 13. Decision tree to match verbalisations for the category ‘observations–changes’ to a possible five theorystages, from the task action (TA) and system initiative modes (SI).

4.1. Modes of interaction

Table 1 shows for each mode the number and percentage of times it was followed bystages within each of the modes. For task action and explore navigate modes, 79% ofstage-to-stage transitions stayed within the mode boundaries and 21% crossed to anothermode. System initiative mode had only 51% transitions being within-mode, 22% transfersto task action mode and 27% transfers to explore navigate mode. Chi-squared tests of thenumber of within-mode transitions, were found to be significant atp , 0.01, for taskaction and explore navigate modes. Therefore, hypothesis 1a is supported for only twomodes, for which there were significantly more transitions within the mode boundariesthan could have been expected by chance.

Table 2 shows the number of within-mode stage transition sequences by length of thestage chains. The data are reported as a survivorship function, so of the 1594 transitionsthat were observed between stages in task action mode (in Table 1), 1162 progressed tothree stage chains 891 progressed to four stages and 712 of the originating sequences hadfive stages. Other sequences either terminated or crossed a mode boundary, so the percen-tages express the number of sequences that remained within the mode compared with thetotal observed at that length, hence 50% of five stage chains remained within the taskaction mode. Expected values for the stage transition sequences were calculated from thetotal observed frequencies (at length� three, four, five) multiplied by the probability forthe stage combinations, at lengths three, four and five (p � 0.25, 0.125, 0.06). Theobserved values for all lengths, for all of the modes were significant atp # 0.01 (x2).Therefore, hypothesis H1b is supported for all modes since the data showed a pattern ofstaying within mode boundaries for up to five consecutive stages. Indeed, the pattern

K. Kaur et al. / Interacting with Computers 11 (1999) 403–426 415

Table 1Number and percentage of stage-to-stage transitions within mode boundaries and across different modes (dataonly include transitions between the theory stages)

From To

Task action Explore navigate System initiative Total

Task action 1594 79% 388 19% 31 2% 2013Explore navigate 352 19% 1461 79% 41 2% 1854System initiative 37 22% 44 27% 85 51% 166

Table 2Number and percentage of stage transition sequences that remained within the boundaries of each mode, bylength of stage chain (data only include transitions between the theory stages)

Length of stage chain Task action Explore navigate System initiative

3 1162 65% 1055 65% 56 34%4 891 56% 811 55% 40 24%5 712 50% 677 47% 34 20%6 580 43% 606 40% 31 19%

continued and observed values for length� six stages were also significant atp , 0.01, forall modes.

4.2. Stages of interaction

Table 3 shows the number of times each theory stage occurred. Eighty-six percent ofobserved stages could be attributed to predicted stages in the theory. The remainder were24 identified additional stages of interaction, listed in Appendix 1. Ax2 test of the totaloccurrence of theory versus additional stages, was found to be significant atp , 0.01.Therefore, hypothesis 2 is supported since the theory stages were found to account forsignificantly more observed stages of interaction, than could have been expected bychance. Indeed, theory stages accounted for the majority (80–90%) of observed stagesfor every participant.

The predominant stages, which accounted for at least 5% of the total, were six stages intask action and explore navigate modes, see Table 4. Together these stages accounted for

K. Kaur et al. / Interacting with Computers 11 (1999) 403–426416

Table 3Occurrence totals for each of the theory stages. The percentage of observed stages falling into each stage is alsogiven

Theory stages Total occurrence % of all stages No. of participants

execute 1001 18.36 allfeedback 446 8.18 allapproach/orient 345 6.33 allintention task action 144 2.64 allestablish goals 135 2.48 allevaluate 115 2.11 allinspect 68 1.25 alldeduce sequence 56 1.03 9consider objects 20 0.37 6Total task action 2330 42.73

navigate 1433 26.26 allplan 340 6.24 allscan 302 5.54 allexplore 97 1.78 allintention explore action 43 0.79 8Total explore navigate 2215 40.62

monitor 67 1.23 allintention control action 54 0.99 9event 30 0.55 6acknowledge control 15 0.28 8end control 4 0.07 4plan 2 0.04 1intention reactive action 1 0.02 1Total system initiative 173 3.17

Total theory stages 4718 86.52Overall total 5453

71% of the total number of observed stages. Uncommon stages, which accounted for lessthan 1% of the total, included seven theory stages—‘consider objects’ and ‘intentionexplore action’ from task action and explore navigate modes, and five stages from systeminitiative mode.

Five of the additional stages did account for at least 1% of the total. Indeed, theseventh highest occurring stage was the additional stage ‘interpret navigation feed-back’ (4% of total). The others were: forming an intention to execute a command(such as switching worlds); scanning and inspecting an area of the environment;forming an intention to approach an object; and considering where an object mightbe located (see Appendix 1).

4.3. Flow of interaction

Figs. 14–16 show the original models with the observed, common stage transi-tions for each mode. Common stage transitions were taken to be those where atleast 10% of transitions from the first stage led to the second stage. This resulted ina manageable range of one to four stages that commonly followed any one stage. Therewas very little transition from theory stages to the additional stages, so the latter are notincluded.

Comparing Fig. 14(a) and Fig. 14(b) shows that nine of the 13 within-mode transitionswere either as predicted or were jumps forward or backtracks of one stage in task actionmodel. For example, the observed transition from ‘establish goals’ to ‘intention taskaction’ was predicted, there was a jump from ‘intention task action’ to ‘approach/orient’skipping the ‘consider objects’ stage, and a backtrack from ‘feedback’ back to ‘execute’.Therefore, hypothesis 3 is only partially supported for task action mode, since fourobserved transitions did not follow the general flow of interaction in this model. Threeof these four transitions were jumps or backtracks of more than one stage. The othertransition, from ‘inspect’ to ‘execute’, crossed paths in the predicted model. Other differ-ences were mainly with the less prevalent stages which did not have any common incom-ing transitions. For example, there was no iteration back from the ‘evaluate’ stage to‘establish goals’. Transfers to explore navigate mode occurred from five other stages, aswell as the predicted ‘consider objects’ stage. Finally, the three predominant stages in taskaction mode were linked together, as highlighted in Fig. 14(b).

Comparing Fig. 15(a) and Fig. 15(b) shows that four of the five within-mode transitionswere either as predicted or were jumps forward of one stage in explore navigate model. For

K. Kaur et al. / Interacting with Computers 11 (1999) 403–426 417

Table 4The six predominant stages

Mode Predominant stage % of all stages

Explore navigate navigate 26.3Task action execute 18.4Task action feedback 8.2Task action approach/orient 6.3Explore navigate plan 6.2Explore navigate scan 5.5

example, the observed transition from ‘plan’ to ‘navigate’ was predicted and there was ajump from ‘scan’ to ‘navigate’. Therefore, hypothesis 3 is somewhat supported for explorenavigate mode, since only one observed transition did not follow the general flow ofinteraction in this model. This remaining transition was a jump of two stages from‘explore’ to ‘navigate’. No common transition was found from ‘scan’ to ‘plan’, and theless prevalent ‘intention explore action’ stage did not have any common incoming transi-tions. Transfers to task action mode did occur from ‘intention explore action’ but also from

K. Kaur et al. / Interacting with Computers 11 (1999) 403–426418

Fig. 14. Interaction flow diagrams for task action mode. Left: the predicted model. Right: the observed commontransitions for stages in this mode.

‘explore’, and no common transfer was found from the ‘plan’ stage. The threepredominant stages in explore navigate mode were closely coupled, as highlighted inFig. 15(b).

Comparing Fig. 16(a) and Fig. 16(b) shows that all four within-mode transitions wereeither as predicted or were jumps forward or backtracks of one stage in system initiativemodel. For example, the observed transition from ‘acknowledge control’ to ‘monitor’ waspredicted and there was a jump from ‘acknowledge control’ to ‘intention control action’.Therefore, hypothesis 3 is supported for system initiative mode. As before, the lessprevalent stages had no common incoming transitions. Transfers to task action and explorenavigate modes occurred from ‘intention control action’ and ‘intention reactive action’stages, but the ‘plan’ stage was skipped and instead transfers made directly from the‘event’ and ‘end control’ stages.

In the next section, the implications of all these results are discussed.

5. Discussion

The study findings have important implications for the proposed theory and provideinsights into the behaviour of users interacting with VEs.

5.1. Implications for theory of interaction

Results provide general support for the proposed theory and indicate required refine-ments. Support was found for the three modes of interaction. However, very little of theinteraction was accounted for by stages in system initiative mode, and this mode wasfound to be less cohesive than the others. A possible reason is that the test environmentprovided limited system behaviour. Therefore, system initiative more may be more

K. Kaur et al. / Interacting with Computers 11 (1999) 403–426 419

Fig. 15. Interaction flow diagrams for explore navigate mode. Left: the predicted model. Right: the observedcommon transitions for stages in this mode.

application dependent, and may be unimportant for applications involving very littlesystem behaviour. For other applications, such as guided training environments or thoseinvolving virtual agents, activity in system initiative mode may potentially be an importantpart of interaction.

Six frequent and closely linked stages of interaction may be considered as the predo-minant or core model. These included all the physical behaviours ‘navigate’, ‘execute’ and‘approach/orient’. In the data analysis, physical behaviours were invariably detected fromthe videos, but not all participants’ thoughts were verbalised because of errors of omission[24]. Therefore, the realistic percentage of physical, over mental, stages may be lower, butthe very high occurrence totals for ‘navigate’ and ‘execute’ suggest that these stages arevery common, and the ‘approach/orient’ stage reasonably common. The ‘navigate’ stageappeared to be almost a default behaviour which participants would return to, for exampleafter carrying out actions.

Some theory stages were uncommon and may indicate areas where the models are weakor need to be simplified. Three stages within system initiative mode were especiallyuncommon, occurring less than five times each. This seems to be due to participantspreferring only certain responses to the limited system behaviours. For example, the‘end control’ stage was uncommon because participants often exited before the guidedtour had completed, and participants often did not answer the ringing telephone, for the‘intention reactive action’ stage to occur. Therefore, some types of response to systembehaviour may be less important, but further investigation, using a wider range of systembehaviours, seems necessary. In explore navigate mode, the ‘intention explore action’

K. Kaur et al. / Interacting with Computers 11 (1999) 403–426420

Fig. 16. Interaction flow diagrams for system initiative mode. Left: the predicted model. Right: the observedcommon transitions for stages in this mode.

stage was expected to occur more often since a large part of this model is the predictedexploratory and opportunistic actions. Participants may have had limited opportunity forexploratory actions because, during the exploration phase, they were busy learning thenavigation technique and familiarising themselves with the environment. Furthermore,exploratory actions were recorded, and it may be that participants did not always verbalisespecific intentions to carry them out, because there was little conscious planning and theywere more inclined to immediately try out the action. Therefore, although the ‘intentionexplore action’ stage was uncommon, it seems exploratory actions should have a place inthe model, but further evaluation is needed to assess their importance. Finally, in taskaction mode the ‘consider objects’ stage was found to be a less important behaviourbecause, for most intended actions, little consideration of objects involved was required.Therefore, the model can be simplified by incorporating this behaviour into ‘intention taskaction’, to create a stage where an intention to carry out a task action is formed and apossible consideration made of objects required.

Some common additional stages of interaction were found that perhaps should beincorporated in the theory. The most common was ‘interpret navigation feedback’which occurred when participants were learning the navigation technique or had encoun-tered problems in navigation. This behaviour may have been particularly common with thetest application because, often during the guided tour participants tried to navigate forthemselves, but found they were not moving in the intended direction. Therefore, insteadof assigning a new stage, the existing ‘navigate’ stage can be extended to include apossible interpretation of navigation feedback. The next most common additional stagewas ‘intention to execute command’ which was similar to the lesser common ‘intention toopen door for navigation’ (see Appendix 1). These stages involved intentions to carry outactions for moving through the environment, for example, transporting to other worlds orset positions, or opening doors to move into rooms. Although the test application providedcommands for moving between worlds/views which may not be typical, actions formoving to set positions and opening doors are more common in VEs. Therefore, a requiredrefinement of the theory seems to be a new stage for actions to move through the environ-ment. The other common additional stages indicated three further extensions to existingstages.

Partial support was found for the predicted flow of interaction in the models. Thegeneral organisation of stages in the models was found to hold reasonably well, butthere were jumps forward, omitting stages, and backtracking to previous stages. Predictedflows to the more uncommon stages were not found. There is not enough data on inter-action flow with the uncommon stages for satisfactory evaluation and further assessmentof these parts of the models is needed. Stage omissions were expected for automatedactions in skilled behaviour, and backtracking when re-trying stages after error or repeat-ing stages for multi-operation tasks. However, the models can be refined to be morerepresentative by including the major jumps and backtracks. As an example, Fig. 17shows an amended task action model after some preliminary refinements. The ‘considerobjects’ stage has been merged with ‘intention task action’ and this removes some of thejumps to skip this stage. Major unpredicted flows have been added, such as a backtrackfrom ‘feedback’ to ‘execute’ for re-trying actions after unsatisfactory feedback, a crossfrom ‘inspect’ to ‘execute’ to link object manipulations to prior inspections, and a transfer

K. Kaur et al. / Interacting with Computers 11 (1999) 403–426 421

K. Kaur et al. / Interacting with Computers 11 (1999) 403–426422

Fig. 17. Task action model after preliminary refinements.

from ‘evaluate’ to explore navigate mode for returning to navigation after evaluating anaction.

5.2. Wider implications

The study has shown that protocol analysis techniques can be used to evaluate and refinetheoretical-based process models. The refined models, in the theory, provide an importantinitial understanding of the behaviour of users interacting with VEs. The models areclearly different from Norman’s initial model. The main difference being the additionof exploratory and reactive behaviours to the pre-planning model. Although reactivebehaviour has received less attention, the importance of exploratory behaviour and,more generally, display-based and situated behaviour has been widely recognised. Such-man [27] argued for the importance of situated action, where the environment is made upof a succession of situations that the user encounters and responds to, rather than beingcontrolled by pre-formed plans. Payne [20] argued for the importance of display-basedaction, where the interface acts as a resource for action, rather than playing the limited roleof feedback. Rieman et al. [23] argued that exploratory behaviour is a frequent, effectiveand often preferred method for a user to learn about an interactive system. They propose amodel of exploratory behaviour, IDXL, which has some similarities to the explore navi-gate model. IDXL describes interactive, unplanned and weakly goal-driven behaviour.The default action is to scan the interface for useful features, using a label-followingstrategy, then either try out actions or continue scanning. The scanning need not be drivenby prior task goals but can be exploratory. Labels may play a much less prominent role inVE interfaces, instead graphical objects in the immediate vicinity are perceived, identifiedand may be acted upon. The LICAI model of interaction [15] also points to the importanceof display based cognition in the label following strategy of interaction. LICAI makesstrong, and experimentally verified, predictions about the user’s behaviour and usabilityproblems in choosing commands in menu driven interfaces. Our work complements moreprecise theories such as LICAI and IDXL, by providing a wider view of interaction.Whereas more detailed theory can predict behaviour, and detailed features of an interfacedesign to ensure usability, they sacrifice breadth for depth. For instance, LICAI can adviseon command names being compatible with the user task, on visibility of commands andordering of command labels, but it has little advice on design of icons for commands orobjects or interactive metaphors. We see the process model approach contributing widerranging advice, to suggest the design features that a user interface should provide to assureusability at each stage in the cycle of interaction.

Furthermore, the study indicates that various modes of interaction behaviour (such astask driven, exploratory and reactive) are important, and may co-exist in any interactionsession. Therefore, interaction models perhaps need to incorporate the different ways inwhich a user’s interactions may be driven. Fields et al. [6] suggest that relying on a singlecognitive perspective for interaction modelling provides too narrow a basis for interfacedesign, and that a range of interaction strategies, such as plan following, semantic match-ing and learning by exploration, need to be considered.

Norman’s theory of action was also elaborated by Springett [26] to describe interactivebehaviour for direct manipulation interfaces. Springett’s model describes interaction at

K. Kaur et al. / Interacting with Computers 11 (1999) 403–426 423

different skill levels using Rasmussen’s [21] three levels of user action, but with noexplicit modelling of exploratory and reactive behaviour. Three of the six predominantVE stages do not have an equivalent in Springett’s model (navigation in 3D space, plan-ning further navigations and approaching and orienting to action objects), indicating thatsuch behaviours may be more specific to VEs, and that virtual reality extends the bound-aries of interactive behaviour beyond GUIs.

Insight into the similarities and differences between the nature of VE interactionand interaction with more conventional interfaces is useful when considering usabil-ity principles, guidelines and methods for VE design. Guidelines borrowed fromstandard user interface design, such as ISO9241 [11], are likely to prove inadequatebecause they do not address the range of interactive behaviour afforded by VEs.Knowledge of user interaction with VEs, provided by the study, can inform thedevelopment of design guidance and evaluation methods, specifically for VEs.Indeed, any such methods should address the problem of supporting the identifiedstages and modes of interaction, and pay particular attention to the predominantstages, that were identified in the study.

6. Further work

Our other work involves further assessment of the models, and use of the theory todevise design and evaluation guidance for VEs. More data on interaction behaviour will beanalysed to improve validation of certain areas of the theory, such as stages in systeminitiative mode. Individual differences in interaction behaviour and differences in beha-viour by task and environment type will also be analysed.

Required properties of a VE design have been predicted to support the stages ofinteraction defined in the models (see [12]). Current work involves testing whetherthe required design properties lead to improved interaction, by usability testing anapplication before and after implementation of the properties. The set of designproperties will then be developed into concrete design guidelines and checklistquestionnaires to support design and evaluation of VEs. Our aim is to addressproblems of VE interaction design using interaction models as a theoretical basisfor guidance. This study represents an important step towards this aim by evaluatingand refining the interaction models, so they are suitable for use in informing usabilityrequirements when interacting with VEs.

Acknowledgements

We wish to thank VR Solutions and The Rural Wales Development Board for loan ofthe application used in the study. In particular, we thank Phillip Trotter, Bob Stone andAndrew Connell. Kulwinder Kaur has been supported by an EPSRC post-graduatestudentship, number 94314576.

K. Kaur et al. / Interacting with Computers 11 (1999) 403–426424

Appendix A. Twenty-four additional stages identified

The occurrence total and percentage for each additional stage

Totaloccurrence

% of allstages

No. ofparticipants

interpret navigation feedback 216 3.96 allintention to execute command 104 1.91 9scan and inspect an area 82 1.50 allintention to approach target 78 1.43 allconsider location of target object 61 1.12 9intention to open door for navigation 35 0.64 7evaluate exploration carried out 33 0.61 9deduce the sequence required for navigation 30 0.55 9interpret feedback after an approach 17 0.31 7deduce the interaction sequence after carrying out an action 12 0.22 5scan and check the view angle or orientation of self 10 0.18 6consider attributes of target object 8 0.15 5evaluate completed tasks 7 0.13 2plan for future tasks 6 0.11 4consider content of environment 6 0.11 4predict what a planned navigation will bring into view 5 0.09 4plan how to take control back from the system 5 0.09 2decide to give up on a task 4 0.07 3intention to opportunistically carry out an action for a different task 4 0.07 3evaluate the state prior to the action execution 3 0.06 2perceive the end of an event with a long duration 3 0.06 1evaluate navigation method 3 0.06 3predict what will be the outcome of an exploratory action 2 0.04 2locate the current position in the world 1 0.02 1

Total 735 13.48

References

[1] P. Barnard, Bridging between basic theories and artefacts of human–computer interaction, in: J.M. Carroll(Ed.), Designing Interaction, Psychology at the Human Computer Interface, Cambridge University Press,Cambridge, 1991.

[2] M. Bolas, Designing virtual environments, in: C.G. Loeffler, T. Anderson (Eds.), The Virtual RealityCasebook, Van Nostrand Reinhold, New York, 1994.

[3] S. Bryson, Approaches to the successful design and implementation of VR applications, in: R.A. Earnshaw,J.A. Vince, H. Jones (Eds.), Virtual Reality Applications, Academic Press, London, 1995.

[4] R.P. Darken, J.L. Sibert, Wayfinding strategies and behaviours in large virtual worlds, in: Proceedings ofCHI ’96, Vancouver, 1996, Association for Computing Machinery, New York, 1996, pp. 142–149.

[5] K.A. Ericsson, H.A. Simon, Protocol Analysis, MIT Press, Cambridge, MA, USA, 1984.[6] B. Fields, P. Wright, M. Harrison, Objectives, strategies and resources as design drivers, in: Proceedings of

INTERACT 97: 6th IFIP Conference on Human–Computer Interaction, Sydney, International Federationfor Information Processing, (London: Chapman & Hall) 1997.

K. Kaur et al. / Interacting with Computers 11 (1999) 403–426 425

[7] J.J. Gibson, The Ecological Approach to Visual Perception, Lawrence Erlbaum Associates, Hillsdale, NJ,1986.

[8] B. Hayes-Roth, F. Hayes-Roth, A cognitive model of planning, Cognitive Science 3 (1979) 275–310.[9] K.P. Herndon, A. van Dam, M. Gleicher, The Challenges of 3D Interaction: a CHI’94 Workshop, SIGCHI

Bulletin 26 (4) (1994) 36–43.[10] K. Hook, N. Dahlback, Can cognitive science contribute to the design of VR applications? Paper presented

at 5th MultiG Workshop, Stockholm, December 1992, ftp://ftp.kth.se/pub/MultiG/conferences/MultiG5.[11] ISO, draft international standard 9241—part 16, Ergonomic requirements for office work with visual

display terminals—direct manipulation dialogues, International Standards Organisation, 1996.[12] K. Kaur, Designing virtual environments for usability, Doctoral Consortium Paper, in: Proceedings of

INTERACT 97: 6th IFIP Conference on Human–Computer Interaction, Sydney, International Federationfor Information Processing, (London: Chapman & Hall) 1997.

[13] K. Kaur, N. Maiden, A. Sutcliffe, Design practice and usability problems with virtual environments, in:Proceedings of the Virtual Reality World ’96 Conference, Stuttgart, IDG Conferences, Munich, Germany,1996.

[14] D. Kieras, P.G. Polson, An approach to the formal analysis of user complexity, International Journal ofMan–Machine Studies 22 (1985) 365–394.

[15] M. Kitajima, P.G. Polson, A comprehension-based model of exploration, in: Proceedings of the CHI 96,Vancouver, Association for Computing Machinery, New York, 1996, pp. 324–331.

[16] C.E. Loeffler, T. Anderson, The Virtual Reality Casebook, Van Nostrand Reinhold, New York, 1994.[17] L.D. Miller, A usability evaluation of the Rolls-Royce virtual reality for aero engine maintenance system.

Masters Thesis, University College London, 1994.[18] M. Mohageg, R. Mysers, C. Marrin, J. Kent, D. Mott, P. Isaacs, A user interface for accessing 3D content on

the world wide web, in: Proceedings of CHI 96, Vancouver, Association for Computing Machinery, NewYork, 1996, pp. 466–472.

[19] D.A. Norman, The Psychology of Everyday Things, Basic Books, New York, 1998.[20] S.J. Payne, Display-based action at the user interface, International journal of Man–Machine Studies 35

(1991) 275–289.[21] J. Rasmussen, Skills, rules, and knowledge; signals, signs, and symbols, and other distinctions; Human

performance models, Transactions on System, Man and Cybernetics 13 (3) (1983) 257–266.[22] P. Reisner, Discussion: HCI, what is it and what research is needed?, in: J.M. Carroll (Ed.), Interfacing

Thought: Cognitive Aspects of Human–Computer Interaction, MIT Press, Cambridge, MA, 1987.[23] J. Rieman, R.M. Young, A. Howes, A dual-space model for iteratively deepening exploratory learning,

International Journal of Human–Computer Studies 44 (1996) 743–775.[24] J.E. Russo, E.J. Johnson, D.L. Stephens, The validity of verbal protocols, Memory and Cognition 17 (6)

(1989) 759–769.[25] B. Shneiderman, The future of interactive systems and the emergence of direct manipulation, Behaviour and

Information Technology 1 (1982) 237–256.[26] M.V. Springett, User Modelling for Direct Manipulation Evaluation, Ph.D. Thesis, Centre for HCI Design,

City University, London, 1996.[27] L.A. Suchman, Plans and Situated Actions: The Problem of Human–Machine Communication, Cambridge

University Press, Cambridge, 1987.[28] A. Sutcliffe, M. Ryan, M. Springett, A. Doubleday, Model mismatch analysis: towards a deeper explanation

of users’ usability problems, Report HCID/95/10, Centre for HCI Design, School of Informatics, CityUniversity, London, 1995.

[29] J. Wann, M. Mon-Williams, What does virtual reality NEED?: human factors issues in the design of three-dimensional computer environments, International Journal of Human–Computer Studies 44 (1996) 829–847.

K. Kaur et al. / Interacting with Computers 11 (1999) 403–426426