measurement and modeling of display clutter in advanced ... · integration flight deck (ifd)...

22
This article was downloaded by: [Massachusetts Institute of Technology] On: 12 October 2012, At: 06:36 Publisher: Taylor & Francis Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK The International Journal of Aviation Psychology Publication details, including instructions for authors and subscription information: http://www.tandfonline.com/loi/hiap20 Measurement and Modeling of Display Clutter in Advanced Flight Deck Technologies Amy L. Alexander a , David B. Kaber b , Sang-Hwan Kim c , Emily M. Stelzer d , Karl Kaufmann b & Lawrence J. Prinzel III e a Aptima, Inc., Woburn, Massachusetts b Department of Industrial and Systems Engineering, North Carolina State University, Raleigh, North Carolina c Department of Industrial and Manufacturing Systems Engineering, University of Michigan, Dearborn, Michigan d The MITRE Corporation, McLean, Virginia e NASA Langley Research Center, Hampton, Virginia Version of record first published: 11 Oct 2012. To cite this article: Amy L. Alexander, David B. Kaber, Sang-Hwan Kim, Emily M. Stelzer, Karl Kaufmann & Lawrence J. Prinzel III (2012): Measurement and Modeling of Display Clutter in Advanced Flight Deck Technologies, The International Journal of Aviation Psychology, 22:4, 299-318 To link to this article: http://dx.doi.org/10.1080/10508414.2012.718233 PLEASE SCROLL DOWN FOR ARTICLE Full terms and conditions of use: http://www.tandfonline.com/page/terms- and-conditions

Upload: others

Post on 15-Jul-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Measurement and Modeling of Display Clutter in Advanced ... · Integration Flight Deck (IFD) simulator. The IFD is a fixed-based simulator that replicates the forward flight deck

This article was downloaded by: [Massachusetts Institute of Technology]On: 12 October 2012, At: 06:36Publisher: Taylor & FrancisInforma Ltd Registered in England and Wales Registered Number: 1072954Registered office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH,UK

The International Journal ofAviation PsychologyPublication details, including instructions forauthors and subscription information:http://www.tandfonline.com/loi/hiap20

Measurement and Modeling ofDisplay Clutter in AdvancedFlight Deck TechnologiesAmy L. Alexander a , David B. Kaber b , Sang-HwanKim c , Emily M. Stelzer d , Karl Kaufmann b &Lawrence J. Prinzel III ea Aptima, Inc., Woburn, Massachusettsb Department of Industrial and Systems Engineering,North Carolina State University, Raleigh, NorthCarolinac Department of Industrial and ManufacturingSystems Engineering, University of Michigan,Dearborn, Michigand The MITRE Corporation, McLean, Virginiae NASA Langley Research Center, Hampton, Virginia

Version of record first published: 11 Oct 2012.

To cite this article: Amy L. Alexander, David B. Kaber, Sang-Hwan Kim, Emily M.Stelzer, Karl Kaufmann & Lawrence J. Prinzel III (2012): Measurement and Modelingof Display Clutter in Advanced Flight Deck Technologies, The International Journal ofAviation Psychology, 22:4, 299-318

To link to this article: http://dx.doi.org/10.1080/10508414.2012.718233

PLEASE SCROLL DOWN FOR ARTICLE

Full terms and conditions of use: http://www.tandfonline.com/page/terms-and-conditions

Page 2: Measurement and Modeling of Display Clutter in Advanced ... · Integration Flight Deck (IFD) simulator. The IFD is a fixed-based simulator that replicates the forward flight deck

This article may be used for research, teaching, and private study purposes.Any substantial or systematic reproduction, redistribution, reselling, loan,sub-licensing, systematic supply, or distribution in any form to anyone isexpressly forbidden.

The publisher does not give any warranty express or implied or make anyrepresentation that the contents will be complete or accurate or up todate. The accuracy of any instructions, formulae, and drug doses should beindependently verified with primary sources. The publisher shall not be liablefor any loss, actions, claims, proceedings, demand, or costs or damageswhatsoever or howsoever caused arising directly or indirectly in connectionwith or arising out of the use of this material.

Dow

nloa

ded

by [

Mas

sach

uset

ts I

nstit

ute

of T

echn

olog

y] a

t 06:

36 1

2 O

ctob

er 2

012

Page 3: Measurement and Modeling of Display Clutter in Advanced ... · Integration Flight Deck (IFD) simulator. The IFD is a fixed-based simulator that replicates the forward flight deck

THE INTERNATIONAL JOURNAL OF AVIATION PSYCHOLOGY, 22(4), 299–318Copyright © 2012 Taylor & Francis Group, LLCISSN: 1050-8414 print / 1532-7108 onlineDOI: 10.1080/10508414.2012.718233

Measurement and Modeling of DisplayClutter in Advanced Flight Deck

Technologies

Amy L. Alexander,1 David B. Kaber,2 Sang-Hwan Kim,3

Emily M. Stelzer,4 Karl Kaufmann,2 and Lawrence J. Prinzel, III5

1Aptima, Inc., Woburn, Massachusetts2Department of Industrial and Systems Engineering,

North Carolina State University, Raleigh, North Carolina3Department of Industrial and Manufacturing Systems Engineering,

University of Michigan, Dearborn, Michigan4The MITRE Corporation, McLean, Virginia

5NASA Langley Research Center, Hampton, Virginia

Clutter is a key concern in the design of complex displays, particularly insafety-critical domains such as aviation. The objective of this research was to inves-tigate techniques for measuring subjective perceptions of clutter and to model thepredicted impacts of clutter on pilot performance within the context of advancedflight deck technologies. Six commercial pilots flew simulated approaches undervaried workload conditions with low-, medium-, and high-clutter head-up displays,rating the perceived clutter and subjective mental workload associated with eachdisplay configuration. Results revealed that high-clutter displays produced elevatedreports of perceived clutter and workload due to information density or redundancy,whereas low-clutter displays were perceived as less cluttered but challenging to usedue to lack of relevant information typically used during flight. A multidimensionalmeasure of clutter was found to be more sensitive to display differences than anoverall perceived rating of clutter, and low-level visual display properties were suc-cessful in predicting clutter perceptions and pilot performance. Finalized productsof this research could support optimized display design through the identification ofclutter thresholds and the implementation of clutter alerts, decluttering mechanisms,or both, and could be used to support display certification and acquisitions processes.

Correspondence should be sent to Amy L. Alexander, MIT Lincoln Laboratory, 244 Wood Street,Lexington, MA 02420. E-mail: [email protected]

Dow

nloa

ded

by [

Mas

sach

uset

ts I

nstit

ute

of T

echn

olog

y] a

t 06:

36 1

2 O

ctob

er 2

012

Page 4: Measurement and Modeling of Display Clutter in Advanced ... · Integration Flight Deck (IFD) simulator. The IFD is a fixed-based simulator that replicates the forward flight deck

300 ALEXANDER ET AL.

Clutter is a key concern in the design of complex displays given the potentialnegative impacts of data overload on performance, particularly in safety-criticaldomains such as aviation. Display clutter has been found to obscure impor-tant information and cause confusion (e.g., Lohrenz, Trafton, Beck, & Gendron,2009), interfere with searching for an important target (e.g., Rosenholtz, Li, &Nakano, 2007), hinder the interpretation and use of visual information in deci-sion processes (e.g., Aretz, 1988), and disrupt visual attention (e.g., Schons &Wickens, 1993). It is therefore critical to develop a reliable technique for measur-ing clutter in complex displays, and in turn, determine the extent to which givenlevels of clutter will degrade operator performance. The ability to reliably mea-sure and model display clutter will support optimized display design through theidentification of acceptable information–performance trade-offs that could drivethe detection of clutter thresholds and the implementation of clutter alerts anddecluttering mechanisms.

To measure or model the amount of clutter in a given display, one must havean understanding of what clutter is, operationally speaking, and what causes it.A review of definitions of clutter was provided by Kaber and colleagues (Kaberet al., 2008), focusing on the content or format of the display, degraded perfor-mance associated with use of the display, or information relevance. We attemptto capture all of these issues by defining clutter as an unintended effect ofdisplaying visual imagery that obscures or confuses existing information, isredundant, or is not relevant to the task at hand (Alexander, Stelzer, Kim, &Kaber, 2008). Critically, this definition captures both bottom-up (i.e., data-driven)factors inherent to the physical characteristics of a display and top-down (i.e.,knowledge-driven) factors influenced by operator experience and knowledge ofthe task domain.

Previous efforts have examined a variety of measures and models to quantifythe amount of clutter in a display and correlate levels of clutter with degradationsin operator performance. Rosenholtz and colleagues (Rosenholtz, Li, Mansfield,& Jin, 2005; Rosenholtz et al., 2007) tested three different measures of clutter inthe context of visual search tasks: feature congestion (based on the ease of addinga new item that would reliably draw attention), subband entropy (relating clutterto the organization and encoding efficiency of visual information in a display),and edge density (a calculation of the density of object edge pixels). All threemeasures were found to increase monotonically with the set size used in the visualsearch task and also correlated well with search-related performance measures.However, these studies did not examine the effect of top-down information onperceptions of clutter based on user goals and tasks or resultant performance indynamic, real-world visual tasks.

Lohrenz and colleagues initially examined the use of feature clustering(Lohrenz, Layne, Edwards, Gendron, & Bradley, 2006) and later pursued a color-clustering model (Beck, Lohrenz, Trafton, & Gendron, 2008; Lohrenz et al., 2009)

Dow

nloa

ded

by [

Mas

sach

uset

ts I

nstit

ute

of T

echn

olog

y] a

t 06:

36 1

2 O

ctob

er 2

012

Page 5: Measurement and Modeling of Display Clutter in Advanced ... · Integration Flight Deck (IFD) simulator. The IFD is a fixed-based simulator that replicates the forward flight deck

CLUTTER MEASUREMENT AND MODELING 301

as potential techniques for measuring clutter in complex geospatial displays. Thefeature clustering algorithm was designed to cluster display features accordingto geospatial location and color and was found to correlate well with subjec-tive clutter ratings collected from a single rater. Follow-on studies with the moreformalized color-clustering model also revealed high correlations with subjectiveclutter ratings as well as visual search times for predefined targets. Again, thesestudies did not account for the influence of top-down factors that could be cap-tured by collecting subjective perceptions of clutter and examining actual operatorperformance in the context of a realistic task environment.

The objective of this research was to investigate techniques for measuringsubjective perceptions of display clutter, reflecting both bottom-up and top-downfactors, as well as modeling the predicted impacts of clutter on pilot performancewithin the context of advanced flight deck technologies and varying workloadconditions. Technologies such as synthetic and enhanced vision systems aredesigned to support flight navigation while increasing pilot situation awarenessand decreasing workload by providing a database- or sensor-driven view of theoutside world, regardless of visibility. Although such displays have been shownto be useful for supporting operations in instrument meteorological conditions(Alexander, Wickens, & Hardy, 2005; Prinzel et al., 2004; Schnell, Kwon,Merchant, & Etherington, 2004), there exists a concern that adding complexinformation (e.g., terrain representations, advanced guidance symbology) mightproduce visual clutter that inhibits the very processes or tasks these technologiesare designed to support.

We formulated research hypotheses in six areas to highlight the development ofa multidimensional measure and model of display clutter. As a basis for present-ing the statements of hypotheses here, expert commercial pilots flew simulatedlanding approaches under varied flight task workload (i.e., crosswind) condi-tions with predetermined groupings of low-, medium-, and high-clutter head-updisplays (HUDs). Pilots were asked to hand-fly the approaches and rate theperceived clutter and subjective mental workload associated with each displayconfiguration. The first hypothesis related to the predetermined groupings or clas-sifications of cockpit displays in terms of clutter. We expected displays classifiedas high clutter to result in higher perceptions of clutter than low- or medium-clutter displays (Hypothesis 1a), higher workload scores than medium-clutterdisplays (Hypothesis 1b), and less flight control stability than medium-clutter dis-plays (Hypothesis 1c) due to excessive content relative to flight task informationrequirements. The second hypothesis related to low-clutter displays, which wereexpected to yield lower perceptions of clutter than medium- or high-clutter dis-plays (Hypothesis 2a), greater perceived workload than medium-clutter displays(Hypothesis 2b), and less flight control stability than medium-clutter displays(Hypothesis 2c) due to reduced content.

Dow

nloa

ded

by [

Mas

sach

uset

ts I

nstit

ute

of T

echn

olog

y] a

t 06:

36 1

2 O

ctob

er 2

012

Page 6: Measurement and Modeling of Display Clutter in Advanced ... · Integration Flight Deck (IFD) simulator. The IFD is a fixed-based simulator that replicates the forward flight deck

302 ALEXANDER ET AL.

The third hypothesis concerned the inclusion of crosswinds in the flightsimulation. We did not expect this manipulation to impact perceptions of clut-ter (Hypothesis 3a), workload (Hypothesis 3b), or flight control performance(Hypothesis 3c), given the high experience level of the pilots in this study,but included it as a manipulation to determine if the strength of visual displayproperties for predicting clutter ratings would be influenced by different taskconditions. With respect to relations between subjective measures of clutter andworkload and objective measures of flight control performance (i.e., errors anddeviations), we expected only weak correlations (Hypothesis 4) given the highexperience level of the pilots. The fifth hypothesis concerned the prediction ofpilot perceptions of clutter for which we expected basic visual display properties(e.g., density, contrast) to be significant (Hypothesis 5a). However, our previ-ous research (Alexander et al., 2008; Alexander, Stelzer, Kim, Kaber, & Prinzel,2009) revealed the importance of both bottom-up and top-down factors in influ-encing perceptions of clutter, so we expected that visual display properties wouldonly account for a relatively small amount of variance in calculated clutter scores(Hypothesis 5b). Finally, the sixth area related to the prediction of flight controldeviations. We expected a combination of subjective clutter responses and objec-tive visual display properties to explain a significant fraction of the variance inflight control (Hypothesis 6).

METHOD

Participants

Six captains (all male) with more than 15 years of experience flying commercialtransport vehicles, and some familiarity with advanced HUDs, participated in theexperiment. The pilots ranged in age from 47 to 57 years (M = 50.3 years) withtotal flight hours ranging from 7,500 to 32,000 hr (M = 16,842 hr). Pilots werecompensated for their participation.

Simulation

The experiment was conducted at NASA Langley Research Center using theIntegration Flight Deck (IFD) simulator. The IFD is a fixed-based simulator thatreplicates the forward flight deck of a Boeing 757, outfitted with an overheadHUD projector unit and combiner (color-selective mirror) positioned in the directline of sight of the pilot flying. The IFD provided a 200◦ by 40◦ out-the-windowpanoramic view. The autopilot, auto-throttles, primary flight display, navigationdisplay, flight management computer, and flight director were disabled to forcepilots to hand-fly the aircraft and rely solely on the HUD for critical flightinformation.

Dow

nloa

ded

by [

Mas

sach

uset

ts I

nstit

ute

of T

echn

olog

y] a

t 06:

36 1

2 O

ctob

er 2

012

Page 7: Measurement and Modeling of Display Clutter in Advanced ... · Integration Flight Deck (IFD) simulator. The IFD is a fixed-based simulator that replicates the forward flight deck

CLUTTER MEASUREMENT AND MODELING 303

Task and Scenario

Pilots were required to fly six instrument landing system (ILS) approaches toRunway 16R at Reno-Tahoe International Airport. The scenario began withthe aircraft established on the localizer course abeam the initial approach fix,PYRAM, at 8,500 ft MSL and 210 kt indicated airspeed (KIAS). As shownin Figure 1, each approach was broken into three segments (1 = PYRAM toglideslope intercept, 2 = aircraft established on localizer course and glideslopeto DICEY, and 3 = DICEY to landing decision).

During the first segment, the pilot’s primary task was to maintain the local-izer course at a constant altitude before intercepting the glideslope. This wasto be accomplished while slowing the aircraft from the initial airspeed to thefinal approach speed of 138 KIAS and configuring the aircraft for landing bycalling to the pilot monitoring for gear and flaps, as desired. The second seg-ment began with the aircraft established on the localizer course and glideslopeat 138 KIAS with the landing gear and flaps fully extended. The pilot’s pri-mary task was to maintain the localizer course while flying at the correctrate of descent to achieve the glideslope. The third and final segment beganat DICEY and ended with a “land” or “go-around” decision. While the pri-mary task of maintaining the localizer course and glideslope continued, thepilot was also expected to attend to the outside world and verbalize a landingdecision.

A confederate pilot (member of the research team) served as the pilotmonitoring and sat in the right seat to handle air traffic control (ATC) com-munications, actuate the gear and flaps as directed, set airspeed and altitudebugs as directed, complete checklists, provide altitude callouts as briefed,and confirm settings and procedures on request. An experimenter followeda scenario script providing pilots with weather information and appropriateATC communications (e.g., approach and landing clearances, radio frequencychanges).

13.5 nm 5.9 nm

8500 ft 8500 ft

6400 ft

5514 ft

PYRAM (IAF waypoint)

Glideslope intercept DICEY(waypoint)

Decision Height 4412 ftRunway 16R

Segment 1 Segment 2 Segment 3

Glideslope

FIGURE 1 Landing approach scenario followed in simulator trials. Note. IAF = initialapproach fix.

Dow

nloa

ded

by [

Mas

sach

uset

ts I

nstit

ute

of T

echn

olog

y] a

t 06:

36 1

2 O

ctob

er 2

012

Page 8: Measurement and Modeling of Display Clutter in Advanced ... · Integration Flight Deck (IFD) simulator. The IFD is a fixed-based simulator that replicates the forward flight deck

304 ALEXANDER ET AL.

Independent Variables

Display clutter grouping, flight task workload, and approach segment were manip-ulated as independent variables. HUD configurations were selected to representlow, medium, and high levels of clutter based on expert pilot ratings of previouslystudied displays (see Kaber et al., 2008). Clutter was manipulated through variouscombinations of four different display features: synthetic vision, enhanced vision,highway-in-the-sky tunnel guidance, and instrument symbology. Figure 2 showsthe nine HUD configurations used in the experiment according to their displayclutter groupings. As previously mentioned, flight task workload was manipu-lated through simulated crosswinds. The low workload condition did not includewinds, whereas the high workload condition included a 17 kt wind at either 100◦or 230◦. Given a localizer course of 164◦, pilots were required to apply a heading

FIGURE 2 Nine head-up display configurations by display clutter groupings. Note. SV =synthetic vision; EV = enhanced vision; Tunnel = highway-in-the-sky guidance; IMC =reduced symbology set for instrument meteorological conditions; PRIM = complete primaryflight display symbology set.

Dow

nloa

ded

by [

Mas

sach

uset

ts I

nstit

ute

of T

echn

olog

y] a

t 06:

36 1

2 O

ctob

er 2

012

Page 9: Measurement and Modeling of Display Clutter in Advanced ... · Integration Flight Deck (IFD) simulator. The IFD is a fixed-based simulator that replicates the forward flight deck

CLUTTER MEASUREMENT AND MODELING 305

correction to the right or left, respectively, to compensate for crosswind drift.Consequently, the simulated aircraft achieved a substantial “crab” angle duringhalf the approaches under crosswind. As described earlier, each approach wasdivided into the three segments involving varying primary task demands.

Dependent Measures

The dependent measures included subjective ratings of display clutter and work-load as well as objective performance measures. Subjective impressions of displayclutter were collected in two ways: using a multidimensional measure of clut-ter created through our previous work (see Kaber et al., 2008) and an overallperceived rating of clutter on a 20-point scale. The multidimensional measureof clutter yielded an overall clutter score based on a weighted average of rat-ings on six 20-point scales (anchors are provided in parentheses): redundancy(redundant–orthogonal), colorfulness (monochromatic–colorful), salience (notsalient–salient), dynamics (static–dynamic), variability (monotonous–variable),and density (sparse–dense). Subjective workload ratings were collected via theNASA Task Load Index (NASA–TLX; Hart & Staveland, 1988), a multidi-mensional rating procedure that provides an overall workload score based ona weighted average of ratings on six 20-point scales: mental demand, physicaldemand, temporal demand, own performance, effort, and frustration. Objectiveperformance measures included vertical and lateral flight path tracking perfor-mance, measured as the: (a) degree of kurtosis (i.e., control stability), (b) numberof flight technical errors (i.e., deviations of more than 1 dot on the localizeror glideslope indicators as specified by Federal Aviation Administration AirTransport Pilot Practical Test Standards), and (c) root mean square error (RMSE)of glideslope and localizer deviations (a more fine-grained measure than numberof flight technical errors), respectively.

Visual Display Properties

A display image analysis software application was developed to measure visualdisplay properties for all HUD configurations in advance of the experiment.Screenshot images of the nine HUD configurations were extracted from videosrecorded during flight in the IFD simulator. The image analysis software applica-tion analyzed the images pixel by pixel to calculate contrast, occlusion, density,and luminance. Contrast was calculated as the average difference between thebrightness of pixels as part of the iconic (i.e., instrument symbology) versusnoniconic (e.g., synthetic and enhanced vision) imagery. Occlusion refers to thepercentage of pixels used to render noniconic imagery that were also used topresent iconic imagery. Display density was the overall number of active pixels forthe combined noniconic and iconic imagery divided by the total number of pixels

Dow

nloa

ded

by [

Mas

sach

uset

ts I

nstit

ute

of T

echn

olog

y] a

t 06:

36 1

2 O

ctob

er 2

012

Page 10: Measurement and Modeling of Display Clutter in Advanced ... · Integration Flight Deck (IFD) simulator. The IFD is a fixed-based simulator that replicates the forward flight deck

306 ALEXANDER ET AL.

available. Luminance was calculated as a weighting of the three raw RGB valuesreturned by the pixel analyzer. (It was also directly measured with a photometerdirected at the actual HUD in the IFD simulator before experiment test trials.The pattern of photometer measurements mimicked the pattern of calculatedluminance values from the pixel analysis software, which were used for HUDcondition evaluation purposes.) Although these visual display properties were notconsidered in the classification of HUDs according to clutter groups, the valueswere analyzed for their ability to predict clutter scores from the multidimensionalsubjective measure described previously.

Procedures and Experiment Design

On arrival at NASA Langley, pilots were briefed on the purpose of the experimentand provided with an overview of the experimental protocol, asked to completean informed consent form, and given a brief questionnaire regarding their pre-vious flight experience. Pilots were then briefed on the simulation equipmentand advanced HUD technology to be used in the experiment. After completingtwo training scenarios of basic airwork and an approach under visual conditions,pilots provided preexperiment, paired-comparison rankings of the importance ofworkload factors and display clutter descriptors (as listed earlier). They then com-pleted six test trials in which the HUD configuration and flight task workloadwere manipulated within subjects so that all pilots experienced low-, medium-,and high-clutter displays under both low and high workload in a counterbalancedpresentation. Each segment of the approach was flown with a different HUD con-figuration; hence, pilots used three different configurations in each of the six trials.The simulation was paused at the end of each segment for pilots to provide sub-jective ratings of clutter and workload. Weather conditions were dependent on thepresented HUD configuration in the final segment of each trial and were there-fore not balanced across trials. On completion of the test trials, pilots providedpostexperiment, paired-comparison rankings of the workload factors and clutterdescriptors. This step was conducted to determine whether pilot perceptions of theimportance of any display characteristics in clutter might have changed during thecourse of the experiment. Subsequently, pilots provided their own definitions ofclutter within the context of advanced flight deck technologies, and were debriefedon the study goals and manipulations. This entire procedure was completed in one4-hr session. Figure 3 presents a summary of the experimental procedure from theparticipant’s perspective.

RESULTS

An alpha level of .05 was used for statistically significant effects and an alpha levelof .10 was used for marginally significant effects. Acceptance of a higher alpha

Dow

nloa

ded

by [

Mas

sach

uset

ts I

nstit

ute

of T

echn

olog

y] a

t 06:

36 1

2 O

ctob

er 2

012

Page 11: Measurement and Modeling of Display Clutter in Advanced ... · Integration Flight Deck (IFD) simulator. The IFD is a fixed-based simulator that replicates the forward flight deck

CLUTTER MEASUREMENT AND MODELING 307

FIGURE 3 Graphical summary of steps in experimental procedure.

value was deemed important in this applied research given that Type 2 statisticalerrors (failing to report a “real” difference) have just as important safety implica-tions as Type 1 errors (erroneously reporting a nonexistent difference; Wickens,1998).

Analysis of Variance

Analysis of variance (ANOVA) results did not differ as a function of approachsegment so this independent variable is not discussed in the following subsectionscovering the various responses.

Clutter ratings. A 3 × 2 (clutter grouping by flight task workload) repeatedmeasures ANOVA was conducted on the overall perceived clutter ratings.As shown in Figure 4, there was no significant effect of workload on the rat-ings (p = .17). However, a significant effect of clutter grouping, F(2, 93) = 18.33,p < .001, revealed the low-clutter display condition (M = 37.92) generated lower

FIGURE 4 Mean clutter ratings by display clutter grouping and flight task workload.

Dow

nloa

ded

by [

Mas

sach

uset

ts I

nstit

ute

of T

echn

olog

y] a

t 06:

36 1

2 O

ctob

er 2

012

Page 12: Measurement and Modeling of Display Clutter in Advanced ... · Integration Flight Deck (IFD) simulator. The IFD is a fixed-based simulator that replicates the forward flight deck

308 ALEXANDER ET AL.

ratings than the medium-clutter condition (M = 59.79), which generated lowerratings than the high-clutter condition (M = 68.33). In other words, pilot ratingsof the overall clutter of HUDs were in agreement with the a priori display cluttergroupings defined based on our prior research results. The interaction betweenclutter grouping and flight task workload was not significant (p = .61).

Clutter scores. A 3 × 2 (clutter grouping by flight task workload) repeatedmeasures ANOVA was also conducted on the calculated clutter scores derivedfrom the multidimensional clutter measure. A significant effect of workload, F(1,93) = 6.05, p = .02, revealed that the crosswind condition was associated withhigher calculated clutter scores than the no-wind condition. In addition, a sig-nificant effect of clutter grouping, F(2, 93) = 15.34, p < .001, indicated thatthe low-clutter condition (M = 40.74) generated lower calculated clutter scoresthan the medium-clutter condition (M = 51.41), which generated lower calcu-lated scores than the high-clutter condition (M = 55.72). As was the case withthe perceived clutter ratings, calculated clutter scores were in agreement with thea priori clutter groupings. Furthermore, the interaction between clutter groupingand flight task workload was not significant (p = .44).

Workload scores. A repeated measures ANOVA with display clutter group-ing and flight task workload as predictors was also conducted on the compositeNASA–TLX scores. Mean TLX scores are shown in Figure 5. Although there wasno effect of the workload manipulation (p = .97) on workload scores, there was asignificant effect of clutter grouping, F(2, 93) = 3.3, p = .04. The medium-clutter

FIGURE 5 Mean NASA Task Load Index scores by display clutter grouping and flight taskworkload.

Dow

nloa

ded

by [

Mas

sach

uset

ts I

nstit

ute

of T

echn

olog

y] a

t 06:

36 1

2 O

ctob

er 2

012

Page 13: Measurement and Modeling of Display Clutter in Advanced ... · Integration Flight Deck (IFD) simulator. The IFD is a fixed-based simulator that replicates the forward flight deck

CLUTTER MEASUREMENT AND MODELING 309

condition (M = 60.15) generated lower workload ratings than the low-clutter con-dition (M = 65.56) and high-clutter condition (M = 64.72). The interaction ofclutter grouping and flight task workload was not significant (p = .64).

Flight path tracking performance. A repeated measures ANOVA with dis-play clutter grouping and flight task workload as predictors was conducted onthe glideslope deviation performance data. As shown in Figure 6a, there wasno significant effect of the workload (p = .92) in flight tracking performance.A marginally significant effect of clutter grouping, F(2, 59) = 2.75, p = .07,indicated the low display clutter group (M = 5.31) generated a distribution ofglideslope deviations with lower kurtosis (indicative of less stable control) than

FIGURE 6 (a) Glideslope deviation stability by display clutter grouping and flight taskworkload. (b) Localizer deviation stability by display clutter grouping and flight task workload.

Dow

nloa

ded

by [

Mas

sach

uset

ts I

nstit

ute

of T

echn

olog

y] a

t 06:

36 1

2 O

ctob

er 2

012

Page 14: Measurement and Modeling of Display Clutter in Advanced ... · Integration Flight Deck (IFD) simulator. The IFD is a fixed-based simulator that replicates the forward flight deck

310 ALEXANDER ET AL.

the medium-clutter condition (M = 8.76), which generated a deviation distri-bution with higher kurtosis (more stable control) than the high-clutter condition(M = 6.22). The interaction of clutter grouping and flight task workload was notsignificant (p = .29).

Finally, a 3 × 2 (clutter grouping by flight task workload) repeated mea-sures ANOVA was also conducted on the localizer deviation performance data.Although there was no effect of workload (p = .59) on the response, there wasa significant effect of clutter grouping, F(2, 93) = 5.42, p = .006). Figure 6bpresents the mean localizer deviations for the various display conditions. Post-hocanalysis indicated that the low-clutter condition (M = 8.83) generated a distribu-tion of deviations with lower kurtosis (less stable control) than the medium-cluttercondition (M = 13.18) and high-clutter condition (M = 12.29). The interactionbetween clutter grouping and flight task workload was not significant (p = .91).

Correlations

Table 1 presents the results of the correlation analyses across subjective ratingsand performance measures. Overall clutter ratings did not appear to have a sig-nificant correlation with any of the performance measures in any flight segment;

TABLE 1Results of Correlation Analyses on Clutter and Workload Scores Versus Glideslope and

Localizer Deviation Performance by Approach Segment

Glideslope Localizer

Error RMSE Error RMSE

Clutter scores vs. performanceSegment 1 r N/A N/A –.07 –.50

p N/A N/A .66 .002Segment 2 r .07 .03 .16 .12

p .70 .86 .36 .48Segment 3 r .25 .08 –.27 –.33

p .14 .63 .11 .05Workload scores vs. performance

Segment 1 r N/A N/A .47 .37p N/A N/A .004 .03

Segment 2 r .40 .45 .37 .43p .02 .01 .03 .01

Segment 3 r –.03 .29 .30 .38p .88 .09 .08 .02

Note. Pearson coefficients are presented on the first line of each cell with significance levelsdirectly below. Significant correlations, p ≤ .05, are presented in bold. RMSE = root mean squareerror.

Dow

nloa

ded

by [

Mas

sach

uset

ts I

nstit

ute

of T

echn

olog

y] a

t 06:

36 1

2 O

ctob

er 2

012

Page 15: Measurement and Modeling of Display Clutter in Advanced ... · Integration Flight Deck (IFD) simulator. The IFD is a fixed-based simulator that replicates the forward flight deck

CLUTTER MEASUREMENT AND MODELING 311

therefore, these data are not included in Table 1. Calculated clutter scores based onthe multidimensional clutter measure were, however, negatively correlated withlocalizer RMSE in Segments 1 and 3. This indicated that as HUD visual contentincreased, localizer deviations decreased. All other performance measures werenot significantly related with the clutter score. (Note that glideslope performancewas not evaluated in Segment 1 because the pilot’s primary task was to maintaina constant altitude.) These results also indicate that the calculated clutter scoremight be a more sensitive measure than overall clutter ratings to differences indisplay content, which can have relevance to aspects of flight control.

As shown in the bottom section of Table 1, the NASA–TLX workloadscores were significantly positively correlated with observed glideslope errors andRMSE in approach Segment 2 (again, glideslope performance was not evaluatedduring Segment 1). Workload scores and localizer deviation performance mea-sures were significantly, positively correlated for the first two segments for doterrors and for all segments for RMSE. These findings indicate that increases inperceived workload were associated with increased flight errors and deviations.

Relations between subjective impressions of clutter and workload were alsoanalyzed. Correlation analyses on overall clutter ratings and NASA–TLX work-load scores revealed positive linear relations in Segment 2 (r = .36, p = .03) andSegment 3 (r = .38, p = .02). A marginally significant relation of clutter ratingsand workload scores was found in Segment 1 (r = .32, p = .06). This suggeststhat pilots were consistent in associating degrees of display clutter with overallperceived cognitive load; as display clutter increased so did pilot perceptions ofworkload. Opposite to these results, it was found that calculated clutter scoreswere not related with the TLX scores for test trials. It is possible that the timing ofthe workload ratings, directly following the overall display clutter ratings, mighthave led pilots to think about their clutter ratings when assessing flight segmentworkload. This could have created a correlation between the two sets of ratings.Alternatively, the calculated clutter score might be more capable of capturing thedifferences between clutter and workload than the overall clutter rating, indicatingthat clutter and workload are distinct mechanisms.

Regression Analysis and Modeling

The four display visual property measures (lumens, contrast, occlusion, and den-sity) were used to create linear equations to model and predict values of the clutterscore derived from the multidimensional subjective measure. Regression analysisresults for models for the low and high task workload conditions are presentedin Table 2 and include magnitudes and signs of all predictor terms along with tstatistics and significance levels. Both models were significant with R2 = 0.27,p = .003 for low workload and R2 = 0.22, p = .02 for high workload. In general,decreases in display lumens and contrast and increases in occlusion and density

Dow

nloa

ded

by [

Mas

sach

uset

ts I

nstit

ute

of T

echn

olog

y] a

t 06:

36 1

2 O

ctob

er 2

012

Page 16: Measurement and Modeling of Display Clutter in Advanced ... · Integration Flight Deck (IFD) simulator. The IFD is a fixed-based simulator that replicates the forward flight deck

312 ALEXANDER ET AL.

TABLE 2Regression Analysis Results for the Model of Percentage Clutter Score in Display

Properties Under Low and High Flight Task Workload Conditions

Flight Task Workload

Low High

Estimate Error t Value Pr > t Estimate Error t Value Pr > t

Intercept 26.45 6.01 4.40 < .001 38.04 6.71 5.67 < .001Lumen −13.35 5.25 2.54 .01 −6.22 5.86 1.06 .29Contrast −6.92 5.03 1.37 .17 −9.54 5.62 1.70 .09Occlusion 0.52 0.69 0.76 .45 0.82 0.77 1.07 .29Density 17.43 6.74 2.59 .01 8.40 7.52 1.12 .27

Note. Significant correlations, p ≤ .05, are presented in bold.

caused increases in clutter scores. That said, the low R2 values indicate that othervariables (e.g., top-down factors such as pilot goals and task knowledge) might berequired to account for greater variability in the clutter scores.

Under low flight task workload, the contrast and occlusion measures did notprove to be statistically significant in predicting the clutter score. Regression anal-ysis revealed parameter estimates for average lumens and density to have negativeand positive signs, respectively, and for density to explain the greatest portion ofvariance in clutter scores. The complete model for estimating clutter score basedon visual display properties under low workload is as follows:

Percent clutter score = 26.45 − 13.35 ∗ lumen + 17.43 ∗ density

The results for the high workload condition revealed a lack of significanceof all four visual display properties (all p > .05) for predicting clutter score,although contrast was a marginally significant predictor (p = .09). This findingsuggests that bottom-up factors became less important, and perhaps top-downfactors became more important, as workload increased.

Models of pilot performance were also generated to determine the extent towhich flight path deviations could be predicted based on calculated clutter scores,workload scores, display clutter grouping, flight task workload, approach seg-ment, and visual display properties. A stepwise regression modeling procedurewith backward elimination of insignificant (p > .05) predictors was conducted forglideslope and localizer RMSE. Some predictors were not included in the mod-els due to multicollinearity among variables. Model diagnostics were conductedto ensure residuals conformed with the normality assumption. The final best-fitregression models for each performance measure were selected based on maxi-mum R2 values, and are presented in Table 3. It should be noted that the R2 values

Dow

nloa

ded

by [

Mas

sach

uset

ts I

nstit

ute

of T

echn

olog

y] a

t 06:

36 1

2 O

ctob

er 2

012

Page 17: Measurement and Modeling of Display Clutter in Advanced ... · Integration Flight Deck (IFD) simulator. The IFD is a fixed-based simulator that replicates the forward flight deck

CLUTTER MEASUREMENT AND MODELING 313

TABLE 3Best-Fit Regression Models for Predicting Glideslope and Localizer Deviation Performance

Performance Measure Best-Fit Model R2 p

Glideslope RMSE Log (glideslope RMSE) = –0.43 – 0.49 (contrast)– 0.15 (density)Clutter and workload scores were notsignificant.

0.11 .02

Localizer RMSE Log (localizer RMSE) = –1.82 + 0.45 (segment)– 1.52 (display) + 0.87 (luminance) – 1.75(contrast) – 1.02 (density)Clutter score was not significant.

0.38 < .001

Log (localizer RMSE) = –3.05 + 0.45 (z-TLX)+ 0.69 (segment) – 1.36 (display) + 0.08(luminance)

0.46 < .001

Note. RMSE = root mean square error; TLX = NASA Task Load Index.

for the models did not exceed 0.46, suggesting that other pilot, task, environment,or system factors beyond those examined could influence flight path control.

DISCUSSION

Table 4 presents a summary of the overall findings of this study by each hypothesisalong with brief explanations for hypotheses that were only partially supported.The ANOVA results on the pilot clutter ratings and calculated clutter scores werein agreement with the predetermined groupings of low-, medium-, and high-clutter displays. This finding supported Hypotheses 1a and 2a. Although overallclutter ratings were not influenced by flight task workload (i.e., crosswinds), cal-culated clutter scores based on the multidimensional subjective measure werehigher in the high-workload condition compared to the low-workload condition.These conflicting results provide partial support for Hypothesis 3a, given thatwe did not expect the workload manipulation to influence perceptions of clut-ter regardless of measurement technique. It is possible that the calculated clutterscores were more sensitive to the influence of high crosswinds due to overlap ofsymbology in the HUD that occurred when the aircraft achieved a substantial crabangle. For example, the guidance symbology ordinarily appearing in the center ofthe HUD moved to the right or left of the display (depending on the wind direc-tion) and obscured the glideslope indicator and altitude tape (on the right) or speedtape (on the left).

Medium-clutter displays appeared to provide some optimal amount of informa-tion relative to pilot goals and tasks across segments that led to lower workloadscores (supporting Hypotheses 1b and 2b) and better flight control performance

Dow

nloa

ded

by [

Mas

sach

uset

ts I

nstit

ute

of T

echn

olog

y] a

t 06:

36 1

2 O

ctob

er 2

012

Page 18: Measurement and Modeling of Display Clutter in Advanced ... · Integration Flight Deck (IFD) simulator. The IFD is a fixed-based simulator that replicates the forward flight deck

314 ALEXANDER ET AL.

TABLE 4Summary of Overall Findings by Hypothesis

Hypothesis Supported Partially Supported

1a: High-clutter displays → higher perceptions ofclutter

X

1b: High-clutter displays → higher workload scores X1c: High-clutter displays → less flight control

stabilityX (Localizer

deviations were notsignificant)

2a: Low-clutter displays → lower perceptions ofclutter

X

2b: Low-clutter displays → lower workload scores X2c: Low-clutter displays → less flight control stability X3a: Crosswinds → no impact on perceptions of clutter X (Clutter scores

were higher incrosswind condition)

3b: Crosswinds → no impact on workload X3c: Crosswinds → no impact on flight control

performanceX

4: Clutter and workload ratings → flight controlperformance

X (Clutter ratingswere not significant)

5a: Visual display properties → perceptions of clutter X (Only lumens anddensity weresignificant)

5b: Visual display properties → small amount ofvariance in clutter scores

X

6: Clutter scores + visual display properties → flightcontrol performance

X (Clutter scoreswere not significant)

(supporting Hypotheses 1c and 2c) compared to low- and high-clutter displays.Generally speaking, high-clutter displays produced elevated reports of workloaddue to increased display feature density, redundant information presentation, orboth. On the other hand, low-clutter displays were found to be challenging forpilots to use due to limited visual imagery that might typically be expected orused for flight tasks. These challenges were mirrored in flight control performance(i.e., control stability, or the degree of deviation distribution kurtosis), supportingcritical links between manipulations of HUD content, pilot perceptions of clut-ter, and resulting objective flight performance. Localizer tracking performance,however, was not significantly different between the medium-and high-clutter dis-plays, tempering support for Hypothesis 1c. This lack of difference might havebeen due to the placement of localizer information toward the bottom of the HUD,a location that was not also occupied by noniconic features during display use.This prevented the localizer image from becoming obscured, like the glideslopeinformation, as additional information elements were included in the display or

Dow

nloa

ded

by [

Mas

sach

uset

ts I

nstit

ute

of T

echn

olog

y] a

t 06:

36 1

2 O

ctob

er 2

012

Page 19: Measurement and Modeling of Display Clutter in Advanced ... · Integration Flight Deck (IFD) simulator. The IFD is a fixed-based simulator that replicates the forward flight deck

CLUTTER MEASUREMENT AND MODELING 315

the aircraft was exposed to crosswinds. As expected, ANOVAs revealed that themanipulation of flight task workload did not influence subjective workload scoresor flight control performance, supporting Hypotheses 3b and 3c, respectively.

Hypothesis 4 was only partially supported by the various correlation analysesconducted. Although overall clutter ratings were not correlated with flight controlperformance, the calculated clutter scores were found to be negatively correlatedwith localizer deviations (i.e., RMSE). This suggested that the calculated clutterscore was once again more sensitive than the overall clutter rating, and that aspilots perceived higher clutter in the HUD, lateral path control improved. Thisfinding is in agreement with the earlier ANOVA, which indicated that control wasless stable with the low-clutter displays compared to the medium- and high-clutterdisplays. It is likely that the stability and kurtosis measure and path deviations andRMSE tapped the same aspects of flight performance. In general, it appeared thatvariability in flight path control was lower with the high-content displays and sowere absolute path deviations. In support of this proposition, pilot perceptionsof flight task workload were positively and significantly related to vertical (i.e.,glideslope) and lateral (i.e., localizer) flight control performance, indicating thatas pilot ratings of workload increased, flight control errors and deviations alsoincreased (although as indicated earlier, flight control stability was not impacted).

Although we did not set out to examine the sensitivity of different flight con-trol performance metrics, it appears that errors and deviations are more sensitiveto both perceptions of clutter and flight task workload than flight control stability.In comparing perceptions of clutter and workload, only the overall clutter ratingswere found to be positively and significantly correlated with workload scores. It ispossible that the timing of the workload ratings, directly following the overall clut-ter ratings, might have led pilots to recall their clutter ratings when assessing flightsegment workload, creating a correlation between the two sets of ratings. A moreinteresting explanation, however, is that the calculated clutter score might be morecapable of capturing the differences between clutter and workload than the overallclutter rating, indicating that clutter and workload are distinct mechanisms.

To address Hypothesis 5a, we developed multiple linear regression models ofcalculated clutter scores based on HUD visual properties. Contrary to expecta-tion, only lumens and density were significant in predicting the percentage clutterscore under low flight task workload conditions. It is possible that pilot experi-ence with HUDs might decrease attention to the underlying visual properties ofthe display and focus attention on the information presented (i.e., pilots mightbecome accustomed to HUD image quality and overlook limitations of intensityor density due to display size). Under high flight task workload, none of the visualproperties were significant predictors of calculated clutter scores, indicating thatbottom-up factors became less important, and perhaps top-down factors becamemore important, as workload increased. It could be the case that definitions of clut-ter change as workload changes, a relationship that should be investigated further

Dow

nloa

ded

by [

Mas

sach

uset

ts I

nstit

ute

of T

echn

olog

y] a

t 06:

36 1

2 O

ctob

er 2

012

Page 20: Measurement and Modeling of Display Clutter in Advanced ... · Integration Flight Deck (IFD) simulator. The IFD is a fixed-based simulator that replicates the forward flight deck

316 ALEXANDER ET AL.

in future work. In addition, as predicted in Hypothesis 5b, only a relatively smallamount of variance in calculated clutter scores was accounted for by the measuredvisual display properties. Top-down (i.e., knowledge-driven) factors such as pilotexpectancy and task relevance need to be considered in such models. Individualdifferences among the pilot sample might also be important.

Contrary to Hypothesis 6, both the calculated clutter and workload scores werenot significant in the majority of regression models predicting pilot performance(glideslope and localizer deviations). Even though the correlations between cal-culated clutter and workload scores and flight control deviations were strong, asdiscussed previously, other variables such as flight segment, display configura-tion, and a select number of visual display properties dominated the regressionmodels of performance and the subjective measures became statistically insignif-icant. However, the R2 values were once again relatively low, indicating thatother factors not examined in this study might influence flight path control.Collectively, these findings point to the need to develop more complete models ofpilot performance by integrating other contextual (i.e., task, environmental, pilot)variables.

CONCLUSION

This study achieved the objective of investigating techniques for measuring sub-jective perceptions of display clutter as well as modeling the predicted impacts ofclutter on pilot performance within the context of advanced flight deck technolo-gies and varying workload conditions. Primary contributions of this work includethe development of a multidimensional measure and model of display clutter.The model, in particular, could be used to complement objective assessments ofdisplay clutter by taking into account top-down factors as relevant dimensions.Furthermore, given that the strength of visual display properties in predictingclutter ratings changes with flight task workload, the model could be used as apowerful assessment of clutter that is sensitive to different task conditions.

A key next step in this line of research would be to validate the multidi-mensional measure of clutter as well as the predictive performance models inevaluating other complex displays for different work domains. In fact, a follow-upstudy has already been conducted in which the clutter measure and models wereextended to a head-down primary flight and navigation display in a vertical takeoffand landing aircraft simulator (see Kaufmann, Kaber, Alexander, Kim, & Naylor,2011). Continued work on developing reliable measures and models of displayclutter could support optimized display design through the identification of clut-ter thresholds and the implementation of clutter alerts, decluttering mechanisms,or both. Decluttering mechanisms could include a reduction in the amount ofvisual imagery and features portrayed, perhaps through temporary filtering, or

Dow

nloa

ded

by [

Mas

sach

uset

ts I

nstit

ute

of T

echn

olog

y] a

t 06:

36 1

2 O

ctob

er 2

012

Page 21: Measurement and Modeling of Display Clutter in Advanced ... · Integration Flight Deck (IFD) simulator. The IFD is a fixed-based simulator that replicates the forward flight deck

CLUTTER MEASUREMENT AND MODELING 317

highlighting critical information to guide operator attention. Furthermore, final-ized products of this research could be used to support display certification andacquisitions processes.

ACKNOWLEDGMENTS

This article is based on work supported by the National Aeronautics and SpaceAdministration (NASA) under Contract Number NNL06AA21A issued throughthe Integrated Intelligent Flight Deck (IIFD) project under the Aviation SafetyProgram. Any opinions, findings, and conclusions or recommendations expressedin this article are those of the authors and do not necessarily reflect the viewsof NASA. This research was conducted while Amy L. Alexander and Emily M.Stelzer were employed by Aptima, Inc., and Sang-Hwan Kim was a graduate stu-dent at North Carolina State University. We would like to thank Mr. Randy Baileyand Dr. Steve Young for committing substantial NASA Langley Research Centerresources to this project; Mr. Jerry Karwac, Ms. Wei Anderson, and Mr. DennisFrasca for providing simulation support; Mr. Mike Norman for providing pilotsubject matter expertise in the implementation of the display configurations; Ms.Angela Allamandola for serving as a confederate copilot during the simulator testtrials; and Mr. Theo Veil for his assistance with experiment planning and datacollection and the pixel analyzer software development.

Portions of this research were reported at the 53rd annual meeting of theHuman Factors and Ergonomics Society, October 19–23, 2009, San Antonio, TX.

REFERENCES

Alexander, A. L., Stelzer, E. M., Kim, S-H., & Kaber, D. B. (2008). Bottom-up and top-down contrib-utors to pilot perceptions of display clutter in advanced flight deck technologies. In Proceedings ofthe Human Factors and Ergonomics Society 52nd annual meeting (pp. 1180–1184). Santa Monica,CA: HFES.

Alexander, A. L., Stelzer, E. M., Kim, S-H., Kaber, D. B., & Prinzel, L. J. (2009). Data and knowl-edge as predictors of perceptions of display clutter, subjective workload and pilot performance. InProceedings of the Human Factors and Ergonomics Society 53rd annual meeting (pp. 21–25). SantaMonica, CA: HFES.

Alexander, A. L., Wickens, C. D., & Hardy, T. J. (2005). Synthetic vision systems: The effects ofguidance symbology, display size, and field of view. Human Factors, 47, 693–707.

Aretz, A. J. (1988). A model of electronic map interpretation. In Proceedings of the Human FactorsSociety 32nd annual meeting (pp. 130–135). Santa Monica, CA: HFS.

Beck, M. R., Lohrenz, M. C., Trafton, J. G., & Gendron, M. L. (2008). The role of local and globalclutter in visual search [Abstract]. Journal of Vision, 8, 1071.

Hart, S. G., & Staveland, L. E. (1988). Development of NASA–TLX (Task Load Index): Results ofempirical and theoretical research. In P. A. Hancock & N. Meshkati (Eds.), Human mental workload(pp. 239–250). Amsterdam, The Netherlands: North Holland.

Dow

nloa

ded

by [

Mas

sach

uset

ts I

nstit

ute

of T

echn

olog

y] a

t 06:

36 1

2 O

ctob

er 2

012

Page 22: Measurement and Modeling of Display Clutter in Advanced ... · Integration Flight Deck (IFD) simulator. The IFD is a fixed-based simulator that replicates the forward flight deck

318 ALEXANDER ET AL.

Kaber, D. B., Alexander, A. L., Stelzer, E. M., Kim, S-H., Kaufmann, K., & Hsiang, S. M. (2008).Perceived clutter in advanced cockpit displays: Measurement and modeling with experienced pilots.Aviation, Space, & Environmental Medicine, 79, 1007–1018.

Kaufmann, K., Kaber, D., Alexander, A., Kim, S-H., & Naylor, J. T. (2011). Testing measuresof aviation display clutter for predicting pilot perceptions and flight performance. In H. Khalid,A. Hedge, & T. Z. Ahram (Eds.), Advances in ergonomics modeling and usability evaluation(pp. 235–245). New York, NY: CRC.

Lohrenz, M. C., Layne, G. L., Edwards, S. S., Gendron, M. L., & Bradley, J. T. (2006). Feature clus-tering to measure clutter in electronic displays. In Proceedings of the 12th Industry, Engineering,and Management Systems conference. Cocoa Beach, FL: The Association for Industry, Engineering,and Management Systems.

Lohrenz, M. C., Trafton, J. G., Beck, M. R., & Gendron, M. L. (2009). A model of clutter for complex,multivariate geospatial displays. Human Factors, 51, 90–101.

Prinzel, L. J., Comstock, J. R., Glaab, L. J., Kramer, L. J., Arthur, J. J., & Barry, J. S. (2004). Theefficacy of head-down and head-up synthetic vision display concepts for retro- and forward-fit ofcommercial aircraft. International Journal of Aviation Psychology, 14, 53–77.

Rosenholtz, R., Li, Y., Mansfield, J., & Jin, Z. (2005). Feature congestion, a measure of display clut-ter. In Proceedings of the 2005 Conference for the Association for Computing Machinery SpecialInterest Group on Computer-Human Interaction (pp. 761–770). Portland, OR: ACM SIGCHI.

Rosenholtz, R., Li, Y., & Nakano, L. (2007). Measuring visual clutter. Journal of Vision, 7, 1–22.Schnell, T., Kwon, Y., Merchant, S., & Etherington, T. (2004). Improved flight technical performance

in flight decks equipped with synthetic vision information system displays. International Journal ofAviation Psychology, 14, 79–102.

Schons, V., & Wickens, C. D. (1993). Visual separation and information access in aircraft displaylayout (Tech. Rep. No. ARL-93-7/NASA-A3I-93-1). Savoy: IL: University of Illinois Institute ofAviation.

Wickens, C. D. (1998). Commonsense statistics. Ergonomics in Design, 6(4), 18–22.

Manuscript first received: July 2011

Dow

nloa

ded

by [

Mas

sach

uset

ts I

nstit

ute

of T

echn

olog

y] a

t 06:

36 1

2 O

ctob

er 2

012