developing qualitative metrics for visual analytic environments

Developing Qualitative Metrics for Visual Analytic Environments

Jean ScholtzPacific Northwest National Laboratory

BELIV 2010

Why Are Qualitative Metrics Needed?

Quantitative metricsTime to accomplish a task

Accuracy with which a task is done (need ground truth)

Percentage of tasks that the user is able to complete

Qualitative informationProvides more knowledge about what in the software actually helps the user

and how it helps the user

Problems with obtaining qualitative informationIt is subjective

And varies by individual so generalizations are often difficult

Qualitative Assessment and the VAST Challenge

In the VAST Challenge we are able to produce quantitative metrics easily (as ground truth is embedded in the data set)

But what we really want to know is how well the tool, particularly the visualizations, help the user to arrive at the ground truth.

While we have always used reviewers (both analysts and visualization researchers) the reviews were conducted informally until 2009.

In 2009 we used a review system so we now have a body of reviews that we can analyze to see what is important to reviewers

Evaluating the Reviews

Our study asked the following questions:What materials should be provided to evaluators to ensure that they can adequately assess the visual analytic systems? What aspects of visual analytic systems are important to reviewers?Is there an advantage to selecting evaluators from different domains of expertise (visualization researchers and professional analysts)?

We analyzed the following information to answer these questions:Reviews (2-3 per entry) of 42 entriesEach reviewer provided a clarity rating, ratings for the usefulness, efficiency and intuitiveness of the visualizations, the analytic process and the interactions. Reviewers were also asked to rate the novelty of the submission.

What materials should be provided to evaluators to ensure that they can adequately assess the visual analytic systems?

Because of the number of teams, the research nature of the tools and the number of reviewers, it is not possible for reviewers to actually use the system. They rely on:

A textual description, including screen shotsA video showing how a certain answer was achieved (the process)The accuracy of the team’s answers

And we found…..The materials are sufficientThe clarity of the submission definitely affects the scoreAccuracy metrics impact reviewers’ scores

ConclusionsEmphasis to participants that they need to make sure their descriptions (text and video) are understandableReconsider decision to show reviewers the accuracy scores

Clarity Score vs Average Score

0%

5%

10%

15%

20%

25%

30%

35%

40%

1 2 3 4 5 6 7

Clarity Score

Pe

rce

nta

ge

re

ce

ivin

g a

ve

rag

e s

co

re

clarity <= 4

clarity >4

What aspects of visual analytic systems are important to reviewers?

We analyzed the reviews to see what comments viewers made

We classified these comments into three categories:Analytic processVisualizationsInteractions

NotesThere is obviously overlap so there may be disagreements about which category a comment belongs inMost comments are stated negatively – in order to provide them in the positive we need more generalization or have to describe the actual situation (visualization, interaction or process)

Comments on Analytic Process

Highly manual processesRepetitive steps in processLarge amount of data that analysts have to visually inspectAutomation that might cause an analyst not to see an important piece of dataNeed for analyst to remember previously seen informationToo many stepsFor automatic identification of patterns and/or behaviors, users need an explanation of what the software is programmed to identifyAnalysts need to document their rationale for assumptions in their reportsDocument the selection of a particular visualization if several are available Show filters and transformations appliedParticipants need to explain how the visualizations helped the analysis process

Comments on Visualizations

Complexity of visualizationsMisleading color coding, inconsistent use of colorLack of labels; non intuitive labelsNon intuitive symbols

Using tooltips instead of labels causes analysts to mouse over too many itemsNo coordination or linking between visualizations

The use of thickness of lines to represent strength of association is difficult to differentiateDifficult to compare visualizations if can only view them seriallyIs the visualization useful for analysis or is it a reporting tool?Use of different scales in visualizations is confusingNeed to relate anomalies seen in visualizations to analysis

Comments on Interactions

Too much scrolling

Interactions embedded in menus

Too many levels of menus and options to check

Need to be able to filter complex visualizations

Need to be able to drill down to actual data

Conclusions

Reviewers were asked to comment on the categories of process, visualization and interaction

They provided many comments but the comments were not always in the appropriate category

Reviewers were asked to comment on efficiency, usefulness and intuitiveness

They did mention many issues impacting each of these qualities but we did not use the individual ratings as we had expected

For VAST 2010 Challenge we are:Providing the guidelines for teams (from this study plus another analyst study)Providing definitions of process, visualization and interactionLooking forward to identifying more guidelines (pertaining to new types of data)

Is there an advantage to selecting evaluators from different domains of expertise?

Are there differences between what visualization researchers say and what analysts say?

We looked at entries where there were large differences in scores between the analyst and the visualization researchers

Eight entries (1 analyst review, 2 researcher reviews)Analyst ratings were lower in 2 instancesVisualization rating was lower in 5 instancesAnalyst and one visualization researcher were lower than the other visualization researcher

Why Visualization Researchers Gave Lower Ratings

CommentsThe tool does not seem flexible enough to investigate other scenariosShould a way to quickly browse through video snippets be considered a visualization?One visualization is provided. More are needed to look at other data.The analytic process was not well describedSuspicious events were highlighted in the visualization but it was unclear how they were foundThe visualization is distracting and not useful for analysisThe analytic process is described in terms of using this tool to detect certain kinds of events without justifying if this type of event is related to the mini challenge questionThe visualization was too compressed. It was difficult to see groupings.Tool required an iterative process and it was difficult to remember what had been done.

ConclusionsVisualization researchers are gaining a good understanding of what users need

Overall Conclusions

Reviewers have the appropriate material to assess the submission, assuming the material is clear and understandableAnalysts and Visualization researchers provide excellent comments on aspects of the system (analytic process, visualizations and interactions) although not necessarily classified correctlyThere is little or no difference between comments provided by analysts and those provided by visualization researchers

developing qualitative metrics for visual analytic environments

Technology

conclusions reviewers

number of reviewers

complex visualizations

comments onanalytic

analysis process

confusing need

categories of process

qualitative metrics