situated evaluation of visual analytics systems ann blandford professor of human–computer...

Situated evaluation of visual analytics systems

Ann BlandfordProfessor of Human–Computer Interaction

Director, UCL Interaction Centre

With thanks to Simon Attfield, Sarah Faisal and Stephann Makri for interesting discussions and examples

2

How often do you…

• Wake up thinking “I’m going to seek information today”?– Or “I’m going to use such-and-such a system today”?

• Go to bed thinking “I didn’t find out anything new today”?

3

Systems are there to support activities

• …but are rarely the focus of attention• They need to support people:

– “If the user can’t use it, it doesn’t work” (Dray)

• We need to know how well they do their job and how to improve them

• Evaluation is concerned with such questions• Situated evaluation focuses on how a system supports

people’s work in context

4

Visual Analytics is…

• …the design and evaluation of visualisation systems that support people’s work in…– Predicting– Deciding– Sensemaking– Etc.

• …with large bodies of data / information• In other words: it’s about designing visualisation systems

that support situated information interaction

5

Structure of this talk

• The “information journey”• Different approaches to evaluating systems• Planning evaluation studies• A focus on user concepts: CASSM analysis

6

The information journey

7

Example from legal investigation

• Question: did fraud occur? – an information need• Refine into sub-questions / hypotheses• Find, validate and interpret information to answer

each sub-question• …which may generate more sub-questions• …for which further information needs to be found• …leading eventually to a conclusion

8

People’s attentional resources

• The way someone views a visualisation will depend on what questions are already in their mind

• A well designed visualisation will draw the user’s attention to “interesting” features of the information (e.g. patterns, anomalies…) to raise new questions

• It will also support users in interpreting information

QuickTime™ and a decompressor

are needed to see this picture.

9

A view of design

Design

Requirements

Evaluation

10

Approaches to evaluating systems

• There are many possible evaluation questions.– Some concern just the system (e.g. reliability and performance)– Some concern the use of the system– Some concern the situation in which the system is used

• Verification and validation:– Does the system perform as intended, and does it do what the

users need?

• Situated evaluation is concerned with the question of whether the system performs as well as possible for its users in the situated context of use.

11

With and without…

• Evaluation studies can be conducted…– With or without a running system

• E.g. with storyboard or specification

– With or without users• E.g. expert review

– With or without the context of use• In a laboratory or in the “field”

12

Examples of evaluation approaches

• Heuristic evaluation– Done by experts, usually with a running system, in the laboratory,

evaluating a system against a checklist of desirable properties (e.g. does it provide suitable feedback?)

• Think aloud study– Done with users, with a running system, usually in a laboratory, evaluating

a system against defined tasks

• Contextual Inquiry– Done with users, with a running system, situated, with their tasks

• Conceptual Structures for Information Interaction (CSII)– Done with users, maybe without a running system, situated

13

Planning evaluation studies

• There are many possible evaluation questions for a VA system. E.g.:– What information should be displayed?– How should the information be laid out?– How can the system be made easier to use?– How might a system fit into the user’s work practices?

• Different questions are relevant at different stages of development.

14

The “PRET A Rapporter” checklist for planning an evaluation study

• Purpose of the evaluation• Resources and constraints• Ethics• Techniques for data collection• Analysis techniques• Report findings

» NB: These steps are not strictly ordered!

15

Purpose of the evaluation: what question is it to address

• Accessibility• Acceptability• Efficiency• Effectiveness• Experience• Improvement• Other (e.g. Culture)• Usability

• …. Leading to detailed questions. E.g.:

• What questions are analysts trying to answer with the system?

• What concepts are people working with when answering those questions?

• How easily do users understand the system concepts?

• What difficulties do people have understanding the system and how could it be improved?

16

Resources and constraints

• Cost• Time• System representations• Participants• Test environment and equipment• Data capture tools• Analysis tools• Expertise of analysts

– Or availability of materials to learn new technique

• Security considerations

17

Ethics

• Always consider ethical dimensions. In particular:– Vulnerable participants (young, old, etc.)– Informed consent– Privacy and confidentiality

18

Techniques for data collection

• Qualitative or quantitative (or both)?• Audio, video, form-based?• From users, systems, contexts of work?

19

Analysis of data

• Quick or detailed?• Qualitative, or quantitative, and more details of

techniques– Descriptive or inferential statistics?– Grounded theory or which pre-existing questions?– Structured (e.g. CSII)?

20

Reporting

• Tailor your reporting to the purpose of the evaluation

• If it’s academic research it needs to conform to appropriate standards of scientific inspectability / reproducibility

• If it’s a report to your development team, find out what they require in terms of brevity / depth / evidence / illustration / redesign suggestions.

21

A focus on concepts:CSII (simplified from CASSM)

• What are the concepts that people are working with when making sense of a dataset?

• What are the concepts implemented in a system?• How well do these fit each other?

22

Early example

• Ambulance control:– Allocators deal with incidents (although they refer

to them as ‘calls’)– Computer system only permits processing of calls– Mediating paper system enables allocators to

group call information by incident– There is a misfit between two concepts although

they’re both called ‘calls’

23

So how are “concepts” identified?

• Answer: data!– Verbal user data for user concepts

• Interviews• Think-aloud• Contextual Inquiry• Focus group• Rich descriptions of user activities + analyst expertise

– System descriptions for interface & system concepts• Maybe a running system• Maybe a mature specification / storyboard

24

Sketchy worked example: user concepts

• Data gathered by Stephann Makri on Masters students searching for information in digital libraries to support their projects

• Participants thought aloud, with occasional interventions by Stephann to probe their understanding and intentions

• Data transcribed and analysed• System also reviewed to compare concepts

25

Analysing transcript: the topic of interest

So now I’d go the ACM because it’s an HCI topic and I think anything I find on there on my topic will be more highly related on there than if I used anything else. So I’ll type in ‘pattern languages’ again . . . [Loads ACM Digital Library homepage and conducts search]. Why did you type in ‘pattern languages’? Umm . . . Well I probably should have typed in ‘language patterns’ so it’s exactly like before [with Google], but I forgot. I know the topic I want is either ‘language patterns’ or ‘pattern languages’ and I can’t remember what I searched on Google [which search terms were used]. So I’ll just type it into Google again to see if I’ve got the terms round the right way. Yeah, it is ‘pattern languages’ . . . [Begins reading article titles out loud]. I’m reading the titles to see if there’s anything. [Pauses for a second]. So this is the guy [author] that wrote the article I read yesterday, so that’s interesting. [Reads abstract out loud (at speed)]. And perhaps this one? Yeah, I’ll probably have a look at these two. How did you decide that those two might be interesting? Reading the titles, just to get a quick overview. For example this title that I read just seemed too technical. The second one, ‘a debate on language and tool support for design patterns’ . . . that didn’t seem that relevant either because I don’t think it’s about pattern languages, whereas this one talks about pattern languages specifically in the title.

26

Some user concepts from the transcript

• Information resource (e.g. ACM or Google)– Subject areas covered

• Topic– Search terms to describe topic– Of interest (or not)

• Article– Title– Author– Topic (overview)– Relevance to self

• Self– Search history– Interests– Subject area

27

Comment on qualitative analysis

• Qualitative analysis isn’t as easy as it might at first appear.• Interviewees don’t give you the data ‘on a plate’: you have to “listen” to

the data: probe it to really understand what people are talking about.• Entities and attributes are often about nouns (that’s a good heuristic),

but the nouns may go unspoken. Or the same word might mean different things in different contexts. Or different words might mean the same thing…

• For CSII, you’ll often abstract away from the domain details to something more general.

28

The device perspective

• The focus should be on domain-relevant concepts, not interface widgets, except insofar as the user has to interact with widgets to manipulate domain concepts– E.g. the check boxes that

determine what is displayed would probably be considered.

– …but the scroll bars would not

29

Interface and system

• In many situations, the interface and underlying system representations are the same.

• Sometimes, people can infer things from the interface that are not represented in the underlying system– E.g. clusters or patterns in data may be visually apparent

• Less commonly, there are important features of the underlying system that are not represented at the interface– E.g. the “layers” in most drawing programs

• For CSII, interface and system are merged, though you should note points where this merging hides important features of the device

30

Identifying device concepts

• Aim to work at the same level of abstraction as the user concepts• Consider interface and underlying system

– Only differentiate between these when it matters

• It’s often an iterative process• It can be useful to distinguish between entities and attributes

– CSII is intentionally sketchier than E-R modelling to avoid “death by detail”

• Later in analysis, it can be useful to consider how easy or difficult it is to make important changes to the system state (I.e. through actions)

31

Summary: progressive deepeningof analysis

1. Identify core concepts on user and system sides.

2. Refine description• User and system concepts.• Present, absent or difficult?• Entities or attributes?• How easy to work with?

32

So CSII…

• Focuses on the core concepts that define a domain

• Goes into no more detail than necessary• Is a description that focuses on particular aspects,

leading to identification of ‘misfits’ between user and system

• Takes the user’s perspective.

33

Core concepts of CSII

• Entities: concepts the user is working with that can be created or deleted, or that have attributes.

• Attributes: properties of entities.• Actions: change the state of an entity or attribute.

34

Example

• Entity:– Author– Paper– Year– Idea*

• Attribute of author (domain object):– Co-authored paper– Co-authored with…

• Action:– Focus attention on particular paper / author

35

Notes

• In many situations, the activity of constructing a model provides more insights that inspecting the resulting model.

• While doing the analysis, you should be keeping a note of any usability issues that you notice as you’re going along.

36

Summary

• Visual Analytics systems need to be evaluated from many different perspectives if they are to be usable, useful and used.

• Situated evaluation involves working with the intended users of a system to understand their practices and needs.

• In this talk, we have considered how to plan a study and how to assess a system in terms of conceptual fit.

• A longer session would have considered process, interactivity, affordances, and other contextual factors.

37

References

• CSII is a simplified version of CASSM:– Blandford,A., Green,T., Furniss,D., Makri,S. (2008). Evaluating system

utility and conceptual fit using CASSM. International Journal of Human-Computer Studies 66, 393-409. ISSN: 1071-5819

• PRET A Rapporter is described in the context of evaluating digital libraries:– Blandford, A., Adams, A., Attfield, S., Buchanan, G., Gow, J., Makri, S.,

Rimmer, J., Warwick, C. (2008). The PRET A Rapporter Framework: Evaluating Digital Libraries from the perspective of information work. Information Processing and Management 44(1), 4-21. ISSN: 0306-4573

• Both are presented briefly in:– Blandford,A., Attfield,S. (2010), Interacting with Information,Synthesis

Lectures series. Morgan & Claypool.

Thank you!

www.uclic.ucl.ac.uk/annb/

situated evaluation of visual analytics systems ann blandford professor of human–computer...

Documents

running system

information journey

interpreting information

context slide

information need

evaluation studies

conclusion slide

field slide