hci and swe: tool support for crazed developers€¦ · – heuristics for software design. basic...
TRANSCRIPT
Visualization and Interaction for Business and Entertainment
MSR UW Workshop
2007
HCI and SWE: Tool Support for Crazed Developers
Mary Czerwinski, Research Area Manager, Human-Centered Computing
Manager, VIBE, Microsoft Research
Overview HCI and our Research Efforts in the SWE Domain
• Why UCD?Background on Psychology– Learning, Memory and Perception– Traditional view of HCI
• Methodologies and when they are useful
• Information worker and developer productivity and group awareness
• Future directions
Why user-centered design?
• Cost savings (well documented, see Neilsen, 1993)– Not always directly visible (support calls, resales,
product returns, distributed productivity benefits to user, sw development costs)
• Competitive market--user expectations
• Political demands
• Help might not help
What is Usable SW?
Useful - Does it do what is needed? (teach, find, manage $, communicate, share, escape)
UsableIs it easy to learn?Is efficient to use?Do no or only few errors occur?Is it easy to remember?
Desirable - Is it fun to use? Do you want to keep using it again and again?
What Can Research Tell us about Making Usable Software?
• Psychological research on human cognitive abilities:– attention; visual perception– memory; learning
• Research on human-computer interaction– applied research; task-oriented studies– heuristics for software design
Basic Cognitive Principles
• Associations are built by repetition
• Scaffold model - more likely to remember items that have many associations
• Recognition is easier than recall
• Working memory has small capacity (time & size)
• Long-term memory has large capacity (time & size)
Memory
Basic Cognitive Principles
• Attention is a resource - gets divided between the different senses, different tasks
• Automatic, well-learned processes don’t require much attention so we can concentrate on new items
• Good design canprovide information where it is neededmake observer focus on one part of the displayprime an observer so they’re biased towards what you want them to see
Attention
Basic Cognitive Principles
• We excel at pattern recognition
• We automatically try to organize visual displays and look for cues about the organization should be - gestalt principles
• Motion, grouping, contrast, color can make different parts of a display more or less salient
Visual Perception
Basic Cognitive Principles
Memory, Attention, and Visual Perception Interact
MEMORY
ATTENTION
PERCEPTION What is this feature?Does it match the task?RecognitionPull info from memoryFeedback
HCI: Some Important Facts about Human Learning
• Learning is improved by organization– Also, grouping and levels of processing
• Consistency and mnemonics improve learning
• Targeted feedback facilitates learning
• Learning occurs across people and organizations
HCI: Human Learning Facts continued….• Learning proceeds faster and more
effectively when info is presented incrementally
• Some users like to explore systems to learn; others will not
• Workers focus on accomplishing tasks, not learning software
What Can Research Tell us about Making Usable Software?
• Research on human-computer interaction– Applied research; task-oriented lab
studies– Heuristics for software design– In situ studies– Logging– Surveys
Usability in your product cycle the earlier the better!
• Establish usability goals• Field research--tasks• Cognitive modeling• Competitive testing• Participatory design• UI design guidelines• Applied research• PSS communication• Roundtables• Low fidelity prototyping•Focus groups•Surveys
Quality Assurance• Competitive testing• Field testing • PSS communication
• Iterative test and design• Heuristic evaluation• Spec reviews• Low/Hi fidelity
Development
Planning
Toward user-centered design…early stages of cycle
• Modeling customers’ activities (even mental ones)Understand activities, then create a solutionGOMS-style modelsA way to share information as a team
• Generate multiple solutions
• Develop usability goalsMeasuring against clear, quantifiable goals
Usability metrics--data
♦Collecting data: video, protocols, subjective ratings and objective observations; debrief
♦Averages: times, % error time, # of trials before success, # of experimenter interventions, subjective ratings, # of task interrupts, % completed
♦Usability issues with # of Ss♦Look for patterns and lines of converging
evidence
Development stage: design, test & redesign
• Not traditional “waterfall” model
• Developing low-fi/hi-fi prototypes– Formative and heuristic evaluations first
• Test with a small number of users– Neilsen’s famous number 6
• Redesign based on feedback
• Evaluate again
Toward beta….
♦ Identify usability “showstoppers” before ship; fit and finish (e.g., audio tweaks, aesthetics)
♦Competitive benchmarking
♦Prioritize usability enhancements for next version
♦Field research to understand real usage of products in context and usability opportunities
Ship
♦Usability issues and recommendations for v. 2.0
♦Important to mark specifics down and publish so that positives and negatives of the design solution are archived
♦Usability issues should be tracked with PSS if unresolved
Important considerations...
♦Ethical treatment of Ss, consent forms and NDAs
♦Statistical power and significance
♦Guided exploration v. free discovery, learning v. initial use
♦Validity, reliability, and generalizability
♦Objectivity
Cautions about lab testing
• Doesn’t tell you what to design--structured user visits and interviews do
• We set the tasks, the design, and the analysis
• Best case performance
• Look for patterns of behaviors--the usability issues with the UI design; not necessarily hypothesis testing (but may in competitive or complex studies)
Some Examples
• Now, some examples of how we do user-centered design for iworkers and devs
• Work with Rob Deline, Gina Venolia, George Robertson, Andy Begel, KoriInkpen and many others
iWorker Diary Study: Motivation
• Hypothesis: Current software does not support multitasking well– How bad/universal is the problem?
• Seek SW design ideas…– Research shows users developing workaround
strategies– Interruptions research shows harmful effects of
incoming notifications on current task– Memory for To Do’s poor, undersupported– Need to better understand task switching and
multitasking
Method
• 10 multitasking users recruited• An excel spreadsheet was used as a
diary “template” to be filled out each day• Diaries emailed back to me each
evening• Participants instructed to write down
every “task switch”– how hard to switch, # of docs required, # of
interrupts experienced, task time, anything forgotten, notes, etc.
Partial diary for MS (6 hours)
About the same time…Large Display Findings
• Started exploring how user behavior changes as displays increase in size and resolution
• Found that users were significantly more productive when performing knowledge work (multitasking, task switching) with large displays
• Less window management=less cognitive load
• But still needed help with task management
• Created robust logger to determine how windowing behavior changed with larger displays
Tools for Task Management
• GroupBar joins related items in the taskbar, remembers spatial layouts of tasks (Smith et al., 2003)– Desktop “snapshots”– Can “rehydrate” tasks with the
press of a button
• Scalable Fabric and VibeLog(AVI 2004)– Over 5000 downloads of SF– Logging of task activity
Color Plate 1. Scalable Fabric showing the representation of three tasks as clusters of windows, and a single window being dragged
from the focus area into the periphery.
Visualization and Interaction for Business and Entertainment
MSR UW Workshop
2007
Clipping Lists and Change Borders
Peripheral Information Display
Tara Matthews, Mary Czerwinski, George Robertson, and Desney Tan
Study of Proposed Solutions:Clipping Lists and Change Borders• Compare interfaces w/ varying types of abstraction
– All interfaces based on Scalable Fabric (SF)
• Abstraction types:– Change detection– Semantic content extraction
• 4 interfaces:
SF Semantic Content Extraction (Clippings)
SF + Change Detection Semantic Content Extraction + Change Detection
Baseline: Scalable Fabric• Tasks as piles
• Windows shrunken
SF Clippings
SF + Change Detection
Clippings + Change Detection
Change Borders
• Adds red borders around windows changing content
• Border turns green when change is complete
SF List
SF + Change Detection
List + Change Detection
Red Change Border Green Change BorderRed Change Border Green Change Border
Clipping Lists
• Extracts window content
• Two ways to select content– Default: title bar– User WinCut– Future: AI
• Goal of selection:– Help w/ recognition,
resumption timing, and flow
SF List
SF + Change Detection
List + Change Detection
Clipping Lists + Change Borders
• Extracts window content
• Adds green highlight to task boundary & windows that have changed
SF List
SF + Change Detection
Clippings + Change Detection
tion
Study Results
• Semantic content extraction (Clipping Lists)– Is more effective
than both change detection and scaling
– Significantly benefits:
• Task flow• Resumption timing• Reacquisition
Average Task Times
540
560
580
600
620
640
660
680
700
Ave
rage
Tim
e in
Sec
onds
SF Clippings + Change
ClippingsSF + Change
Average Task Times
540
560
580
600
620
640
660
680
700
Ave
rage
Tim
e in
Sec
onds
SF Clippings + Change
ClippingsSF + Change
Average Time to Resume Quiz
0
10
2030
40
50
6070
80
90
Ave
rage
Tim
e in
Sec
onds
SF Clippings + Change
ClippingsSF + Change
Average Time to Resume Quiz
0
10
2030
40
50
6070
80
90
Ave
rage
Tim
e in
Sec
onds
SF Clippings + Change
ClippingsSF + Change
Programmer Productivity: Team Tracks w/Rob Deline et al.
• We have observed devs struggling with unfamiliar code– Inefficient navigation to find task-relevant code– Misleading results of text searches– Disorientation from too much navigation, too many
open files, interruptions– [DeLine, Khella, Czerwinski, Robertson SoftVis ’05],
[Ko, Aung, Myers ICSE ’05]
• Team Tracks guides code exploration– Records the team’s code navigation during
development– Mines that data to prune the working set and guide
navigation
Evaluating Team Tracks
• Study 1: Does nav frequency indicate importance?– Setup: Four programming tasks, then ratings
questionnaire and quiz– Dependent measures: code paths, task completion,
ratings, quiz scores– Hypothesis: Navigation frequency correlates to
importance rating [reported at SoftVis ’05]
• Study 2: Does Team Tracks improve productivity?– Use Team Tracks with Group 1’s navigation data– Same set up and dependent measures– Hypothesis: Team Tracks improves task completions
and quiz scores
Navigation frequency does correlate with importance ratings
• Pearson product moment correlation, r=0.79, p<0.01
Team Tracks does improve task completion rates and quiz scores
•Improved task completion rates–All completed tasks 1 and 2–Task 3 (localized code): 1 / 7 without, 3 / 9 with Team Tracks–Task 4 (dispersed code): 1 / 7 without, 7 / 9 with Team Tracks
•Group 2 quiz scores significantly higher t(16)=-2.04, p<.03
–IE 8.0 team deployment ethnography next–Added annotations and other features
Visualization and Interaction for Business and Entertainment
MSR UW Workshop
2007
Code ThumbnailsUsing Spatial Memory to Navigate Source Code (with larger displays)
DeLine, Czerwinski, Meyers, Venolia, Drucker, Robertson ▪ VL/HCC 06 ▪ 40/11
• Recent studies have documented the problem– Ko, Aung, and Myers 2005: 35% of developer task time is
navigation– DeLine, Khella, Czerwinski, Robertson 2005 report
disorientation
• Current navigation UI relies on remembering symbols– Most common are text search, symbol search, file boxes,
project tree view, class tree view
• Could developers use their spatial memory instead?
Code navigation is a problem
DeLine, Czerwinski, Meyers, Venolia, Drucker, Robertson ▪ VL/HCC 06 ▪ 41/11
Code Thumbnails is designed to leverage spatial memory
DeLine, Czerwinski, Meyers, Venolia, Drucker, Robertson ▪ VL/HCC 06 ▪ 42/11
Formative Evaluation
• Areas for feedback– Do developers like Code Thumbnails?– Do developers find CT useful for navigation?– Do developers form a spatial memory of the
CT visualizations?
• Participants– 11 developers (10 outside, 1 MS), average
15 years experience
DeLine, Czerwinski, Meyers, Venolia, Drucker, Robertson ▪ VL/HCC 06 ▪ 43/11
Task structure
• Two-hour sessions– Introduction to Code Thumbnails (10 min)– Three programming tasks on 3000 KLOC C# code (75 min)– Targeted search (10 min)– Spatial memory quiz (10 min)– Survey and feedback (15 min)
• IDE operations logged for 5 participants– Includes both CT and standard features– Collected during programming tasks and targeted search
DeLine, Czerwinski, Meyers, Venolia, Drucker, Robertson ▪ VL/HCC 06 ▪ 44/11
High survey marks1 2 3 4 5
Learnability
Ease of use
Preference
Satisfaction
Global navigation
Utility
Divided attention
Local navigation
Lack of frustration
Avg. Response : Unfavorable .. Favorable
0%10%20%30%40%50%60%70%80%90%
100%
1 2 3 4 5
Participant
Perc
ent o
f Act
ions
Click symbol searchresultSolution Explorer
Go To Definition
Click text searchresultCTD double-click
CTD thumbnail click
CTD title click
CTS scrollbar scroll
CTS thumbnail click
DeLine, Czerwinski, Meyers, Venolia, Drucker, Robertson ▪ VL/HCC 06 ▪ 45/11
Frequent use during programming tasks
DeLine, Czerwinski, Meyers, Venolia, Drucker, Robertson ▪ VL/HCC 06 ▪ 46/11
Frequent use during targeted search
• Find fifteen targets, using any feature– Find five files by name– Find five methods by name– Find five methods by functional description
• Often, multiple operations were used per search trial– e.g. CT Desktop to select a file, then scrolling within the file
• CT Desktop used in more trials than other operations– CT Desktop used in 64% of trials– Text search used in 16%– CT Scrollbar used in 11%– Solution Explorer used in 8%
DeLine, Czerwinski, Meyers, Venolia, Drucker, Robertson ▪ VL/HCC 06 ▪ 47/11
– File searches significantly slower than method searches (just Fitt’s law)
– File searches significantly slower without thumbnails– Method searches not signficantly slower without thumbnails
(always fast)– Frequently accessed files had smaller first-click distance
(368 pixels vs 511)
Spatial memory quiz
5 files by name 5 methods by name5 files by name 5 methods by name
1 2 3 4
DeLine, Czerwinski, Meyers, Venolia, Drucker, Robertson ▪ VL/HCC 06 ▪ 48/11
Related work
• Seesoft– Used code thumbnails to show statistics per line
• Eclipse scrollbar– Shows errors and file result as tick marks
• Aspect Browser– Shows search results in Seesoft style to help find aspects
• Data Mountain– Replacement for web Favorites leveraging spatial memory
Visualization and Interaction for Business and Entertainment
MSR UW Workshop
2007
FASTDash: A Visual Dashboard for Fostering Awareness in Software Teams
Jacob T. Biehl*, Mary Czerwinski,Greg Smith & George G. Robertson
*Department of Computer Science†University of Illinois
VIBE GroupMicrosoft Research
Problem
• Dev coordination breakdowns are frequent and costly– Defects cost ~$60 billion to US economy [NIST ’02]
• Actions of team members are difficult to acquire– Common, unsatisfied information need [Ko, et.al.
‘07
• Lack techniques for gaining awareness information
Contextual Inquiry
• 90 surveys/13 structured interviews with MS developers
• Key set of detailed information needed– What source files are team members working in?– How are those files being used?– Are the files changing? If so, what parts?– Am I affected by the changes?
• Scattered resources used– Source code– Emails/IMs– Diagrams/notes on whiteboard– Bug DBs, check-in logs, status reports
• Frequently changing information
FASTDash
• Map information need onto a visualization
• Combines multiple sources of activity information– Source repository actions (e.g. check-ins, check-outs, conflicts)– Active file actions (e.g. open files, changing files, edit/debug state,
etc.)– Project related comments/notes (e.g. status, assistance messages)
• Designed to be a persistent visualization
• Targeted for project groups of 2-8 programmers
System Design
• Works automatically along side existing tools• Source repository independent
– Works with SourceDepot and Team Foundation Server
– Extendable to others (e.g. CVS or SVN)
• Utilizes IDE plug-in capabilities– Currently implemented for Visual Studio– Other IDE plug-ins could be easily integrated
(e.g. Eclipse)
• SQL database to centrally manage information
Study
• Evaluate impact on programmer awareness and overall behavior
• Observation-based field study– Provide semi-longitudinal exposure– Actual projects/workspace– Impact on use of existing practices/tools
• 6 experienced programmers– µ =12.3 years as professional developers
Methodology
• Coding scheme– Influenced by existing coding schemes– 5 coding categories (further details in paper)
• Communication, shared display use, shared physical artifact use, collaboration type, collaboration configuration
– Can be leveraged/applied in future studies
• Pre/post design– 2 days pre-visualization– 2 days with visualization
• Other measures include situation awareness ratings and pre/post questionnaires
Workspace
Results• General increase project-related communication
– “Why are you editing that file? It’s not part of what you are working on.”
– “You can’t leave yet, you have files still checked out and we need to runthe build tonight.”
Results
• Reduction in use of physical artifacts
• Trend toward improved situational awareness– Division of attention ratings reduced by 30%– Instability of situation ratings reduced by 30%
Use & Feedback
• Enabled global view of project activity– “makes it easier to verify if no item has been checked
out… before making the build for the final release”
• Provided instant reflection of project state– “I liked the real time info…and to know what [fellow]
developers are working on”– “The visualization of possible conflicts was useful”
• Increased utility of information through contextual notes/comments– “We usually make comments… but those ‘verbal’
comments are often lost. Placing [flags] with the common on the context where it applies is cool”
• Voluntary continued use
FASTDash Future Work
• Better understand which features of FASTDash were most/least useful
• Evaluate the long-term efficacy and impact of FASTDash
• Extend the visualization to support other information and iworker workgroups
• Explore augmenting the visualization to address issues of scale and artifact importance
Conclusion
• Basic psychology can be used to– Derive HCI principles– Offer methodologies for HCI
• More opps for UCD early in product life cycle; look for converging lines of evidence– Different methods appropriate at different times
• Can’t know what to design without understanding current practice or what’s wrong with current designs
Visualization and Interaction for Business and Entertainment
MSR UW Workshop
2007
Thank you for your attention!http://research.microsoft.com/research/vibe
Task Frequencies Breakdown
Frequency of Task Type
Downtime0%
Email23%
Meeting6%
Personal5%
Project18%
Routine Task27%
Telephone Call8%
Task Tracking13%
Indicative of Difficulty Tracking Tasks
“Returned to”Tasks from this group
Frequency of Task Shift Initiators
Frequency of Switch Causes
Email3%
Next Task19%
Self-Initiated40%
Telephone Call14%
Return to Task7%
Other Person1%
New Information Request
3%
Emergency1%
Appointment9%
Deadline2%
App Prompt1%
Difficulty Switching by Type
Rated Difficulty Sw itching to Task
0
1
2
3
Task Type
Diff
icul
ty S
witc
hing
(1=L
ow,
2=M
ed, 3
=Hig
h)
Other Tasks
Returned-to Tasks
Task Length by Type
Task Duration by Task Type
0
20
40
60
80
100
120
140
160
Task Type
Ave
rage
Tas
k D
urat
ion
(Min
s)
Other TasksReturned-to Tasks
Document Requirements by Task Type
Number of Documents by Task Type
0
0.5
1
1.5
2
2.5
3
Task Type
Aver
age
# of
Doc
s
Other TasksReturned-to Tasks
Difficulty Switching by Type
Rated Difficulty Sw itching to Task
0
1
2
3
Task Type
Diff
icul
ty S
witc
hing
(1=L
ow,
2=M
ed, 3
=Hig
h)
Other Tasks
Returned-to Tasks
Task Length by Type
Task Duration by Task Type
0
20
40
60
80
100
120
140
160
Task Type
Ave
rage
Tas
k D
urat
ion
(Min
s)
Other TasksReturned-to Tasks
Document Requirements by Task Type
Number of Documents by Task Type
0
0.5
1
1.5
2
2.5
3
Task Type
Aver
age
# of
Doc
s
Other TasksReturned-to Tasks
Number of Interruptions by Task Type
0
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
2
Task Type
Aver
age
Num
ber
of In
terr
uptio
ns
Other TasksReturned-to Tasks
Interruptions by Task Type
Focus on Returned to Tasks
• Elapsed time spanned hours to days• Maintaining desktop state isn’t always the
answer– Often, users said they were waiting on info from
other people or places (web, server)—prospective reminders needed here
– Info came in via phone, email, web, or personal contacts (better app integration needed here)
– But reminding about task context and info assembly / layout was a key problem identified
General Design Ideas from Participants
• Smarter, adjustable To Do list tracking & alarming
– In the projects versus just in Calendar
– Consider sticky notes for partial / future tasks
• Auto-categorization of email and files
• Better reminders for things forgotten – Track events we know about and visualize them, or rely on
user manual tagging
• Better user adaptivity– e.g., knowing what kinds of paste operations a user
typically performs and automating them
Findings
• During a given week, KWs task shift an awful lot (avg. 10 task shifts a day)
• Long-term projects are more complex shifts– Lengthier (11.25% of the week), more documents,
interrupts, “returns”– Rated significantly harder to return to
• Passage of time also takes its toll• What designs will help?