supporting research collaboration through bi-level file ... › ~cathycmarshall ›...
TRANSCRIPT
supporting research collaboration through bi-level
file synchronization
Cathy Marshall, Ted Wobber, Rama Ramasubramanian, and Doug Terry
MSR-SVC, Mountain View, CA
GROUP 2012 October 30, 2012
Sanibel Island, Florida
this is a story about developing & using a custom sync app in the wild, among a
population of CS researchers...
“I commute by train from San Francisco. So when I'm writing a paper, I often edit text on the train on my laptop. And to a lesser extent, at home. But when I'm in the office, I'd be using my desktop. And I just copy things manually back and forth between them…” –a researcher in the lab (interviewed 9/6/2007)
file sync may be used in 3 ways in paper writing
• sharing files with collaborators
• working from multiple devices, each of which may have intermittent connectivity
• backing up active files
using sync to share files is different than using sync to work from multiple devices (or as backup)
Sync is important AND harder than it looks
Getting sync right is the essence of everything. If you don’t, everything else fails.
-- Ray Ozzie in Wired, 11/2008
“Automatic file synchronization has the potential to simplify managing information and roles across devices, but only if users are willing to [use] it. Our findings suggest that people do not trust automatic file synchronization, even though they employ automatic synchronization for other types of information: music, email messages, contact information, calendar data, and task lists.
-- Dearman and Pierce, Proc. CHI 2008
meanwhile, my CS collaborators had a (systems) agenda...
Item Store Comm.
Sync Utilities
Security
Item Store Comm.
Sync Utilities
Security
Application Application
Update API
Cimbiosys sync protocol: eventual consistency flexible network topology efficient reconciliation of (filtered) sync metadata
from a slide titled “evaluation” in our NSDI 2008 Cimbiosys talk
Evaluation is different in the systems community
soliciting use
giving a short talk in the lab: “so… try Cimetric next time you write a paper/share files”
• easy to install (all you need is .NET framework)
• work like you normally would (e.g. editor-agnostic)
• easy to use
• cheerful user support
• easy to bail out if you don’t like it.
meanwhile, other sync services are gradually becoming available.
Windows Live Mesh
Cimetric still has some crucial distinctions that meet the lab’s needs
translating sync into Cimetric
• Working files are folders that you want replicated among your computers. the files don’t all need to be shared with your colleagues; work is seamless among devices
• Repositories contain files you share with your collaborators. repositories have every version you or your colleagues have shared
• Check in/check out allows you to move working files to your copy of the repository and vice-versa. you can choose when to incorporate other peoples’ changes and when to share yours; locking is advisory
repository
working files
Work desktop
Home
Work desktop
check out/check in
synchronize repository
synchronize working files
repository Laptop
working files
Cathy’s files repository
working files
repository
working files
repository
working files
repository
working files
Laptop
Ted’s files
original Cimetric design (pre-adoption)
lesson 1: no cloud = no users
• MSFT IT enforced power saving settings. – often no viable path among peers – e.g. idle computers powered
themselves down
• most researchers worked with colleagues outside the firewall
• most researchers used at least some computers—e.g. home computers—that were outside the firewall
Work desktop
Home
Work desktop
check out/check in
synchronize repository
synchronize working files
repository Laptop
working files
Cathy’s files repository
working files
repository
working files
repository
working files
repository
working files
Laptop
Ted’s files
repository revised Cimetric design
who used Cimetric?
ID # of people project external collaborator? cloud
UC1 4 sharing project files no no
UC2 2 writing a paper no no
UC3 2 sharing project files yes yes
UC4 2 writing a paper yes yes
UC5 4 writing a paper no yes
UC6 2 writing a paper no yes
UC7 1 developing algorithms no no
UC8 3 writing a paper yes yes
UC9 2 writing a paper yes yes
9 small groups, mostly writing papers, some with external collaborators.
what’s the worst that could happen?
the danger of infrastructure interventions
what do we want to learn?
• What is the role of the cloud?
• Is a bi-level design effective?
• Which aspects of sync should we reveal (and allow people to control)?
the role of the cloud
• cloud-optional combines best features of cloud-based sync with peer-to-peer – peer-to-peer is faster when it’s possible – cloud-optional design addresses security concerns
with storing files on external cloud – cloud-optional design minimizes storage and
transaction costs (and gets around things like quotas)
• but working files (not just repository) need to be able to sync with the cloud
• Cimetric needed to handle on-the-fly configuration changes (e.g. adding a cloud replica mid-stream)
bi-level sync
• supports practice
• but... – difficult to discover
– difficult to configure
– again, configurations change on the fly
• workarounds attest to potential need (e.g. UC9’s use of Cimetric to sync repository + Dropbox to sync working files)
revealing sync processes (part 1)
revealing sync history: which computer did this file come from?
controlling sync: who initiates, what’s the interval, which peers?
revealing sync processes (part 2)
• over the long haul, it’s hard to tell inactivity from malfunction (e.g. is cloud connection okay?)
• sometimes there’s no good way to get at the things users want to know the most (e.g. what to expect from initial sync)
• sadly, people are most interested in sync when something breaks
conclusions, speculations, + future work
Conclusions: differences between sync-to-share and sync-for-ubiquitous -access need further (design) thought! there’s a role for peer-to-peer sync and cloud-mediated sync
Speculations: this is a specialized application... UX/architecture might not be understood or configured by ‘civilians’ something similar to this app could be built using existing sync services—someone should try it
“Future” work: see Marshall & Tang from DIS 2012—users form conceptual models of sync that are incomplete and sometimes wrong.
Q: Is user-facing file sync worth it?
contact info:
http://research.microsoft.com/~cathymar
http://www.csdl.tamu.edu/~marshall
twitter: @ccmarshall
my collaborators: {terry,wobber,rama}@microsoft.com}