supporting research collaboration through bi-level file ... › ~cathycmarshall ›...

Post on 05-Jul-2020

4 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

supporting research collaboration through bi-level

file synchronization

Cathy Marshall, Ted Wobber, Rama Ramasubramanian, and Doug Terry

MSR-SVC, Mountain View, CA

GROUP 2012 October 30, 2012

Sanibel Island, Florida

this is a story about developing & using a custom sync app in the wild, among a

population of CS researchers...

“I commute by train from San Francisco. So when I'm writing a paper, I often edit text on the train on my laptop. And to a lesser extent, at home. But when I'm in the office, I'd be using my desktop. And I just copy things manually back and forth between them…” –a researcher in the lab (interviewed 9/6/2007)

file sync may be used in 3 ways in paper writing

• sharing files with collaborators

• working from multiple devices, each of which may have intermittent connectivity

• backing up active files

using sync to share files is different than using sync to work from multiple devices (or as backup)

Sync is important AND harder than it looks

Getting sync right is the essence of everything. If you don’t, everything else fails.

-- Ray Ozzie in Wired, 11/2008

“Automatic file synchronization has the potential to simplify managing information and roles across devices, but only if users are willing to [use] it. Our findings suggest that people do not trust automatic file synchronization, even though they employ automatic synchronization for other types of information: music, email messages, contact information, calendar data, and task lists.

-- Dearman and Pierce, Proc. CHI 2008

meanwhile, my CS collaborators had a (systems) agenda...

Item Store Comm.

Sync Utilities

Security

Item Store Comm.

Sync Utilities

Security

Application Application

Update API

Cimbiosys sync protocol: eventual consistency flexible network topology efficient reconciliation of (filtered) sync metadata

from a slide titled “evaluation” in our NSDI 2008 Cimbiosys talk

Evaluation is different in the systems community

soliciting use

giving a short talk in the lab: “so… try Cimetric next time you write a paper/share files”

• easy to install (all you need is .NET framework)

• work like you normally would (e.g. editor-agnostic)

• easy to use

• cheerful user support

• easy to bail out if you don’t like it.

meanwhile, other sync services are gradually becoming available.

Windows Live Mesh

Cimetric still has some crucial distinctions that meet the lab’s needs

translating sync into Cimetric

• Working files are folders that you want replicated among your computers. the files don’t all need to be shared with your colleagues; work is seamless among devices

• Repositories contain files you share with your collaborators. repositories have every version you or your colleagues have shared

• Check in/check out allows you to move working files to your copy of the repository and vice-versa. you can choose when to incorporate other peoples’ changes and when to share yours; locking is advisory

repository

working files

Work desktop

Home

Work desktop

check out/check in

synchronize repository

synchronize working files

repository Laptop

working files

Cathy’s files repository

working files

repository

working files

repository

working files

repository

working files

Laptop

Ted’s files

original Cimetric design (pre-adoption)

lesson 1: no cloud = no users

• MSFT IT enforced power saving settings. – often no viable path among peers – e.g. idle computers powered

themselves down

• most researchers worked with colleagues outside the firewall

• most researchers used at least some computers—e.g. home computers—that were outside the firewall

Work desktop

Home

Work desktop

check out/check in

synchronize repository

synchronize working files

repository Laptop

working files

Cathy’s files repository

working files

repository

working files

repository

working files

repository

working files

Laptop

Ted’s files

repository revised Cimetric design

who used Cimetric?

ID # of people project external collaborator? cloud

UC1 4 sharing project files no no

UC2 2 writing a paper no no

UC3 2 sharing project files yes yes

UC4 2 writing a paper yes yes

UC5 4 writing a paper no yes

UC6 2 writing a paper no yes

UC7 1 developing algorithms no no

UC8 3 writing a paper yes yes

UC9 2 writing a paper yes yes

9 small groups, mostly writing papers, some with external collaborators.

what’s the worst that could happen?

the danger of infrastructure interventions

what do we want to learn?

• What is the role of the cloud?

• Is a bi-level design effective?

• Which aspects of sync should we reveal (and allow people to control)?

the role of the cloud

• cloud-optional combines best features of cloud-based sync with peer-to-peer – peer-to-peer is faster when it’s possible – cloud-optional design addresses security concerns

with storing files on external cloud – cloud-optional design minimizes storage and

transaction costs (and gets around things like quotas)

• but working files (not just repository) need to be able to sync with the cloud

• Cimetric needed to handle on-the-fly configuration changes (e.g. adding a cloud replica mid-stream)

bi-level sync

• supports practice

• but... – difficult to discover

– difficult to configure

– again, configurations change on the fly

• workarounds attest to potential need (e.g. UC9’s use of Cimetric to sync repository + Dropbox to sync working files)

revealing sync processes (part 1)

revealing sync history: which computer did this file come from?

controlling sync: who initiates, what’s the interval, which peers?

revealing sync processes (part 2)

• over the long haul, it’s hard to tell inactivity from malfunction (e.g. is cloud connection okay?)

• sometimes there’s no good way to get at the things users want to know the most (e.g. what to expect from initial sync)

• sadly, people are most interested in sync when something breaks

conclusions, speculations, + future work

Conclusions: differences between sync-to-share and sync-for-ubiquitous -access need further (design) thought! there’s a role for peer-to-peer sync and cloud-mediated sync

Speculations: this is a specialized application... UX/architecture might not be understood or configured by ‘civilians’ something similar to this app could be built using existing sync services—someone should try it

“Future” work: see Marshall & Tang from DIS 2012—users form conceptual models of sync that are incomplete and sometimes wrong.

Q: Is user-facing file sync worth it?

contact info:

cathymar@microsoft.com

http://research.microsoft.com/~cathymar

http://www.csdl.tamu.edu/~marshall

twitter: @ccmarshall

my collaborators: {terry,wobber,rama}@microsoft.com}

top related