supporting research collaboration through bi-level file ... › ~cathycmarshall ›...

24
supporting research collaboration through bi-level file synchronization Cathy Marshall, Ted Wobber, Rama Ramasubramanian, and Doug Terry MSR-SVC, Mountain View, CA GROUP 2012 October 30, 2012 Sanibel Island, Florida

Upload: others

Post on 05-Jul-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: supporting research collaboration through bi-level file ... › ~cathycmarshall › group_2012-final_for_web.pdf · Sync is important AND harder than it looks Getting sync right is

supporting research collaboration through bi-level

file synchronization

Cathy Marshall, Ted Wobber, Rama Ramasubramanian, and Doug Terry

MSR-SVC, Mountain View, CA

GROUP 2012 October 30, 2012

Sanibel Island, Florida

Page 2: supporting research collaboration through bi-level file ... › ~cathycmarshall › group_2012-final_for_web.pdf · Sync is important AND harder than it looks Getting sync right is

this is a story about developing & using a custom sync app in the wild, among a

population of CS researchers...

“I commute by train from San Francisco. So when I'm writing a paper, I often edit text on the train on my laptop. And to a lesser extent, at home. But when I'm in the office, I'd be using my desktop. And I just copy things manually back and forth between them…” –a researcher in the lab (interviewed 9/6/2007)

Page 3: supporting research collaboration through bi-level file ... › ~cathycmarshall › group_2012-final_for_web.pdf · Sync is important AND harder than it looks Getting sync right is

file sync may be used in 3 ways in paper writing

• sharing files with collaborators

• working from multiple devices, each of which may have intermittent connectivity

• backing up active files

using sync to share files is different than using sync to work from multiple devices (or as backup)

Page 4: supporting research collaboration through bi-level file ... › ~cathycmarshall › group_2012-final_for_web.pdf · Sync is important AND harder than it looks Getting sync right is

Sync is important AND harder than it looks

Getting sync right is the essence of everything. If you don’t, everything else fails.

-- Ray Ozzie in Wired, 11/2008

“Automatic file synchronization has the potential to simplify managing information and roles across devices, but only if users are willing to [use] it. Our findings suggest that people do not trust automatic file synchronization, even though they employ automatic synchronization for other types of information: music, email messages, contact information, calendar data, and task lists.

-- Dearman and Pierce, Proc. CHI 2008

Page 5: supporting research collaboration through bi-level file ... › ~cathycmarshall › group_2012-final_for_web.pdf · Sync is important AND harder than it looks Getting sync right is

meanwhile, my CS collaborators had a (systems) agenda...

Item Store Comm.

Sync Utilities

Security

Item Store Comm.

Sync Utilities

Security

Application Application

Update API

Cimbiosys sync protocol: eventual consistency flexible network topology efficient reconciliation of (filtered) sync metadata

Page 6: supporting research collaboration through bi-level file ... › ~cathycmarshall › group_2012-final_for_web.pdf · Sync is important AND harder than it looks Getting sync right is

from a slide titled “evaluation” in our NSDI 2008 Cimbiosys talk

Evaluation is different in the systems community

Page 7: supporting research collaboration through bi-level file ... › ~cathycmarshall › group_2012-final_for_web.pdf · Sync is important AND harder than it looks Getting sync right is

soliciting use

Page 8: supporting research collaboration through bi-level file ... › ~cathycmarshall › group_2012-final_for_web.pdf · Sync is important AND harder than it looks Getting sync right is
Page 9: supporting research collaboration through bi-level file ... › ~cathycmarshall › group_2012-final_for_web.pdf · Sync is important AND harder than it looks Getting sync right is

giving a short talk in the lab: “so… try Cimetric next time you write a paper/share files”

• easy to install (all you need is .NET framework)

• work like you normally would (e.g. editor-agnostic)

• easy to use

• cheerful user support

• easy to bail out if you don’t like it.

Page 10: supporting research collaboration through bi-level file ... › ~cathycmarshall › group_2012-final_for_web.pdf · Sync is important AND harder than it looks Getting sync right is

meanwhile, other sync services are gradually becoming available.

Windows Live Mesh

Cimetric still has some crucial distinctions that meet the lab’s needs

Page 11: supporting research collaboration through bi-level file ... › ~cathycmarshall › group_2012-final_for_web.pdf · Sync is important AND harder than it looks Getting sync right is

translating sync into Cimetric

• Working files are folders that you want replicated among your computers. the files don’t all need to be shared with your colleagues; work is seamless among devices

• Repositories contain files you share with your collaborators. repositories have every version you or your colleagues have shared

• Check in/check out allows you to move working files to your copy of the repository and vice-versa. you can choose when to incorporate other peoples’ changes and when to share yours; locking is advisory

repository

working files

Page 12: supporting research collaboration through bi-level file ... › ~cathycmarshall › group_2012-final_for_web.pdf · Sync is important AND harder than it looks Getting sync right is

Work desktop

Home

Work desktop

check out/check in

synchronize repository

synchronize working files

repository Laptop

working files

Cathy’s files repository

working files

repository

working files

repository

working files

repository

working files

Laptop

Ted’s files

original Cimetric design (pre-adoption)

Page 13: supporting research collaboration through bi-level file ... › ~cathycmarshall › group_2012-final_for_web.pdf · Sync is important AND harder than it looks Getting sync right is

lesson 1: no cloud = no users

• MSFT IT enforced power saving settings. – often no viable path among peers – e.g. idle computers powered

themselves down

• most researchers worked with colleagues outside the firewall

• most researchers used at least some computers—e.g. home computers—that were outside the firewall

Page 14: supporting research collaboration through bi-level file ... › ~cathycmarshall › group_2012-final_for_web.pdf · Sync is important AND harder than it looks Getting sync right is

Work desktop

Home

Work desktop

check out/check in

synchronize repository

synchronize working files

repository Laptop

working files

Cathy’s files repository

working files

repository

working files

repository

working files

repository

working files

Laptop

Ted’s files

repository revised Cimetric design

Page 15: supporting research collaboration through bi-level file ... › ~cathycmarshall › group_2012-final_for_web.pdf · Sync is important AND harder than it looks Getting sync right is

who used Cimetric?

ID # of people project external collaborator? cloud

UC1 4 sharing project files no no

UC2 2 writing a paper no no

UC3 2 sharing project files yes yes

UC4 2 writing a paper yes yes

UC5 4 writing a paper no yes

UC6 2 writing a paper no yes

UC7 1 developing algorithms no no

UC8 3 writing a paper yes yes

UC9 2 writing a paper yes yes

9 small groups, mostly writing papers, some with external collaborators.

Page 16: supporting research collaboration through bi-level file ... › ~cathycmarshall › group_2012-final_for_web.pdf · Sync is important AND harder than it looks Getting sync right is

what’s the worst that could happen?

Page 17: supporting research collaboration through bi-level file ... › ~cathycmarshall › group_2012-final_for_web.pdf · Sync is important AND harder than it looks Getting sync right is

the danger of infrastructure interventions

Page 18: supporting research collaboration through bi-level file ... › ~cathycmarshall › group_2012-final_for_web.pdf · Sync is important AND harder than it looks Getting sync right is

what do we want to learn?

• What is the role of the cloud?

• Is a bi-level design effective?

• Which aspects of sync should we reveal (and allow people to control)?

Page 19: supporting research collaboration through bi-level file ... › ~cathycmarshall › group_2012-final_for_web.pdf · Sync is important AND harder than it looks Getting sync right is

the role of the cloud

• cloud-optional combines best features of cloud-based sync with peer-to-peer – peer-to-peer is faster when it’s possible – cloud-optional design addresses security concerns

with storing files on external cloud – cloud-optional design minimizes storage and

transaction costs (and gets around things like quotas)

• but working files (not just repository) need to be able to sync with the cloud

• Cimetric needed to handle on-the-fly configuration changes (e.g. adding a cloud replica mid-stream)

Page 20: supporting research collaboration through bi-level file ... › ~cathycmarshall › group_2012-final_for_web.pdf · Sync is important AND harder than it looks Getting sync right is

bi-level sync

• supports practice

• but... – difficult to discover

– difficult to configure

– again, configurations change on the fly

• workarounds attest to potential need (e.g. UC9’s use of Cimetric to sync repository + Dropbox to sync working files)

Page 21: supporting research collaboration through bi-level file ... › ~cathycmarshall › group_2012-final_for_web.pdf · Sync is important AND harder than it looks Getting sync right is

revealing sync processes (part 1)

revealing sync history: which computer did this file come from?

controlling sync: who initiates, what’s the interval, which peers?

Page 22: supporting research collaboration through bi-level file ... › ~cathycmarshall › group_2012-final_for_web.pdf · Sync is important AND harder than it looks Getting sync right is

revealing sync processes (part 2)

• over the long haul, it’s hard to tell inactivity from malfunction (e.g. is cloud connection okay?)

• sometimes there’s no good way to get at the things users want to know the most (e.g. what to expect from initial sync)

• sadly, people are most interested in sync when something breaks

Page 23: supporting research collaboration through bi-level file ... › ~cathycmarshall › group_2012-final_for_web.pdf · Sync is important AND harder than it looks Getting sync right is

conclusions, speculations, + future work

Conclusions: differences between sync-to-share and sync-for-ubiquitous -access need further (design) thought! there’s a role for peer-to-peer sync and cloud-mediated sync

Speculations: this is a specialized application... UX/architecture might not be understood or configured by ‘civilians’ something similar to this app could be built using existing sync services—someone should try it

“Future” work: see Marshall & Tang from DIS 2012—users form conceptual models of sync that are incomplete and sometimes wrong.

Q: Is user-facing file sync worth it?

Page 24: supporting research collaboration through bi-level file ... › ~cathycmarshall › group_2012-final_for_web.pdf · Sync is important AND harder than it looks Getting sync right is

contact info:

[email protected]

http://research.microsoft.com/~cathymar

http://www.csdl.tamu.edu/~marshall

twitter: @ccmarshall

my collaborators: {terry,wobber,rama}@microsoft.com}