collaboration on large datasets using globus
DESCRIPTION
Collaboration on Large Datasets using Globus. Rachana Ananthakrishnan University of Chicago. Data sharing in collaborations. Registry. Registry. Staging Store. Ingest Store. Ingest Store. Community Store. Community Store. Analysis Store. Analysis Store. Archive. Mirror. Archive. - PowerPoint PPT PresentationTRANSCRIPT
Collaboration on Large Datasets using Globus
Rachana Ananthakrishnan
University of Chicago
Data sharing in collaborations
RegistryStaging Store
IngestStore
AnalysisStore
Community Store
Archive Mirror
IngestStore
AnalysisStore
Community Store
Archive Mirror
Registry
Data Management User Stories
• “I need a good place to store / backup / archive my (big) research data”
• “I need to easily, quickly, and reliably move or mirror portions of my data to other places.”
• “I need a way to easily and securely share my data with my colleagues at other institutions.”
• “I want to publish my data.”
• “I want to discover published data.”
• …
Exemplar: ISI-MIP
• Inter-Sectoral Impact Model Intercomparison Project
• Framework to collate climate impact data across scales and sectors
• World-wide collaboration with data assets managed by the collaboration
• Inputs from various climate models & output forms basis for model evaluation and improvement
Credits: Dr. Joshua Elliot, University of Chicago
ISI-MIP Use Cases
• Share data with researchers across institutions world-wide– Restricted sharing– Multiple institutions
• Accept data submissions– Restricted writing to archive
• Publish results– Move selected results to other locations– Track metadata – Discover data
What is Globus?
Big data publish*, transfer and sharing……with Dropbox-like
simplicity……directly from your own
storage systems* In pilot phase
Collaboration Archive
Univ. of Chicago Argonne IIT UIUC
Publish walk-through
3. Assemble Dataset (Transfer Data)
Curator
2. Describe Submission
Scientist
4. Curate Dataset
1. Publish Data
8
Login with Campus Identity
9
New submission
10
Assemble the Dataset
11
Move data to publish archive
12
Grant Submission License
13
Submission Complete
14
Curator Logs in
15
Curation Workflow Options
16
Verify Metadata & Files
17
Approve the Submission
18
Submission is now Published with DOI
Collaboration Archive
Univ. of Chicago Argonne IIT UIUC
Discover walk-through
3. Assemble Dataset (Transfer Data)
Curator
2. Describe Submission
Scientist
4. Curate Dataset
1. Publish Data6. Download
5. Search
20
Search Published Datasets
21
Discovering a Published Dataset
22
Download the Published Dataset
23
Select Download Destination
Globus Under the Covers
Identity, Group, Profile Management Services
…
Sharing Service
Transfer Service
Globus Toolkit
Glo
bus
API
s
Glo
bus
Conn
ect
Reliable, secure, high-performance file transfer and synchronization
• “Fire-and-forget” transfers
• Automatic fault recovery
• Seamless security integration
• Powerful GUIand APIs
DataSource
DataDestination
User initiates transfer request
1
Globus moves and syncs files
2
Globus notifies user
3
Simple, secure sharing off existing storage systems
DataSource
User A selects file(s) to share, selects user or group, and sets permissions
1
Globus tracks shared files; no need to move files to cloud storage!
2
User B logs in to Globus and
accesses shared file
3
• Easily share large data with any user or group
• No cloud storage required
Thank you
• Signup and use Globus to transfer and share
• globus.org/signup
• Signup as early adopters of publish
• globus.org/data-publication
• Support
Thank you to our sponsors!
U . S . D E PA RT M E N T O F
ENERGY